mirror of
https://github.com/torproject/torspec.git
synced 2024-11-27 03:40:47 +00:00
315 lines
13 KiB
Plaintext
315 lines
13 KiB
Plaintext
|
|
Tor Protocol Specification
|
|
|
|
Roger Dingledine
|
|
Nick Mathewson
|
|
|
|
0. Preliminaries
|
|
|
|
THIS SPECIFICATION IS OBSOLETE.
|
|
|
|
This document specifies the Tor directory protocol as used in version
|
|
0.1.0.x and earlier. See dir-spec.txt for a current version.
|
|
|
|
1. Basic operation
|
|
|
|
There is a small number of directory authorities, and a larger number of
|
|
caches. Client and servers know public keys for the directory authorities.
|
|
Tor servers periodically upload self-signed "router descriptors" to the
|
|
directory authorities. Each authority publishes a self-signed "directory"
|
|
(containing all the router descriptors it knows, and a statement on which
|
|
are running) and a self-signed "running routers" document containing only
|
|
the statement on which routers are running.
|
|
|
|
All Tors periodically download these documents, downloading the directory
|
|
less frequently than they do the "running routers" document. Clients
|
|
preferentially download from caches rather than authorities.
|
|
|
|
1.1. Document format
|
|
|
|
Router descriptors, directories, and running-routers documents all obey the
|
|
following lightweight extensible information format.
|
|
|
|
The highest level object is a Document, which consists of one or more
|
|
Items. Every Item begins with a KeywordLine, followed by one or more
|
|
Objects. A KeywordLine begins with a Keyword, optionally followed by
|
|
whitespace and more non-newline characters, and ends with a newline. A
|
|
Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
|
|
An Object is a block of encoded data in pseudo-Open-PGP-style
|
|
armor. (cf. RFC 2440)
|
|
|
|
More formally:
|
|
|
|
Document ::= (Item | NL)+
|
|
Item ::= KeywordLine Object*
|
|
KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
|
|
Keyword = KeywordChar+
|
|
KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
|
|
ArgumentChar ::= any printing ASCII character except NL.
|
|
WS = (SP | TAB)+
|
|
Object ::= BeginLine Base-64-encoded-data EndLine
|
|
BeginLine ::= "-----BEGIN " Keyword "-----" NL
|
|
EndLine ::= "-----END " Keyword "-----" NL
|
|
|
|
The BeginLine and EndLine of an Object must use the same keyword.
|
|
|
|
When interpreting a Document, software MUST reject any document containing a
|
|
KeywordLine that starts with a keyword it doesn't recognize.
|
|
|
|
The "opt" keyword is reserved for non-critical future extensions. All
|
|
implementations MUST ignore any item of the form "opt keyword ....." when
|
|
they would not recognize "keyword ....."; and MUST treat "opt keyword ....."
|
|
as synonymous with "keyword ......" when keyword is recognized.
|
|
|
|
2. Router descriptor format.
|
|
|
|
Every router descriptor MUST start with a "router" Item; MUST end with a
|
|
"router-signature" Item and an extra NL; and MUST contain exactly one
|
|
instance of each of the following Items: "published" "onion-key" "link-key"
|
|
"signing-key" "bandwidth". Additionally, a router descriptor MAY contain
|
|
any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items.
|
|
Other than "router" and "router-signature", the items may appear in any
|
|
order.
|
|
|
|
The items' formats are as follows:
|
|
"router" nickname address ORPort SocksPort DirPort
|
|
|
|
Indicates the beginning of a router descriptor. "address"
|
|
must be an IPv4 address in dotted-quad format. The last
|
|
three numbers indicate the TCP ports at which this OR exposes
|
|
functionality. ORPort is a port at which this OR accepts TLS
|
|
connections for the main OR protocol; SocksPort is deprecated and
|
|
should always be 0; and DirPort is the port at which this OR accepts
|
|
directory-related HTTP connections. If any port is not supported,
|
|
the value 0 is given instead of a port number.
|
|
|
|
"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
|
|
|
|
Estimated bandwidth for this router, in bytes per second. The
|
|
"average" bandwidth is the volume per second that the OR is willing
|
|
to sustain over long periods; the "burst" bandwidth is the volume
|
|
that the OR is willing to sustain in very short intervals. The
|
|
"observed" value is an estimate of the capacity this server can
|
|
handle. The server remembers the max bandwidth sustained output
|
|
over any ten second period in the past day, and another sustained
|
|
input. The "observed" value is the lesser of these two numbers.
|
|
|
|
"platform" string
|
|
|
|
A human-readable string describing the system on which this OR is
|
|
running. This MAY include the operating system, and SHOULD include
|
|
the name and version of the software implementing the Tor protocol.
|
|
|
|
"published" YYYY-MM-DD HH:MM:SS
|
|
|
|
The time, in GMT, when this descriptor was generated.
|
|
|
|
"fingerprint"
|
|
|
|
A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded
|
|
in hex, with a single space after every 4 characters) for this router's
|
|
identity key. A descriptor is considered invalid (and MUST be
|
|
rejected) if the fingerprint line does not match the public key.
|
|
|
|
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
|
|
be marked with "opt" until earlier versions of Tor are obsolete.]
|
|
|
|
"hibernating" 0|1
|
|
|
|
If the value is 1, then the Tor server was hibernating when the
|
|
descriptor was published, and shouldn't be used to build circuits.
|
|
|
|
[We didn't start parsing this line until Tor 0.1.0.6-rc; it should
|
|
be marked with "opt" until earlier versions of Tor are obsolete.]
|
|
|
|
"uptime"
|
|
|
|
The number of seconds that this OR process has been running.
|
|
|
|
"onion-key" NL a public key in PEM format
|
|
|
|
This key is used to encrypt EXTEND cells for this OR. The key MUST
|
|
be accepted for at least XXXX hours after any new key is published in
|
|
a subsequent descriptor.
|
|
|
|
"signing-key" NL a public key in PEM format
|
|
|
|
The OR's long-term identity key.
|
|
|
|
"accept" exitpattern
|
|
"reject" exitpattern
|
|
|
|
These lines, in order, describe the rules that an OR follows when
|
|
deciding whether to allow a new stream to a given address. The
|
|
'exitpattern' syntax is described below.
|
|
|
|
"router-signature" NL Signature NL
|
|
|
|
The "SIGNATURE" object contains a signature of the PKCS1-padded
|
|
hash of the entire router descriptor, taken from the beginning of the
|
|
"router" line, through the newline after the "router-signature" line.
|
|
The router descriptor is invalid unless the signature is performed
|
|
with the router's identity key.
|
|
|
|
"contact" info NL
|
|
|
|
Describes a way to contact the server's administrator, preferably
|
|
including an email address and a PGP key fingerprint.
|
|
|
|
"family" names NL
|
|
|
|
'Names' is a whitespace-separated list of server nicknames. If two ORs
|
|
list one another in their "family" entries, then OPs should treat them
|
|
as a single OR for the purpose of path selection.
|
|
|
|
For example, if node A's descriptor contains "family B", and node B's
|
|
descriptor contains "family A", then node A and node B should never
|
|
be used on the same circuit.
|
|
|
|
"read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
|
|
"write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
|
|
|
|
Declare how much bandwidth the OR has used recently. Usage is divided
|
|
into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines
|
|
the end of the most recent interval. The numbers are the number of
|
|
bytes used in the most recent intervals, ordered from oldest to newest.
|
|
|
|
[We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
|
|
be marked with "opt" until earlier versions of Tor are obsolete.]
|
|
|
|
2.1. Nonterminals in routerdescriptors
|
|
|
|
nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
|
|
|
|
exitpattern ::= addrspec ":" portspec
|
|
portspec ::= "*" | port | port "-" port
|
|
port ::= an integer between 1 and 65535, inclusive.
|
|
addrspec ::= "*" | ip4spec | ip6spec
|
|
ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
|
|
ip4 ::= an IPv4 address in dotted-quad format
|
|
ip4mask ::= an IPv4 mask in dotted-quad format
|
|
num_ip4_bits ::= an integer between 0 and 32
|
|
ip6spec ::= ip6 | ip6 "/" num_ip6_bits
|
|
ip6 ::= an IPv6 address, surrounded by square brackets.
|
|
num_ip6_bits ::= an integer between 0 and 128
|
|
|
|
Ports are required; if they are not included in the router
|
|
line, they must appear in the "ports" lines.
|
|
|
|
3. Directory format
|
|
|
|
A Directory begins with a "signed-directory" item, followed by one each of
|
|
the following, in any order: "recommended-software", "published",
|
|
"router-status", "dir-signing-key". It may include any number of "opt"
|
|
items. After these items, a directory includes any number of router
|
|
descriptors, and a single "directory-signature" item.
|
|
|
|
"signed-directory"
|
|
|
|
Indicates the start of a directory.
|
|
|
|
"published" YYYY-MM-DD HH:MM:SS
|
|
|
|
The time at which this directory was generated and signed, in GMT.
|
|
|
|
"dir-signing-key"
|
|
|
|
The key used to sign this directory; see "signing-key" for format.
|
|
|
|
"recommended-software" comma-separated-version-list
|
|
|
|
A list of which versions of which implementations are currently
|
|
believed to be secure and compatible with the network.
|
|
|
|
"running-routers" whitespace-separated-list
|
|
|
|
A description of which routers are currently believed to be up or
|
|
down. Every entry consists of an optional "!", followed by either an
|
|
OR's nickname, or "$" followed by a hexadecimal encoding of the hash
|
|
of an OR's identity key. If the "!" is included, the router is
|
|
believed not to be running; otherwise, it is believed to be running.
|
|
If a router's nickname is given, exactly one router of that nickname
|
|
will appear in the directory, and that router is "approved" by the
|
|
directory server. If a hashed identity key is given, that OR is not
|
|
"approved". [XXXX The 'running-routers' line is only provided for
|
|
backward compatibility. New code should parse 'router-status'
|
|
instead.]
|
|
|
|
"router-status" whitespace-separated-list
|
|
|
|
A description of which routers are currently believed to be up or
|
|
down, and which are verified or unverified. Contains one entry for
|
|
every router that the directory server knows. Each entry is of the
|
|
format:
|
|
|
|
!name=$digest [Verified router, currently not live.]
|
|
name=$digest [Verified router, currently live.]
|
|
!$digest [Unverified router, currently not live.]
|
|
or $digest [Unverified router, currently live.]
|
|
|
|
(where 'name' is the router's nickname and 'digest' is a hexadecimal
|
|
encoding of the hash of the routers' identity key).
|
|
|
|
When parsing this line, clients should only mark a router as
|
|
'verified' if its nickname AND digest match the one provided.
|
|
|
|
"directory-signature" nickname-of-dirserver NL Signature
|
|
|
|
The signature is computed by computing the digest of the
|
|
directory, from the characters "signed-directory", through the newline
|
|
after "directory-signature". This digest is then padded with PKCS.1,
|
|
and signed with the directory server's signing key.
|
|
|
|
If software encounters an unrecognized keyword in a single router descriptor,
|
|
it MUST reject only that router descriptor, and continue using the
|
|
others. Because this mechanism is used to add 'critical' extensions to
|
|
future versions of the router descriptor format, implementation should treat
|
|
it as a normal occurrence and not, for example, report it to the user as an
|
|
error. [Versions of Tor prior to 0.1.1 did this.]
|
|
|
|
If software encounters an unrecognized keyword in the directory header,
|
|
it SHOULD reject the entire directory.
|
|
|
|
4. Network-status descriptor
|
|
|
|
A "network-status" (a.k.a "running-routers") document is a truncated
|
|
directory that contains only the current status of a list of nodes, not
|
|
their actual descriptors. It contains exactly one of each of the following
|
|
entries.
|
|
|
|
"network-status"
|
|
|
|
Must appear first.
|
|
|
|
"published" YYYY-MM-DD HH:MM:SS
|
|
|
|
(see section 3 above)
|
|
|
|
"router-status" list
|
|
|
|
(see section 3 above)
|
|
|
|
"directory-signature" NL signature
|
|
|
|
(see section 3 above)
|
|
|
|
5. Behavior of a directory server
|
|
|
|
lists nodes that are connected currently
|
|
speaks HTTP on a socket, spits out directory on request
|
|
|
|
Directory servers listen on a certain port (the DirPort), and speak a
|
|
limited version of HTTP 1.0. Clients send either GET or POST commands.
|
|
The basic interactions are:
|
|
"%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n",
|
|
command, url, content-length, host.
|
|
Get "/tor/" to fetch a full directory.
|
|
Get "/tor/dir.z" to fetch a compressed full directory.
|
|
Get "/tor/running-routers" to fetch a network-status descriptor.
|
|
Post "/tor/" to post a server descriptor, with the body of the
|
|
request containing the descriptor.
|
|
|
|
"host" is used to specify the address:port of the dirserver, so
|
|
the request can survive going through HTTP proxies.
|
|
|