mirror of
https://github.com/torproject/torspec.git
synced 2025-01-23 08:26:25 +00:00
0a905db7d9
svn:r670
518 lines
21 KiB
Plaintext
518 lines
21 KiB
Plaintext
$Id$
|
|
|
|
Tor Spec
|
|
|
|
Note: This is an attempt to specify Tor as it exists as implemented in
|
|
early June, 2003. It is not recommended that others implement this
|
|
design as it stands; future versions of Tor will implement improved
|
|
protocols.
|
|
|
|
TODO: (very soon)
|
|
- Specify truncate/truncated
|
|
- Sendme w/stream0 is circuit sendme
|
|
- Integrate -NM and -RD comments
|
|
|
|
EVEN LATER:
|
|
- Do TCP-style sequencing and ACKing of DATA cells so that we can afford
|
|
to lose some data cells.
|
|
-
|
|
|
|
0. Notation:
|
|
|
|
PK -- a public key.
|
|
SK -- a private key
|
|
K -- a key for a symmetric cypher
|
|
|
|
a|b -- concatenation of 'a' with 'b'.
|
|
|
|
All numeric values are encoded in network (big-endian) order.
|
|
|
|
Unless otherwise specified, all symmetric ciphers are AES in counter
|
|
mode, with an IV of all 0 bytes. Asymmetric ciphers are either RSA
|
|
with 1024-bit keys and exponents of 65537, or DH with the safe prime
|
|
from rfc2409, section 6.2, whose hex representation is:
|
|
|
|
"FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08"
|
|
"8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B"
|
|
"302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9"
|
|
"A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6"
|
|
"49286651ECE65381FFFFFFFFFFFFFFFF"
|
|
|
|
|
|
1. System overview
|
|
|
|
Tor is a connection-oriented anonymizing communication service. Users
|
|
build a path known as a "virtual circuit" through the network, in which
|
|
each node knows its predecessor and successor, but no others. Traffic
|
|
flowing down the circuit is unwrapped by a symmetric key at each node,
|
|
which reveals the downstream node.
|
|
|
|
2. Connections
|
|
|
|
There are two ways to connect to an onion router (OR). The first is
|
|
as an onion proxy (OP), which allows the OP to authenticate the OR
|
|
without authenticating itself. The second is as another OR, which
|
|
allows mutual authentication.
|
|
|
|
Tor uses TLS for link encryption, using the cipher suite
|
|
"TLS_DHE_RSA_WITH_AES_128_CBC_SHA". An OR always sends a
|
|
self-signed X.509 certificate whose commonName is the server's
|
|
nickname, and whose public key is in the server directory.
|
|
|
|
All parties receiving certificates must confirm that the public
|
|
key is as it appears in the server directory, and close the
|
|
connection if it is not.
|
|
|
|
Once a TLS connection is established, the two sides send cells
|
|
(specified below) to one another. Cells are sent serially. All
|
|
cells are 256 bytes long. Cells may be sent embedded in TLS
|
|
records of any size or divided across TLS records, but the framing
|
|
of TLS records should not leak information about the type or
|
|
contents of the cells.
|
|
|
|
OR-to-OR connections are never deliberately closed. An OP should
|
|
close a connection to an OR if there are no circuits running over
|
|
the connection, and an amount of time (KeepalivePeriod, defaults to
|
|
5 minutes) has passed.
|
|
|
|
3. Cell Packet format
|
|
|
|
The basic unit of communication for onion routers and onion
|
|
proxies is a fixed-width "cell". Each cell contains the following
|
|
fields:
|
|
|
|
ACI (anonymous circuit identifier) [2 bytes]
|
|
Command [1 byte]
|
|
Length [1 byte]
|
|
Sequence number (unused, set to 0) [4 bytes]
|
|
Payload (padded with 0 bytes) [248 bytes]
|
|
[Total size: 256 bytes]
|
|
|
|
The 'Command' field holds one of the following values:
|
|
0 -- PADDING (Padding) (See Sec 6.2)
|
|
1 -- CREATE (Create a circuit) (See Sec 4)
|
|
2 -- CREATED (Acknowledge create) (See Sec 4)
|
|
3 -- RELAY (End-to-end data) (See Sec 5)
|
|
4 -- DESTROY (Stop using a circuit) (See Sec 4)
|
|
|
|
The interpretation of 'Length' and 'Payload' depend on the type of
|
|
the cell.
|
|
PADDING: Neither field is used.
|
|
CREATE: Length is 144; the payload contains the first phase of the
|
|
DH handshake.
|
|
CREATED: Length is 128; the payload contains the second phase of
|
|
the DH handshake.
|
|
RELAY: Length is a value between 8 and 248; the first 'length'
|
|
bytes of payload contain useful data.
|
|
DESTROY: Neither field is used.
|
|
|
|
Unused fields are filled with 0 bytes. The payload is padded with
|
|
0 bytes.
|
|
|
|
PADDING cells are currently used to implement connection
|
|
keepalive. ORs and OPs send one another a PADDING cell every few
|
|
minutes.
|
|
|
|
CREATE and DESTROY cells are used to manage circuits; see section
|
|
4 below.
|
|
|
|
RELAY cells are used to send commands and data along a circuit; see
|
|
section 5 below.
|
|
|
|
|
|
4. Circuit management
|
|
|
|
4.1. CREATE and CREATED cells
|
|
|
|
Users set up circuits incrementally, one hop at a time. To create
|
|
a new circuit, users send a CREATE cell to the first node, with the
|
|
first half of the DH handshake; that node responds with a CREATED cell
|
|
with the second half of the DH handshake. To extend a circuit past
|
|
the first hop, the user sends an EXTEND relay cell (see section 5)
|
|
which instructs the last node in the circuit to send a CREATE cell
|
|
to extend the circuit.
|
|
|
|
The payload for a CREATE cell is an 'onion skin', consisting of:
|
|
RSA-encrypted data [128 bytes]
|
|
Symmetrically-encrypted data [16 bytes]
|
|
The RSA-encrypted portion contains:
|
|
Symmetric key [16 bytes]
|
|
First part of DH data (g^x) [112 bytes]
|
|
The symmetrically encrypted portion contains:
|
|
Second part of DH data (g^x) [16 bytes]
|
|
|
|
The two parts of the DH data, once decrypted and concatenated, form
|
|
g^x as calculated by the client.
|
|
|
|
The relay payload for an EXTEND relay cell consists of:
|
|
Address [4 bytes]
|
|
Port [2 bytes]
|
|
Onion skin [144 bytes]
|
|
|
|
The port and address field denote the IPV4 address and port of the
|
|
next onion router in the circuit.
|
|
|
|
4.2. Setting circuit keys
|
|
|
|
Once the handshake between the OP and an OR is completed, both
|
|
servers can now calculate g^xy with ordinary DH. From the base key
|
|
material g^xy, they compute two 16 byte keys, called Kf and Kb as
|
|
follows. First, the server represents g^xy as a big-endian
|
|
unsigned integer. Next, the server computes 40 bytes of key data
|
|
as K = SHA1(g^xy | [00]) | SHA1(g^xy | [01]) where "00" is a single
|
|
octet whose value is zero, and "01" is a single octet whose value
|
|
is one. The first 16 bytes of K form Kf, and the next 16 bytes of
|
|
K form Kb.
|
|
|
|
Kf is used to encrypt the stream of data going from the OP to the
|
|
OR, whereas Kb is used to encrypt the stream of data going from the
|
|
OR to the OP.
|
|
|
|
4.3. Creating circuits
|
|
|
|
When creating a circuit through the network, the circuit creator
|
|
performs the following steps:
|
|
|
|
1. Choose a chain of N onion routers (R_1...R_N) to constitute
|
|
the path, such that no router appears in the path twice.
|
|
|
|
2. If not already connected to the first router in the chain,
|
|
open a new connection to that router.
|
|
|
|
3. Choose an ACI not already in use on the connection with the
|
|
first router in the chain. If we are an onion router and our
|
|
nickname is lexicographically greater than the nickname of the
|
|
other side, then let the high bit of the ACI be 1, else 0.
|
|
|
|
4. Send a CREATE cell along the connection, to be received by
|
|
the first onion router.
|
|
|
|
5. Wait until a CREATED cell is received; finish the handshake
|
|
and extract the forward key Kf_1 and the back key Kb_1.
|
|
|
|
6. For each subsequent onion router R (R_2 through R_N), extend
|
|
the circuit to R.
|
|
|
|
To extend the circuit by a single onion router R_M, the circuit
|
|
creator performs these steps:
|
|
|
|
1. Create an onion skin, encrypting the RSA-encrypted part with
|
|
R's public key.
|
|
|
|
2. Encrypt and send the onion skin in a relay EXTEND cell along
|
|
the circuit (see section 5).
|
|
|
|
3. When a relay EXTENDED cell is received, calculate the shared
|
|
keys. The circuit is now extended.
|
|
|
|
When an onion router receives an EXTEND relay cell, it sends a
|
|
CREATE cell to the next onion router, with the enclosed onion skin
|
|
as its payload. The initiating onion router chooses some ACI not
|
|
yet used on the connection between the two onion routers. (But see
|
|
section 4.3. above, concerning choosing ACIs.)
|
|
|
|
As an extension (called router twins), if the desired next onion
|
|
router R in the circuit is down, and some other onion router R'
|
|
has the same key as R, then it's ok to extend to R' rather than R.
|
|
|
|
When an onion router receives a CREATE cell, if it already has a
|
|
circuit on the given connection with the given ACI, it drops the
|
|
cell. Otherwise, sometime after receiving the CREATE cell, it completes
|
|
the DH handshake, and replies with a CREATED cell, containing g^y
|
|
as its [128 byte] payload. Upon receiving a CREATED cell, an onion
|
|
router packs it payload into an EXTENDED relay cell (see section 5),
|
|
and sends that cell up the circuit. Upon receiving the EXTENDED
|
|
relay cell, the OP can retrieve g^y.
|
|
|
|
(As an optimization, OR implementations may delay processing onions
|
|
until a break in traffic allows time to do so without harming
|
|
network latency too greatly.)
|
|
|
|
4.4. Tearing down circuits
|
|
|
|
Circuits are torn down when an unrecoverable error occurs along
|
|
the circuit, or when all streams on a circuit are closed and the
|
|
circuit's intended lifetime is over. Circuits may be torn down
|
|
either completely or hop-by-hop.
|
|
|
|
To tear down a circuit completely, an OR or OP sends a DESTROY
|
|
cell to the adjacent nodes on that circuit, using the appropriate
|
|
direction's ACI.
|
|
|
|
Upon receiving an outgoing DESTROY cell, an OR frees resources
|
|
associated with the corresponding circuit. If it's not the end of
|
|
the circuit, it sends a DESTROY cell for that circuit to the next OR
|
|
in the circuit. If the node is the end of the circuit, then it tears
|
|
down any associated edge connections (see section 5.1).
|
|
|
|
After a DESTROY cell has been processed, an OR ignores all data or
|
|
destroy cells for the corresponding circuit.
|
|
|
|
To tear down part of a circuit, the OP sends a RELAY_TRUNCATE cell
|
|
signaling a given OR (Stream ID zero). That OR sends a DESTROY
|
|
cell to the next node in the circuit, and replies to the OP with a
|
|
RELAY_TRUNCATED cell.
|
|
|
|
When an unrecoverable error occurs along one connection in a
|
|
circuit, the nodes on either side of the connection should, if they
|
|
are able, act as follows: the node closer to the OP should send a
|
|
RELAY_TRUNCATED cell towards the OP; the node farther from the OP
|
|
should send a DESTROY cell down the circuit.
|
|
|
|
[We'll have to reevaluate this section once we figure out cleaner
|
|
circuit/connection killing conventions. -RD]
|
|
|
|
4.5. Routing data cells
|
|
|
|
When an OR receives a RELAY cell, it checks the cell's ACI and
|
|
determines whether it has a corresponding circuit along that
|
|
connection. If not, the OR drops the RELAY cell.
|
|
|
|
Otherwise, if the OR is not at the OP edge of the circuit (that is,
|
|
either an 'exit node' or a non-edge node), it de/encrypts the length
|
|
field and the payload with AES/CTR, as follows:
|
|
'Forward' relay cell (same direction as CREATE):
|
|
Use Kf as key; encrypt.
|
|
'Back' relay cell (opposite direction from CREATE):
|
|
Use Kb as key; decrypt.
|
|
If the OR recognizes the stream ID on the cell (it is either the ID
|
|
of an open stream or the signaling (zero) ID), the OR processes the
|
|
contents of the relay cell. Otherwise, it passes the decrypted
|
|
relay cell along the circuit if the circuit continues, or drops the
|
|
cell if it's the end of the circuit. [Getting an unrecognized
|
|
relay cell at the end of the circuit must be allowed for now;
|
|
we can reexamine this once we've designed full tcp-style close
|
|
handshakes. -RD]
|
|
|
|
Otherwise, if the data cell is coming from the OP edge of the
|
|
circuit, the OP decrypts the length and payload fields with AES/CTR as
|
|
follows:
|
|
OP sends data cell to node R_M:
|
|
For I=1...M, decrypt with Kf_I.
|
|
|
|
Otherwise, if the data cell is arriving at the OP edge if the
|
|
circuit, the OP encrypts the length and payload fields with AES/CTR as
|
|
follows:
|
|
OP receives data cell:
|
|
For I=N...1,
|
|
Encrypt with Kb_I. If the stream ID is a recognized
|
|
stream for R_I, or if the stream ID is the signaling
|
|
ID (zero), then stop and process the payload.
|
|
|
|
For more information, see section 5 below.
|
|
|
|
5. Application connections and stream management
|
|
|
|
5.1. Streams
|
|
|
|
Within a circuit, the OP and the exit node use the contents of
|
|
RELAY packets to tunnel end-to-end commands and TCP connections
|
|
("Streams") across circuits. End-to-end commands can be initiated
|
|
by either edge; streams are initiated by the OP.
|
|
|
|
The first 8 bytes of each relay cell are reserved as follows:
|
|
Relay command [1 byte]
|
|
Stream ID [7 bytes]
|
|
|
|
The recognized relay commands are:
|
|
1 -- RELAY_BEGIN
|
|
2 -- RELAY_DATA
|
|
3 -- RELAY_END
|
|
4 -- RELAY_CONNECTED
|
|
5 -- RELAY_SENDME
|
|
6 -- RELAY_EXTEND
|
|
7 -- RELAY_EXTENDED
|
|
8 -- RELAY_TRUNCATE
|
|
9 -- RELAY_TRUNCATED
|
|
10 -- RELAY_DROP
|
|
|
|
All RELAY cells pertaining to the same tunneled stream have the
|
|
same stream ID. Stream ID's are chosen randomly by the OP. A
|
|
stream ID is considered "recognized" on a circuit C by an OP or an
|
|
OR if it already has an existing stream established on that
|
|
circuit, or if the stream ID is equal to the signaling stream ID,
|
|
which is all zero: [00 00 00 00 00 00 00]
|
|
|
|
To create a new anonymized TCP connection, the OP sends a
|
|
RELAY_BEGIN data cell with a payload encoding the address and port
|
|
of the destination host. The stream ID is zero. The payload format is:
|
|
NEWSTREAMID | ADDRESS | ':' | PORT | '\000'
|
|
where NEWSTREAMID is the newly generated Stream ID to use for
|
|
this stream, ADDRESS may be a DNS hostname, or an IPv4 address in
|
|
dotted-quad format; and where PORT is encoded in decimal.
|
|
|
|
Upon receiving this packet, the exit node resolves the address as
|
|
necessary, and opens a new TCP connection to the target port. If
|
|
the address cannot be resolved, or a connection can't be
|
|
established, the exit node replies with a RELAY_END cell.
|
|
Otherwise, the exit node replies with a RELAY_CONNECTED cell.
|
|
|
|
The OP waits for a RELAY_CONNECTED cell before sending any data.
|
|
Once a connection has been established, the OP and exit node
|
|
package stream data in RELAY_DATA cells, and upon receiving such
|
|
cells, echo their contents to the corresponding TCP stream.
|
|
|
|
Relay RELAY_DROP cells are long-range dummies; upon receiving such
|
|
a cell, the OR or OP must drop it.
|
|
|
|
5.2. Closing streams
|
|
|
|
[Note -- TCP streams can only be half-closed for reading. Our
|
|
Bickford's conversation was incorrect. -NM]
|
|
|
|
Because TCP connections can be half-open, we follow an equivalent
|
|
to TCP's FIN/FIN-ACK/ACK protocol to close streams.
|
|
|
|
A exit conneection can have a TCP stream in one of three states:
|
|
'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes
|
|
of modeling transitions, we treat 'CLOSED' as a fourth state,
|
|
although connections in this state are not, in fact, tracked by the
|
|
onion router.
|
|
|
|
A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from
|
|
the corresponding TCP connection, the edge node sends a 'RELAY_END'
|
|
cell along the circuit and changes its state to 'DONE_PACKAGING'.
|
|
Upon receiving a 'RELAY_END' cell, an edge node sends a 'FIN' to
|
|
the corresponding TCP connection (e.g., by calling
|
|
shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'.
|
|
|
|
When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it
|
|
also sends a 'RELAY_END' along the circuit, and changes its state
|
|
to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a
|
|
'RELAY_END' cell, it sends a 'FIN' and changes its state to
|
|
'CLOSED'.
|
|
|
|
[Note: Please rename 'RELAY_END2'. :) -NM ]
|
|
|
|
If an edge node encounters an error on any stram, it sends a
|
|
'RELAY_END2' cell along the circuit (if possible) and closes the
|
|
TCP connection immediately. If an edge node receives a
|
|
'RELAY_END2' cell for any stream, it closes the TCP connection
|
|
completely, and sends nothing along the circuit.
|
|
|
|
6. Flow control
|
|
|
|
6.1. Link throttling
|
|
|
|
Each node should do appropriate bandwidth throttling to keep its
|
|
user happy.
|
|
|
|
Communicants rely on TCP's default flow control to push back when they
|
|
stop reading.
|
|
|
|
6.2. Link padding
|
|
|
|
Currently nodes are not required to do any sort of link padding or
|
|
dummy traffic. Because strong attacks exist even with link padding,
|
|
and because link padding greatly increases the bandwidth requirements
|
|
for running a node, we plan to leave out link padding until this
|
|
tradeoff is better understood.
|
|
|
|
6.3. Circuit-level flow control
|
|
|
|
To control a circuit's bandwidth usage, each OR keeps track of
|
|
two 'windows', consisting of how many RELAY_DATA cells it is
|
|
allowed to package for transmission, and how many RELAY_DATA cells
|
|
it is willing to deliver to streams outside the network.
|
|
Each 'window' value is initially set to 1000 data cells
|
|
in each direction (cells that are not data cells do not affect
|
|
the window). When an OR is willing to deliver more cells, it sends a
|
|
RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR
|
|
receives a RELAY_SENDME cell with stream ID zero, it increments its
|
|
packaging window.
|
|
|
|
Either of these cells increment the corresponding window by 100.
|
|
|
|
The OP behaves identically, except that it must track a packaging
|
|
window and a delivery window for every OR in the circuit.
|
|
|
|
An OR or OP sends cells to increment its delivery window when the
|
|
corresponding window value falls under some threshold (900).
|
|
|
|
If a packaging window reaches 0, the OR or OP stops reading from
|
|
TCP connections for all streams on the corresponding circuit, and
|
|
sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell.
|
|
|
|
6.4. Stream-level flow control
|
|
|
|
Edge nodes use RELAY_SENDME cells to implement end-to-end flow
|
|
control for individual connections across circuits. Similarly to
|
|
circuit-level flow control, edge nodes begin with a window of cells
|
|
(500) per stream, and increment the window by a fixed value (50)
|
|
upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME
|
|
cells when both a) the window is <= 450, and b) there are less than
|
|
ten cell payloads remaining to be flushed at that edge.
|
|
|
|
|
|
7. Directories and routers
|
|
|
|
7.1. Router descriptor format.
|
|
|
|
(Unless otherwise noted, tokens on the same line are space-separated.)
|
|
|
|
Router ::= Router-Line Date-Line Onion-Key Link-Key Signing-Key Exit-Policy Router-Signature NL
|
|
Router-Line ::= "router" nickname address ORPort SocksPort DirPort bandwidth NL
|
|
Date-Line ::= "published" YYYY-MM-DD HH:MM:SS NL
|
|
Onion-key ::= "onion-key" NL a public key in PEM format NL
|
|
Link-key ::= "link-key" NL a public key in PEM format NL
|
|
Signing-Key ::= "signing-key" NL a public key in PEM format NL
|
|
Exit-Policy ::= Exit-Line*
|
|
Exit-Line ::= ("accept"|"reject") string NL
|
|
Router-Signature ::= "router-signature" NL Signature
|
|
Signature ::= "-----BEGIN SIGNATURE-----" NL
|
|
Base-64-encoded-signature NL "-----END SIGNATURE-----" NL
|
|
|
|
ORport ::= port where the router listens for routers/proxies (speaking cells)
|
|
SocksPort ::= where the router listens for applications (speaking socks)
|
|
DirPort ::= where the router listens for directory download requests
|
|
bandwidth ::= maximum bandwidth, in bytes/s
|
|
|
|
nickname ::= between 1 and 32 alphanumeric characters. case-insensitive.
|
|
|
|
Example:
|
|
router moria1 moria.mit.edu 9001 9021 9031 100000
|
|
published 2003-09-24 19:36:05
|
|
-----BEGIN RSA PUBLIC KEY-----
|
|
MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
|
|
7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
|
|
nZ7kVMRoiXCbjL6VAtNa4Zy1Af/GOm0iCIDpholeujQ95xew7rQnAgMA//8=
|
|
-----END RSA PUBLIC KEY-----
|
|
signing-key
|
|
-----BEGIN RSA PUBLIC KEY-----
|
|
7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
|
|
MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
|
|
f/GOm0iCIDpholeujQ95xew7rnZ7kVMRoiXCbjL6VAtNa4Zy1AQnAgMA//8=
|
|
-----END RSA PUBLIC KEY-----
|
|
reject 18.0.0.0/24
|
|
|
|
Note: The extra newline at the end of the router block is intentional.
|
|
|
|
7.2. Directory format
|
|
|
|
Directory ::= Directory-Header Directory-Router Router* Signature
|
|
Directory-Header ::= "signed-directory" NL Software-Line NL
|
|
Software-Line: "recommended-software" comma-separated-version-list
|
|
Directory-Router ::= Router
|
|
Directory-Signature ::= "directory-signature" NL Signature
|
|
Signature ::= "-----BEGIN SIGNATURE-----" NL
|
|
Base-64-encoded-signature NL "-----END SIGNATURE-----" NL
|
|
|
|
Note: The router block for the directory server must appear first.
|
|
The signature is computed by computing the SHA-1 hash of the
|
|
directory, from the characters "signed-directory", through the newline
|
|
after "directory-signature". This digest is then padded with PKCS.1,
|
|
and signed with the directory server's signing key.
|
|
|
|
7.3. Behavior of a directory server
|
|
|
|
lists nodes that are connected currently
|
|
speaks http on a socket, spits out directory on request
|
|
|
|
-----------
|
|
(for emacs)
|
|
Local Variables:
|
|
mode:text
|
|
indent-tabs-mode:nil
|
|
fill-column:77
|
|
End:
|