mirror of
https://github.com/torproject/torspec.git
synced 2024-12-12 04:35:37 +00:00
820 lines
33 KiB
Plaintext
820 lines
33 KiB
Plaintext
Filename: 202-improved-relay-crypto.txt
|
|
Title: Two improved relay encryption protocols for Tor cells
|
|
Author: Nick Mathewson
|
|
Created: 19 Jun 2012
|
|
Status: Meta
|
|
|
|
Note: This is an important development step in improving our relay
|
|
crypto, but it doesn't actually say how to do this.
|
|
|
|
|
|
Overview:
|
|
|
|
This document describes two candidate designs for a better Tor
|
|
relay encryption/decryption protocol, designed to stymie tagging
|
|
attacks and better accord with best practices in protocol design.
|
|
|
|
My hope is that readers will examine these protocols, evaluate their
|
|
security and practicality, improve on them, and help to pick one for
|
|
implementation in Tor.
|
|
|
|
In section 1, I describe Tor's current relay crypto protocol, its
|
|
drawbacks, how it fits in with the rest of Tor, and some
|
|
requirements/desiderata for a replacement. In sections 2 and 3, I
|
|
propose two alternative replacements for this protocol. In section
|
|
4, I discuss their pros and cons.
|
|
|
|
1. Background and Motivation
|
|
|
|
1.0. A short overview of Tor's current protocols
|
|
|
|
The core pieces of the Tor protocol are the link protocol,
|
|
the circuit extend protocol, the relay protocol, and the stream
|
|
protocol. All are documented in [TorSpec].
|
|
|
|
Briefly:
|
|
|
|
- The link protocol handles all direct pairwise communication
|
|
between nodes. Everything else is transmitted over it. It
|
|
uses TLS.
|
|
|
|
- The circuit extend protocol uses public-key crypto to set up
|
|
multi-node virtual tunnels, called "circuits", from a client
|
|
through one or more nodes.
|
|
|
|
*** The relay protocol uses established circuits to communicate
|
|
from a client to a node on a circuit. That's the one we'll
|
|
be talking about here. ***
|
|
|
|
- The stream protocol is tunneled over relay protocol; clients
|
|
use it to tell servers to open anonymous TCP connections, to
|
|
send data, and so forth. Servers use it to report success or
|
|
failure opening anonymous TCP connections, to send data from
|
|
TCP connections back to clients, and so forth.
|
|
|
|
In more detail: The link protocol's job is to connect two
|
|
computers with an encrypted, authenticated stream, to
|
|
authenticate one or both of them to the other, and to provide a
|
|
channel for passing "cells" between them. The circuit extend
|
|
protocol's job is to set up circuits: persistent tunnels that
|
|
connect a Tor client to an exit node through a series of
|
|
(typically three) hops, each of which knows only the previous and
|
|
next hop, and each of which has a set of keys that they share
|
|
only with the client. Finally, the relay protocol's job is to
|
|
allow a client to communicate bidirectionally with the node(s) on
|
|
the circuit, once their shared keys have been established.
|
|
|
|
(We'll ignore the stream protocol for the rest of this document.)
|
|
|
|
Note on terminology: Tor nodes are sometimes called "servers",
|
|
"relays", or "routers". I'll use all these terms more or less
|
|
interchangeably. For simplicity's sake, I will call the party
|
|
who is constructing and using a circuit "the client" or "Alice",
|
|
even though nodes sometimes construct circuits too.
|
|
|
|
Tor's internal packets are called "cells". All the cells we deal
|
|
with here are 512 bytes long.
|
|
|
|
The nodes in a circuit are called its "hops"; most circuits are 3
|
|
hops long. This doesn't include the client: when Alice builds a
|
|
circuit through nodes Bob_1, Bob_2, and Bob_3, the first hop is
|
|
"Bob_1" and the last hop is "Bob_3".
|
|
|
|
1.1. The current relay protocol and its drawbacks
|
|
|
|
[This section describes Tor's current relay protocol. It is not a
|
|
proposal; rather it is what we do now. Sections 2 and 3 have my
|
|
proposed replacements for it.]
|
|
|
|
A "relay cell" is a cell that is generated by the client to send
|
|
to a node, or by a node to send to the client. It's called a
|
|
"relay" cell because a node that receives one may need to relay
|
|
it to the next or previous node in the circuit (or to handle the
|
|
cell itself).
|
|
|
|
A relay cell moving towards the client is called "inbound"; a
|
|
cell moving away is called "outbound".
|
|
|
|
When a relay cell is constructed by the client, the client adds one
|
|
layer of crypto for each node that will process the cell, and gives
|
|
the cell to the first node in the circuit. Each node in turn then
|
|
removes one layer of crypto, and either forwards the cell to the next
|
|
node in the circuit or acts on that cell itself.
|
|
|
|
When a relay cell is constructed by a node, it encrypts it and sends
|
|
it to the preceding node in the circuit. Each node between the
|
|
originating node and the client then encrypts the cell and passes it
|
|
back to the preceding node. When the client receives the cell, it
|
|
removes layers of crypto until it has an unencrypted cell, which it
|
|
then acts on.
|
|
|
|
In the current protocol, the body of each relay cell contains, in
|
|
its unencrypted form:
|
|
|
|
Relay command [1 byte]
|
|
Zeros [2 bytes]
|
|
StreamID [2 bytes]
|
|
Digest [4 bytes]
|
|
Length [2 bytes]
|
|
Data [498 bytes]
|
|
|
|
(This only adds up to 509 bytes. That's because the Tor link
|
|
protocol transfers 512-byte cells, and has a 3 byte overhead per
|
|
cell. Not how we would do it if we were starting over at this
|
|
point.)
|
|
|
|
At every node of a circuit, the node relaying a cell
|
|
encrypts/decrypts it with AES128-CTR. If the cell is outbound
|
|
and the "zeros" field is set to all-zeros, the node additionally
|
|
checks whether 'digest' is correct. A correct digest is the
|
|
first 4 bytes of the running SHA1 digest of: a shared secret,
|
|
concatenated with all the relay cells destined for this node on
|
|
this circuit so far, including this cell. If _that's_ true, then
|
|
the node accepts this cell. (See section 6 of [TorSpec] for full
|
|
detail; see section A.1 for a more rigorous description.)
|
|
|
|
The current approach has some actual problems. Notably:
|
|
|
|
* It permits tagging attacks. Because AES_CTR is an XOR-based
|
|
stream cipher, an adversary who controls the first node in a
|
|
circuit can XOR anything he likes into the relay cell, and
|
|
then see whether he/she encounters a correspondingly
|
|
defaced cell at some exit that he also controls.
|
|
|
|
That is, the attacker picks some pattern P, and when he
|
|
would transmit some outbound relay cell C at hop 1, he
|
|
instead transmits C xor P. If an honest exit receives the
|
|
cell, the digest will probably be wrong, and the honest exit
|
|
will reject it. But if the attacker controls the exit, he
|
|
will notice that he has received a cell C' where the digest
|
|
doesn't match, but where C' xor P _does_ have a good digest.
|
|
The attacker will then know that his two nodes are on the
|
|
same circuit, and thereby be able to link the user (whom the
|
|
first node sees) to her activities (which the last node sees).
|
|
|
|
Back when we did the Tor design, this didn't seem like a
|
|
big deal, since an adversary who controls both the first and
|
|
last node in a circuit is presumed to win already based on
|
|
traffic correlation attacks. This attack seemed strictly
|
|
worse than that, since it was trivially detectable in the
|
|
case where the attacker _didn't_ control both ends. See
|
|
section 4.4 of the Tor paper [TorDesign] for our early
|
|
thoughts here; see Xinwen Fu et al's 2009 work for a more
|
|
full explanation of the in-circuit tagging attack [XF]; and
|
|
see "The 23 Raccoons'" March 2012 "Analysis of the Relative
|
|
Severity of Tagging Attacks" mail on tor-dev (and the
|
|
ensuing thread) for a discussion of why we may want to care
|
|
after all, due to attacks that use tagging to amplify route
|
|
capture. [23R]
|
|
|
|
It also has some less practical issues.
|
|
|
|
* The digest portion is too short. Yes, if you're an attacker
|
|
trying to (say) change an "ls *" into an "rm *", you can
|
|
only expect to get one altered cell out of 2^32 accepted --
|
|
and all future cells on the circuit will be rejected with
|
|
similar probability due to the change in the running hash
|
|
-- but 1/2^32 is a pretty high success rate for crypto attacks.
|
|
|
|
* It does MAC-then-encrypt. That makes smart people cringe.
|
|
|
|
* Its approach to MAC is H(Key | Msg), which is vulnerable to
|
|
length extension attack if you've got a Merkle-Damgard hash
|
|
(which we do). This isn't relevant in practice right now,
|
|
since the only parties who see the digest are the two
|
|
parties that rely on it (because of the MAC-then-encrypt).
|
|
|
|
|
|
1.2. Some feature requirements
|
|
|
|
Relay cells can originate at the client or at a relay. Relay cells
|
|
that originate at the client are given to the first node in the
|
|
circuit, and constructed so that they will be decrypted and forwarded
|
|
by the first M-1 nodes in the circuit, and finally decrypted and
|
|
processed by the Mth node, where the client chooses M. (Usually, the
|
|
Mth node is the the last one, which will be an exit node.) Relay
|
|
cells that originate at a hop in the circuit are given to the
|
|
preceding node, and eventually delivered to the client.
|
|
|
|
Tor provides a so called "leaky pipe" circuit topology
|
|
[TorDesign]: a client can send a cell to any node in the circuit,
|
|
not just the last node. I'll try to keep that property, although
|
|
historically we haven't really made use of it.
|
|
|
|
In order to implement telescoping circuit construction (where the
|
|
circuit is built by iteratively asking the last node in the
|
|
circuit to extend the circuit one hop more), it's vital that the
|
|
last hop of the circuit be able to change.
|
|
|
|
Proposal 188 [Prop188] suggests that we implement a "bridge guards"
|
|
feature: making some (non-public) nodes insert an extra hop into
|
|
the circuit after themselves, in a way that will make it harder
|
|
for other nodes in the network to enumerate them. We
|
|
therefore want our circuits to be one-hop re-extensible: when the
|
|
client extends a circuit from Bob1 to Bob2, we want Bob1 to be
|
|
able to introduce a new node "Bob1.5" into the circuit such that
|
|
the circuit runs Alice->Bob1->Bob1.5->Bob2. (This feature has
|
|
been called "loose source routing".)
|
|
|
|
Any new approach should be able to coexist on a circuit
|
|
with the old approach. That is, if Alice wants to build a
|
|
circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a
|
|
revised relay protocol, then Alice should be able to build a
|
|
circuit such that she can have Bob1 and Bob3 process each cell
|
|
with the current protocol, and Bob2 process it with a revised
|
|
protocol. (Why? Because if all nodes in a circuit needed to use
|
|
the same relay protocol, then each node could learn information
|
|
about the other nodes in the circuit from which relay protocol
|
|
was chosen. For example, if Bob1 supports the new protocol, and
|
|
sees that the old relay protocol is in use, and knows that Bob2
|
|
supports the new one, then Bob1 has learned that Bob3 is some
|
|
node that does not support the new relay protocol.)
|
|
|
|
Cell length needs to be constant as cells move through the
|
|
network. For historical reasons mentioned above in section 1.1,
|
|
the length of the encrypted part of a relay cell needs to be 509
|
|
bytes.
|
|
|
|
1.3. Some security requirements
|
|
|
|
Two adjacent nodes on a circuit can trivially tell that they are
|
|
on the same circuit, and the first node can trivially tell who
|
|
the client is. Other than that, we'd like any attacker who
|
|
controls a node on the circuit not to have a good way to learn
|
|
any other nodes, even if he/she controls those nodes. [*]
|
|
|
|
Relay cells should not be malleable: no relay should be able to
|
|
alter a cell between an honest sender and an honest recipient in
|
|
a way that they cannot detect.
|
|
|
|
Relay cells should be secret: nobody but the sender and recipient
|
|
of a relay cell should be able to learn what it says.
|
|
|
|
Circuits should resist transparent, recoverable tagging attacks:
|
|
if an attacker controls one node in a circuit and alters a relay
|
|
cell there, no non-adjacent point in the circuit should be able
|
|
to recover the relay cell as it would have received it had the
|
|
attacker not altered it.
|
|
|
|
The above properties should apply to sequences of cells too:
|
|
an attacker shouldn't be able to change what sequence of cells
|
|
arrives at a destination (for example, by removing, replaying, or
|
|
reordering one or more cells) without being detected.
|
|
|
|
(Feel free to substitute whatever formalization of the above
|
|
requirements makes you happiest, and add whatever caveats are
|
|
necessary to make you comfortable. I probably missed at least
|
|
two critical properties.)
|
|
|
|
[*] Of course, an attacker who controls two points on a circuit
|
|
can probably confirm this through traffic correlation. But
|
|
we'd prefer that the cryptography not create other, easier
|
|
ways for them to do this.
|
|
|
|
1.4. A note on algorithms
|
|
|
|
This document is deliberately agnostic concerning the choice of
|
|
cryptographic primitives -- not because I have no opinions about
|
|
good ciphers, MACs, and modes of operation -- but because
|
|
experience has taught me that mentioning any particular
|
|
cryptographic primitive will prevent discussion of anything else.
|
|
|
|
Please DO NOT suggest algorithms to use in implementing these
|
|
protocols yet. It will distract! There will be time later!
|
|
|
|
If somebody _else_ suggests algorithms to use, for goodness' sake
|
|
DON'T ARGUE WITH THEM! There will be time later!
|
|
|
|
|
|
2. Design 1: Large-block encryption
|
|
|
|
In this approach, we use a tweakable large-block cipher for
|
|
encryption and decryption, and a tweak-chaining function TC.
|
|
|
|
2.1. Chained large-block what now?
|
|
|
|
We assume the existence of a primitive that provides the desired
|
|
properties of a tweakable[Tweak] block cipher, taking blocks of any
|
|
desired size. (In our case, the block size is 509 bytes[*].)
|
|
|
|
It also takes a Key, and a per-block "tweak" parameter that plays
|
|
the same role that an IV plays in CBC, or that the counter plays
|
|
in counter mode.
|
|
|
|
The Tweak-chaining function TC takes as input a previous tweak, a
|
|
tweak chaining key, and a cell; it outputs a new tweak. Its
|
|
purpose is to make future cells undecryptable unless you have
|
|
received all previous cells. It could probably be something like
|
|
a MAC of the old tweak and the cell using the tweak chaining key
|
|
as the MAC key.
|
|
|
|
(If the initial tweak is secret, I am not sure that TC needs to
|
|
be keyed.)
|
|
|
|
[*] Some large-block cipher constructions use blocks whose size is
|
|
the multiple of some underlying cipher's block size. If we wind
|
|
up having to use one of those, we can use 496-byte blocks instead
|
|
at the cost of 2.5% wasted space.
|
|
|
|
2.2. The protocol
|
|
|
|
2.2.1. Setup phase
|
|
|
|
The circuit construction algorithm needs to produce forward and
|
|
backward keys Kf and Kb, the forward and backward tweak chaining
|
|
keys TCKf and TCKb, as well as initial tweak values Tf and Tb.
|
|
|
|
2.2.2. The cell format
|
|
|
|
We replace the "Digest" and "Zeros" fields of the cell with a
|
|
single Z-byte "Zeros" field to determine when the cell is
|
|
recognized and correctly decrypted; its length is a security
|
|
parameter.
|
|
|
|
2.2.3. The decryption operations
|
|
|
|
For a relay to handle an inbound RELAY cell, it sets Tb_next to
|
|
TC(TCKb, Tb, Cell). Then it encrypts the cell using the large
|
|
block cipher keyed with Kb and tweaked with Tb. Then it sets Tb
|
|
to Tb_next.
|
|
|
|
For a relay to handle an outbound RELAY cell, it sets Tf_next to
|
|
TC(TCKf, Tf, Cell). Then it decrypts the cell using the large
|
|
block cipher keyed with Kf and tweaked with Tf. Then it sets Tf
|
|
to Tf_next. Then it checks the 'Zeros' field on the cell;
|
|
if that field is all [00] bytes, the cell is for us.
|
|
|
|
2.3. Security discussion
|
|
|
|
This approach is fairly simple (at least, no more complex than
|
|
its primitives) and achieves some of our security goals. Because
|
|
of the large block cipher approach, any change to a cell will
|
|
render that cell undecryptable, and indistinguishable from random
|
|
junk. Because of the tweak chaining approach, if even one cell
|
|
is missed or corrupted or reordered, future cells will also
|
|
decrypt into random junk.
|
|
|
|
The tagging attack in this case is turned into a circuit-junking
|
|
attack: an adversary who tries to mount it can probably confirm
|
|
that he was indeed first and last node on a target circuit
|
|
(assuming that circuits which turn to junk in this way are rare),
|
|
but cannot recover the circuit after that point.
|
|
|
|
As a neat side observation, note that this approach improves upon
|
|
Tor's current forward secrecy, by providing forward secrecy while
|
|
circuits are still operational, since each change to the tweak
|
|
should make previous cells undecryptable if the old tweak value
|
|
isn't recoverable.
|
|
|
|
The length of Zeros is a parameter for what fraction of "random
|
|
junk" cells will potentially be accepted by a router or client.
|
|
If Zeros is Z bytes long, then junk cells will be accepted with
|
|
P < 2^-(8*Z + 7). (The '+7' is there because the top 7 bits of
|
|
the Length field must also be 0.)
|
|
|
|
There's no trouble using this protocol in a mixed circuit, where
|
|
some nodes speak the old protocol and some speak the
|
|
large-block-encryption protocol.
|
|
|
|
3. Design 2: short-MAC-and-pad
|
|
|
|
In this design, we behave more similarly to a mix-net design
|
|
(such as Mixmaster or Mixminion's headers). Each node checks a
|
|
MAC, and then re-pads the cell to its chosen length before
|
|
decoding the cell.
|
|
|
|
This design uses as a primitive a MAC and a stream cipher. It
|
|
might also be possible to use an authenticating cipher mode,
|
|
if we can find one that works like a stream cipher and allows us
|
|
to efficiently output authenticators for the stream so far.
|
|
|
|
NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be
|
|
replaced with an authenticated encryption mode without too much
|
|
loss of generality.
|
|
|
|
3.1. The protocol
|
|
|
|
3.1.1 Setup phase
|
|
|
|
The circuit construction algorithm needs to produce forward and
|
|
backward keys Kf and Kb, forward and backward stream cipher IVs
|
|
IVf and IVb, and forward and backward MAC keys Mf and Mb.
|
|
|
|
Additionally, the circuit construction algorithm needs a way for
|
|
the client to securely (and secretly? XXX) tell each hop in the
|
|
circuit a value 'bf' for the number of bytes of MAC it should
|
|
expect on outbound cells and 'bb' for the number of bytes it
|
|
should use on cells it's generating. Each node can get a
|
|
different 'bf' and 'bb'. These values can be 0: if a node's bf
|
|
is 0, it doesn't authenticate cells; if its bb is 0, it doesn't
|
|
originate them.
|
|
|
|
The circuit construction algorithm also needs a way to tell each
|
|
the client to securely (and secretly? XXX) tell each hop in the
|
|
circuit whether it is allowed to be the final destination for
|
|
relay cells.
|
|
|
|
Set the stream Sf and the stream Sb to empty values.
|
|
|
|
3.1.2. The cell format
|
|
|
|
The Zeros and Digest field of the cell format are removed.
|
|
|
|
3.1.3. The relay operations
|
|
|
|
Upon receiving an outbound cell, a node removes the first b bytes
|
|
of the cell, and puts them aside as 'M'. The node then computes
|
|
between 0 and 2 MACs of the stream consisting of all previously
|
|
MAC'd data, plus the remainder of the cell:
|
|
|
|
If b>0 and there is a next hop, the node computes M_relay.
|
|
|
|
If this node was told to deliver traffic, or it is the last
|
|
node in the circuit so far, the node computes M_receive.
|
|
|
|
M_relay is computed as MAC(stream | "relay"); M_receive is
|
|
computed as MAC(stream | "receive").
|
|
|
|
If M = M_receive, this cell is for the node; it should process
|
|
it.
|
|
|
|
If M = M_relay, or if b == 0, this cell should be relayed.
|
|
|
|
If a MAC was computed and neither of the above cases was met,
|
|
then the cell is bogus; the node should discard it and destroy
|
|
the circuit.
|
|
|
|
The node then removes the first bf bytes of the cell, and pads the
|
|
cell at the end with bf zero bytes. Finally, the node decrypts
|
|
the whole remaining padded cell with the stream cipher.
|
|
|
|
To handle an inbound cell, the node simply does a stream cipher
|
|
with no checking.
|
|
|
|
3.1.4. Generating inbound cells.
|
|
|
|
To generate an inbound cell, a node makes a 509-bb byte RELAY
|
|
cell C, encrypts that cell with Kb, appends the encrypted cell to
|
|
Sb, and prepends M_receive(Sb) to the cell.
|
|
|
|
3.1.5. Generating outbound cells
|
|
|
|
Generating an outbound cell is harder, since we need to know what
|
|
padding the earlier nodes will generate in order to know what
|
|
padding the later nodes will receive and compute their MACs, but
|
|
we also need to know what MACs we'll send to the later nodes in
|
|
order to compute which MACs we'll send to the earlier ones.
|
|
|
|
Mixnet clients have needed to do this for ages, though, so the
|
|
algorithms are pretty well settled. I'll give one below in A.3.
|
|
|
|
3.2. Security discussion
|
|
|
|
This approach is also simple and (given good parameter choices)
|
|
can achieve our goals. The trickiest part is the algorithm that
|
|
clients must follow to package cells, but that's not so bad.
|
|
|
|
It's not necessary to check MACs on inbound traffic, because
|
|
nobody but the client can tell scrambled messages from good ones,
|
|
and the client can be trusted to keep the client's own secrets.
|
|
|
|
With this protocol, if the attacker tries to do a tagging attack,
|
|
the circuit should get destroyed by the next node in the circuit
|
|
that has a nonzero "bf" value, with probability == 1-2^-(8*bf).
|
|
(If there are further intervening honest nodes, they also have a
|
|
chance to detect the attack.)
|
|
|
|
Similarly, any attempt to replay, or reorder outbound cells
|
|
should fail similarly.
|
|
|
|
The "bf" values could reveal to each node its position in the
|
|
circuit and the client preferences, depending on how we set them;
|
|
on the other hand, having a fixed bf value would reveal to the
|
|
last node the length of the circuit. Neither option seems great.
|
|
|
|
This protocol doesn't provide any additional forward secrecy
|
|
beyond what Tor has today. We could fix that by changing our use
|
|
of the stream cipher so that instead of running in counter mode
|
|
between cells, we use a tweaked stream cipher and change the
|
|
tweak with each cell (possibly based on the unused portion of the
|
|
MAC).
|
|
|
|
This protocol does support loose source routing, so long as
|
|
no padding bytes are added by any router-added nodes.
|
|
|
|
In a circuit, every node has at least one relay cell sent to it:
|
|
even non-exit nodes get a RELAY_EXTEND cell.
|
|
|
|
4. Discussion
|
|
|
|
I'm not currently seeing a reason to strongly prefer one of these
|
|
approaches over another.
|
|
|
|
In favor of large-block encryption:
|
|
- The space overhead seems smaller: we need to use up fewer
|
|
bytes in order to get equivalent looking security.
|
|
|
|
(For example, if we want to have P < 2^64 that a cell altered
|
|
by hop 1 could be accepted by hop 2 or hop 3, *and* we want P
|
|
< 2^64 that a cell altered by hop 2 could be accepted by hop
|
|
3, the large-block approach needs about 8 bytes for the Zeros
|
|
field, whereas the short-MAC-and-pad approach needs 16 bytes
|
|
worth of MACs.)
|
|
|
|
- We get forward secrecy pretty easily.
|
|
|
|
- The format doesn't leak anything about the length of the
|
|
circuit, or limit it.
|
|
|
|
- We don't need complicated logic to set the 'b' parameters.
|
|
|
|
- It doesn't need tricky padding code.
|
|
|
|
In the favor of short-MAC-and-pad:
|
|
- All of the primitives used are much better analyzed and
|
|
understood. There's nothing esoteric there. The format
|
|
itself is similar to older, well-analyzed formats.
|
|
|
|
- Most of the constructions for the large-block-cipher approach
|
|
seem less efficient in CPU cycles than a good stream cipher
|
|
and a MAC. (But I don't want to discuss that now; see section
|
|
1.4 above!)
|
|
|
|
Unclear:
|
|
|
|
- Suppose that an attacker controls the first and last hop of a
|
|
circuit, and tries an end-to-end tagging attack. With
|
|
large-block encryption, the tagged cell and all future cells
|
|
on the circuit turn to junk after the tagging attack, with
|
|
P~~100%. With short-MAC-and-pad, the circuit is destroyed at
|
|
the second hop, with P ~~ 1- 2^(-8*bf). Is one of these
|
|
approaches materially worse for the attacker?
|
|
|
|
- Can we do better than the "compute two MACs" approach for
|
|
distinguishing the relay and receive cases of the
|
|
short-MAC-and-pad protocol?
|
|
|
|
- To what extent can we improve these protocols?
|
|
|
|
- If we do short-MAC-and-pad, should we apply the forward
|
|
security hack alluded to in section 3.2?
|
|
|
|
5. Acknowledgments
|
|
|
|
Thanks to the many reviewers of the initial drafts of this
|
|
proposal. If you can make any sense of what I'm saying, they
|
|
deserve much of the credit.
|
|
|
|
A. Formal description
|
|
|
|
Note that in all these cases, more efficient descriptions exist.
|
|
|
|
A.1. The current Tor relay protocol.
|
|
|
|
Relay cell format:
|
|
|
|
Relay command [1 byte]
|
|
Zeros [2 bytes]
|
|
StreamID [2 bytes]
|
|
Digest [4 bytes]
|
|
Length [2 bytes]
|
|
Data [498 bytes]
|
|
|
|
Circuit setup:
|
|
|
|
(Specified elsewhere; the client negotiates with each router in
|
|
a circuit the secret AES keys Kf, Kb, and the secret 'digest
|
|
keys' Df, and Db. They initialize AES counters Cf and Cb to
|
|
0. They initialize the digest stream Sf to Df, and Sb to Db.)
|
|
|
|
HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]):
|
|
|
|
1. If the Zeros field of Cell is not [00 00], return False.
|
|
2. Let Cell' = Cell with its Digest field set to [00 00 00 00].
|
|
3. Let S' = Stream | Cell'.
|
|
4. If SHA1(S') = the Digest field of Cell, set Stream to S',
|
|
and return True.
|
|
5. Otherwise return False.
|
|
|
|
HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
|
|
|
|
1. Set the Zeros and Digest field of Cell' to [00] bytes.
|
|
2. Set Stream to Stream | Cell'.
|
|
3. Construct Cell from Cell' by setting the Digest field to
|
|
SHA1(Stream), and taking all other fields as-is.
|
|
4. Return Cell.
|
|
|
|
HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
|
|
1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
|
|
counter 'Ctr'. Increment 'Ctr' by the length of the cell.
|
|
|
|
HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
|
|
1. Same as ENCRYPT.
|
|
|
|
|
|
Router operation, upon receiving an inbound cell -- that is, one
|
|
sent towards the client.
|
|
|
|
1. ENCRYPT(cell, Kb, Cb)
|
|
2. Send the decrypted contents towards the client.
|
|
|
|
Router operation, upon receiving an outbound cell -- that is, one
|
|
sent away from the client.
|
|
|
|
1. DECRYPT(cell, Kf, Cf)
|
|
2. If CHECK(Cell, Sf) is true, this cell is for us. Do not
|
|
relay the cell.
|
|
3. Otherwise, this cell is not for us. Send the decrypted cell
|
|
to the next hop on the circuit, or discard it if there is no
|
|
next hop.
|
|
|
|
Router operation, to create a relay cell that will be delivered
|
|
to the client.
|
|
|
|
1. Construct a Relay cell Cell' with the relay command, length,
|
|
stream ID, and body fields set as appropriate.
|
|
2. Let Cell = CONSTRUCT(Cell', Sb).
|
|
3. ENCRYPT(Cell, Kb, Cb)
|
|
4. Send the encrypted cell towards the client.
|
|
|
|
Client operation, receiving an inbound cell.
|
|
|
|
For each hop H in a circuit, starting with the first hop and
|
|
ending (possibly) with the last:
|
|
|
|
1. DECRYPT(Cell, Kb_H, Cb_H)
|
|
|
|
2. If CHECK(Cell, Sb_H) is true, this cell was sent from hop
|
|
H. Exit the loop, and return the cell in its current
|
|
form.
|
|
|
|
If we reach the end of the loop without finding the right hop,
|
|
the cell is bogus or corrupted.
|
|
|
|
Client operation, sending an outbound cell to hop H.
|
|
|
|
1. Construct a Relay cell Cell' with the relay command, length,
|
|
stream ID, and body fields set as appropriate.
|
|
2. Let Cell = CONSTRUCT(Cell', Sf_H)
|
|
3. For i = H..1:
|
|
A. ENCRYPT(Cell, Sf_i, Cf_i)
|
|
4. Deliver Cell to the first hop in the circuit.
|
|
|
|
A.2. The large-block-cipher protocol
|
|
|
|
Same as A.1, except for the following changes.
|
|
|
|
The cell format is now:
|
|
Zeros [Z bytes]
|
|
Length [2 bytes]
|
|
StreamID [2 bytes]
|
|
Relay command [1 byte]
|
|
Data [504-Z bytes]
|
|
|
|
Ctr is no longer a counter, but a Tweak value.
|
|
|
|
Each key is now a tuple of (Key_Crypt, Key_TC)
|
|
|
|
Streams are no longer used.
|
|
|
|
HELPER FUNCTION: CHECK(Cell [in], Stream [in,out])
|
|
1. If the Zeros field of Cell contains only [00] bytes,
|
|
return True.
|
|
2. Otherwise return false.
|
|
|
|
HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out])
|
|
1. Let Cell be Cell', with its "Zeros" field set to Z [00]
|
|
bytes.
|
|
2. Return Cell'.
|
|
|
|
HELPER FUNCTION: ENCRYPT(Cell [in,out], Key [in], Tweak [in,out])
|
|
2. Encrypt Cell using Key and Tweak
|
|
1. Let Tweak' = TC(Key_TC, Tweak, Cell)
|
|
3. Set Tweak to Tweak'.
|
|
|
|
HELPER FUNCTION: DECRYPT(Cell [in,out], Key [in], Tweak [in,out])
|
|
1. Let Tweak' = TC(Key_TC, Tweak, Cell)
|
|
2. Decrypt Cell using Key and Tweak
|
|
3. Set Tweak to Tweak'.
|
|
|
|
A.3. The short-MAC-and-pad protocol.
|
|
|
|
Define M_relay(K,S) as MAC(K, S|"relay").
|
|
Define M_receive(K,S) as MAC(K, S|"receive").
|
|
Define Z(n) as a series of n [00] bytes.
|
|
Define BODY_LEN as 509
|
|
|
|
The cell message format is now:
|
|
|
|
Relay command [1 byte]
|
|
StreamID [2 bytes]
|
|
Length [2 bytes]
|
|
Data [variable bytes]
|
|
|
|
Helper function: CHECK(Cell [in], b [in], K [in], S [in,out]):
|
|
Let M = Cell[0:b]
|
|
Let Rest = Cell[b:...]
|
|
If b == 0:
|
|
Return (nil, Rest)
|
|
Let S' = S | Rest
|
|
If M == M_relay(K,S')[0:b]:
|
|
Set S = S'
|
|
Return ("relay", Rest)
|
|
If M == M_receive(K,S')[0:b]:
|
|
Set S = S'
|
|
Return ("receive", Rest)
|
|
Return ("BAD", nil)
|
|
|
|
HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out])
|
|
1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and
|
|
counter 'Ctr'. Increment 'Ctr' by the length of the cell.
|
|
|
|
HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out])
|
|
1. Same as ENCRYPT.
|
|
|
|
Router operation, upon receiving an inbound cell:
|
|
1. ENCRYPT(cell, Kb, Cb)
|
|
2. Send the decrypted contents towards the client.
|
|
|
|
Router operation, upon receiving an outbound cell:
|
|
1. Let Status, Msg = CHECK(Cell, bf, Mf, Sf)
|
|
2. If Status == "BAD", drop the cell and destroy the circuit.
|
|
3. Let Cell' = Msg | Z(BODY_LEN - len(Msg))
|
|
4. DECRYPT(Cell', Kf, Cf) [*]
|
|
5. If Status == "receive" or (Status == nil and there is no
|
|
next hop), Cell' is for us: process it.
|
|
6. Otherwise, send Cell' to the next node.
|
|
|
|
Router operation, sending a cell towards the client:
|
|
1. Let Body = a relay cell body of BODY_LEN-bb_i bytes.
|
|
2. Let Cell' = ENCRYPT(Body, Kb, Cb)
|
|
3. Let Sb = Sb | Cell'
|
|
4. Let M = M_receive(Mb, Sb)[0:b]
|
|
5. Send the cell M | Cell' back towards the client.
|
|
|
|
Client operation, upon receiving an inbound cell:
|
|
|
|
For each hop H in the circuit, from first to last:
|
|
|
|
1. Let Status, Msg = CHECK(Cell, bb_i, Mb_i, Sb_i)
|
|
2. If Status = "relay", drop the cell and destroy
|
|
the circuit. (BAD is okay; it means that this hop didn't
|
|
originate the cell.)
|
|
3. DECRYPT(Msg, Kb_i, Cb_i)
|
|
4. If Status = "receive", this cell is from hop i; process
|
|
it.
|
|
5. Otherwise, set Cell = Msg.
|
|
|
|
Client operation, sending an outbound cell:
|
|
|
|
Let BF = the total of all bf_i values.
|
|
|
|
1. Construct a relay cell body Msg of BODY_LEN-BF bytes.
|
|
2. For each hop i, let Stream_i = ENCRYPT(Kf_i,Z(CELL_LEN),Cf_i)
|
|
3. Let Pad_0 = "".
|
|
4. For i in range 1..N, where N is the number of hops:
|
|
Let Pad_i = Pad_{i-1} | Z(bf_i)
|
|
Let S_last = the last len(Pad_i) bytes of Stream_i.
|
|
Let Pad_i = Pad_i xor S_last
|
|
Now Pad_i is the padding as it will stand after node i
|
|
has processed it.
|
|
|
|
5. For i in range N..1, where N is the number of hops:
|
|
If this is the last hop, let M_* = M_receive. Else let
|
|
M_* = M_relay.
|
|
|
|
Let Body = Msg xor the first len(Msg) bytes of Stream_i
|
|
|
|
Let M = M_*(Mf, Body | Pad_(i-1))
|
|
|
|
Set Msg = M[:bf_i] | Body
|
|
|
|
6. Send Msg outbound to the first relay in the circuit.
|
|
|
|
|
|
[*] Strictly speaking, we could omit the pad-and-decrypt
|
|
operation once we know we're the final hop.
|
|
|
|
|
|
|
|
R. References
|
|
|
|
[Prop188] Tor Proposal 188: Bridge Guards and other anti-enumeration defenses
|
|
https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/188-bridge-guards.txt
|
|
|
|
[TorSpec] The Tor Protocol Specification
|
|
https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=tor-spec.txt
|
|
|
|
[TorDesign] Dingledine et al, "Tor: The Second Generation Onion
|
|
Router",
|
|
https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf
|
|
|
|
[Tweak] Liskov et al, "Tweakable Block Ciphers",
|
|
http://www.cs.berkeley.edu/~daw/papers/tweak-crypto02.pdf
|
|
|
|
[XF] Xinwen Fu et al, "One Cell is Enough to Break Tor's Anonymity"
|
|
|
|
[23R] The 23 Raccoons, "Analysis of the Relative Severity of Tagging
|
|
Attacks" http://archives.seul.org/or/dev/Mar-2012/msg00019.html
|
|
(You'll want to read the rest of the thread too.)
|