mirror of
https://github.com/torproject/torspec.git
synced 2024-12-12 04:35:37 +00:00
523 lines
23 KiB
Plaintext
523 lines
23 KiB
Plaintext
Filename: 188-bridge-guards.txt
|
|
Title: Bridge Guards and other anti-enumeration defenses
|
|
Author: Nick Mathewson, Isis Lovecruft
|
|
Created: 14 Oct 2011
|
|
Modified: 10 Sep 2015
|
|
Status: Reserve
|
|
|
|
[NOTE: This proposal is marked as "reserve" because the enumeration
|
|
technique it addresses does not currently seem to be in use. See
|
|
ticket tor#7144 for more information. (2020 July 31)]
|
|
|
|
|
|
1. Overview
|
|
|
|
Bridges are useful against censors only so long as the adversary
|
|
cannot easily enumerate their addresses. I propose a design to make
|
|
it harder for an adversary who controls or observes only a few
|
|
nodes to enumerate a large number of bridges.
|
|
|
|
Briefly: bridges should choose guard nodes, and use the Tor
|
|
protocol's "Loose source routing" feature to re-route all extend
|
|
requests from clients through an additional layer of guard nodes
|
|
chosen by the bridge. This way, only a bridge's guard nodes can
|
|
tell that it is a bridge, and the attacker needs to run many more
|
|
nodes in order to enumerate a large number of bridges.
|
|
|
|
I also discuss other ways to avoid enumeration, recommending some.
|
|
|
|
These ideas are due to a discussion at the 2011 Tor Developers'
|
|
Meeting in Waterloo, Ontario. Practically none of the ideas here
|
|
are mine; I'm just writing up what I remember.
|
|
|
|
2. History and Motivation
|
|
|
|
Under the current bridge design, an attacker who runs a node can
|
|
identify bridges by seeing which "clients" make a large number of
|
|
connections to it, or which "clients" make connections to it in the
|
|
same way clients do. This has been a known attack since early
|
|
versions {XXXX check} of the design document; let's try to fix it.
|
|
|
|
2.1. Related idea: Guard nodes
|
|
|
|
The idea of guard nodes isn't new: since 0.1.1, Tor has used guard
|
|
nodes (first designed as "Helper" nodes by Wright et al in {XXXX})
|
|
to make it harder for an adversary who controls a smaller number of
|
|
nodes to eavesdrop on clients. The rationale was: an adversary who
|
|
controls or observes only one entry and one exit will have a low
|
|
probability of correlating any single circuit, but over time, if
|
|
clients choose a random entry and exit for each circuit, such an
|
|
adversary will eventually see some circuits from each client with a
|
|
probability of 1, thereby building a statistical profile of the
|
|
client's activities. Therefore, let each client choose its entry
|
|
node only from among a small number of client-selected "guard"
|
|
nodes: the client is still correlated with the same probability as
|
|
before, but now the client has a nonzero chance of remaining
|
|
unprofiled.
|
|
|
|
2.2. Related idea: Loose source routing
|
|
|
|
Since the earliest versions of Onion Routing, the protocol has
|
|
provided "loose source routing". In strict source routing, the
|
|
source of a message chooses every hop on the message's path. But
|
|
in loose source routing, the message traverses the selected nodes,
|
|
but may also traverse other nodes as well. In other words, the
|
|
client selects nodes N_a, N_b, and N_c, but the message may in fact
|
|
traverse any sequence of nodes N_1...N_j, so long as N_1=N_a,
|
|
N_x=N_b, and N_y=N_c, for 1 < x < y.
|
|
|
|
Tor has retained this feature, but has not yet made use of it.
|
|
|
|
3. Design
|
|
|
|
Every bridge currently chooses a set of guard nodes for its
|
|
circuits. Bridges should also re-route client circuits through
|
|
these circuits.
|
|
|
|
Specifically, when a bridge receives a request from a client to
|
|
extend a circuit, it should first create a circuit to its guard,
|
|
and then relay that extend cell through the guard. The bridge
|
|
should add an additional layer of encryption to outgoing cells on
|
|
that circuit corresponding to the encryption that the guard will
|
|
remove, and remove a layer of encryption on incoming cells on that
|
|
circuit corresponding to the encryption that the guard will add.
|
|
|
|
3.1. Loose-Source Routed Circuit Construction
|
|
|
|
Alice, an OP, is using a bridge, Bob, and she has chosen the
|
|
following path through the network:
|
|
|
|
Alice -> Bob -> Charlie -> Deidra
|
|
|
|
However, Bob has decided to take advantage of the loose-source
|
|
routing circuit characteristic (for example, in order to use a bridge
|
|
guard), and Bob has chosen N additional loose-source routed hop(s),
|
|
through which he will transparently relays cells.
|
|
|
|
NOTE: For the purposes of bridge guards, N is always 1. However, for
|
|
completion's sake, the following details of the circuit construction
|
|
are generalized to include N > 1. Additionally, the following steps
|
|
should hold for a hop at any position in Alice's circuit that has
|
|
decided to take advantage of the loose-source routing feature, not
|
|
only for bridge ORs.
|
|
|
|
From Alice's perspective, her circuit path matches the one diagrammed
|
|
above. However, the overall path of the circuit is:
|
|
|
|
Alice -> Bob -> Guillaume -> Charlie -> Deidra
|
|
|
|
From Bob's perspective, the circuit's path is:
|
|
|
|
Alice -> Bob -> Guillaume -> Charlie -> UNKNOWN
|
|
|
|
Interestingly, because Bob's behaviour towards Guillaume and choices
|
|
of cell types is that of a normal OP, Guillaume's perspective of the
|
|
circuit's path is:
|
|
|
|
Bob -> Guillaume -> Charlie -> UNKNOWN
|
|
|
|
That is, to Guillaume, Bob appears (for the most part) to be a
|
|
normally connecting client. (See §4.1 for more detailed analysis.)
|
|
|
|
3.1.1. Detailed Steps of Loose-Source Routed Circuit Construction
|
|
|
|
1. Connection from OP
|
|
|
|
Alice has connected to Bob, and she has sent to Bob either a
|
|
CREATE/CREATE_FAST or CREATE2 cell.
|
|
|
|
2. Loose-Source Path Selection
|
|
|
|
In anticipation of Alice's first RELAY_EARLY cell (which will
|
|
contain an EXTEND cell to Alice's next hop), Bob begins
|
|
constructing a loose-source routed circuit. To do so, Bob chooses
|
|
N additional hop(s):
|
|
|
|
2.a. For the first additional hop, H_1, Bob chooses a suitable
|
|
entry guard node, Guillaume, using the same algorithm as OPs.
|
|
See "§5 Guard nodes" of path-spec.txt for additional
|
|
information on the selection algorithm.
|
|
|
|
2.b. Each additional hop, [H_2, ..., H_N], is chosen at random
|
|
from a list of suitable, non-excluded ORs.
|
|
|
|
3. Loose-Source Routed Circuit Extension and Cell Types
|
|
|
|
Bob now follows the same procedure as OPs use to complete the key
|
|
exchanges with his chosen additional hop(s).
|
|
|
|
While undergoing these following substeps, Bob SHOULD continue to
|
|
proceed with Step 4, below, in parallel, as an optimization for
|
|
speeding up circuit construction.
|
|
|
|
3.a. Create Cells
|
|
|
|
Bob sends the appropriate type of create cell to Guillaume.
|
|
For ORs new enough to support the NTor handshake (nearly all
|
|
of them at this point), Bob sends a CREATE2 cell. Otherwise,
|
|
for ORs which only support the older TAP handshake, Bob sends
|
|
either a CREATE or CREATE_FAST cell, using the same
|
|
decision-making logic as OPs.
|
|
|
|
See §4.1 for more information the distinguishability of
|
|
bridges based upon whether they use CREATE versus
|
|
CREATE_FAST. Also note that the CREATE2 cell has since
|
|
become ubiquitous after this proposal was originally drafted.
|
|
Thus, because we prefer ORs which use loose-source routing to
|
|
behave (as much as possible) like OPs, we now prefer to use
|
|
CREATE2.
|
|
|
|
3.b. Created Cells
|
|
|
|
Later, when Bob receives a corresponding CREATED/CREATED_FAST
|
|
or CREATED2 cell from Guillaume, Bob extracts key material
|
|
for the shared forward and reverse keys, KG_f and KG_b,
|
|
respectively.
|
|
|
|
3.c. Extend Cells
|
|
|
|
When N > 1, for each additional hop, H_i, in [H_2, ..., H_N],
|
|
Bob chooses the appropriate type of extend cell for H_i, and
|
|
sends this extend cell to H_i-1, who transforms it into a
|
|
create cell in order to perform the extension. To choose
|
|
which type of extend cell to send, Bob uses the same
|
|
algorithm as an OP to determine whether to use EXTEND or
|
|
EXTEND2. Similar to the CREATE* cells above, for most modern
|
|
ORs, this will very likely mean an EXTEND2 cell.
|
|
|
|
3.d. Extended Cells
|
|
|
|
When a corresponding EXTENDED/EXTENDED2 cell is received for
|
|
an additional hop, H_i, Bob extracts the shared forward and
|
|
reverse keys, Ki_f and Ki_b, respectively.
|
|
|
|
4. Responding to the OP
|
|
|
|
Now that the additional hops in Bob's loose-source routed circuit
|
|
are chosen, and construction of the loose-source routed circuit
|
|
has begun, Bob answers Alice's original CREATE/CREATE_FAST or
|
|
CREATE2 cell (from Step 1) by sending the corresponding created
|
|
cell type.
|
|
|
|
Alice has now built a circuit through Bob, and the two share the
|
|
negotiated forward and reverse keys, KB_n and KB_p, respectively.
|
|
|
|
Note that Bob SHOULD do this step in tandem with the loose-source
|
|
routed circuit construction procedure outlined in Step 3, above.
|
|
|
|
5. OP Circuit Extension
|
|
|
|
Alice then wants to extend the circuit to node Charlie. She makes
|
|
a hybrid-encrypted onionskin, encrypted to Charlie's public key,
|
|
containing her chosen g^x value. She puts this in an extend cell:
|
|
"Extend (Charlie's address) (Charlie's OR Port) (Onionskin)
|
|
(Charlie's ID)". She encrypts this with KB_n and sends it as a
|
|
RELAY_EARLY cell to Bob.
|
|
|
|
Bob's behaviour is now dependent on whether the loose-source
|
|
routed circuit construction steps (as outlined in Step 3, above)
|
|
have already completed.
|
|
|
|
5.a. The Loose-Source Routed Circuit Construction is Incomplete
|
|
|
|
If Bob has not yet finished the loose-source routed circuit
|
|
construction, then Bob MUST store the first outgoing
|
|
(i.e. exitward) RELAY_EARLY cell received from Alice until
|
|
the loose-source routed circuit construction has been
|
|
completed.
|
|
|
|
If any incoming (i.e. toward the OP) RELAY* cell is received
|
|
while the loose-source routed circuit is not fully
|
|
constructed, Bob MUST drop the cell.
|
|
|
|
If Bob has already stored Alice's first RELAY_EARLY cell, and
|
|
Alice sends any additional RELAY* cell, then Bob SHOULD mark
|
|
the entire circuit for close with END_CIRC_REASON_TORPROTOCOL.
|
|
|
|
5.b. The Loose-Source Routed Circuit Construction is Completed
|
|
|
|
Later, when the loose-source routed circuit is fully
|
|
constructed, Bob MUST send any stored cells from Alice
|
|
outward by following the procedure described in Step 6.a.
|
|
|
|
6. Relay Cells
|
|
|
|
When receiving a RELAY* cell in either direction, Bob MAY keep
|
|
statistics on the number of relay cells encountered, as well as
|
|
the number of relay cells relayed.
|
|
|
|
6.a. Outgoing Relay Cells
|
|
|
|
Bob decrypts the RELAY* cell with KB_n. If the cell becomes
|
|
recognized, Bob should now follow the relay command checks
|
|
described in Step 6.c.
|
|
|
|
Bob MUST encrypt the relay cell's underlying payload to each
|
|
additional hop in the loose-source routed circuit, in
|
|
reverse: for each additional hop, H_i, in [H_N, ..., H_1],
|
|
Bob encrypts the relay cell payload to Ki_f, the shared
|
|
forward key for the hop H_i.
|
|
|
|
Bob MUST update the forward digest, DG_f, of the relay cell,
|
|
regardless of whether or not the cell is recognized. See
|
|
6.c. for additional information on recognized cells.
|
|
|
|
Bob now sends the cell outwards through the additional hops.
|
|
At each hop, H_i, the hop removes a layer of the onionskin by
|
|
decrypting the cell with Ki_f, and then hop H_i forwards the
|
|
cell to the next addition additional hop H_i+1. When the
|
|
final additional hop, H_N, received the cell, the OP's cell
|
|
command and payload should be processed by H_N in the normal
|
|
manner for an OR.
|
|
|
|
6.b. Incoming Relay Cells
|
|
|
|
Bob MUST decrypt the relay cell's underlying payload from
|
|
each additional hop in the loose-source routed circuit (in
|
|
forward order, this time): For each additional hop, H_i, in
|
|
[H_1, ..., H_N], Bob decrypts the relay cell payload with
|
|
Ki_b, the shared backward key for the hop H_i.
|
|
|
|
If the cell has becomes recognized after all decryptions, Bob
|
|
should now follow the relay command checks described in Step
|
|
6.c.
|
|
|
|
Bob MUST update the backward digest, DG_b, of the relay cell,
|
|
regardless of whether or not the cell is recognized. See
|
|
6.c. for additional information on recognized cells.
|
|
|
|
Bob encrypts the cell towards the OP with KB_p, and sends the
|
|
cell inwards.
|
|
|
|
6.c. Recognized Cells
|
|
|
|
If a relay cell, either incoming or outgoing, becomes
|
|
recognized (i.e. Bob sees that the cell was intended for him)
|
|
after decryption, and there is no stream attached to the
|
|
circuit, then Bob SHOULD mark the circuit for close if the
|
|
relay command contained within the cell is any of the
|
|
following types:
|
|
|
|
- RELAY_BEGIN
|
|
- RELAY_CONNECTED
|
|
- RELAY_END
|
|
- RELAY_RESOLVE
|
|
- RELAY_RESOLVED
|
|
- RELAY_BEGIN_DIR
|
|
|
|
Apart from the above checks, Bob SHOULD essentially treat
|
|
every cell as "unrecognized" by following the en-/de-cryption
|
|
procedures in Steps 6.a. and 6.b. regardless of whether the
|
|
cell is actually recognized or not. That is, since this is a
|
|
loose-source routed circuit, Bob SHOULD relay cells not
|
|
intended for him *and* cells intended for him through the
|
|
leaky pipe, no matter what the cell's underlying payload and
|
|
command are.
|
|
|
|
3.1.2. Example Loose-Source Circuit Construction
|
|
|
|
For example, given the following circuit path chosen by Alice:
|
|
|
|
Alice -> Bob -> Charlie -> Deidra
|
|
|
|
when Alice wishes to extend to node Charlie, and Bob the bridge is
|
|
using only one additional loose-source routed hop, Guillaume, as his
|
|
bridge guard, the following steps are taken:
|
|
|
|
- Alice packages the extend into a RELAY_EARLY cell and encrypts
|
|
the RELAY_EARLY cell with KB_f to Bob.
|
|
|
|
- Bob receives the RELAY_EARLY cell from Alice, and he follows
|
|
the procedure (outlined in §3.1.1. Step 6.a.) by:
|
|
|
|
* Decrypting the cell with KB_f,
|
|
* Encrypting the cell to the forward key, KG_f, which Bob
|
|
shares with his guard node, Guillaume,
|
|
* Updating the cell forward digest, DG_f, and
|
|
* Sending the cell as a RELAY_EARLY cell to Guillaume.
|
|
|
|
- When Guillaume receives the cell from Bob, he processes it by:
|
|
|
|
* Decrypting the cell with KG_f. Guillaume now sees that it
|
|
is a RELAY_EARLY cell containing an extend cell "intended"
|
|
for him, containing: "Extend (Charlie's address) (Charlie's
|
|
OR Port) (Onionskin) (Charlie's ID)".
|
|
* Performing the circuit extension to the specified node,
|
|
Charlie, by acting accordingly: creating a connection to
|
|
Charlie if he doesn't have one, ensuring that the ID is as
|
|
expected, and then sending the onionskin in a create cell
|
|
on that connection. Note that Guillaume is behaving
|
|
exactly as a regular node would upon receiving an Extend
|
|
cell.
|
|
* Now the handshake finishes. Charlie receives the onionskin
|
|
and sends Guillaume "CREATED g^y,KH".
|
|
* Making an extended cell for Bob which contains
|
|
"E(KG_b, EXTENDED g^y KH)", and
|
|
* Sending the extended cell to Bob. Note that Charlie and
|
|
Guillaume are both still behaving in a manner identical to
|
|
regular ORs.
|
|
|
|
- Bob receives the extended cell from Guillaume, and he follows
|
|
the procedure (outlined in §3.1.1. Step 6.b.) by:
|
|
|
|
* Decrypting the cell with KG_b,
|
|
* Encrypting the cell to Alice with KB_b,
|
|
* Updating the cell backward digest, DG_b, and
|
|
* Sending the cell to Alice.
|
|
|
|
- Alice receives the cell, and she decrypts it with KB_b, just
|
|
as she would have if Bob had extended to Charlie directly.
|
|
She then processes the extended cell contained within to
|
|
extract shared keys with Charlie. Note that Alice's behaviour
|
|
is identical to regular OPs.
|
|
|
|
3.2. Additional Notes on the Construction
|
|
|
|
Note that this design does not require that our stream cipher
|
|
operations be commutative, even though they are.
|
|
|
|
Note also that this design requires no change in behavior from any
|
|
node other than Bob, and as we can see in the above example in §3.1.2
|
|
for Alice's circuit extension, Alice, Guillaume, and Charlie behave
|
|
identical to a normal OP and normal ORs.
|
|
|
|
Finally, observe that even though the circuit N hops longer than it
|
|
would be otherwise, no relay's count of permissible RELAY_EARLY cells
|
|
falls lower than it otherwise would. This is because the extra hop
|
|
that Bob adds is done with RELAY_EARLY cells, then he continues to
|
|
relay Alice's cells as RELAY_EARLY, until the appropriate maximum
|
|
number of RELAY_EARLY cells is reached. Afterwards, further
|
|
RELAY_EARLY cells from Alice are repackaged by Bob as normal RELAY
|
|
cells.
|
|
|
|
4. Alternative designs
|
|
|
|
4.1. Client-enforced bridge guards
|
|
|
|
What if Tor didn't have loose source routing? We could have
|
|
bridges tell clients what guards to use by advertising those guard
|
|
in their descriptors, and then refusing to extend circuits to any
|
|
other nodes. This change would require all clients to upgrade in
|
|
order to be able to use the newer bridges, and would quite possibly
|
|
cause a fair amount of pain along the way.
|
|
|
|
Fortunately, we don't need to go down this path. So let's not!
|
|
|
|
4.2. Separate bridge-guards and client-guards
|
|
|
|
In the design above, I specify that bridges should use the same
|
|
guard nodes for extending client circuits as they use for their own
|
|
circuits. It's not immediately clear whether this is a good idea
|
|
or not. Having separate sets would seem to make the two kinds of
|
|
circuits more easily distinguishable (even though we already assume
|
|
they are distinguishable). Having different sets of guards would
|
|
also seem like a way to keep the nodes who guard our own traffic
|
|
from learning that we're a bridge... but another set of nodes will
|
|
learn that anyway, so it's not clear what we'd gain.
|
|
|
|
One good reason to keep separate guard lists is to prevent the
|
|
*client* of the bridge from being able to enumerate the guards that
|
|
the bridge uses to protect its own traffic (by extending a circuit
|
|
through the bridge to a node it controls, and finding out where the
|
|
extend request arrives from).
|
|
|
|
5. Additional bridge enumeration methods and protections
|
|
|
|
In addition to the design above, there are more ways to try to
|
|
prevent enumeration.
|
|
|
|
Right now, there are multiple ways for the node after a bridge to
|
|
distinguish a circuit extended through the bridge from one
|
|
originating at the bridge. (This lets the node after the bridge
|
|
tell that a bridge is talking to it.)
|
|
|
|
5.1. Make it harder to tell clients from bridges
|
|
|
|
When using the older TAP circuit handshake protocol, one of the
|
|
giveaways is that the first hop in a circuit is created with
|
|
CREATE_FAST cells, but all subsequent hops are created with CREATE
|
|
cells.
|
|
|
|
However, because nearly everything in the network now uses the newer
|
|
NTor circuit handshake protocol, clients send CREATE2 cells to all
|
|
hops, regardless of position. Therefore, in the above design, it's
|
|
no longer quite so simple to distinguish an OP connecting through
|
|
bridge from an actual OP, since all of the circuits that extend
|
|
through a bridge now reach its guards through CREATE2 cells (whether
|
|
the bridge originated them or not), and only as a fallback (e.g. if
|
|
an additional node in the loose-source routed path does not support
|
|
NTor) will the bridge ever use CREATE/CREATE_FAST. (Additionally,
|
|
when using the fallback mathod, the behaviour for choosing either
|
|
CREATE or CREATE_FAST is identical to normal OP behaviour.)
|
|
|
|
The CREATE/CREATE_FAST distinction is not the only way for a
|
|
bridge's guard to tell bridges from orginary clients, however.
|
|
Most importantly, a busy bridge will open far more circuits than a
|
|
client would. More subtly, the timing on response from the client
|
|
will be higher and more highly variable that it would be with an
|
|
ordinary client. I don't think we can make bridges behave wholly
|
|
indistinguishably from clients: that's why we should go with guard
|
|
nodes for bridges.
|
|
|
|
[XXX For further research: we should study the methods by which a
|
|
bridge guard can determine that they are acting as a guard for a
|
|
bridge, rather than for a normal OP, and which methods are likely to
|
|
be more accurate or efficient than others. -IL]
|
|
|
|
5.2. Bridge Reachability Testing
|
|
|
|
Currently, a bridge's reachability is tested both by the bridge
|
|
itself (called "self-testing") and by the BridgeAuthority.
|
|
|
|
5.2.1. Bridge Reachability Self-Testing
|
|
|
|
Before a bridge uploads its descriptors to the BridgeAuthority, it
|
|
creates a special type of testing circuit which ends at itself:
|
|
|
|
Bob -> Guillaume -> Charlie -> Bob
|
|
|
|
Thus, going to all this trouble to later use loose-source routing in
|
|
order to relay Alice's traffic through Guillaume (rather than
|
|
connecting directly to Charlie, as Alice intended) is diminished by
|
|
the fact that Charlie can still passively enumerate bridges by
|
|
waiting to be asked to connect to a node which is not contained
|
|
within the consensus.
|
|
|
|
We could get around this option by disabling self-testing for bridges
|
|
entirely, by automatically setting "AssumeReachable 1" for all bridge
|
|
relays… although I am not sure if this is wise.
|
|
|
|
Our best idea thus far, for bridge reachability self-testing, is to create
|
|
a circuit like so:
|
|
|
|
Bridge → Guard → Middle → OtherMiddle → Guard → Bridge
|
|
|
|
While, clearly, that circuit is just a little bit insane, it must be that
|
|
way because we cannot simply do:
|
|
|
|
Bridge → Guard → Middle → Guard → Bridge
|
|
|
|
because the Middle would refuse to extend back to the previous node
|
|
(all ORs follow this rule). Similarly, it would be inane to do:
|
|
|
|
Bridge → Guard → Middle → OtherMiddle → Bridge
|
|
|
|
because, obviously, that merely shifts the problem to OtherMiddle and
|
|
accomplishes nothing. [XXX Is there something smarter we could do? —IL]
|
|
|
|
5.2.2. Bridge Reachability Testing by the BridgeAuthority
|
|
|
|
After receiving Bob's descriptors, the BridgeAuthority attempts to
|
|
connect to Bob's ORPort by making a direct TLS connection to the
|
|
bridge's advertised ORPort.
|
|
|
|
Should we change this behaviour? One the one hand, at least this
|
|
does not enable any random OR in the entire network to enumerate
|
|
bridges. On the other hand, any adversary who can observe packets
|
|
from the BridgeAuthority is capable of enumeration.
|
|
|
|
6. Other considerations
|
|
|
|
What fraction of our traffic is bridge traffic? Will this alter
|
|
our circuit selection weights?
|