mirror of
https://github.com/torproject/torspec.git
synced 2025-01-05 23:38:23 +00:00
239 lines
11 KiB
Plaintext
239 lines
11 KiB
Plaintext
Filename: 210-faster-headless-consensus-bootstrap.txt
|
||
Title: Faster Headless Consensus Bootstrapping
|
||
Author: Mike Perry, Tim Wilson-Brown, Peter Palfrader
|
||
Created: 01-10-2012
|
||
Last Modified: 02-10-2015
|
||
Status: Superseded
|
||
Target: 0.2.8.x+
|
||
|
||
Status-notes:
|
||
|
||
* This has been partially superseded by the fallback directory code,
|
||
and partially by the exponential-backoff code.
|
||
|
||
Overview and Motiviation
|
||
|
||
This proposal describes a way for clients to fetch the initial
|
||
consensus more quickly in situations where some or all of the directory
|
||
authorities are unreachable. This proposal is meant to describe a
|
||
solution for bug #4483.
|
||
|
||
Design: Bootstrap Process Changes
|
||
|
||
The core idea is to attempt to establish bootstrap connections in
|
||
parallel during the bootstrap process, and download the consensus from
|
||
the first connection that completes.
|
||
|
||
Connection attempts will be performed on an exponential backoff basis.
|
||
Initially, connections will be performed to a randomly chosen hard
|
||
coded directory mirror and a randomly chosen canonical directory
|
||
authority. If neither of these connections complete, additional mirror
|
||
and authority connections are tried. Mirror connections are tried at
|
||
a faster rate than authority connections.
|
||
|
||
Clients represent the majority of the load on the network. They can use
|
||
directory mirrors to download their documents, as the mirrors download
|
||
their documents from the authorities early in the consensus validity
|
||
period.
|
||
|
||
We specify that client mirror connections retry after one second, and
|
||
then double the retry time with every connection attempt:
|
||
0, 1, 2, 4, 8, 16, 32, ...
|
||
(The timers currently implemented in Tor increment with every
|
||
connection failure.)
|
||
|
||
We specify that client directory authority connections retry after
|
||
10 seconds, and then double the retry time with every connection:
|
||
0, 10, 20, ...
|
||
|
||
If a client has both an IPv4 and IPv6 address, it will try IPv4 and
|
||
IPv6 mirrors and authorities on the following schedule:
|
||
IPv4, IPv6, IPv4, IPv6, ...
|
||
|
||
[ TODO: should we add random noise to these scheduled times? - teor
|
||
Tor doesn’t add random noise to the current failure-based
|
||
timers, but as failures are a network event, they are
|
||
somewhat random/arbitrary already. These attempt-based timers
|
||
will go off every few seconds, exactly erraon the second. ]
|
||
|
||
(Relays can’t use directory mirrors to download their documents,
|
||
as they *are* the directory mirrors.)
|
||
|
||
The maximum retry time for all these timers is 3 days + 1 hour. This
|
||
places a small load on the mirrors and authorities, while allowing a
|
||
client that regains a network connection to eventually download a
|
||
consensus.
|
||
|
||
We try IPv4 first to avoid overloading IPv6-enabled authorities and
|
||
mirrors. Each timing schedule uses a separate IPv4/IPv6 schedule.
|
||
This ensures that clients try an IPv6 authority within the first
|
||
10 seconds. This helps implement #8374 and related tickets.
|
||
|
||
We don't want to keep on trying an IP version that always fails.
|
||
Therefore, once sufficient IPv4 and IPv6 connections have been
|
||
attempted, we select an IP version for new connections based on the ratio
|
||
of their failure rates, up to a maximum of 1:5. This may not make a
|
||
substantial difference to consensus downloads, as we only need one
|
||
successful consensus download to bootstrap. However, it is important for
|
||
future features like #17217, where clients try to automatically determine
|
||
if they can use IPv4 or IPv6 to contact the Tor network.
|
||
|
||
The retry timers and IP version schedules must reset on HUP and any
|
||
network reachability events, so that clients that have unreliable networks
|
||
can recover from network failures.
|
||
[ TODO: Do we do this for any other timers?
|
||
I think this needs another proposal, it’s out of scope here.
|
||
- teor ]
|
||
|
||
The first connection to complete will be used to download the consensus
|
||
document and the others will be closed, after which bootstrapping will
|
||
proceed as normal.
|
||
|
||
We expect the vast majority of clients to succeed within 4 seconds,
|
||
after making up to 4 connection attempts to mirrors and 1 connection
|
||
attempt to an authority. Clients which can't connect in the first
|
||
10 seconds, will try 1 more mirror, then try to contact another
|
||
directory authority. We expect almost all clients to succeed within
|
||
10 seconds. This is a much better success rate than the current Tor
|
||
implementation, which fails k/n of clients if k of the n directory
|
||
authorities are down. (Or, if the connection fails in certain ways,
|
||
it will retry once, failing 1-(1-(k/n)^2).)
|
||
|
||
If at any time, the total outstanding bootstrap connection attempts
|
||
exceeds 10, no new connection attempts are to be launched until an
|
||
existing connection attempt experiences full timeout. The retry time
|
||
is not doubled when a connection is skipped.
|
||
|
||
A benefit of connecting to directory authorities is that clients are
|
||
warned if their clock is wrong. Starting the authority and fallback
|
||
schedules at the same time should ensure that some clients check their
|
||
clock with an authority at each bootstrap.
|
||
|
||
Design: Fallback Dir Mirror Selection
|
||
|
||
The set of hard coded directory mirrors from #572 shall be chosen using
|
||
the 100 Guard nodes with the longest uptime.
|
||
|
||
The fallback weights will be set using each mirror's fraction of
|
||
consensus bandwidth out of the total of all 100 mirrors, adjusted to
|
||
ensure no fallback directory sees more than 10% of clients. We will
|
||
also exclude fallback directories that are less than 1/1000 of the
|
||
consensus weight, as they are not large enough to make it worthwhile
|
||
including them.
|
||
|
||
This list of fallback dir mirrors should be updated with every
|
||
major Tor release. In future releases, the number of dir mirrors
|
||
should be set at 20% of the current Guard nodes (approximately 200 as
|
||
of October 2015), rather than fixed at 100.
|
||
|
||
[TODO: change the script to dynamically calculate an upper limit.]
|
||
|
||
Performance: Additional Load with Current Parameter Choices
|
||
|
||
This design and the connection count parameters were chosen such that
|
||
no additional bandwidth load would be placed on the directory
|
||
authorities. In fact, the directory authorities should experience less
|
||
load, because they will not need to serve the entire consensus document
|
||
for a connection in the event that one of the directory mirrors complete
|
||
their connection before the directory authority does.
|
||
|
||
However, the scheme does place additional TLS connection load on the
|
||
fallback dir mirrors. Because bootstrapping is rare, and all but one of
|
||
the TLS connections will be very short-lived and unused, this should not
|
||
be a substantial issue.
|
||
|
||
The dangerous case is in the event of a prolonged consensus failure
|
||
that induces all clients to enter into the bootstrap process. In this
|
||
case, the number of TLS connections to the fallback dir mirrors within
|
||
the first second would be 2*C/100, or 40,000 for C=2,000,000 users. If
|
||
no connections complete before the 10 retries, 7 of which go to
|
||
mirrors, this could reach as high as 140,000 connection attempts, but
|
||
this is extremely unlikely to happen in full aggregate.
|
||
|
||
However, in the no-consensus scenario today, the directory authorities
|
||
would already experience 2*C/9 or 444,444 connection attempts. (Tor
|
||
currently tries 2 authorities, before delaying the next attempt.) The
|
||
10-retry scheme, 3 of which go to authorities, increases their total
|
||
maximum load to about 666,666 connection attempts, but again this is
|
||
unlikely to be reached in aggregate. Additionally, with this scheme,
|
||
even if the dirauths are taken down by this load, the dir mirrors
|
||
should be able to survive it.
|
||
|
||
Implementation Notes: Code Modifications
|
||
|
||
The implementation of the bootstrap process is unfortunately mixed
|
||
in with many types of directory activity.
|
||
|
||
The process starts in update_consensus_networkstatus_downloads(),
|
||
which initiates a single directory connection through
|
||
directory_get_from_dirserver(). Depending on bootstrap state,
|
||
a single directory server is selected and a connection is
|
||
eventually made through directory_initiate_command_rend().
|
||
|
||
There appear to be a few options for altering this code to retry multiple
|
||
simultaneous connections. It looks like we can modify
|
||
update_consensus_networkstatus_downloads() to make connections more often
|
||
if the purpose is DIR_PURPOSE_FETCH_CONSENSUS and there is no valid
|
||
(reasonably live) consensus. We can make multiple connections from
|
||
update_consensus_networkstatus_downloads(), as the sockets are non-blocking.
|
||
(This socket appears to be non-blocking on Unixes (SOCK_NONBLOCK & O_NONBLOCK)
|
||
and Windows (FIONBIO).) As long as we can tolerate a timer resolution of
|
||
~1 second (due to the use of second_elapsed_callback and time_t), this
|
||
requires no additional timers or callbacks. We can make 1 connection for each
|
||
schedule per second, for a maximum of 2 per second.
|
||
|
||
The schedules can be specified in:
|
||
TestingClientBootstrapConsensusAuthorityDownloadSchedule
|
||
TestingClientBootstrapConsensusFallbackDownloadSchedule
|
||
(Similar to the existing TestingClientConsensusDownloadSchedule.)
|
||
|
||
TestingServerIPVersionPreferenceSchedule
|
||
(Consisting of a CSV like “4,6,4,6”, or perhaps “0,1,0,1”.)
|
||
|
||
update_consensus_networkstatus_downloads() checks the list of pending
|
||
connections and, if it is 10 or greater, skip the connection attempt,
|
||
and leave the retry time constant.
|
||
|
||
The code in directory_send_command() and connection_finished_connecting()
|
||
would need to be altered to check that we are not already downloading the
|
||
consensus. If we’re not, then download the consensus on this connection, and
|
||
close any other pending consensus dircons.
|
||
|
||
We might also need to make similar changes in authority_certs_fetch_missing(),
|
||
as we can’t use a consensus until we have enough authority certificates.
|
||
However, Tor already makes multiple requests (one per certificate), and only
|
||
needs a majority of certificates to validate a consensus. Therefore, we will
|
||
only need to modify authority_certs_fetch_missing() if clients download a
|
||
consensus, then end up getting stuck downloading certificates. (Current tests
|
||
show bootstrapping working well without any changes to authority certificate
|
||
fetches.)
|
||
|
||
Reliability Analysis
|
||
|
||
We make the pessimistic assumptions that 50% of connections to directory
|
||
mirrors fail, and that 20% of connections to authorities fail. (Actual
|
||
figures depend on relay churn, age of the fallback list, and authority
|
||
uptime.)
|
||
|
||
We expect the first 10 connection retry times to be:
|
||
(Research shows users tend to lose interest after 40 seconds.)
|
||
Mirror: 0s 1s 2s 4s 8s 16s 32s
|
||
Auth: 0s 10s 20s
|
||
Success: 90% 95% 97% 98.7% 99.4% 99.89% 99.94% 99.988% 99.994%
|
||
|
||
97% of clients succeed in the first 2 seconds.
|
||
99.4% of clients succeed without trying a second authority.
|
||
99.89% of clients succeed in the first 10 seconds.
|
||
0.11% of clients remain, but in this scenario, 2 authorities are
|
||
unreachable, so the client is most likely blocked from the Tor
|
||
network. Alternately, they will likely succeed on relaunch.
|
||
|
||
The current implementation makes 1 or 2 authority connections within the
|
||
first second, depending on exactly how the first connection fails. Under
|
||
the 20% authority failure assumption, these clients would have a success
|
||
rate of either 80% or 96% within a few seconds. The scheme above has a
|
||
greater success rate in the first few seconds, while spreading the load
|
||
among a larger number of directory mirrors. In addition, if all the
|
||
authorities are blocked, current clients will inevitably fail, as they
|
||
do not have a list of directory mirrors.
|