mirror of
https://github.com/torproject/torspec.git
synced 2024-12-13 21:48:45 +00:00
Updated to remove dropping of failing guards and just focus
on the specifics of recording, storing, and learning circuitbuildtimeout parameters. svn:r16511
This commit is contained in:
parent
dfbeee69a6
commit
95969867fc
@ -10,8 +10,8 @@ Overview
|
||||
|
||||
The performance of paths selected can be improved by adjusting the
|
||||
CircuitBuildTimeout and avoiding failing guard nodes. This proposal
|
||||
describes a method of tracking buildtime statistics, and using those
|
||||
statistics to adjust the CircuitBuildTimeout and the number of guards.
|
||||
describes a method of tracking buildtime statistics at the client, and
|
||||
using those statistics to adjust the CircuitBuildTimeout.
|
||||
|
||||
Motivation
|
||||
|
||||
@ -22,71 +22,91 @@ Motivation
|
||||
|
||||
Implementation
|
||||
|
||||
Storing Build Times
|
||||
|
||||
Circuit build times will be stored in the circular array
|
||||
'circuit_build_times' consisting of uint16_t elements as milliseconds.
|
||||
The total size of this array will be based on the number of circuits
|
||||
it takes to converge on a good fit of the long term distribution of
|
||||
the circuit builds for a fixed link. We do not want this value to be
|
||||
too large, because it will make it difficult for clients to adapt to
|
||||
moving between different links.
|
||||
|
||||
From our initial observations, this value appears to be on the order
|
||||
of 1000, but will be configurable in a #define NCIRCUITS_TO_OBSERVE.
|
||||
|
||||
Long Term Storage
|
||||
|
||||
The long-term storage representation will be implemented by storing a
|
||||
histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
|
||||
writing out the statistics to disk. The format of this histogram on disk
|
||||
is yet to be finalized, but it will likely be of the format
|
||||
'CircuitBuildTime <bin> <count>'.
|
||||
Example:
|
||||
|
||||
CircuitBuildTimeBin 1 100
|
||||
CircuitBuildTimeBin 2 50
|
||||
...
|
||||
|
||||
Reading the histogram in will entail multiplying each bin by the
|
||||
BUILDTIME_BIN_WIDTH and then inserting <count> values into the
|
||||
circuit_build_times array each with the value of
|
||||
<bin>*BUILDTIME_BIN_WIDTH.
|
||||
|
||||
Learning the CircuitBuildTimeout
|
||||
|
||||
Based on studies of build times, we found that the distribution of
|
||||
circuit buildtimes appears to be a Pareto distribution. The number
|
||||
of circuits to observe (ncircuits_to_cutoff) before changing the
|
||||
CircuitBuildTimeout will be tunable. From out measurements,
|
||||
ncircuits_to_cuttoff appears to be on the order of 100.
|
||||
|
||||
In addition, the total number of circuits gathered
|
||||
(ncircuits_to_observe) will also be tunable. It is likely that
|
||||
ncircuits_to_observe will be somewhere on the order of 1000. The values
|
||||
can be represented compactly in Tor in milliseconds as a circular array
|
||||
of 16 bit integers. More compact long-term storage representations can
|
||||
be implemented by simply storing a histogram with 50 millisecond buckets
|
||||
when writing out the statistics to disk.
|
||||
circuit buildtimes appears to be a Pareto distribution.
|
||||
|
||||
Calculating the preferred CircuitBuildTimeout
|
||||
We will calculate the parameters for a Pareto distribution
|
||||
fitting the data using the estimators at
|
||||
http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
|
||||
|
||||
Circuits that have longer buildtimes than some x% of the estimated
|
||||
CDF of the Pareto distribution will be excluded. x will be tunable
|
||||
as well.
|
||||
The timeout itself will be calculated by solving the CDF for the
|
||||
a percentile cutoff BUILDTIME_PERCENT_CUTOFF. This value
|
||||
represents the percentage of paths the Tor client will accept out of
|
||||
the total number of paths. We have not yet determined a good
|
||||
cutoff for this mathematically, but 85% seems a good choice for now.
|
||||
|
||||
Circuit timeouts
|
||||
From http://en.wikipedia.org/wiki/Pareto_distribution#Definition,
|
||||
the calculation we need is pow(BUILDTIME_PERCENT_CUTOFF/100.0, k)/Xm.
|
||||
|
||||
In the event of a timeout, backoff values should include the 100-x%
|
||||
of expected CDF of timeouts. Also, in the event of network failure,
|
||||
the observation mechanism should stop collecting timeout data.
|
||||
When to Begin Calculation
|
||||
|
||||
Dropping Failed Guards
|
||||
The number of circuits to observe (NCIRCUITS_TO_CUTOFF) before
|
||||
changing the CircuitBuildTimeout will be tunable via a #define. From
|
||||
our measurements, a good value for NCIRCUITS_TO_CUTOFF appears to be
|
||||
on the order of 100.
|
||||
|
||||
In addition, we have noticed that some entry guards are much more
|
||||
failure prone than others. In particular, the circuit failure rates for
|
||||
the fastest entry guards was approximately 20-25%, where as slower
|
||||
guards exhibit failure rates as high as 45-50%. In [1], it was
|
||||
demonstrated that failing guard nodes can deliberately bias path
|
||||
selection to improve their success at capturing traffic. For both these
|
||||
reasons, failing guards should be avoided.
|
||||
Dealing with Timeouts
|
||||
|
||||
Timeouts should be counted as the expectation of the region of
|
||||
of the Pareto distribution beyond the cutoff. The proposal will
|
||||
be updated with this value soon.
|
||||
|
||||
Also, in the event of network failure, the observation mechanism
|
||||
should stop collecting timeout data.
|
||||
|
||||
Circuits that timeout will be destroyed, as this indicates one
|
||||
or more of their respective nodes are currently overloaded.
|
||||
|
||||
Client Hints
|
||||
|
||||
Some research still needs to be done to provide initial values
|
||||
for CircuitBuildTimeout based on values learned from modem
|
||||
users, DSL users, Cable Modem users, and dedicated links. A
|
||||
radiobutton in Vidalia should eventually be provided that
|
||||
sets CircuitBuildTimeout to one of these values and also
|
||||
provide the option of purging all learned data, should any exist.
|
||||
|
||||
These values can either be published in the directory, or
|
||||
shipped hardcoded for a particular Tor version.
|
||||
|
||||
We propose increasing the number of entry guards to five, and gathering
|
||||
circuit failure statistics on each entry guard. Any guards that exceed
|
||||
the average failure rate of all guards by 10% after we have
|
||||
gathered ncircuits_to_observe circuits will be replaced.
|
||||
|
||||
|
||||
Issues
|
||||
|
||||
Impact on anonymity
|
||||
|
||||
Since this follows a Pareto distribution, large reductions on the
|
||||
timeout can be achieved without cutting off a great number of the
|
||||
total paths. However, hard statistics on which cutoff percentage
|
||||
gives optimal performance have not yet been gathered.
|
||||
|
||||
Guard Turnover
|
||||
|
||||
We contend that the risk from failing guards biasing path selection
|
||||
outweighs the risk of exposure to larger portions of the network
|
||||
for the first hop. Furthermore, from our observations, it appears
|
||||
that circuit failure is strongly correlated to node load. Allowing
|
||||
clients to migrate away from failing guards should naturally
|
||||
rebalance the network, and eventually clients should converge on
|
||||
a stable set of reliable guards. It is also likely that once clients
|
||||
begin to migrate away from failing guards, their load should go
|
||||
down, causing their failure rates to drop as well.
|
||||
|
||||
|
||||
[1] http://www.crhc.uiuc.edu/~nikita/papers/relmix-ccs07.pdf
|
||||
|
||||
total paths. This will eliminate a great deal of the performance
|
||||
variation of Tor usage.
|
||||
|
Loading…
Reference in New Issue
Block a user