mirror of
https://github.com/torproject/torspec.git
synced 2024-11-27 03:40:47 +00:00
413 lines
13 KiB
Plaintext
413 lines
13 KiB
Plaintext
|
Tor Bandwidth List Format
|
||
|
juga
|
||
|
teor
|
||
|
|
||
|
1. Scope and preliminaries
|
||
|
|
||
|
This document describes the format of Tor's Bandwidth List,
|
||
|
version 1.0.0, 1.1.0 and later.
|
||
|
It is new specification for the existing format 1.0.0.
|
||
|
Describes a new format 1.1.0, which is backwards compatible with
|
||
|
1.0.0 parsers.
|
||
|
|
||
|
Since Tor version 0.2.4.12-alpha the directory authorities use
|
||
|
the Bandwidth List file called "V3BandwidthsFile" generated by
|
||
|
Torflow [1]. The format is described in Torflow's README.spec.txt and
|
||
|
is considered to be version 1.0.0.
|
||
|
|
||
|
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
|
||
|
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
|
||
|
"OPTIONAL" in this document are to be interpreted as described in
|
||
|
RFC 2119.
|
||
|
|
||
|
1.2. Acknowledgements
|
||
|
|
||
|
The original bandwidth generator (Torflow) and format was
|
||
|
created by mike. Teor suggested to write this specification while
|
||
|
contributing on pastly's new bandwidth generator implementation.
|
||
|
|
||
|
This specification was revised after feedback from:
|
||
|
|
||
|
Nick Mathewson (nickm)
|
||
|
Iain Learmonth (irl)
|
||
|
|
||
|
1.3 Outline
|
||
|
|
||
|
The Tor directory protocol (dir-spec.txt [3]) sections 3.4.1
|
||
|
and 3.4.2, use the term bandwidth measurements, to refer to what
|
||
|
here is called Bandwidth List.
|
||
|
A Bandwidth List file contains information on relays' bandwidth
|
||
|
capacities and is produced by bandwidth generators, previously known
|
||
|
as bandwidth scanners.
|
||
|
|
||
|
1.4. Format Versions
|
||
|
|
||
|
1.0.0 - The legacy fallback Bandwidth List format
|
||
|
|
||
|
1.1.0 - Adds KeyValue Lines to the Header List section, add KeyValues
|
||
|
to RelayLines and format versions.
|
||
|
|
||
|
All Tor versions can consume format version 1.0.0.
|
||
|
All Tor versions can consume format version 1.1.0,
|
||
|
but they warn on additional header Lines.
|
||
|
[TODO: this might be fixed, and if it is fixed should be said which
|
||
|
version of Tor]
|
||
|
|
||
|
2. Format details
|
||
|
|
||
|
The Bandwidth List MUST contain the following sections:
|
||
|
- Header List (exactly once)
|
||
|
- Relays' Bandwidth List (zero or more times)
|
||
|
If it does not contain these sections, parsers SHOULD ignore the file.
|
||
|
|
||
|
2.1. Definitions
|
||
|
|
||
|
The following nonterminals are defined in Tor directory protocol
|
||
|
sections 1.2., 2.1.1., 2.1.3.:
|
||
|
|
||
|
Int
|
||
|
SP (space)
|
||
|
NL (newline)
|
||
|
Keyword
|
||
|
ArgumentChar
|
||
|
nickname
|
||
|
hexdigest (a '$', followed by 40 hexadecimal characters
|
||
|
([A-Fa-f0-9]))
|
||
|
|
||
|
Nonterminal defined section 2 of version-spec.txt [4]:
|
||
|
|
||
|
version_number
|
||
|
|
||
|
We define the following nonterminals:
|
||
|
|
||
|
Line ::= ArgumentChar* NL
|
||
|
RelayLine ::= KeyValue (SP KeyValue)* NL
|
||
|
KeyValue ::= Keyword "=" Value
|
||
|
Value ::= ArgumentCharValue+
|
||
|
ArgumentCharValue ::= any printing ASCII character except NL and SP.
|
||
|
Terminator ::= "====="
|
||
|
Timestamp ::= Int
|
||
|
Bandwidth ::= Int
|
||
|
MasterKey ::= a base64-encoded Ed25519 public key, with
|
||
|
padding characters omitted.
|
||
|
DateTime ::= "YYYY-MM-DDTHH:MM:SS", as in ISO 8601
|
||
|
|
||
|
Note that key_value and value are defined in Tor directory protocol
|
||
|
with different formats to KeyValue and Value here.
|
||
|
|
||
|
All Lines in the file MUST be 510 characters or less, to allow for the
|
||
|
trailing newline and NULL characters.
|
||
|
The previous limit was 254 characters in Tor 0.2.6.2-alpha and
|
||
|
earlier.
|
||
|
The parser MAY ignore longer Lines.
|
||
|
[TODO: Change this restriction in 1.1.0 or later]
|
||
|
|
||
|
2.2. Header List format
|
||
|
|
||
|
Some header Lines MUST appear in specific positions, as documented
|
||
|
below.
|
||
|
All other Lines can appear in any order.
|
||
|
If a parser does not recognize any extra material in a header Line,
|
||
|
the Line MUST be ignored.
|
||
|
If a header Line does not conform to this format, the Line SHOULD be
|
||
|
ignored by parsers.
|
||
|
|
||
|
It consists of:
|
||
|
|
||
|
Timestamp NL
|
||
|
|
||
|
[At start, exactly once.]
|
||
|
|
||
|
The Unix Epoch time in seconds when the file was created.
|
||
|
It does not follow the KeyValue format for backwards
|
||
|
compatibility with version 1.0.0.
|
||
|
|
||
|
"version=" version_number NL
|
||
|
|
||
|
[In second position, zero or one time.]
|
||
|
|
||
|
The specification document format version.
|
||
|
It uses semantic versioning [5].
|
||
|
|
||
|
This Line has been added in version 1.1.0 of this specification.
|
||
|
|
||
|
Version 1.0.0 documents do not contain this Line, and the
|
||
|
version_number is considered to be "1.0.0".
|
||
|
|
||
|
"software=" Value NL
|
||
|
|
||
|
[Zero or one time.]
|
||
|
|
||
|
The name of the software that created the document.
|
||
|
|
||
|
This Line has been added in version 1.1.0 of this specification.
|
||
|
|
||
|
Version 1.0.0 documents do not contain this Line, and the software
|
||
|
is considered to be "torflow".
|
||
|
|
||
|
"software_version=" Value NL
|
||
|
|
||
|
[Zero or one time.]
|
||
|
|
||
|
The version of the software that created the document.
|
||
|
The version may be a version_number, a git commit, or some other
|
||
|
version scheme.
|
||
|
|
||
|
This Line has been added in version 1.1.0 of this specification.
|
||
|
|
||
|
"generator_started=" DateTime NL
|
||
|
|
||
|
[Zero or one time.]
|
||
|
|
||
|
The date and time timestamp in ISO 8601 format and UTC time zone
|
||
|
when the generator started.
|
||
|
|
||
|
This Line has been added in version 1.1.0 of this specification.
|
||
|
|
||
|
"earliest_bandwidth=" DateTime NL
|
||
|
|
||
|
[Zero or one time.]
|
||
|
|
||
|
The date and time timestamp in ISO 8601 format and UTC time zone
|
||
|
when the first relay bandwidth was obtained.
|
||
|
|
||
|
This Line has been added in version 1.1.0 of this specification.
|
||
|
|
||
|
KeyValue NL
|
||
|
|
||
|
[Zero or more times.]
|
||
|
|
||
|
There MUST NOT be multiple KeyValue header Lines with the same key.
|
||
|
If there are, the parser SHOULD choose an arbitrary Line.
|
||
|
|
||
|
If a parser does not recognize a Keyword in a KeyValue Line, it
|
||
|
MUST be ignored.
|
||
|
|
||
|
Future format versions may include additional KeyValue header Lines.
|
||
|
Additional header Lines will be accompanied by a minor version
|
||
|
increment.
|
||
|
|
||
|
Implementations MAY add additional header Lines as needed. This
|
||
|
specification SHOULD be updated to avoid conflicting meanings for
|
||
|
the same header keys.
|
||
|
|
||
|
Parsers MUST NOT rely on the order of these additional Lines.
|
||
|
|
||
|
Additional header Lines MUST NOT use any keywords specified in the
|
||
|
relay measurements format.
|
||
|
If there are, the parser MAY ignore conflicting keywords.
|
||
|
|
||
|
Terminator NL
|
||
|
|
||
|
[Zero or one time.]
|
||
|
|
||
|
The Header List section ends with this Terminator.
|
||
|
|
||
|
In version 1.0.0, Header List ends when the first relay bandwidth
|
||
|
is found conforming to the next section.
|
||
|
Implementations of version 1.1.0 SHOULD include this Line.
|
||
|
|
||
|
2.3. Relays' Bandwidth List format
|
||
|
|
||
|
It consists of zero or more RelayLines with the relays' bandwidth
|
||
|
in arbitrary order.
|
||
|
|
||
|
There MUST NOT be multiple KeyValue pairs with the same key in the same
|
||
|
RelayLine.
|
||
|
If there are, the parser SHOULD choose an arbitrary Value.
|
||
|
|
||
|
There MUST NOT be multiple RelayLine per relay identity (node_id or
|
||
|
master_key_ed25519).
|
||
|
If there are, parsers SHOULD issue a warning and MAY choose an arbitrary
|
||
|
value or ignore both values.
|
||
|
|
||
|
If a parser does not recognize any extra material in a RelayLine,
|
||
|
the extra material MUST be ignored.
|
||
|
|
||
|
Each RelayLine MUST include the following KeyValue pairs:
|
||
|
In version 1.0.0, node_id MUST NOT be at the end of the Line.
|
||
|
In version 1.1.0, the KeyValue can be in any arbitrary order.
|
||
|
[TODO: list of Tor version that support it, when it's done]
|
||
|
|
||
|
"node_id=" hexdigest
|
||
|
|
||
|
[Exactly once.]
|
||
|
|
||
|
The fingerprint for the relay's RSA identity key.
|
||
|
|
||
|
"master_key_ed25519=" MasterKey
|
||
|
|
||
|
[Zero or one time.]
|
||
|
|
||
|
The relays's master Ed25519 key, base64 encoded,
|
||
|
without trailing "="s, to avoid ambiguity with KeyValue "="
|
||
|
character.
|
||
|
|
||
|
Implementations of version 1.1.0 SHOULD include both node_id and
|
||
|
master_key_ed25519.
|
||
|
Parsers SHOULD accept Lines that contain at least one of them.
|
||
|
|
||
|
"bw=" Bandwidth
|
||
|
|
||
|
[Exactly once.]
|
||
|
|
||
|
The measured bandwidth of this relay.
|
||
|
|
||
|
Tor accepts zero bandwidths, but they trigger bugs in older Tor
|
||
|
implementations. Therefore, implementations SHOULD NOT produce zero
|
||
|
bandwidths. Instead, they SHOULD use one as their minimum bandwidth.
|
||
|
If there are zero bandwidths, the parser MAY ignore them.
|
||
|
|
||
|
Multiple measurements can be aggregated using an averaging scheme,
|
||
|
such as a mean, median, or decaying average.
|
||
|
|
||
|
Torflow scales bandwidths to kilobytes per second. Other
|
||
|
implementations SHOULD use kilobytes per second for their initial
|
||
|
bandwidth scaling.
|
||
|
|
||
|
If different implementations or configurations are used in votes for
|
||
|
the same network, their measurements MAY need further scaling. See
|
||
|
Appendix B for information about scaling, and one possible scaling
|
||
|
method.
|
||
|
|
||
|
KeyValue
|
||
|
|
||
|
[Zero or more times.]
|
||
|
|
||
|
Future format versions may include additional KeyValue pairs on a
|
||
|
RelayLine.
|
||
|
Additional KeyValue pairs will be accompanied by a minor version
|
||
|
increment.
|
||
|
|
||
|
Implementations MAY add additional relay KeyValue pairs as needed.
|
||
|
This specification SHOULD be updated to avoid conflicting meanings
|
||
|
for the same Keywords.
|
||
|
|
||
|
Parsers MUST NOT rely on the order of these additional KeyValue
|
||
|
pairs.
|
||
|
|
||
|
Additional KeyValue pairs MUST NOT use any keywords specified in the
|
||
|
header format.
|
||
|
If there are, the parser MAY ignore conflicting keywords.
|
||
|
|
||
|
2.4. Implementation notes
|
||
|
|
||
|
KeyValue pairs in RelayLines that current implementations generate.
|
||
|
|
||
|
2.4.1. Simple Bandwidth Scanner
|
||
|
|
||
|
Every RelayLine in sbws version 0.1.0 consists of:
|
||
|
|
||
|
"node_id=" hexdigest SP
|
||
|
|
||
|
As above.
|
||
|
|
||
|
"bw=" Bandwidth SP
|
||
|
|
||
|
As above.
|
||
|
|
||
|
"nick=" nickname SP
|
||
|
|
||
|
[Exactly once.]
|
||
|
|
||
|
The relay nickname.
|
||
|
|
||
|
"rtt=" Int SP
|
||
|
|
||
|
[Exactly once.]
|
||
|
|
||
|
The Round Trip Time in milliseconds to obtain 1 byte of data.
|
||
|
|
||
|
"time=" DateTime NL
|
||
|
|
||
|
[Exactly once.]
|
||
|
|
||
|
The date and time timestamp in ISO 8601 format and UTC time zone
|
||
|
when the last bandwidth was obtained.
|
||
|
|
||
|
2.4.2. Torflow
|
||
|
|
||
|
Torflow RelayLines include node_id and bw, and other KeyValue pairs [2].
|
||
|
|
||
|
References:
|
||
|
|
||
|
1. https://gitweb.torproject.org/torflow.git
|
||
|
2. https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/README.spec.txt#n332
|
||
|
3. https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
|
||
|
4. https://gitweb.torproject.org/torspec.git/tree/version-spec.txt
|
||
|
5. https://semver.org/
|
||
|
|
||
|
A. Sample data
|
||
|
|
||
|
The following has not been obtained from any real measurement.
|
||
|
|
||
|
A.1. Generated by Torflow
|
||
|
|
||
|
This an example version 1.0.0 document:
|
||
|
|
||
|
1523911758
|
||
|
node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 bw=760 nick=Test measured_at=1523911725 updated_at=1523911725 pid_error=4.11374090719 pid_error_sum=4.11374090719 pid_bw=57136645 pid_delta=2.12168374577 circ_fail=0.2 scanner=/filepath
|
||
|
node_id=$96C15995F30895689291F455587BD94CA427B6FC bw=189 nick=Test2 measured_at=1523911623 updated_at=1523911623 pid_error=3.96703337994 pid_error_sum=3.96703337994 pid_bw=47422125 pid_delta=2.65469736988 circ_fail=0.0 scanner=/filepath
|
||
|
|
||
|
A.2. Generated by sbws version 0.1.X
|
||
|
[TODO: this needs to be implemented when this spec is finished]
|
||
|
|
||
|
1523911758
|
||
|
version=1.1.0
|
||
|
software=sbws
|
||
|
software_version=0.1.0
|
||
|
generator_started=2018-05-08T16:13:25
|
||
|
earliest_bandwidth=2018-05-08T16:13:26
|
||
|
====
|
||
|
node_id=$68A483E05A2ABDCA6DA5A3EF8DB5177638A27F80 master_key_ed25519=YaqV4vbvPYKucElk297eVdNArDz9HtIwUoIeo0+cVIpQ bw=760 nick=Test rtt=380 time=2018-05-08T16:13:26
|
||
|
node_id=$96C15995F30895689291F455587BD94CA427B6FC master_key_ed25519=a6a+dZadrQBtfSbmQkP7j2ardCmLnm5NJ4ZzkvDxbo0I bw=189 nick=Test2 rtt=378 time=2018-05-08T16:13:36
|
||
|
|
||
|
B. Scaling bandwidths
|
||
|
|
||
|
B.1. Scaling requirements
|
||
|
|
||
|
Tor accepts zero bandwidths, but they trigger bugs in older Tor
|
||
|
implementations. Therefore, scaling methods SHOULD perform the
|
||
|
following checks:
|
||
|
* If the total bandwidth is zero, all relays should be given equal
|
||
|
bandwidths.
|
||
|
* If the scaled bandwidth is zero, it should be rounded up to one.
|
||
|
|
||
|
Initial experiments indicate that scaling may not be needed for
|
||
|
torflow and sbws, because their measured bandwidths are similar
|
||
|
enough already.
|
||
|
|
||
|
B.2. A linear scaling method
|
||
|
|
||
|
If scaling is required, here is a simple linear bandwith scaling
|
||
|
method, which ensures that all bandwidth votes contain approximately
|
||
|
the same total bandwidth:
|
||
|
|
||
|
1. Calculate the relay quota by dividing the total measured bandwidth
|
||
|
in all votes, by the number of relays with measured bandwidth
|
||
|
votes. In the public tor network, this is approximately 7500 as of
|
||
|
April 2018. The quota should be a consensus parameter, so it can be
|
||
|
adjusted for all generators on the network.
|
||
|
|
||
|
2. Calculate a vote quota by multiplying the relay quota by the number
|
||
|
of relays this bandwidth authority has measured
|
||
|
bandwidths for.
|
||
|
|
||
|
3. Calculate a scaling factor by dividing the vote quota by the
|
||
|
total unscaled measured bandwidth in this bandwidth
|
||
|
authority's upcoming vote.
|
||
|
|
||
|
4. Multiply each unscaled measured bandwidth by the scaling
|
||
|
factor.
|
||
|
|
||
|
Now, the total scaled bandwidth in the upcoming vote is
|
||
|
approximately equal to the quota.
|
||
|
|
||
|
B.3. Quota changes
|
||
|
|
||
|
If all generators are using scaling, the quota can be gradually
|
||
|
reduced or increased as needed. Smaller quotas decrease the size
|
||
|
of uncompressed consensuses, and may decrease the size of
|
||
|
consensus diffs and compressed consensuses. But if the relay
|
||
|
quota is too small, some relays may be over- or under-weighted.
|