Merge branch 'netflow_clarification'

2024-11-27 03:40:47 +00:00 · 2022-05-27 14:26:02 -04:00 · 2022-05-27 14:26:02 -04:00 · ffceda4ac2
commit ffceda4ac2
parent e80e874964 1272bd0db5
1 changed files with 57 additions and 30 deletions
--- a/padding-spec.txt
+++ b/padding-spec.txt
@ -143,6 +143,12 @@ Table of Contents
  user traffic in that time period is multiplexed over a single connection
  (as it is with Tor).

+  Though flow measurement in principle can be bidirectional (counting cells
+  sent in both directions between a pair of IPs) or unidirectional (counting
+  only cells sent from one IP to another), we assume for safety that all
+  measurement is unidirectional, and so traffic must be sent by both parties
+  in order to prevent record splitting.
+
 2.2. Implementation

  Tor clients currently maintain one TLS connection to their Guard node to
@ -154,35 +160,41 @@ Table of Contents
  connections, and pad them, but otherwise not pad between normal relays.

  Both clients and Guards will maintain a timer for all application (ie:
-  non-directory) TLS connections. Every time a non-padding packet is sent or
-  received by either end, that endpoint will sample a timeout value from
-  between 1.5 seconds and 9.5 seconds using the max(X,X) distribution
-  described in Section 2.3. The time range is subject to consensus
+  non-directory) TLS connections. Every time a padding packet sent by an
+  endpoint, that endpoint will sample a timeout value from
+  the max(X,X) distribution described in Section 2.3. The default
+  range is from 1.5 seconds to 9.5 seconds time range, subject to consensus
  parameters as specified in Section 2.6.

-  If the connection becomes active for any reason before this timer
-  expires, the timer is reset to a new random value between 1.5 and 9.5
-  seconds. If the connection remains inactive until the timer expires, a
-  single CELL_PADDING cell will be sent on that connection.
+  (The timing is randomized to avoid making it obvious which cells are
+  padding.)

-  In this way, the connection will only be padded in the event that it is
-  idle, and will always transmit a packet before the minimum 10 second inactive
-  timeout.
+  If another cell is sent for any reason before this timer expires, the timer
+  is reset to a new random value.
+
+  If the connection remains inactive until the timer expires, a
+  single CELL_PADDING cell will be sent on that connection (which will
+  also start a new timer).
+
+  In this way, the connection will only be padded in a given direction in
+  the event that it is idle in that direction, and will always transmit a
+  packet before the minimum 10 second inactive timeout.
+
+  (In practice, an implementation may not be able to determine when,
+  exactly, a cell is sent on a given channel.  For example, even though the
+  cell has been given to the kernel via a call to `send(2)`, the kernel may
+  still be buffering that cell.  In cases such as these, implementations
+  should use a reasonable proxy for the time at which a cell is sent: for
+  example, when the cell is queued.  If this strategy is used,
+  implementations should try to observe the innermost (closest to the wire)
+  queue that the practically can, and if this queue is already nonempty,
+  padding should not be scheduled until after the queue does become empty.)

 2.3. Padding Cell Timeout Distribution Statistics

-  It turns out that because the padding is bidirectional, and because both
-  endpoints are maintaining timers, this creates the situation where the time
-  before sending a padding packet in either direction is actually
-  min(client_timeout, server_timeout).
-
-  If client_timeout and server_timeout are uniformly sampled, then the
-  distribution of min(client_timeout,server_timeout) is no longer uniform, and
-  the resulting average timeout (Exp[min(X,X)]) is much lower than the
-  midpoint of the timeout range.
-
-  To compensate for this, instead of sampling each endpoint timeout uniformly,
-  we instead sample it from max(X,X), where X is uniformly distributed.
+  To limit the amount of padding sent, instead of sampling each endpoint
+  timeout uniformly, we instead sample it from max(X,X), where X is
+  uniformly distributed.

  If X is a random variable uniform from 0..R-1 (where R=high-low), then the
  random variable Y = max(X,X) has Prob(Y == i) = (2.0*i + 1)/(R*R).
@ -206,9 +218,6 @@ Table of Contents
     15000   7499.5    7995       4999.5           9999.5
     20000   9900.5    10661      6666.2           13332.8

-  In this way, we maintain the property that the midpoint of the timeout range
-  is the expected mean time before a padding packet is sent in either
-  direction.

 2.4. Maximum overhead bounds

@ -253,6 +262,13 @@ Table of Contents
  CELL_PADDING_NEGOTIATE to instruct the relay not to pad, and then does not
  send any further padding itself.

+  Currently, clients negotiate padding only when a channel is created,
+  immediately after sending their NETINFO cell.  Recipients SHOULD, however,
+  accept padding negotiation messages at any time.
+
+  Clients and bridges MUST reject padding negotiation messages from relays,
+  and close the channel if they receive one.
+
 2.6. Consensus Parameters Governing Behavior

  Connection-level padding is controlled by the following consensus parameters:
@ -277,11 +293,22 @@ Table of Contents
      - Default: 14000

    * nf_conntimeout_clients
-      - The number of seconds to keep circuits opened and available for
-        clients to use. Note that the actual client timeout is randomized
-        uniformly from this value to twice this value. This governs client
-        OR conn lifespan. Reduced padding clients use half the consensus
+      - The number of seconds to keep never-used circuits opened and
+        available for clients to use. Note that the actual client timeout is
+        randomized uniformly from this value to twice this value.
+      - The number of seconds to keep idle (not currently used) canonical
+        channels are open and available. (We do this to ensure a sufficient
+        time duration of padding, which is the ultimate goal.)
+      - This value is also used to determine how long, after a port has been
+        used, we should attempt to keep building predicted circuits for that
+        port. (See path-spec.txt section 2.1.1.)  This behavior was
+        originally added to work around implementation limitations, but it
+        serves as a reasonable default regardless of implementation.
+      - For all use cases, reduced padding clients use half the consensus
        value.
+      - Implementations MAY mark circuits held open past the reduced padding
+        quantity (half the consensus value) as "not to be used for streams",
+        to prevent their use from becoming a distinguisher.
      - Default: 1800

    * nf_pad_before_usage