- Bridge network statuses contain a "published" line containing the
publication timestamp, so that parsers don't have to learn that
timestamp from the file name anymore.
- Bridge network status entries are ordered by hex-encoded
fingerprint, not by base64-encoded fingerprint, which is mostly a
cosmetic change.
- Server descriptors and extra-info descriptors are stored under the
SHA1 hashes of the descriptor identifiers of their non-scrubbed
forms. Previously, descriptors were (supposed to be; see #5607)
stored under the digests of their scrubbed forms. The reason for
hashing digests is to prevent looking up an existing descriptor
from the bridge authority by its non-scrubbed descriptor digest.
With this change, we don't have to repair references between
statuses, server descriptors, and extra-info descriptors anymore
which turned out to be error-prone (#5608). Server descriptors and
extra-info descriptors contain a new "router-digest" line with the
hex-formatted descriptor identifier. These lines are necessary,
because we cannot calculate the identifier anymore and because we
don't want to rely on the file name.
- Stop sanitizing bridge nicknames (#5684).
- Stop sanitizing *-stats lines (#5807).
- All sanitized bridge descriptors contain @type annotations (#5651).
In #5805 we found that we have quite a few files that are either empty or
truncated. Turns out that the current metrics-db code doesn't allow
writing empty files, but it does allow writing truncated files. We now
parse all descriptors with metrics-lib and only store valid descriptors to
disk. Fixes part of #5813.
See #5124 for plans to get rid of 'opt' strings in descriptors. Looks
like the bridge descriptor sanitizer was the only code relying on them to
be there. This is fixed now.
In theory, overwriting files with the same content doesn't hurt. But it
makes it harder to identify only the files changed in the past 3 days for
the rsync/ directory.
Make sure that there's always a bridge-stats-end line preceding the
bridge-ips line.
We should add more such checks in the future. This is probably something
to implement in metrics-lib once it's more stable.
In the past, we sometimes had problems rsync'ing the v3-status-votes file
from gabelmoo and were left with a truncated file. When parsing this file
we successfully parsed the complete votes and failed parsing the last,
truncated vote. But we however marked this vote as downloaded and made no
further attempts to download it from the directory authorities.
The fix is to only mark a vote as downloaded if it contains a valid
directory footer.