We did download bastet's votes in the past after reading the
consensus. But there could have been situations when there was no
consensus that we did not explicitly ask for bastet's vote.
Found per chance while setting up a CollecTor instance with a webstats
module.
Clean log lines immediately when they are read and also make use of sanitized
log's high redundancy immediately, i.e., continue with maps of
<LocalDate, <Map<String, Long>>.
Rename method(s) to reflect what they do.
Adapt to latest changes of metrics-lib (task-25329) and make use of the high
redundancy of logs (e.g. a 3G file might only contain 350 different lines).
This avoids OOM and array out of bounds exceptions for large files (>2G) and
gives a speed-up of roughly 50%. (The earlier 66min are down to 34min for
meronense&weschniakowii files plus two larger files.)
There is a BATCH constant, which could be tuned for processing speed. It is
logged for each webstats module run. Currently, it is set to 100k. This
was more or less arbitrarily chosen and used for all the tests. A test run
using 500k didn't show significant differences.
To avoid possible inconsistencies DescriptorBuilder is finalized after the first
call to 'toString' and cannot be altered anymore. Any attempt to add more leads
to an IllegalStateException.
The class doesn't 'know' about descriptor sanitization, it is only a sort of
container for writing descriptors. It could be actually moved to some util
package and used in other parsing steps, too.
Also rename test helper classes to avoid naming conflicts.
Remove 'descriptor' from variable names.
Make DescriptorBuilder public.
Adapt other classes as well as tests.
Typically, the "published" line appears before the "fingerprint" line.
However, an alternative Tor implementation orders these two lines
differently, which is valid due to the spec. We need to handle this
case by accepting lines in either order.
Fixes#23981.
This includes adding property 'OnionPerfSources' and renaming
some markers properly. In addition, all camel-case occurrences
of 'OnionPerf' have a capitalized 'P' now.
Part of task-21759.