Commit Graph

949 Commits

Author SHA1 Message Date
Karsten Loesing
8c1bdd9a84 Avoid calling bytesFor() unnecessarily.
Found while looking into #25522.
2018-03-17 10:45:55 +01:00
iwakeh
2c00f28ab7 Add lines according to their count.
Making the test from the previous commit pass and fixing task-25522.
2018-03-07 13:46:55 +00:00
iwakeh
2e6fa506b3 Add a failing test.
Making a static method easier accessible for tests.
2018-03-07 13:46:53 +00:00
Karsten Loesing
727d4e54ef Add bastet to directory authorities to download votes for.
We did download bastet's votes in the past after reading the
consensus. But there could have been situations when there was no
consensus that we did not explicitly ask for bastet's vote.

Found per chance while setting up a CollecTor instance with a webstats
module.
2018-03-07 14:46:50 +01:00
iwakeh
190d90a800 Describe file protocol for Tor web server logs.
Part of task-20234.
2018-02-26 15:24:50 +00:00
Karsten Loesing
ef1dfb6d32 Bump version to 1.5.0-dev. 2018-02-26 16:24:49 +01:00
Karsten Loesing
ddfa7bad24 Prepare for 1.5.0 release. 2018-02-26 15:06:51 +01:00
iwakeh
d05b4e4aee Circumvent Collection (integer) size limit.
Clean log lines immediately when they are read and also make use of sanitized
log's high redundancy immediately, i.e., continue with maps of
<LocalDate, <Map<String, Long>>.

Rename method(s) to reflect what they do.
2018-02-26 14:16:07 +01:00
iwakeh
8557bf6255 Reduce memory footprint and wall time.
Adapt to latest changes of metrics-lib (task-25329) and make use of the high
redundancy of logs (e.g. a 3G file might only contain 350 different lines).
This avoids OOM and array out of bounds exceptions for large files (>2G) and
gives a speed-up of roughly 50%. (The earlier 66min are down to 34min for
meronense&weschniakowii files plus two larger files.)

There is a BATCH constant, which could be tuned for processing speed. It is
logged for each webstats module run.  Currently, it is set to 100k.  This
was more or less arbitrarily chosen and used for all the tests.  A test run
using 500k didn't show significant differences.
2018-02-20 16:30:13 +00:00
iwakeh
fbb35f75da Adapt CollecTor to latest metrics-lib master branch. 2018-02-20 16:30:08 +00:00
iwakeh
5b68aaf8aa Add hasContent method to make even more use of DescriptorBuilder. 2018-02-20 17:30:07 +01:00
iwakeh
43cd158766 Make logging statements comply to Metrics' standards.
Also edit here and there for more readability and less lines.
2018-02-20 17:30:07 +01:00
iwakeh
4e61bb792b Use DescriptorBuilder more often.
Add convenience constructor accepting the first string as argument.
2018-02-20 17:30:07 +01:00
iwakeh
afe07d8efd Add a finalized state to DescriptorBuilder.
To avoid possible inconsistencies DescriptorBuilder is finalized after the first
call to 'toString' and cannot be altered anymore.  Any attempt to add more leads
to an IllegalStateException.
2018-02-20 17:30:03 +01:00
iwakeh
2457eb5be7 Use Java8 idiom for toString method. 2018-02-20 17:29:39 +01:00
iwakeh
fbfa16c05b Make DescriptorBuilder also accept DescriptorBuilders.
This might facilitate easier processing of descriptors.
2018-02-20 17:29:39 +01:00
iwakeh
266051f339 Rename SanitizedBridgeDescriptorBuilder to DescriptorBuilder.
The class doesn't 'know' about descriptor sanitization, it is only a sort of
container for writing descriptors.  It could be actually moved to some util
package and used in other parsing steps, too.

Also rename test helper classes to avoid naming conflicts.
Remove 'descriptor' from variable names.
Make DescriptorBuilder public.
Adapt other classes as well as tests.
2018-02-20 17:29:38 +01:00
Karsten Loesing
d5aba97f9b Separate parsing and sanitizing steps for bridge descriptors.
First step towards implementing #20549.
2018-02-20 17:29:38 +01:00
iwakeh
06d1a81d4c Avoid repeated validation of clean and validated log lines. 2018-02-05 18:00:15 +01:00
iwakeh
bd948070e0 Optimize parallel processing and use static imports for readability. 2018-02-05 18:00:10 +01:00
iwakeh
15db1e2a79 Parallelize two more processing steps. 2018-02-05 18:00:05 +01:00
iwakeh
2a0aa8c7f8 Use enum Method from metrics-lib. 2018-02-05 18:00:00 +01:00
iwakeh
97e577ae73 Add webstats module with sync and local import functionality.
Implements task-22428.
2018-01-31 14:01:20 +01:00
Karsten Loesing
7f01208aed Update copyright to 2018. 2018-01-09 10:23:10 +01:00
Karsten Loesing
ee7f1353a2 Update metrics-base. 2017-12-15 17:01:27 +01:00
Karsten Loesing
b23232bd44 Exclude lastModifiedMillis in index.json.
Fixes #24621.
2017-12-14 10:13:11 +01:00
Karsten Loesing
60dfface97 Bump version to 1.4.1-dev. 2017-10-26 10:16:35 +02:00
Karsten Loesing
56a303ecdb Prepare for 1.4.1 release. 2017-10-25 20:51:30 +02:00
Karsten Loesing
7cfb69ad03 Add change log entries for the two #23981 changes. 2017-10-25 20:50:29 +02:00
Karsten Loesing
3a0ba1baba Update metrics-base. 2017-10-25 20:50:00 +02:00
Karsten Loesing
e54bca3f95 Retain "bridge-distribution-request" lines. 2017-10-25 20:49:18 +02:00
Karsten Loesing
a38f7c2771 Handle bridge descriptors with unusual line order.
Typically, the "published" line appears before the "fingerprint" line.
However, an alternative Tor implementation orders these two lines
differently, which is valid due to the spec. We need to handle this
case by accepting lines in either order.

Fixes #23981.
2017-10-25 17:29:40 +02:00
Karsten Loesing
3a95892de3 Add test that will fail #23981. 2017-10-25 17:19:46 +02:00
Karsten Loesing
ebb13b12d0 Bump version to 1.4.0-dev. 2017-10-17 21:22:04 +02:00
Karsten Loesing
95051e9480 Prepare for 1.4.0 release. 2017-10-09 14:23:52 +02:00
iwakeh
0a586c0be0 Add the build revision to index.json files.
Implements the final part of task-21414 for CollecTor.
2017-10-09 14:20:44 +02:00
iwakeh
e656d57351 Added overview page for javadoc. 2017-09-20 10:15:31 +00:00
iwakeh
41dd260513 Add changelog entries. 2017-09-20 12:15:30 +02:00
iwakeh
4b3c2fef58 Adapt and add tests for OnionPerf sync-runs. 2017-09-20 12:14:32 +02:00
iwakeh
8ffdfd6c2c Enable OnionPerf 'Sync' runs. 2017-09-20 12:14:25 +02:00
iwakeh
c4ab51e8dc Make OnionPerf adhere to the standard CollecTorMain.
This includes adding property 'OnionPerfSources' and renaming
some markers properly.  In addition, all camel-case occurrences
of 'OnionPerf' have a capitalized 'P' now.

Part of task-21759.
2017-09-20 12:14:18 +02:00
Karsten Loesing
74e683ab4a Tweak the change log a bit. 2017-09-19 14:34:03 +02:00
Karsten Loesing
6e300b5e49 Un-prettify directory listings.
With #22836 being deployed, Tor Metrics parses our index.json and
provides its own directory listings. Time to stop prettifying ours.
2017-09-19 10:13:02 +02:00
Karsten Loesing
1d2e2479b3 Bump version to 1.3.0-dev. 2017-09-15 16:08:43 +02:00
Karsten Loesing
18006f9561 Prepare for 1.3.0 release. 2017-09-15 11:36:04 +02:00
Karsten Loesing
e6d1496286 Update to metrics-lib 2.1.0 and to Java 8. 2017-09-15 11:32:12 +02:00
iwakeh
68f5301bd1 Keep annotations of given descriptors.
Makes test pass again and implements task-23215.
Changes a test descriptor to contain a second annotation.
2017-09-07 12:14:16 +02:00
iwakeh
1042a7683c Changed test data, wich makes some tests fail.
Part of task-23215.
2017-09-07 12:13:18 +02:00
iwakeh
d2677cba06 Use metrics-lib's JSON handling classes.
Implements task-23286 using metrics-lib-2.0.0
2017-08-18 13:03:04 +00:00
Karsten Loesing
d162d90299 Bump version to 1.2.1-dev. 2017-08-18 15:03:03 +02:00