Commit Graph

913 Commits

Author SHA1 Message Date
Karsten Loesing
2472f01296 Recognize new bridge authority Serge.
Fixes #26786.
2018-07-14 20:41:11 +02:00
iwakeh
9d8a0ae6ec Prevent weird values in cut-off time calculation.
Rename variable to include the unit (which is days).
Log cut-off date on info level.

Implements task-20224.
2018-07-09 14:39:22 +02:00
Karsten Loesing
98244f3ee1 Fix bug in tarball-creating script.
The bug was that some tarballs were not compressed in a run following
an aborted run.

Fixes #26193.
2018-07-09 14:30:32 +02:00
Karsten Loesing
9f05667878 Rename packages.
Rename root package org.torproject.collector to
org.torproject.metrics.collector and ..index to ..indexer.

Implements #24291.
2018-07-05 10:59:03 +02:00
Karsten Loesing
55a9f76d10 Bump version to 1.6.0-dev. 2018-05-25 15:03:49 +02:00
Karsten Loesing
78d01e5b01 Prepare for 1.6.0 release. 2018-05-23 22:14:27 +02:00
Karsten Loesing
8184888365 Replace Gson with Jackson.
Implements #26162.
2018-05-23 22:08:41 +02:00
iwakeh
574a3ec4c6 Adapt to metrics-lib 2.3.0 changes.
This concerns #25523.
2018-04-19 15:55:37 +02:00
iwakeh
970dd3d1e1 Describe 'contrib' path contents. 2018-03-26 13:01:55 +02:00
iwakeh
5d2bdbb8ca Add 'contrib' directory to index.json. 2018-03-20 08:31:46 +00:00
iwakeh
770f55cf82 Corrected description of vote descriptor path.
Implements task-20287.
2018-03-20 09:31:45 +01:00
Karsten Loesing
4e166b7832 Include webstats in create-tarballs.sh. 2018-03-19 20:59:34 +01:00
Karsten Loesing
88867bf3b4 Bump version to 1.5.1-dev. 2018-03-19 16:26:13 +01:00
Karsten Loesing
60ac193616 Prepare for 1.5.1 release. 2018-03-19 15:07:21 +01:00
Karsten Loesing
8c1bdd9a84 Avoid calling bytesFor() unnecessarily.
Found while looking into #25522.
2018-03-17 10:45:55 +01:00
iwakeh
2c00f28ab7 Add lines according to their count.
Making the test from the previous commit pass and fixing task-25522.
2018-03-07 13:46:55 +00:00
iwakeh
2e6fa506b3 Add a failing test.
Making a static method easier accessible for tests.
2018-03-07 13:46:53 +00:00
Karsten Loesing
727d4e54ef Add bastet to directory authorities to download votes for.
We did download bastet's votes in the past after reading the
consensus. But there could have been situations when there was no
consensus that we did not explicitly ask for bastet's vote.

Found per chance while setting up a CollecTor instance with a webstats
module.
2018-03-07 14:46:50 +01:00
iwakeh
190d90a800 Describe file protocol for Tor web server logs.
Part of task-20234.
2018-02-26 15:24:50 +00:00
Karsten Loesing
ef1dfb6d32 Bump version to 1.5.0-dev. 2018-02-26 16:24:49 +01:00
Karsten Loesing
ddfa7bad24 Prepare for 1.5.0 release. 2018-02-26 15:06:51 +01:00
iwakeh
d05b4e4aee Circumvent Collection (integer) size limit.
Clean log lines immediately when they are read and also make use of sanitized
log's high redundancy immediately, i.e., continue with maps of
<LocalDate, <Map<String, Long>>.

Rename method(s) to reflect what they do.
2018-02-26 14:16:07 +01:00
iwakeh
8557bf6255 Reduce memory footprint and wall time.
Adapt to latest changes of metrics-lib (task-25329) and make use of the high
redundancy of logs (e.g. a 3G file might only contain 350 different lines).
This avoids OOM and array out of bounds exceptions for large files (>2G) and
gives a speed-up of roughly 50%. (The earlier 66min are down to 34min for
meronense&weschniakowii files plus two larger files.)

There is a BATCH constant, which could be tuned for processing speed. It is
logged for each webstats module run.  Currently, it is set to 100k.  This
was more or less arbitrarily chosen and used for all the tests.  A test run
using 500k didn't show significant differences.
2018-02-20 16:30:13 +00:00
iwakeh
fbb35f75da Adapt CollecTor to latest metrics-lib master branch. 2018-02-20 16:30:08 +00:00
iwakeh
5b68aaf8aa Add hasContent method to make even more use of DescriptorBuilder. 2018-02-20 17:30:07 +01:00
iwakeh
43cd158766 Make logging statements comply to Metrics' standards.
Also edit here and there for more readability and less lines.
2018-02-20 17:30:07 +01:00
iwakeh
4e61bb792b Use DescriptorBuilder more often.
Add convenience constructor accepting the first string as argument.
2018-02-20 17:30:07 +01:00
iwakeh
afe07d8efd Add a finalized state to DescriptorBuilder.
To avoid possible inconsistencies DescriptorBuilder is finalized after the first
call to 'toString' and cannot be altered anymore.  Any attempt to add more leads
to an IllegalStateException.
2018-02-20 17:30:03 +01:00
iwakeh
2457eb5be7 Use Java8 idiom for toString method. 2018-02-20 17:29:39 +01:00
iwakeh
fbfa16c05b Make DescriptorBuilder also accept DescriptorBuilders.
This might facilitate easier processing of descriptors.
2018-02-20 17:29:39 +01:00
iwakeh
266051f339 Rename SanitizedBridgeDescriptorBuilder to DescriptorBuilder.
The class doesn't 'know' about descriptor sanitization, it is only a sort of
container for writing descriptors.  It could be actually moved to some util
package and used in other parsing steps, too.

Also rename test helper classes to avoid naming conflicts.
Remove 'descriptor' from variable names.
Make DescriptorBuilder public.
Adapt other classes as well as tests.
2018-02-20 17:29:38 +01:00
Karsten Loesing
d5aba97f9b Separate parsing and sanitizing steps for bridge descriptors.
First step towards implementing #20549.
2018-02-20 17:29:38 +01:00
iwakeh
06d1a81d4c Avoid repeated validation of clean and validated log lines. 2018-02-05 18:00:15 +01:00
iwakeh
bd948070e0 Optimize parallel processing and use static imports for readability. 2018-02-05 18:00:10 +01:00
iwakeh
15db1e2a79 Parallelize two more processing steps. 2018-02-05 18:00:05 +01:00
iwakeh
2a0aa8c7f8 Use enum Method from metrics-lib. 2018-02-05 18:00:00 +01:00
iwakeh
97e577ae73 Add webstats module with sync and local import functionality.
Implements task-22428.
2018-01-31 14:01:20 +01:00
Karsten Loesing
7f01208aed Update copyright to 2018. 2018-01-09 10:23:10 +01:00
Karsten Loesing
ee7f1353a2 Update metrics-base. 2017-12-15 17:01:27 +01:00
Karsten Loesing
b23232bd44 Exclude lastModifiedMillis in index.json.
Fixes #24621.
2017-12-14 10:13:11 +01:00
Karsten Loesing
60dfface97 Bump version to 1.4.1-dev. 2017-10-26 10:16:35 +02:00
Karsten Loesing
56a303ecdb Prepare for 1.4.1 release. 2017-10-25 20:51:30 +02:00
Karsten Loesing
7cfb69ad03 Add change log entries for the two #23981 changes. 2017-10-25 20:50:29 +02:00
Karsten Loesing
3a0ba1baba Update metrics-base. 2017-10-25 20:50:00 +02:00
Karsten Loesing
e54bca3f95 Retain "bridge-distribution-request" lines. 2017-10-25 20:49:18 +02:00
Karsten Loesing
a38f7c2771 Handle bridge descriptors with unusual line order.
Typically, the "published" line appears before the "fingerprint" line.
However, an alternative Tor implementation orders these two lines
differently, which is valid due to the spec. We need to handle this
case by accepting lines in either order.

Fixes #23981.
2017-10-25 17:29:40 +02:00
Karsten Loesing
3a95892de3 Add test that will fail #23981. 2017-10-25 17:19:46 +02:00
Karsten Loesing
ebb13b12d0 Bump version to 1.4.0-dev. 2017-10-17 21:22:04 +02:00
Karsten Loesing
95051e9480 Prepare for 1.4.0 release. 2017-10-09 14:23:52 +02:00
iwakeh
0a586c0be0 Add the build revision to index.json files.
Implements the final part of task-21414 for CollecTor.
2017-10-09 14:20:44 +02:00