mirror of
https://github.com/torproject/metrics-web.git
synced 2024-11-23 09:39:47 +00:00
0aed44785b
The update of the build information is part of a different ticket.
446 lines
15 KiB
Plaintext
446 lines
15 KiB
Plaintext
This file contains outdated information and will be updated soon!
|
|
|
|
Tor Metrics
|
|
===========
|
|
|
|
Tor Metrics aggregates publicly available data about the Tor network and
|
|
visualizes that data on a website.
|
|
|
|
This software package, metrics-web, contains (1) the code to aggregate Tor
|
|
network data, (2) the code to generate graphs and .CSV output, and (3) the
|
|
code for a dynamic web application. metrics-web is based on Java, Ant,
|
|
PostgreSQL, R, Apache HTTP Server, and Apache Tomcat.
|
|
|
|
This README explains all necessary steps to install metrics-web including
|
|
any databases (Section 1), the graphing engine (Section 2), and the web
|
|
application (Section 3).
|
|
|
|
|
|
1. Installing the metrics database
|
|
==================================
|
|
|
|
The metrics database contains data about the Tor Network coming from
|
|
different sources, including the Tor directory authorities, Torperf
|
|
performance measurement installations, and others.
|
|
|
|
|
|
1.1. Preparing the operating system
|
|
===================================
|
|
|
|
This README describes the steps for installing metrics-web on a Debian
|
|
GNU/Linux Jessie server. Instructions for other operating systems may
|
|
vary.
|
|
|
|
In the following it is assumed that sudo (or root) privileges are available.
|
|
|
|
Start by adding a metrics user that will be used to execute all commands
|
|
that do not require root privileges.
|
|
|
|
$ sudo adduser metrics
|
|
|
|
The database importer and website sources will be installed in
|
|
/srv/metrics.torproject.org/ that is created as follows:
|
|
|
|
$ sudo mkdir /srv/metrics.torproject.org/
|
|
$ sudo chmod g+ws /srv/metrics.torproject.org/
|
|
$ sudo chown metrics:metrics /srv/metrics.torproject.org/
|
|
|
|
Clone the metrics-web Git repository:
|
|
|
|
$ cd /srv/metrics.torproject.org/
|
|
$ git clone git://git.torproject.org/metrics-web metrics
|
|
|
|
Install OpenJDK 7, Ant 1.9.4, and PostgreSQL 9.4 that are necessary for
|
|
setting up the metrics database.
|
|
|
|
$ sudo apt-get install openjdk-7-jdk ant postgresql-9.4
|
|
|
|
Setting up the graphing engine (cf. 2.) requires installing R 2.8 or
|
|
higher as well as the ggplot2 library.
|
|
|
|
$ sudo apt-get install r-base r-cran-rserve r-cran-ggplot2 r-cran-reshape \
|
|
r-cran-scales r-cran-java
|
|
|
|
Check the versions of the newly installed tools.
|
|
|
|
$ java -version
|
|
java version "1.7.0_101"
|
|
OpenJDK Runtime Environment (IcedTea 2.6.6) (7u101-2.6.6-2~deb8u1)
|
|
OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
|
|
|
|
$ ant -version
|
|
Apache Ant(TM) version 1.9.4 compiled on October 7 2014
|
|
|
|
$ psql --version
|
|
psql (PostgreSQL) 9.4.8
|
|
|
|
Now prepare the library folder for all ant projects.
|
|
|
|
$ cd /srv/metrics.torproject.org/metrics/
|
|
$ mkdir shared/lib
|
|
|
|
Download .jar files listed below. Metrics usually uses Debian stable
|
|
provided libraries, but you can also just download them elsewhere.
|
|
|
|
Copy or link the following jars, annotated with file names in Debian
|
|
stable packages, to /srv/metrics.torproject.org/metrics/shared/lib:
|
|
commons-codec-1.9.jar
|
|
[/usr/share/java/commons-codec-1.9.jar in libcommons-codec-java]
|
|
commons-compress-1.9.jar
|
|
[/usr/share/java/commons-compress-1.9.jar in libcommons-compress-java]
|
|
commons-lang-2.6.jar
|
|
[/usr/share/java/commons-lang-2.6.jar in libcommons-lang-java]
|
|
gson-2.2.4.jar
|
|
[/usr/share/java/gson.jar in libgoogle-gson-java]
|
|
jstl1.1-1.1.2.jar
|
|
[/usr/share/java/jstl1.1-1.1.2.jar in libjstl1.1-java]
|
|
junit4-4.11.jar
|
|
[/usr/share/java/junit4-4.11.jar in junit4]
|
|
postgresql-jdbc3-9.2.jar
|
|
[/usr/share/java/postgresql-jdbc3-9.2.jar in libpostgresql-jdbc-java]
|
|
REngine.jar
|
|
[/usr/lib/R/site-library/Rserve/java/REngine.jar in r-cran-rserve]
|
|
Rserve.jar
|
|
[/usr/lib/R/site-library/Rserve/java/Rserve.jar in r-cran-rserve]
|
|
servlet-api-3.0.jar
|
|
[/usr/share/java/servlet-api-3.0.jar in libservlet3.0-java]
|
|
standard-1.1.2.jar
|
|
[/usr/share/java/standard-1.1.2.jar in libjakarta-taglibs-standard-java]
|
|
xz-1.5.jar
|
|
[/usr/share/java/xz-1.5.jar in libxz-java]
|
|
|
|
DescripTor is provided by The Tor Project and can be found here:
|
|
https://dist.torproject.org/descriptor/
|
|
Download the tar.gz file with the version number listed in build.xml.
|
|
The README inside the tar.gz file has all the information about DescripTor
|
|
and explains how to verify the downloaded files.
|
|
Copy descriptor-<version>.jar to /srv/metrics.torproject.org/shared/lib
|
|
|
|
1.2. Configuring the database
|
|
=============================
|
|
|
|
The first step in setting up the metrics database is to configure the
|
|
PostgreSQL database and import a database schema.
|
|
|
|
Start by creating a new metrics database user. There is no need to give
|
|
the metrics user superuser privileges or allow it to create databases or
|
|
new roles. You will be prompted for the password.
|
|
|
|
$ sudo -u postgres createuser -P metrics
|
|
|
|
Create a new database tordir owned by user metrics.
|
|
|
|
$ sudo -u postgres createdb -O metrics tordir
|
|
|
|
Import the metrics database schema.
|
|
|
|
$ sudo -u metrics psql -f /srv/metrics.torproject.org/metrics/modules/legacy/tordir.sql tordir
|
|
|
|
Confirm that the database now contains tables to hold metrics data. In
|
|
the following, => will be used as the database prompt.
|
|
|
|
$ sudo -u metrics psql tordir
|
|
=> \dt+
|
|
=> \q
|
|
|
|
|
|
1.3. Importing relay descriptor tarballs
|
|
========================================
|
|
|
|
In most cases it makes sense to populate the metrics database with
|
|
archived relay descriptors from the official metrics website.
|
|
|
|
Download the relay descriptor tarballs from the CollecTor website at
|
|
https://collector.torproject.org/archive/relay-descriptors/
|
|
and extract them to /srv/metrics.torproject.org/archives/ . The database
|
|
importer can process v3 votes, v3 consensuses, server descriptors, and extra-infos.
|
|
|
|
Edit the config file ~/metrics-web/config (or create it if it's not there)
|
|
to contain the following five lines (be sure to remove the linebreak in
|
|
the line defining the JDBC string and insert the real password there):
|
|
|
|
ImportDirectoryArchives 1
|
|
DirectoryArchivesDirectory archives/
|
|
KeepDirectoryArchiveImportHistory 1
|
|
WriteRelayDescriptorDatabase 1
|
|
RelayDescriptorDatabaseJDBC
|
|
jdbc:postgresql://localhost/tordir?user=metrics&password=password
|
|
|
|
Compile and run the Java database importer.
|
|
|
|
$ cd /srv/metrics.torproject.org/metrics/
|
|
$ ./run-web.sh
|
|
|
|
The database import will take a while. Once it's complete, check that the
|
|
database tables now contain metrics data:
|
|
|
|
$ sudo -u metrics psql tordir
|
|
=> \dt+
|
|
=> \q
|
|
|
|
It's safe to delete the relay descriptor files in ~/metrics-web/archives/
|
|
once they are imported.
|
|
|
|
An alternative to importing relay descriptor tarballs directly into the
|
|
database is to convert them into a data format that psql's \copy command
|
|
can process. Look for the config option WriteRelayDescriptorsRawFiles in
|
|
/srv/metrics.torproject.org/config.template for more information on this
|
|
experimental feature.
|
|
|
|
In a future version of metrics-web it may also be possible to update local
|
|
relay descriptor tarballs from the official metrics server via rsync and
|
|
import only the changes into the metrics database. The idea is to simply
|
|
rsync the data/ directory from the CollecTor server and have all information
|
|
available. However, this feature is not implemented yet.
|
|
|
|
|
|
1.4. Importing relay descriptors from a local Tor data directory
|
|
================================================================
|
|
|
|
In order to keep the data in the metrics database up-to-date, the metrics
|
|
database importer can import the cached descriptors from a local Tor data
|
|
directory.
|
|
|
|
Configure a local Tor client to fetch all known descriptors as early as
|
|
possible by adding these config options to its torrc file:
|
|
|
|
DownloadExtraInfo 1
|
|
FetchUselessDescriptors 1
|
|
FetchDirInfoEarly 1
|
|
FetchDirInfoExtraEarly 1
|
|
|
|
Tell the metrics database importer where to find the cached descriptor
|
|
files. One way to achieve this is to add symbolic links to
|
|
/srv/metrics.torproject.org/archives/ like this. Tor's data directory is
|
|
assumed to be /srv/tor/ here.
|
|
|
|
$ cd /srv/metrics.torproject.org/archives/
|
|
$ ln -s /srv/tor/cached-* .
|
|
|
|
Add a crontab entry for the database importer to run once per hour:
|
|
|
|
15 * * * * cd /srv/metrics.torproject.org/metrics/ && ./run-web.sh
|
|
|
|
|
|
1.5. Pre-calculating relay statistics
|
|
=====================================
|
|
|
|
The relay graphs on the metrics website rely on pre-calculated statistics
|
|
in the metrics database. These statistics are not calculated after every
|
|
completed import, which would usually be once per hour. In general it's
|
|
sufficient to pre-calculate statistics 2 or 4 times a day.
|
|
|
|
Calculate statistics manually after large imports (this may take a while):
|
|
|
|
$ sudo -u metrics psql tordir -c 'SELECT * FROM refresh_all();'
|
|
|
|
If the metrics database gets updated automatically, write a script and add
|
|
a crontab entry for pre-calculating statistics every 6 or 12 hours.
|
|
|
|
|
|
1.6. Importing sanitized bridge descriptors
|
|
===========================================
|
|
|
|
The metrics database can store aggregate statistics about running bridges
|
|
and bridge usage. These statistics are added by parsing sanitized bridge
|
|
descriptors available on the official metrics website.
|
|
|
|
Download a sanitized bridge descriptor tarball from the metrics website at
|
|
https://collector.torproject.org/archive/bridge-descriptors/ and extract
|
|
it to, e.g., /srv/metrics.torproject.org/bridges/bridge-descriptors-2011-05
|
|
|
|
Edit /srv/metrics.torproject.org/config to contain the following options:
|
|
|
|
ImportSanitizedBridges 1
|
|
SanitizedBridgesDirectory bridges/
|
|
KeepSanitizedBridgesImportHistory 1
|
|
WriteBridgeStats 1
|
|
|
|
Note that the bridge usage statistics require parsing relay descriptors of
|
|
the same time period in order to filter bridges that have been running as
|
|
relays from the results. When parsing sanitized bridge descriptors for
|
|
the first time it may be necessary to delete the relay descriptor import
|
|
history in /srv/metrics.torproject.org/stats/archives-import-history and
|
|
import all relay descriptors once again.
|
|
|
|
Run the database import:
|
|
|
|
$ ./run-web.sh
|
|
|
|
|
|
1.7. Importing Torperf performance data
|
|
=======================================
|
|
|
|
Torperf measures the performance of the Tor network as users experience
|
|
it. Torperf's measurement data are available on the metrics website and
|
|
can be imported into the metrics database, too.
|
|
|
|
Download the Torperf measurement files from the metrics website at
|
|
https://collector.torproject.org/archive/torperf/ and put them in a
|
|
subdirectory, e.g., /srv/metrics.torproject.org/torperf/ .
|
|
|
|
Edit /srv/metrics.torproject.org/config to contain the following options:
|
|
|
|
ImportWriteTorperfStats 1
|
|
TorperfDirectory torperf/
|
|
|
|
Run the database import:
|
|
|
|
$ ./run-web.sh
|
|
|
|
|
|
2. Installing the graphing engine
|
|
=================================
|
|
|
|
The metrics graphing engine generates custom graphs of Tor network data
|
|
based on user-provided parameters. The graphing engine requires the
|
|
metrics database to be installed as described in the previous section.
|
|
|
|
The graphing engine uses R and Rserve to generate its graphs. Rserve is a
|
|
TCP/IP server that makes it easy for other tools to use R without spawning
|
|
their own R process. Rserve also pre-loads R code and R libraries which
|
|
saves time when processing user requests.
|
|
|
|
In this configuration, Rserve will run in the context of the metrics user.
|
|
|
|
Start Rserve, this time with the metrics-web-specific configuration that
|
|
includes pre-loading the graph code:
|
|
|
|
$ cd /srv/metrics.torproject.org/rserve/ && ./start.sh
|
|
|
|
Add a crontab entry to start Rserve on reboot:
|
|
|
|
@reboot cd /srv/metrics.torproject.org/metrics/website/rserve/ && ./start.sh
|
|
|
|
Rserve will pre-load the graph code at startup. If changes are made to
|
|
the graph code, Rserve must be restarted:
|
|
|
|
$ cd /srv/metrics.torproject.org/metrics/website/rserve/
|
|
$ killall Rserve; ./start.sh
|
|
|
|
|
|
3. Installing the metrics website
|
|
=================================
|
|
|
|
The metrics website lets web users search parts of the metrics database
|
|
and visualizes custom graphs.
|
|
|
|
Note that the description here has a few specific parts that only apply to
|
|
the official metrics website. These parts should be changed when setting
|
|
up a non-official metrics website.
|
|
|
|
|
|
3.1. Configuring Apache HTTP Server
|
|
===================================
|
|
|
|
The Apache HTTP Server is used as the front-end web server that serves
|
|
static resources itself and forwards requests for dynamic resources to
|
|
Apache Tomcat.
|
|
|
|
Start by installing Apache 2:
|
|
|
|
$ sudo apt-get install apache2
|
|
|
|
Disable Apache's default site.
|
|
|
|
$ sudo a2dissite 000-default
|
|
|
|
Enable mod_rewrite to tell Apache where to find static resources on disk.
|
|
Also enable mod_proxy to forward requests to Tomcat.
|
|
|
|
$ sudo a2enmod rewrite proxy_http
|
|
|
|
Create a new virtual host configuration and store it in a new file
|
|
/etc/apache2/sites-available/metrics.torproject.org with the following
|
|
content:
|
|
|
|
<VirtualHost *:80>
|
|
ServerName metrics.torproject.org
|
|
ServerAdmin torproject-admin@torproject.org
|
|
ErrorLog /var/log/apache2/error.log
|
|
CustomLog /var/log/apache2/access.log combined
|
|
ServerSignature On
|
|
<IfModule mod_proxy.c>
|
|
<Proxy *>
|
|
Order deny,allow
|
|
Allow from all
|
|
</Proxy>
|
|
ProxyPass / http://127.0.0.1:8080/metrics/ retry=15
|
|
ProxyPassReverse / http://127.0.0.1:8080/metrics/
|
|
ProxyPreserveHost on
|
|
</IfModule>
|
|
</VirtualHost>
|
|
|
|
Enable the new virtual host.
|
|
|
|
$ sudo a2ensite metrics.torproject.org
|
|
|
|
Restart Apache just to be sure that all changes are effective.
|
|
|
|
$ sudo service apache2 restart
|
|
|
|
|
|
3.2. Configuring Apache Tomcat
|
|
==============================
|
|
|
|
Apache Tomcat will process requests for dynamic resources, including web
|
|
pages and graphs.
|
|
|
|
Install Tomcat 8:
|
|
|
|
$ sudo apt-get install tomcat8
|
|
|
|
Replace Tomcat's default configuration in /etc/tomcat8/server.xml with the
|
|
following configuration:
|
|
|
|
<Server port="8005" shutdown="SHUTDOWN">
|
|
<Service name="Catalina">
|
|
<Connector port="8080" maxHttpHeaderSize="8192"
|
|
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
|
|
enableLookups="false" redirectPort="8443" acceptCount="100"
|
|
connectionTimeout="20000" disableUploadTimeout="true"
|
|
compression="off" compressionMinSize="2048"
|
|
noCompressionUserAgents="gozilla, traviata"
|
|
compressableMimeType="text/html,text/xml,text/plain" />
|
|
<Engine name="Catalina" defaultHost="metrics.torproject.org">
|
|
<Host name="metrics.torproject.org" appBase="webapps"
|
|
unpackWARs="true" autoDeploy="true"
|
|
xmlValidation="false" xmlNamespaceAware="false">
|
|
<Alias>metrics.torproject.org</Alias>
|
|
<Valve className="org.apache.catalina.valves.AccessLogValve"
|
|
directory="logs" prefix="metrics_access_log." suffix=".txt"
|
|
pattern="%l %u %t %r %s %b" resolveHosts="false"/>
|
|
</Host>
|
|
</Engine>
|
|
</Service>
|
|
</Server>
|
|
|
|
Be sure to replace *.torproject.org with something else, unless this is
|
|
a re-installation of the official metrics website.
|
|
|
|
Update the paths starting with /srv/metrics.torproject.org/ in
|
|
/srv/metrics.torproject.org/metrics/website/etc/web.xml to the correct
|
|
paths in /srv/metrics.torproject.org/. The default paths in that file are
|
|
correct for the official metrics website setup which is slightly different
|
|
than the one described here.
|
|
|
|
Now generate the web application.
|
|
|
|
$ ant war
|
|
|
|
Create a symbolic link to the ernie.war file:
|
|
|
|
$ sudo ln -s /srv/metrics.torproject.org/metrics/website/metrics.war /var/lib/tomcat8/webapps/
|
|
|
|
Tomcat will now attempt to deploy the web application automatically.
|
|
|
|
Whenever the metrics website needs to be redeployed, generate a new .war
|
|
file and Tomcat will reload the web application automatically.
|
|
|
|
Restart Tomcat to make all configuration changes effective:
|
|
|
|
$ sudo service tomcat8 restart
|
|
|
|
The metrics website should now work.
|
|
|