Mig5's parser was a fine proof of concept but stem parses everything within the
spec. Our list_hidden_service_auth() method now returns either a credential or
dictionary of credentials based on if we're requesting a single service or
everything.
Our methods document parater types twice...
def my_method(my_arg: str) => None:
"""
Dummy method.
:param str my_arg: sample argument
"""
The method signature and :param: reStructured tag both cite our my_arg type.
By using Sphinx's sphinx_autodoc_typehints plugin we can drop our parameter
types...
https://github.com/agronholm/sphinx-autodoc-typehints
Deduplication aside this makes our API docs look nicer by stripping Python
type hints from our signatures. In other words...
stem.util.system.is_available(command: str, cached: bool = True) → bool
Parameters:
command (str) -- command to search for
cached (bool) -- makes use of available cached results if True
... becomes...
stem.util.system.is_available(command, cached=True)
Parameters:
command (str) -- command to search for
cached (bool) -- makes use of available cached results if True
Sphinx emits the following warnings...
WARNING: missing attribute mentioned in :members: or __all__: module stem, attribute directory
WARNING: missing attribute mentioned in :members: or __all__: module stem, attribute process
WARNING: missing attribute mentioned in :members: or __all__: module stem.descriptor, attribute remote
WARNING: missing attribute mentioned in :members: or __all__: module stem.response, attribute getinfo
WARNING: missing attribute mentioned in :members: or __all__: module stem.response, attribute getconf
WARNING: missing attribute mentioned in :members: or __all__: module stem.response, attribute authchallenge
WARNING: missing attribute mentioned in :members: or __all__: module stem.util, attribute lru_cache
WARNING: missing attribute mentioned in :members: or __all__: module stem.util, attribute ordereddict
WARNING: missing attribute mentioned in :members: or __all__: module stem.util, attribute term
WARNING: missing attribute mentioned in :members: or __all__: module stem.util, attribute test_tools
These submodules all exist, but importlib.import_module() does not have
attributes corresponding to them unless we've transitively imported its code
prior to Sphinx's attempt to reference it.
Honestly I don't fully grok the nuance behind how importlib works, but this
configuration flag resolves these warnings so calling it good.
The only visible difference I can see between including '__init__' verses
excluding it for automodule declarations is class names. If we include it the
page cites classes as...
stem.descriptor.__init__.DigestHash
... whereas without it the page renders...
stem.descriptor.DigestHash
The second is obviously correct so dropping the suffix.
The favicon on our website works, but the build warns...
WARNING: logo file 'logo.png' does not exist
WARNING: favicon file 'favicon.png' does not exist
Sphinx deprecated the configuration option we used to disable smartquotes...
RemovedInSphinx17Warning: html_use_smartypants option is deprecated. Smart
quotes are on by default; if you want to disable or customize them, use the
smart_quotes option in docutils.conf.
Despite this warning the actual configuration name they went with doesn't have
an underscore...
https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-smartquotes
Just correcting a few minor sphinx warnings...
/home/atagar/Desktop/stem/docs/download.rst:165: WARNING: Inline interpreted text or phrase reference start-string without end-string.
/home/atagar/Desktop/stem/docs/tutorials/east_of_the_sun.rst:65: WARNING: Literal block ends without a blank line; unexpected unindent.
/home/atagar/Desktop/stem/docs/tutorials/mirror_mirror_on_the_wall.rst:2: WARNING: Duplicate explicit target name: "benchmark script".
/home/atagar/Desktop/stem/docs/tutorials/mirror_mirror_on_the_wall.rst:2: WARNING: Duplicate explicit target name: "benchmark script".
Oops! Transport lines effectviely never appear aside from raw bridge
descriptors (which we never see), so I didn't have a live example to
test with.
Now we have one. DocTor's descriptor validation check is failing with...
03/05/2020 00:35:33 [WARNING] Unable to retrieve the extrainfo descriptors: Transport line has a malformed address: transport obfs4 [2001:985:e77:5:fd34:f56b:c2d1:e98c]:10394 cert=dJ/a+vnP+eFv7FDaVUqWCVlyrqf8FlOva2YAEkDUwiGQuorZf4Oc6FXSdyn8b4pUmZj/WA,iat-mode=0
Caught thanks to GeKo.
We're dropping stem.descriptor's reader and export module due to lack of use...
* I wrote stem.descriptor.reader at Karsten's suggestion to read descriptors
from disk, and track when those on-disk files change. The design seemed to
be for usage within CollecTor, but never was.
In practice stem.descriptor.from_file() provides a simpler mechanism to
read descriptors form disk.
* stem.descriptor.export was contributed by a university student in Stem's
early days. I've never used it nor found anyone else who does.
This module serializes descriptors to a CSV, which is moot since
descriptors already have a string representation we can read and
write...
with open('/path/to/descriptor', 'w') as descriptor_file:
descriptor_file.write(str(my_descriptor))
my_descriptor = stem.descriptor.from_file('/path/to/descriptor', 'server-descriptor 1.0')
Our jenkins tests our failing pretty routinely with python 3.7. It turns out
that this is my bad - unlike python 2.x the socket module frequently (but
inconsistently) raises a BrokenPipeError when closing a file based socket.
Why would anyone care that the socket they're closing isn't working?
That's... kinda the point. Oh well - ignoring these exceptions.
Tor is preparing to move to Gitlab. Rather than follow it I'm moving to GitHub.
Just finished migraing our tickets so now updating the bug tracker links.
Our Event's arrived_at attribute has a couple wrinkes...
* This timestamp reflects when the event was **parsed** rather than
**received**, so it becomes inaccurate if our event loop gets bogged down.
* There's nothing event specific about this attribute. It should apply to all
controller messages.
As such moving this up to the parent class. I first spotted the bug via the
following script...
import time
from stem.control import EventType, Controller
def slow_handler(event):
print("processing a BW event that's %0.1f seconds old" % (time.time() - event.arrived_at))
time.sleep(5)
with Controller.from_port() as controller:
controller.authenticate()
controller.add_event_listener(slow_handler, EventType.BW)
time.sleep(10)
Previously this produced...
% python demo.py
processing a BW event that's 0.0 seconds old
processing a BW event that's 0.0 seconds old
processing a BW event that's 0.0 seconds old
processing a BW event that's 0.0 seconds old
... and now we get...
% python demo.py
processing a BW event that's 0.4 seconds old
processing a BW event that's 4.4 seconds old
processing a BW event that's 8.4 seconds old
Tor now provides bandwidth files over its DirPort. Adding our corresponding
function to download them...
https://trac.torproject.org/projects/tor/ticket/26902
From what I can tell these statistics are available in practice for our next
hour's consensus, but not the present one...
atagar@morrigan:~$ curl -s 128.31.0.34:9131/tor/status-vote/next/bandwidth | wc -l
8987
atagar@morrigan:~$ curl -s 128.31.0.34:9131/tor/status-vote/current/bandwidth | wc -l
0
While authoring stem's website I used this script to automatically
republish changes. Years ago I wised up and swapped to cron...
stem@staticiforme:~$ crontab -l
*/5 * * * * /home/stem/build_site
stem@staticiforme:~$ cat /home/stem/build_site
#!/bin/sh
export PATH=/home/stem/bin:$PATH
export PYTHONPATH=/home/stem/lib/python
cd /home/stem/stem
git pull
cd docs
make clean
make html
sudo -u mirroradm static-master-update-component stem.torproject.org
echo "$(date)" > /home/stem/site_last_built
This republication script was specifically for our own site and hasn't been
used in years. Dropping the script in response to...
https://trac.torproject.org/projects/tor/ticket/30593
Adding the directory module to our api docs, and fixing a few table of contents
issues...
/home/atagar/Desktop/stem/docs/api/directory.rst: WARNING: document isn't included in any toctree
/home/atagar/Desktop/stem/docs/contents.rst: WARNING: document isn't included in any toctree
/home/atagar/Desktop/stem/docs/tutorials/examples/check_digests.rst: WARNING: document isn't included in any toctree
/home/atagar/Desktop/stem/docs/tutorials/examples/download_descriptor.rst: WARNING: document isn't included in any toctree
The last outstanding one (contents.rst) is for the table of contents itself,
and adding it causes a circular reference warning...
/home/atagar/Desktop/stem/docs/contents.rst: WARNING: circular toctree references detected, ignoring: contents <- contents
Not quite sure what sphinx wants us to do about that one so leaving it alone.
Python 3.6 is deprecating invalid escape sequences [1][2], and as such
pycodestyle 2.5.0 generates warnings for them [3]...
* /home/atagar/Desktop/stem/stem/descriptor/bandwidth_file.py
line 264 - W605 invalid escape sequence '\*' | :var dict measurements: **\*** mapping of relay fingerprints to their
line 267 - W605 invalid escape sequence '\*' | :var dict header: **\*** header metadata
line 268 - W605 invalid escape sequence '\*' | :var datetime timestamp: **\*** time when these metrics were published
line 269 - W605 invalid escape sequence '\*' | :var str version: **\*** document format version
line 294 - W605 invalid escape sequence '\*' | **\*** attribute is either required when we're parsed with validation or has
The trick is that there's two layers of escaping at play...
* For Python '\*' is not a valid escape sequence, and as such as a string
it's equivilant to '\\*'...
>>> '\*' == '\\*'
True
* For Sphinx and regexes '\*' is meaningful. All the 'invalid escapes' cited
by pycodestyle are for those.
Simple to fix. This replaces all invalid escape sequences with their valid
counterpart.
[1] https://docs.python.org/3/whatsnew/3.6.html#deprecated-python-behavior
[2] https://bugs.python.org/issue27364
[3] https://trac.torproject.org/projects/tor/ticket/27270
Interesting catch from juga! I don't have a repro for the stacktrace they cite,
but the only way our input_rules could potentially be None is if two threads
attempt to parse the input_rules at the same time.
Creating a local reference for the input_rules to avoid this while remaining
lock-free...
https://trac.torproject.org/projects/tor/ticket/29899
On reflection, process initialization time is cachable whereas uptime is not.
Adding a get_start_time() method, not only for its own usefulness but because
it inherrantly lets us make get_uptime() a cached method.
Tor recently added HSFETCH support for v3 services...
https://trac.torproject.org/projects/tor/ticket/25417https://gitweb.torproject.org/torspec.git/commit/?id=34518e1
Leveraging this new capability takes nothing more than
providing the longer hidden service v3 address to the
command. This is neat since it means all we need to do
on our end is stop rejecting v3 addresses.
Our only use of is_valid_hidden_service_address() is
get_hidden_service_descriptor(), so simply adjusting our
helper to recognize both v2 and v3 addresses.
Sadly I forget where it was pointed out, but invoking the control port via
shell is a *lot* faster than stem...
#!/bin/bash -e
cmd="$@"
pass="ControlPortPassword"
function test_tor() {
echo "$1" >&3
sed "/^250 OK\r$/q" <&3
echo QUIT >&3
exec 3<&-
}
exec 3<>/dev/tcp/127.0.0.1/9051
echo AUTHENTICATE \"$pass\" >&3
read -u 3
test_tor "$cmd"
====================
atagar@morrigan:~$ time ./bench.sh 'GETINFO version' 1>/dev/null
real 0m0.007s
user 0m0.004s
sys 0m0.003s
====================
atagar@morrigan:~$ time tor-prompt --run 'GETINFO version' 1>/dev/null
real 0m0.186s
user 0m0.072s
sys 0m0.030s
Generally speaking this is expected. Spinning up an interpreter takes time. But
in doing a quick investigation realized this is quite a bit slower than it
needs to be...
total tor-prompt runtime 0.186 seconds
--------------------------------------------------------
python interpreter startup 0.016 seconds (9%)
import statements 0.079 seconds (42%)
check if tor is running 0.014 seconds (8%)
connect to tor 0.009 seconds (5%)
autocompete setup 0.065 seconds (34%)
invoke tor controller command 0.003 seconds (2%)
Autocompletion is only relevant when the user is presented with an interactive
interpreter. If we're merely invoking a command it's pointless.
So TL;DR: tor-prompt is now ~34% faster when used to invoke controller commands.
Listing a couple projects I've been meaning to take a peek at. Only took a
cursory glance but seems Mike and Iain did fantastic jobs.
I also hoped to list SelekTOR [1] somewhere. I don't know much about it, but
sounds like an interesting GUI project. However, it's java (and by extension
doesn't use stem) so I can't list it on the examples page. I have a separate
section for alternate language libraries but it's not that either, so guess I
don't have a spot to plop it. Oh well.
[1] https://www.dazzleships.net/selektor-for-linux/
Waste not, want not. I wrote this demo script for my recent status report
(https://blog.atagar.com/november2018/), but on reflection it makes a good
example for how to use our new digest methods.