mirror of
https://github.com/torproject/torspec.git
synced 2024-11-26 19:30:32 +00:00
ab9cf8322b
Now I've updated the status of everything in my proposal-status list.
201 lines
9.3 KiB
Plaintext
201 lines
9.3 KiB
Plaintext
Filename: 194-mnemonic-urls.txt
|
||
Title: Mnemonic .onion URLs
|
||
Author: Sai, Alex Fink
|
||
Created: 29-Feb-2012
|
||
Status: Superseded
|
||
|
||
1. Overview
|
||
|
||
Currently, canonical Tor .onion URLs consist of a naked 80-bit hash[1]. This
|
||
is not something that users can even recognize for validity, let alone produce
|
||
directly. It is vulnerable to partial-match fuzzing attacks[2], where a
|
||
would-be MITM attacker generates a very similar hash and uses various social
|
||
engineering, wiki poisoning, or other methods to trick the user into visiting
|
||
the spoof site.
|
||
|
||
This proposal gives an alternative method for displaying and entering .onion
|
||
and other URLs, such that they will be easily remembered and generated by end
|
||
users, and easily published by hidden service websites, without any dependency
|
||
on a full domain name type system like e.g. namecoin[3]. This makes it easier
|
||
to implement (requiring only a change in the proxy).
|
||
|
||
This proposal could equally be used for IPv4, IPv6, etc, if normal DNS is for
|
||
some reason untrusted.
|
||
|
||
This is not a petname system[4], in that it does not allow service providers
|
||
or users[5] to associate a name of their choosing to an address[6]. Rather, it
|
||
is a mnemonic system that encodes the 80 bit .onion address into a
|
||
meaningful[7] and memorable sentence. A full petname system (based on
|
||
registration of some kind, and allowing for shorter, service-chosen URLs) can
|
||
be implemented in parallel[8].
|
||
|
||
This system has the three properties of being secure, distributed, and
|
||
human-meaningful — it just doesn't also have choice of name (except of course
|
||
by brute force creation of multiple keys to see if one has an encoding the
|
||
operator likes).
|
||
|
||
This is inspired by Jonathan Ackerman's "Four Little Words" proposal[9] for
|
||
doing the same thing with IPv4 addresses. We just need to handle 80+ bits, not
|
||
just 32 bits.
|
||
|
||
It is similar to Markus Jakobsson & Ruj Akavipat's FastWord system[10], except
|
||
that it does not permit user choice of passphrase, does not know what URL a
|
||
user will enter (vs verifying against a single stored password), and again has
|
||
to encode significantly more data.
|
||
|
||
This is also similar to RFC1751[11], RFC2289[12], and multiple other
|
||
fingerprint encoding systems[13] (e.g. PGPfone[14] using the PGP
|
||
wordlist[15], and Arturo Filatsò's OnionURL[16]), but we aim to make something
|
||
that's as easy as possible for users to remember — and significantly easier
|
||
than just a list of words or pseudowords, which we consider only useful as an
|
||
active confirmation tool, not as something that can be fully memorized and
|
||
recalled, like a normal domain name.
|
||
|
||
2. Requirements
|
||
|
||
2.1. encodes at least 80 bits of random data (preferably more, eg for a
|
||
checksum)
|
||
|
||
2.2. valid, visualizable English sentence — not just a series of words[17]
|
||
|
||
2.3. words are common enough that non-native speakers and bad spellers will have
|
||
minimum difficulty remembering and producing (perhaps with some spellcheck help)
|
||
|
||
2.4. not syntactically confusable (e.g. order should not matter)
|
||
|
||
2.5. short enough to be easily memorized and fully recalled at will, not just
|
||
recognized
|
||
|
||
2.6. no dependency on an external service
|
||
|
||
2.7. dictionary size small enough to be reasonable for end users to download as
|
||
part of the onion package
|
||
|
||
2.8. consistent across users (so that websites can e.g. reinforce their random
|
||
hash's phrase with a clever drawing)
|
||
|
||
2.9. not create offensive sentences that service providers will reject
|
||
|
||
2.10. resistant against semantic fuzzing (e.g. by having uniqueness against
|
||
WordNet synsets[18])
|
||
|
||
3. Possible implementations
|
||
|
||
This section is intentionally left unfinished; full listing of template
|
||
sentences and the details of their parser and generating implementation is
|
||
co-dependent on the creation of word class dictionaries fulfilling the above
|
||
criteria. Since that's fairly labor-intensive, we're pausing at this stage for
|
||
input first, to avoid wasting work.
|
||
|
||
3.1. Have a fixed number of template sentences, such as:
|
||
|
||
1. Adj subj adv vtrans adj obj
|
||
2. Subj and subj vtrans adj obj
|
||
3. … etc
|
||
|
||
For a 6 word sentence, with 8 (3b) templates, we need ~12b (4k word)
|
||
dictionaries for each word category.
|
||
|
||
If multiple words of the same category are used, they must either play
|
||
different grammatical roles (eg subj vs obj, or adj on a different item), be
|
||
chosen from different dictionaries, or there needs to be an order-agnostic way
|
||
to join them at the bit level. Preferably this should be avoided, just to
|
||
prevent users forgetting the order.
|
||
|
||
3.2. As 3.1, but treat sentence generation as decoding a prefix code, and have
|
||
a Huffman code for each word class.
|
||
|
||
We suppose it’s okay if the generated sentence has a few more words than it
|
||
might, as long as they’re common lean words. E.g., for adjectives, "good"
|
||
might cost only six bits while "unfortunate" costs twelve.
|
||
|
||
Choice between different sentence syntaxes could be worked into the prefix
|
||
code as well, and potentially done separately for each syntactic constituent.
|
||
|
||
4. Usage
|
||
|
||
To form mnemonic .onion URL, just join the words with dashes or underscores,
|
||
stripping minimal words like 'a', 'the', 'and' etc., and append '.onion'. This
|
||
can be readily distinguished from standard hash-style .onion URLs by form.
|
||
|
||
Translation should take place at the client — though hidden service servers
|
||
should also be able to output the mnemonic form of hashes too, to assist
|
||
website operators in publishing them (e.g. by posting an amusing drawing of
|
||
the described situation on their website to reinforce the mnemonic).
|
||
|
||
After the translation stage of name resolution, everything proceeds as normal
|
||
for an 80-bit hash onion URL.
|
||
|
||
The user should be notified of the mnemonic form of hash URL in some way, and
|
||
have an easy way in the client UI to translate mnemonics to hashes and vice
|
||
versa. For the purposes of browser URLs and the like though, the mnemonic
|
||
should be treated on par with the hash; if the user enters a mnemonic URL they
|
||
should not become redirected to the hash version. (If anything, the opposite
|
||
may be true, so that users become used to seeing and verifying the mnemonic
|
||
version of hash URLs, and gain the security benefits against partial-match
|
||
fuzzing.)
|
||
|
||
Ideally, inputs that don't validly resolve should have a response page served
|
||
by the proxy that uses a simple spell-check system to suggest alternate domain
|
||
names that are valid hash encodings. This could hypothetically be done inline
|
||
in URL input, but would require changes on the browser (normally domain names
|
||
aren't subject so spellcheck), and this avoids that implementation problem.
|
||
|
||
5. International support
|
||
|
||
It is not possible for this scheme to support non-English languages without
|
||
a) (usually) Unicode in domains (which is not yet well supported by browsers),
|
||
and
|
||
b) fully customized dictionaries and phrase patterns per language
|
||
|
||
The scheme must not be used in an attempted 'translation' by simply replacing
|
||
English words with glosses in the target language. Several of the necessary
|
||
features would be completely mangled by this (e.g. other languages have
|
||
different synonym, homonym, etc groupings, not to mention completely different
|
||
grammar).
|
||
|
||
It is unlikely a priori that URLs constructed using a non-English
|
||
dictionary/pattern setup would in any sense 'translate' semantically to
|
||
English; more likely is that each language would have completely unrelated
|
||
encodings for a given hash.
|
||
|
||
We intend to only make an English version at first, to avoid these issues
|
||
during testing.
|
||
|
||
________________
|
||
|
||
[1] https://trac.torproject.org/projects/tor/wiki/doc/HiddenServiceNames
|
||
https://gitweb.torproject.org/torspec.git/blob/HEAD:/address-spec.txt
|
||
[2] http://www.thc.org/papers/ffp.html
|
||
[3] http://dot-bit.org/Namecoin
|
||
[4] https://en.wikipedia.org/wiki/Zooko's_triangle
|
||
[5] https://addons.mozilla.org/en-US/firefox/addon/petname-tool/
|
||
[6] However, service operators can generate a large number of hidden service
|
||
descriptors and check whether their hashes result in a desirable phrasal
|
||
encoding (much like certain hidden services currently use brute force generated
|
||
hashes to ensure their name is the prefix of their raw hash). This won't get you
|
||
whatever phrase you want, but will at least improve the likelihood that it's
|
||
something amusing and acceptable.
|
||
[7] "Meaningful" here inasmuch as e.g. "Barnaby thoughtfully mangles simplistic
|
||
yellow camels" is an absurdist but meaningful sentence. Absurdness is a feature,
|
||
not a bug; it decreases the probability of mistakes if the scenario described is
|
||
not one that the user would try to fit into a template of things they have
|
||
previously encountered IRL. See research into linguistic schema for further
|
||
details.
|
||
[8] https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-oni
|
||
on-nyms.txt
|
||
[9] http://blog.rabidgremlin.com/2010/11/28/4-little-words/
|
||
[10] http://fastword.me/
|
||
[11] https://tools.ietf.org/html/rfc1751
|
||
[12] http://tools.ietf.org/html/rfc2289
|
||
[13] https://github.com/singpolyma/mnemonicode
|
||
http://mysteryrobot.com
|
||
https://github.com/zacharyvoase/humanhash
|
||
[14] http://www.mathcs.duq.edu/~juola/papers.d/icslp96.pdf
|
||
[15] http://en.wikipedia.org/wiki/PGP_word_list
|
||
[16] https://github.com/hellais/Onion-url
|
||
https://github.com/hellais/Onion-url/blob/master/dev/mnemonic.py
|
||
[17] http://www.reddit.com/r/technology/comments/ecllk
|
||
[18] http://wordnet.princeton.edu/wordnet/man2.1/wnstats.7WN.html
|
||
|