mirror of
https://github.com/torproject/torspec.git
synced 2024-12-14 05:58:33 +00:00
765 lines
31 KiB
Plaintext
765 lines
31 KiB
Plaintext
Filename: 270-newhope-hybrid-handshake.txt
|
||
Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope
|
||
Author: Isis Lovecruft, Peter Schwabe
|
||
Created: 16 Apr 2016
|
||
Updated: 22 Jul 2016
|
||
Status: Obsolete
|
||
Depends: prop#220 prop#249 prop#264 prop#270
|
||
|
||
§0. Introduction
|
||
|
||
RebelAlliance is a post-quantum secure hybrid handshake, comprised of an
|
||
alliance between the X25519 and NewHope key exchanges.
|
||
|
||
NewHope is a post-quantum-secure lattice-based key-exchange protocol based
|
||
on the ring-learning-with-errors (Ring-LWE) problem. We propose a hybrid
|
||
handshake for Tor, based on a combination of Tor's current NTor handshake
|
||
and a shared key derived through a NewHope ephemeral key exchange.
|
||
|
||
For further details on the NewHope key exchange, the reader is referred to
|
||
"Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and
|
||
Schwabe [0][1].
|
||
|
||
For the purposes of brevity, we consider that NTor is currently the only
|
||
handshake protocol in Tor; the older TAP protocol is ignored completely, due
|
||
to the fact that it is currently deprecated and nearly entirely unused.
|
||
|
||
|
||
§1. Motivation
|
||
|
||
An attacker currently monitoring and storing circuit-layer NTor handshakes
|
||
who later has the ability to run Shor's algorithm on a quantum computer will
|
||
be able to break Tor's current handshake protocol and decrypt previous
|
||
communications.
|
||
|
||
It is unclear if and when such attackers equipped with large quantum
|
||
computers will exist, but various estimates by researchers in quantum
|
||
physics and quantum engineering give estimates of only 1 to 2 decades.
|
||
Clearly, the security requirements of many Tor users include secrecy of
|
||
their messages beyond this time span, which means that Tor needs to update
|
||
the key exchange to protect against such attackers as soon as possible.
|
||
|
||
|
||
§2. Design
|
||
|
||
An initiator and responder, in parallel, conduct two handshakes:
|
||
|
||
- An X25519 key exchange, as described in the description of the NTor
|
||
handshake in Tor proposal #216.
|
||
- A NewHope key exchange.
|
||
|
||
The shared keys derived from these two handshakes are then concatenated and
|
||
used as input to the SHAKE-256 extendable output function (XOF), as described
|
||
in FIPS-PUB-202 [2], in order to produce a shared key of the desired length.
|
||
The testvectors in §C assume that this key has a length of 32 bytes, but the
|
||
use of a XOF allows arbitrary lengths to easily support future updates of
|
||
the symmetric primitives using the key. See also §3.3.1.
|
||
|
||
|
||
§3. Specification
|
||
|
||
§3.1. Notation
|
||
|
||
Let `a || b` be the concatenation of a with b.
|
||
|
||
Let `a^b` denote the exponentiation of a to the bth power.
|
||
|
||
Let `a == b` denote the equality of a with b, and vice versa.
|
||
|
||
Let `a := b` be the assignment of the value of b to the variable a.
|
||
|
||
Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in
|
||
FIPS-PUB-202) applied to message x.
|
||
|
||
Let X25519 refer to the curve25519-based key agreement protocol described
|
||
in RFC7748 §6.1. [3]
|
||
|
||
Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do
|
||
the appropriate manipulations when generating the secret key (clearing the
|
||
low bits, twidding the high bits). Additionally, EXP() MUST include the
|
||
check for all-zero output due to the input point being of small
|
||
order (cf. RFC7748 §6).
|
||
|
||
Let `X25519_KEYID(B) == B` where B is a valid X25519 public key.
|
||
|
||
When representing an element of the Curve25519 subgroup as a byte string,
|
||
use the standard (32-byte, little-endian, x-coordinate-only) representation
|
||
for Curve25519 points.
|
||
|
||
Let `ID` be a router's identity key taken from the router microdescriptor.
|
||
In the case for relays possessing Ed25519 identity keys (cf. Tor proposal
|
||
#220), this is a 32-byte string representing the public Ed25519 identity key.
|
||
For backwards and forwards compatibility with routers which do not possess
|
||
Ed25519 identity keys, this is a 32-byte string created via the output of
|
||
H(ID).
|
||
|
||
We refer to the router as the handshake "responder", and the client (which
|
||
may be an OR or an OP) as the "initiator".
|
||
|
||
|
||
ID_LENGTH [32 bytes]
|
||
H_LENGTH [32 bytes]
|
||
G_LENGTH [32 bytes]
|
||
|
||
PROTOID := "pqtor-x25519-newhope-shake256-1"
|
||
T_MAC := PROTOID || ":mac"
|
||
T_KEY := PROTOID || ":key_extract"
|
||
T_VERIFY := PROTOID || ":verify"
|
||
|
||
(X25519_SK, X25519_PK) := X25519_KEYGEN()
|
||
|
||
|
||
§3.2. Protocol
|
||
|
||
========================================================================================
|
||
| |
|
||
| Fig. 1: The NewHope-X25519 Hybrid Handshake. |
|
||
| |
|
||
| Before the handshake the Initiator is assumed to know Z, a public X25519 key for |
|
||
| the Responder, as well as the Responder's ID. |
|
||
----------------------------------------------------------------------------------------
|
||
| |
|
||
| Initiator Responder |
|
||
| |
|
||
| SEED := H(randombytes(32)) |
|
||
| x, X := X25519_KEYGEN() |
|
||
| a, A := NEWHOPE_KEYGEN(SEED) |
|
||
| CLIENT_HDATA := ID || Z || X || A |
|
||
| |
|
||
| --- CLIENT_HDATA ---> |
|
||
| |
|
||
| y, Y := X25519_KEYGEN() |
|
||
| NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) |
|
||
| M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A) |
|
||
| SERVER_HDATA := Y || AUTH || M |
|
||
| sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
|
||
| |
|
||
| <-- SERVER_HDATA ---- |
|
||
| |
|
||
| NTOR_KEY := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) |
|
||
| NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a) |
|
||
| sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) |
|
||
| |
|
||
========================================================================================
|
||
|
||
|
||
§3.2.1. The NTor Handshake
|
||
|
||
§3.2.1.1. Prologue
|
||
|
||
Take a router with identity ID. As setup, the router generates a secret key z,
|
||
and a public onion key Z with:
|
||
|
||
z, Z := X25519_KEYGEN()
|
||
|
||
The router publishes Z in its server descriptor in the "ntor-onion-key" entry.
|
||
Henceforward, we refer to this router as the "responder".
|
||
|
||
|
||
§3.2.1.2. Initiator
|
||
|
||
To send a create cell, the initiator generates a keypair:
|
||
|
||
x, X := X25519_KEYGEN()
|
||
|
||
and creates the NTor portion of a CREATE2V cell's HDATA section:
|
||
|
||
CLIENT_NTOR := ID || Z || X [96 bytes]
|
||
|
||
The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in
|
||
the event the responder OR has recently rotated keys, the responder can
|
||
determine which keypair to use.
|
||
|
||
The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2),
|
||
to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1)
|
||
to the responder.
|
||
|
||
CLIENT_NEWHOPE [1824 bytes] (see §3.2.2)
|
||
CLIENT_HDATA := CLIENT_NTOR || CLIENT_NEWHOPE [1920 bytes]
|
||
|
||
If the responder does not respond with a CREATED2V cell, the initiator SHOULD
|
||
NOT attempt to extend the circuit through the responder by sending fragmented
|
||
EXTEND2 cells, since the responder's lack of support for CREATE2V cells is
|
||
assumed to imply the responder also lacks support for fragmented EXTEND2
|
||
cells. Alternatively, for initiators with a sufficiently late consensus
|
||
method, the initiator MUST check that "proto" line in the responder's
|
||
descriptor (cf. Tor proposal #264) advertises support for the "Relay"
|
||
subprotocol version 3 (see §5).
|
||
|
||
|
||
§3.2.1.3. Responder
|
||
|
||
The responder generates a keypair of y, Y = X25519_KEYGEN(), and does
|
||
NTOR_SHAREDB() as follows:
|
||
|
||
(NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B):
|
||
secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID
|
||
NTOR_KEY := H(secret_input, T_KEY)
|
||
verify := H(secret_input, T_VERIFY)
|
||
auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
|
||
AUTH := H(auth_input, T_MAC)
|
||
|
||
The responder sends a CREATED2V cell containing:
|
||
|
||
SERVER_NTOR := Y || AUTH [64 bytes]
|
||
SERVER_NEWHOPE [2048 bytes] (see §3.2.2)
|
||
SERVER_HDATA := SERVER_NTOR || SERVER_NEWHOPE [2112 bytes]
|
||
|
||
and sends this to the initiator.
|
||
|
||
|
||
§3.2.1.4. Finalisation
|
||
|
||
The initiator then checks Y is in G^* [see NOTE below], and does
|
||
NTOR_SHAREDA() as follows:
|
||
|
||
(NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH)
|
||
secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID
|
||
NTOR_KEY := H(secret_input, T_KEY)
|
||
verify := H(secret_input, T_VERIFY)
|
||
auth_input := verify || ID || Z || Y || X || PROTOID || "Server"
|
||
if AUTH == H(auth_input, T_MAC)
|
||
return NTOR_KEY
|
||
|
||
Both parties now have a shared value for NTOR_KEY. They expand this into
|
||
the keys needed for the Tor relay protocol.
|
||
|
||
[XXX We think we want to omit the final hashing in the production of NTOR_KEY
|
||
here, and instead put all the inputs through SHAKE-256. --isis, peter]
|
||
|
||
[XXX We probably want to remove ID and B from the input to the shared key
|
||
material, since they serve for authentication but, as pre-established
|
||
"prologue" material to the handshake, they should not be used in attempts to
|
||
strengthen the cryptographic suitability of the shared key. Also, their
|
||
inclusion is implicit in the DH exponentiations. I should probably ask Ian
|
||
about the reasoning for the original design choice. --isis]
|
||
|
||
|
||
§3.2.2. The NewHope Handshake
|
||
|
||
§3.2.2.1. Parameters & Mathematical Structures
|
||
|
||
Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient
|
||
ring ℤ/qℤ. We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials
|
||
modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials
|
||
modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to
|
||
a polynomial, we mean an element of Rq.
|
||
|
||
n := 1024
|
||
q := 12289
|
||
|
||
SEED [32 Bytes]
|
||
NEWHOPE_POLY [1792 Bytes]
|
||
NEWHOPE_REC [256 Bytes]
|
||
NEWHOPE_KEY [32 Bytes]
|
||
|
||
NEWHOPE_MSGA := (NEWHOPE_POLY || SEED)
|
||
NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC)
|
||
|
||
|
||
§3.2.2.2. High-level Description of Newhope API Functions
|
||
|
||
For a description of internal functions, see §B.
|
||
|
||
(NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED):
|
||
â := gen_a(seed)
|
||
s := poly_getnoise()
|
||
e := poly_getnoise()
|
||
ŝ := poly_ntt(s)
|
||
ê := poly_ntt(e)
|
||
b̂ := pointwise(â, ŝ) + ê
|
||
sp := poly_tobytes(ŝ)
|
||
bp := poly_tobytes(b̂)
|
||
return (sp, (bp || seed))
|
||
|
||
(NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA):
|
||
s' := poly_getnoise()
|
||
e' := poly_getnoise()
|
||
e" := poly_getnoise()
|
||
b̂ := poly_frombytes(bp)
|
||
â := gen_a(seed)
|
||
ŝ' := poly_ntt(s')
|
||
ê' := poly_ntt(e')
|
||
û := poly_pointwise(â, ŝ') + ê'
|
||
v := poly_invntt(poly_pointwise(b̂,ŝ')) + e"
|
||
r := helprec(v)
|
||
up := poly_tobytes(û)
|
||
k := rec(v, r)
|
||
return ((up || r), k)
|
||
|
||
NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY):
|
||
û := poly_frombytes(up)
|
||
ŝ := poly_frombytes(sp)
|
||
v' := poly_invntt(poly_pointwise(û, ŝ))
|
||
k := rec(v', r)
|
||
return k
|
||
|
||
When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use
|
||
that SEED in any other CREATE2V or EXTEND2 cells. See §4 for further
|
||
discussion.
|
||
|
||
|
||
§3.3. Key Expansion
|
||
|
||
The client and server derive a shared key, SHARED, by:
|
||
|
||
HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR"
|
||
SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey)
|
||
|
||
|
||
§3.3.1. Note on the Design Choice
|
||
|
||
The reader may wonder why one would use SHAKE-256 to produce a 256-bit
|
||
output, since the security strength in bits for SHAKE-256 is min(d/2,256)
|
||
for collision resistance and min(d,256) for first- and second-order
|
||
preimages, where d is the output length.
|
||
|
||
The reasoning is that we should be aiming for 256-bit security for all of
|
||
our symmetric cryptography. One could then argue that we should just use
|
||
SHA3-256 for the KDF. We choose SHAKE-256 instead in order to provide an
|
||
easy way to derive longer shared secrets in the future without requiring a
|
||
new handshake. The construction is odd, but the future is bright.
|
||
As we are already using SHAKE-256 for the 32-byte output hash, we are also
|
||
using it for all other 32-byte hashes involved in the protocol. Note that
|
||
the only difference between SHA3-256 and SHAKE-256 with 32-byte output is
|
||
one domain-separation byte.
|
||
|
||
[XXX why would you want 256-bit security for the symmetric side? Are you
|
||
talking pre- or post-quantum security? --peter]
|
||
|
||
|
||
§4. Security & Anonymity Implications
|
||
|
||
This handshake protocol is one-way authenticated. That is, the server is
|
||
authenticated, while the client remains anonymous.
|
||
|
||
The client MUST NOT cache and reuse SEED. Doing so gives non-trivial
|
||
adversarial advantages w.r.t. all-for-the-price-of-one attacks during the
|
||
caching period. More importantly, if the SEED used to generate NEWHOPE_MSGA
|
||
is reused for handshakes along the same circuit or multiple different
|
||
circuits, an adversary conducting a sybil attack somewhere along the path(s)
|
||
will be able to correlate the identity of the client across circuits or
|
||
hops.
|
||
|
||
|
||
§5. Compatibility
|
||
|
||
Because our proposal requires both the client and server to send more than
|
||
the 505 bytes possible within a CREATE2 cell's HDATA section, it depends
|
||
upon the implementation of a mechanism for allowing larger CREATE cells
|
||
(cf. Tor proposal #249).
|
||
|
||
We reserve the following handshake type for use in CREATE2V/CREATED2V and
|
||
EXTEND2V/EXTENDED2V cells:
|
||
|
||
0x0003 [NEWHOPE + X25519 HYBRID HANDSHAKE]
|
||
|
||
We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264
|
||
§5.3) to signify support this handshake, and hence for the CREATE2V and
|
||
fragmented EXTEND2 cells which it requires.
|
||
|
||
There are no additional entries or changes required within either router
|
||
descriptors or microdescriptors to support this handshake method, due to the
|
||
NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519
|
||
public keys already being included within the "ntor-onion-key" entry.
|
||
|
||
Add a "UseNewHopeKEX" configuration option and a corresponding consensus
|
||
parameter to control whether clients prefer using this NewHope hybrid
|
||
handshake or some previous handshake protocol. If the configuration option
|
||
is "auto", clients SHOULD obey the consensus parameter. The default
|
||
configuration SHOULD be "auto" and the consensus value SHOULD initially be "0".
|
||
|
||
|
||
§6. Implementation
|
||
|
||
The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software
|
||
implementations of NewHope, one C reference implementation and an optimized
|
||
implementation using AVX2 vector instructions. Those implementations are
|
||
available at [1].
|
||
|
||
Additionally, there are implementations in Go by Yawning Angel, available
|
||
from [4] and in Rust by Isis Lovecruft, available from [5].
|
||
|
||
The software used to generate the test vectors in §C is based on the C
|
||
reference implementation and available from:
|
||
|
||
https://code.ciph.re/isis/newhope-tor-testvectors
|
||
https://github.com/isislovecruft/newhope-tor-testvectors
|
||
|
||
|
||
§7. Performance & Scalability
|
||
|
||
The computationally expensive part in the current NTor handshake is the
|
||
X25519 key-pair generation and the X25519 shared-key computation. The
|
||
current implementation in Tor is a wrapper to support various highly optimized
|
||
implementations on different architectures. On Intel Haswell processors, the
|
||
fastest implementation of X25519, as reported by the eBACS benchmarking
|
||
project [6], takes 169920 cycles for key-pair generation and 161648 cycles
|
||
for shared-key computation; these add up to a total of 331568 cycles on each
|
||
side (initiator and responder).
|
||
|
||
The C reference implementation of NewHope, also benchmarked on Intel
|
||
Haswell, takes 358234 cycles for the initiator and 402058 cycles for the
|
||
Responder. The core computation of the proposed combination of NewHope and
|
||
X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator
|
||
and a slowdown by a factor of 2.2 for the Responder compared to the current
|
||
NTor handshake. These numbers assume a fully optimized implementation of the
|
||
NTor handshake and a C reference implementation of NewHope. With optimized
|
||
implementations of NewHope, such as the one for Intel Haswell described in
|
||
[0], the computational slowdown will be considerably smaller than a factor
|
||
of 2.
|
||
|
||
|
||
§8. References
|
||
|
||
[0]: https://cryptojedi.org/papers/newhope-20160328.pdf
|
||
[1]: https://cryptojedi.org/crypto/#newhope
|
||
[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061
|
||
[3]: https://tools.ietf.org/html/rfc7748#section-6.1
|
||
[4]: https://github.com/Yawning/newhope
|
||
[5]: https://code.ciph.re/isis/newhopers
|
||
[6]: http://bench.cr.yp.to
|
||
|
||
|
||
§A. Cell Formats
|
||
|
||
§A.1. CREATE2V Cells
|
||
|
||
The client portion of the handshake should send CLIENT_HDATA, formatted
|
||
into a CREATE2V cell as follows:
|
||
|
||
CREATE2V { [2114 bytes]
|
||
HTYPE := 0x0003 [2 bytes]
|
||
HLEN := 0x0780 [2 bytes]
|
||
HDATA := CLIENT_HDATA [1920 bytes]
|
||
IGNORED := 0x00 [194 bytes]
|
||
}
|
||
|
||
[XXX do we really want to pad with IGNORED to make CLIENT_HDATA the
|
||
same number of bytes as SERVER_HDATA? --isis]
|
||
|
||
§A.2. CREATED2V Cells
|
||
|
||
The server responds to the client's CREATE2V cell with SERVER_HDATA,
|
||
formatted into a CREATED2V cell as follows:
|
||
|
||
CREATED2V { [2114 bytes]
|
||
HLEN := 0x0800 [2 bytes]
|
||
HDATA := SERVER_HDATA [2112 bytes]
|
||
IGNORED := 0x00 [0 bytes]
|
||
}
|
||
|
||
§A.3. Fragmented EXTEND2 Cells
|
||
|
||
When the client wishes to extend a circuit, the client should fragment
|
||
CLIENT_HDATA into four EXTEND2 cells:
|
||
|
||
EXTEND2 {
|
||
NSPEC := 0x02 { [1 byte]
|
||
LINK_ID_SERVER [22 bytes] XXX
|
||
LINK_ADDRESS_SERVER [8 bytes] XXX
|
||
}
|
||
HTYPE := 0x0003 [2 bytes]
|
||
HLEN := 0x0780 [2 bytes]
|
||
HDATA := CLIENT_HDATA[0,461] [462 bytes]
|
||
}
|
||
EXTEND2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := CLIENT_HDATA[462,954] [492 bytes]
|
||
}
|
||
EXTEND2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := CLIENT_HDATA[955,1447] [492 bytes]
|
||
}
|
||
EXTEND2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := CLIENT_HDATA[1448,1919] || 0x00[20] [492 bytes]
|
||
}
|
||
EXTEND2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := 0x00[172] [172 bytes]
|
||
}
|
||
|
||
The client sends this to the server to extend the circuit from, and that
|
||
server should format the fragmented EXTEND2 cells into a CREATE2V cell, as
|
||
described in §A.1.
|
||
|
||
§A.4. Fragmented EXTENDED2 Cells
|
||
|
||
EXTENDED2 {
|
||
NSPEC := 0x02 { [1 byte]
|
||
LINK_ID_SERVER [22 bytes] XXX
|
||
LINK_ADDRESS_SERVER [8 bytes] XXX
|
||
}
|
||
HTYPE := 0x0003 [2 bytes]
|
||
HLEN := 0x0800 [2 bytes]
|
||
HDATA := SERVER_HDATA[0,461] [462 bytes]
|
||
}
|
||
EXTENDED2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := SERVER_HDATA[462,954] [492 bytes]
|
||
}
|
||
EXTEND2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := SERVER_HDATA[955,1447] [492 bytes]
|
||
}
|
||
EXTEND2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := SERVER_HDATA[1448,1939] [492 bytes]
|
||
}
|
||
EXTEND2 {
|
||
NSPEC := 0x00 [1 byte]
|
||
HTYPE := 0xFFFF [2 bytes]
|
||
HLEN := 0x0000 [2 bytes]
|
||
HDATA := SERVER_HDATA[1940,2112] [172 bytes]
|
||
}
|
||
|
||
|
||
§B. NewHope Internal Functions
|
||
|
||
gen_a(SEED): returns a uniformly random poly
|
||
poly_getnoise(): returns a poly sampled from a centered binomial
|
||
poly_ntt(poly): number-theoretic transform; returns a poly
|
||
poly_invntt(poly): inverse number-theoretic transform; returns a poly
|
||
poly_pointwise(poly, poly): pointwise multiplication; returns a poly
|
||
poly_tobytes(poly): packs a poly to a NEWHOPE_POLY byte array
|
||
poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly
|
||
|
||
helprec(poly): returns a NEWHOPE_REC byte array
|
||
rec(poly, NEWHOPE_REC): returns a NEWHOPE_KEY
|
||
|
||
|
||
--- Description of the Newhope internal functions ---
|
||
|
||
gen_a(SEED seed) receives as input a 32-byte (public) seed. It expands
|
||
this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128
|
||
is considered a sequence of 16-bit little-endian integers. This sequence is
|
||
used to initialize the coefficients of the returned polynomial from the least
|
||
significant (coefficient of X^0) to the most significant (coefficient of
|
||
X^1023) coefficient. For each of the 16-bit integers first eliminate the
|
||
highest two bits (to make it a 14-bit integer) and then use it as the next
|
||
coefficient if it is smaller than q=12289.
|
||
Note that the amount of output required from SHAKE to initialize all 1024
|
||
coefficients of the polynomial varies depending on the input seed.
|
||
Note further that this function does not process any secret data and thus does
|
||
not need any timing-attack protection.
|
||
|
||
|
||
poly_getnoise() first generates 4096 bytes of uniformly random data. This can
|
||
be done by reading these bytes from the system's RNG; efficient
|
||
implementations will typically only read a 32-byte seed from the system's RNG
|
||
and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR
|
||
mode). The output of the PRG is considered an array of 2048 16-bit integers
|
||
r[0],...,r[2047]. The coefficients of the output polynomial are computed as
|
||
HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW
|
||
stands for Hamming weight.
|
||
Note that the choice of RNG is a local decision; different implementations are
|
||
free to use different RNGs.
|
||
Note further that the output of this function is secret; the PRG (and the
|
||
computation of HW) need to be protected against timing attacks.
|
||
|
||
|
||
poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a
|
||
pseudocode description of a very naive in-place transformation of an input
|
||
polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
|
||
following code (all arithmetic on coefficients performed modulo q):
|
||
|
||
psi = 7
|
||
omega = 49
|
||
|
||
for i in range(0,n):
|
||
t[i] = f[i] * psi^i
|
||
|
||
for i in range(0,n):
|
||
f[i] = 0
|
||
for j in range(0,n):
|
||
f[i] += t[j] * omega^((i*j)%n)
|
||
|
||
Note that this is not how poly_ntt should be implemented if performance is
|
||
an issue; in particular, efficient algorithms for the number-theoretic
|
||
transform take time O(n*log(n)) and not O(n^2)
|
||
Note further that all arithmetic in poly_ntt has to be protected against
|
||
timing attacks.
|
||
|
||
|
||
poly_invntt(poly f): For a mathematical description of poly_invntt see the
|
||
[0]; a pseudocode description of a very naive in-place transformation of an
|
||
input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the
|
||
following code (all arithmetic on coefficients performed modulo q):
|
||
|
||
invpsi = 8778;
|
||
invomega = 1254;
|
||
invn = 12277;
|
||
|
||
for i in range(0,n):
|
||
t[i] = f[i];
|
||
|
||
for i in range(0,n):
|
||
f[i]=0;
|
||
for j in range(0,n):
|
||
f[i] += t[j] * invomega^((i*j)%n)
|
||
f[i] *= invpsi^i
|
||
f[i] *= invn
|
||
|
||
Note that this is not how poly_invntt should be implemented if performance
|
||
is an issue; in particular, efficient algorithms for the inverse
|
||
number-theoretic transform take time O(n*log(n)) and not O(n^2)
|
||
Note further that all arithmetic in poly_invntt has to be protected against
|
||
timing attacks.
|
||
|
||
|
||
poly_pointwise(poly f, poly g) performs pointwise multiplication of the two
|
||
polynomials. This means that for f = (f0 + f1*X + f2*X^2 + ... +
|
||
f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes
|
||
and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0,
|
||
h1 = f1*g1,..., h1023 = f1023*g1023.
|
||
|
||
|
||
poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e.,
|
||
brings them to the interval [0,q-1]. Denote these reduced coefficients as
|
||
f0,..., f1023; note that they all fit into 14 bits. The function then packs
|
||
those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed
|
||
little-endian representation", i.e.,
|
||
r[0] = f[0] & 0xff;
|
||
r[1] = (f[0] >> 8) & ((f[1] & 0x03) << 6)
|
||
r[2] = (f[1] >> 2) & 0xff;
|
||
r[3] = (f[1] >> 10) & ((f[2] & 0x0f) << 4)
|
||
.
|
||
.
|
||
.
|
||
r[1790] = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2)
|
||
r[1791] = f[1023] >> 6
|
||
Note that this function needs to be protected against timing attacks. In
|
||
particular, avoid non-constant-time conditional subtractions (or other
|
||
non-constant-time expressions) in the reduction modulo q of the coefficients.
|
||
|
||
|
||
poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives
|
||
as input an array of 1792 bytes and coverts it into the internal
|
||
representation of a poly. Note that poly_frombytes does not need to check
|
||
whether the coefficients are reduced modulo q or reduce coefficients modulo
|
||
q. Note further that the function must not leak any information about its
|
||
inputs through timing information, as it is also applied to the secret key
|
||
of the initiator.
|
||
|
||
|
||
helprec(poly f) computes 256 bytes of reconciliation information from the
|
||
input poly f. Internally, one byte of reconciliation information is computed
|
||
from four coefficients of f by a function helprec4. Let the input polynomial f
|
||
= (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be
|
||
r[0],...r[256]. This output byte array is computed as
|
||
r[0] = helprec4(f0,f256,f512,f768)
|
||
r[1] = helprec4(f1,f257,f513,f769)
|
||
r[2] = helprec4(f2,f258,f514,f770)
|
||
.
|
||
.
|
||
.
|
||
r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following:
|
||
|
||
helprec4(x0,x1,x2,x3):
|
||
b = randombit()
|
||
r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b)
|
||
r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6)
|
||
return r
|
||
|
||
The function CVPD4 does the following:
|
||
|
||
CVPD4(y0,y1,y2,y3):
|
||
v00 = round(y0/2q)
|
||
v01 = round(y1/2q)
|
||
v02 = round(y2/2q)
|
||
v03 = round(y3/2q)
|
||
v10 = round((y0-1)/2q)
|
||
v11 = round((y1-1)/2q)
|
||
v12 = round((y2-1)/2q)
|
||
v13 = round((y3-1)/2q)
|
||
t = abs(y0 - 2q*v00)
|
||
t += abs(y1 - 2q*v01)
|
||
t += abs(y2 - 2q*v02)
|
||
t += abs(y3 - 2q*v03)
|
||
if(t < 2q):
|
||
v0 = v00
|
||
v1 = v01
|
||
v2 = v02
|
||
v3 = v03
|
||
k = 0
|
||
else
|
||
v0 = v10
|
||
v1 = v11
|
||
v2 = v12
|
||
v3 = v13
|
||
r = 1
|
||
return (v0-v3,v1-v3,v2-v3,k+2*v3)
|
||
|
||
In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to
|
||
the largest integer that does not exceed x; abs() returns the absolute
|
||
value.
|
||
Note that all computations involved in helprec operate on secret data and must
|
||
be protected against timing attacks.
|
||
|
||
|
||
rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from
|
||
f and r. Specifically, it computes one bit of key from 4 coefficients of f and
|
||
one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r =
|
||
r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let
|
||
the bits of the output by k0,...,k255, where
|
||
k0 = k[0] & 0x01
|
||
k1 = (k[0] >> 1) & 0x01
|
||
k2 = (k[0] >> 2) & 0x01
|
||
.
|
||
.
|
||
.
|
||
k8 = k[1] & 0x01
|
||
k9 = (k[1] >> 1) & 0x01
|
||
.
|
||
.
|
||
.
|
||
k255 = (k[32] >> 7)
|
||
The function rec computes k0,...,k255 as
|
||
k0 = rec4(f0,f256,f512,f768,r[0])
|
||
k1 = rec4(f1,f257,f513,f769,r[1])
|
||
.
|
||
.
|
||
.
|
||
k255 = rec4(f255,f511,f767,f1023,r[255])
|
||
|
||
The function rec4 does the following:
|
||
|
||
rec4(y0,y1,y2,y3,r):
|
||
r0 = r & 0x03
|
||
r1 = (r >> 2) & 0x03
|
||
r2 = (r >> 4) & 0x03
|
||
r3 = (r >> 6) & 0x03
|
||
Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3)
|
||
|
||
The function Decode does the following:
|
||
|
||
Decode(v0,v1,v2,v3):
|
||
t0 = round(v0/8q)
|
||
t1 = round(v1/8q)
|
||
t2 = round(v2/8q)
|
||
t3 = round(v3/8q)
|
||
t = abs(v0 - 8q*t0)
|
||
t += abs(v0 - 8q*t0)
|
||
t += abs(v0 - 8q*t0)
|
||
t += abs(v0 - 8q*t0)
|
||
if(t > 1) return 1
|
||
else return 0
|
||
|
||
|
||
§C. Test Vectors
|