So it looks like sed added a '\r' between the closing paren and the semi. Grepping for '^;' failed because the '\r' was considered part of the previous line, so it showed no hits. I finally had to write a C program to properly identify and fix those damn stray semicolons.
* remove superfluous semicolon
* Remove C-style casts from public headers
clang warns about them with -Wold-style-cast. It also warns about
implicitly casting away const with -Wcast-qual. Fix both by removing
unnecessary casts and converting the remaining ones to C++ casts.
Add m_isSpecial, m_mandatoryBlockSize and m_optimalBufferSize members. The additional members stabilize running times and avoid some unnecessary calculations. Previously we were calculating some values in each call to Put and LastPut.
StreamTransformationFilter had a small hack to accomodate AuthenticatedEncryptionFilter and AuthenticatedDecryptionFilter. The hack was enough to support CCM, EAX and GCM modes, which looks a lot like a regular stream cipher from the filter framework point of view.
OCB is slightly different. To the filter framework it looks like a block cipher with an unusual last block size and padding scheme. OCB uses MandatoryBlockSize() == BlockSize() and MinLastBlockSize() == 1 with custom padding of the last block (see the handling of P_* and A_* in the RFC). The unusual config causes the original StreamTransformationFilter assert to fire even though OCB is in a normal configuration.
For the time being, we are trying to retain the assert becuase it is a useful diagnostic. Its possible another authenticated encryption mode, like AEZ or NORX, will cause the assert to incorrectly fire (yet again). We will cross that bridge when we come to it.
The strategy of "cleanup under-aligned buffers" is not scaling well. Corner cases are still turing up. The library has some corner-case breaks, like old 32-bit Intels. And it still has not solved the AltiVec and Power8 alignment problems.
For now we are backing out the changes and investigating other strategies
This commit supports the upcoming AltiVec and Power8 processor support. The commit favors AlignedSecByteBlock over SecByteBlock in places where messages are handled on the AltiVec and Power8 processor data paths. The data paths include all block cipher modes of operation, and some filters like
Intel and ARM processors are tolerant of under-aligned buffers when using crypto intstructions. AltiVec and Power8 are less tolerant, and they simply ignore the three low-order bits to ensure an address is aligned. The AltiVec and Power8 have caused a fair number of wild writes on the stack and in the heap.
Testing on a 64-bit Intel Skylake show a marked improvement in performance. We suspect GCC is generating better code since it knows the alignment of the pointers, and does not have to emit fixup code for under-aligned and mis-aligned data. Here are some data points:
SecByteBlock
- Poly1305: 3.4 cpb
- Blake2s: 6.7 cpb
- Blake2b: 4.5 cpb
- SipHash-2-4: 3.1 cpb
- SipHash-4-8: 3.5 cpb
- ChaCha20: 7.4 cpb
- ChaCha12: 4.6 cpb
- ChaCha8: 3.5 cpb
AlignedSecByteBlock
- Poly1305: 2.9 cpb
- Blake2s: 5.5. cpb
- Blake2b: 3.9 cpb
- SipHash-2-4: 1.9 cpb
- SipHash-4-8: 3.3 cpb
- ChaCha20: 6.0 cpb
- ChaCha12: 4.0 cpb
- ChaCha8: 2.9 cpb
Testing on an mid-2000's 32-bit VIA C7-D with SSE2+SSSE3 showed no improvement, and no performance was lost.
This reverts commit eb3b27a6a5. The change broke GCC 4.8 and unknown version of Clang on OS X. UB reported the OS X break, and JW found duplicated the break on a ARM CubieTruck with GCC 4.8.