This commit supports the upcoming AltiVec and Power8 processor. This commit affects a number of classes due to the ubiquitous use of AES. The commit adds debug asserts to warn of under-aligned and misaligned buffers in debug builds.
This commit supports the upcoming AltiVec and Power8 processor. This commit affects a number of classes due to the ubiquitous use of AES. The commit provides the data alignment requirements.
This commit supports the upcoming AltiVec and Power8 processor support for stream ciphers. This commit affects GlobalRNG() most because its an AES-based generator. The commit favors AlignedSecByteBlock over SecByteBlock in places where messages are handled on the AltiVec and Power8 processor data paths. The data paths include all block cipher modes of operation, and some filters like FilterWithBufferedInput.
Intel and ARM processors are tolerant of under-aligned buffers when using crypto instructions. AltiVec and Power8 are less tolerant, and they simply ignore the three low-order bits to ensure an address is aligned. The AltiVec and Power8 have caused a fair number of wild writes on the stack and in the heap.
Testing on a 64-bit Intel Skylake show a marked improvement in performance. We suspect GCC is generating better code since it knows the alignment of the pointers, and does not have to emit fixup code for under-aligned and mis-aligned data. Testing on an mid-2000s 32-bit VIA C7-D with SSE2+SSSE3 showed no improvement, and no performance was lost.
This commit supports the upcoming AltiVec and Power8 processor support for DefaultEncryptors and DefaultDecryptors. The commit favors AlignedSecByteBlock over SecByteBlock in places where messages are handled on the AltiVec and Power8 processor data paths. The data paths include all block cipher modes of operation, and some filters like FilterWithBufferedInput.
Intel and ARM processors are tolerant of under-aligned buffers when using crypto intstructions. AltiVec and Power8 are less tolerant, and they simply ignore the three low-order bits to ensure an address is aligned. The AltiVec and Power8 have caused a fair number of wild writes on the stack and in the heap.
Testing on a 64-bit Intel Skylake show a marked improvement in performance. We suspect GCC is generating better code since it knows the alignment of the pointers, and does not have to emit fixup code for under-aligned and mis-aligned data. Testing on an mid-2000's 32-bit VIA C7-D with SSE2+SSSE3 showed no improvement, and no performance was lost.
This commit supports the upcoming AltiVec and Power8 processor support. The commit favors AlignedSecByteBlock over SecByteBlock in places where messages are handled on the AltiVec and Power8 processor data paths. The data paths include all block cipher modes of operation, and some filters like
Intel and ARM processors are tolerant of under-aligned buffers when using crypto intstructions. AltiVec and Power8 are less tolerant, and they simply ignore the three low-order bits to ensure an address is aligned. The AltiVec and Power8 have caused a fair number of wild writes on the stack and in the heap.
Testing on a 64-bit Intel Skylake show a marked improvement in performance. We suspect GCC is generating better code since it knows the alignment of the pointers, and does not have to emit fixup code for under-aligned and mis-aligned data. Here are some data points:
SecByteBlock
- Poly1305: 3.4 cpb
- Blake2s: 6.7 cpb
- Blake2b: 4.5 cpb
- SipHash-2-4: 3.1 cpb
- SipHash-4-8: 3.5 cpb
- ChaCha20: 7.4 cpb
- ChaCha12: 4.6 cpb
- ChaCha8: 3.5 cpb
AlignedSecByteBlock
- Poly1305: 2.9 cpb
- Blake2s: 5.5. cpb
- Blake2b: 3.9 cpb
- SipHash-2-4: 1.9 cpb
- SipHash-4-8: 3.3 cpb
- ChaCha20: 6.0 cpb
- ChaCha12: 4.0 cpb
- ChaCha8: 2.9 cpb
Testing on an mid-2000's 32-bit VIA C7-D with SSE2+SSSE3 showed no improvement, and no performance was lost.
Since removing the allocator overloards that handled the wipe mark, we have to route deallocate into the standard one. The standard one fires an assert for [now] normal operation
There was no aliasing violation in practice. We used a to assign the right pointer. If the compiler would have removed the unneeded assignment based on T_64bit, then we would not have been flagged.
* Fix build on FreeBSD 10.3 x86 with clang++ v. 3.4.1. The x64 build (also clang++ 3.4.1) doesn't require CRYPTOPP_DISABLE_SHA_ASM. It seems to be a bug specific to the x86 version of clang++.
* Based on suggestion from @noloader, don't split x86/x64 clang++ version detection. Just wait until clang++ is consistently working in both x86/x64.
16-byte aligned is the default for most systems nowadays, so we side stepped alignment problems on all platforms except 32-bit Solaris. We need the 16-byte alignment for all Intel compatibles since the late 1990s, which is nearly all processors in the class.
The worst case is, if a processor lacks SSE2, then it gets an aligned SecBlock anyways. The last time we saw processors without the features was 486 and early Pentiums, and that was 1996 or so. Even low-end processors like Intel Atoms and VIA have SSE2+SSSE3.
Also see "Enable 16-byte alignment full-time for i386 and x86_64?" (https://groups.google.com/forum/#!topic/cryptopp-users/ubp-gFC1BJI) for a discussion.
This reverts commit 64def346cd. It broke AppVeyor and Travis builds (it tested good locally on Intel, Aarch and Solaris i86). CMake is so fucked up. I regret the day we added it to the project.