Commit Graph

39 Commits

Author SHA1 Message Date
Jeffrey Walton
7f374faf52
Cleanup adv-simd.h for x86 2018-08-12 19:04:14 -04:00
Jeffrey Walton
3d6c8d9589
Update comments 2018-08-12 01:12:00 -04:00
Jeffrey Walton
6cd7f83346
Cleanup PPC vector functions
The Crypto++ functions follow IBM's lead and provide VectorLoad, VectorLoadBE, VectorStore, and VectorStoreBE. Additionally, VectorLoadKey was removed in favor of vanilla VectorLoad.
2018-08-06 05:15:12 -04:00
Tanzinul Islam
da00422d3c Fix build with Embarcadero C++Builder 10.2.3 (#696)
Fix two compilation errors encountered with C++Builder (Starter Edition):

 - In `cpu.cpp`, 0ccdc197b introduced a dependency on `_xgetbv()` from `<immintrin.h>` that doesn't exist on C++Builder. Enlist it for the workaround, similar to SunCC in 692ed2a2b.

 - In `adv-simd.h`, `<pmmintrin.h>` is being #included under the `CRYPTOPP_SSE2_INTRIN_AVAILABLE` macro. This header, [which apparently provides SSE3 intrinsics](https://stackoverflow.com/a/11228864/1433768), is not shipped with C++Builder. (This section of code was recently downgraded from a SSSE3 to a SSE2 block in 09c8ae28, followed by moving away from `<immintrin.h>` in bc8da71a, followed by reintroducing the SSSE3 check in d1e646a5.) Split the SSE2 and SSSE3 cases such that `<pmmintrin.h>` is not #included for SSE2. This seems safe to do, because some `git grep` analysis shows that:
    - `adv-simd.h` is not #included by any other header, but only directly #included by some `.cpp` files.
    - Among those `.cpp` files, only `sm4-simd.cpp` has a `CRYPTOPP_SSE2_INTRIN_AVAILABLE` preprocessor block, and there it again includes the other two headers (`<emmintrin.h>` and `<xmmintrin.h>`).

NOTE: I was compiling via the IDE after [setting up a project file](https://github.com/tanzislam/cryptopals/wiki/Importing-into-Embarcadero-C%E2%94%BC%E2%94%BCBuilder-Starter-10.2#using-the-crypto-library). My compilation command was effectively:

```
bcc32c.exe -DCRYPTOPP_NO_CXX11 -DCRYPTOPP_DISABLE_SSSE3 -D__SSE2__ -D__SSE__ -D__MMX__
```
2018-08-04 22:54:36 -04:00
Jeffrey Walton
d1e646a589
Fix SunStudio 12.6 compile on i386 2018-07-16 09:37:08 -04:00
Jeffrey Walton
4e3a1ea962
Add ARMv8.4 cpu feature detection support (GH #685) (#687)
This PR adds ARMv8.4 cpu feature detection support. Previously we only needed ARMv8.1 and things were much easier. For example, ARMv8.1 `__ARM_FEATURE_CRYPTO` meant PMULL, AES, SHA-1 and SHA-256 were available. ARMv8.4 `__ARM_FEATURE_CRYPTO` means PMULL, AES, SHA-1, SHA-256, SHA-512, SHA-3, SM3 and SM4 are  available. 

We still use the same pattern as before. We make something available based on compiler version and/or preprocessor macros. But this time around we had to tighten things up a bit to ensure ARMv8.4 did not cross-pollinate down into ARMv8.1.

ARMv8.4 is largely untested at the moment. There is no hardware in the field and CI lacks QEMU with the relevant patches/support. We will probably have to revisit some of this stuff in the future.

Since this update applies to ARM gadgets we took the time to expand Android and iOS testing on Travis. Travis now tests more platforms, and includes Autotools and CMake builds, too.
2018-07-15 08:35:14 -04:00
Jeffrey Walton
7cc6531dd2
Clear unused variable warning 2018-07-14 12:59:42 -04:00
Jeffrey Walton
c6c44aa5d1
Add PtrAdd and PtrSub helper functions
This helps contain UB on pointer subtraction by ensuring a ptrdiff_t is used. The code is a little uglier but it is also more portable.
2018-07-10 05:00:02 -04:00
Jeffrey Walton
bc8da71ab3
Fix early Fedora compiles 2018-07-06 01:14:28 -04:00
Jeffrey Walton
ac1439de59
Update documentation 2018-07-01 22:25:07 -04:00
Jeffrey Walton
7eaccfa47b
Update comments 2018-07-01 04:03:30 -04:00
Jeffrey Walton
d6cde47bbd
Update documentation 2018-07-01 03:53:45 -04:00
Jeffrey Walton
c58ea35e23
Update documentation 2018-07-01 03:42:17 -04:00
Jeffrey Walton
64d15aff66
Update documentation 2018-07-01 03:29:12 -04:00
Jeffrey Walton
4a7814be7e
Remove alignment of double for 64-bit template 2018-07-01 02:00:10 -04:00
Jeffrey Walton
810f5c1859
Remove GCC_NO_UBSAN and double casts 2018-07-01 01:23:35 -04:00
Jeffrey Walton
09c8ae2835
Use inline for LEA_Encryption and LEA_Decryption 2018-06-23 12:58:55 -04:00
Jeffrey Walton
8279fab432
Fix AdvancedProcessBlocks128_6x1_NEON template name 2018-06-23 12:35:06 -04:00
Jeffrey Walton
527613df22
Update documentation 2018-06-23 12:27:25 -04:00
Jeffrey Walton
9980d30734
Add LEA-128 NEON and ARMv8 implementation (GH #669)
LEA-128(128) from 35.6 cpb to 14.11 cpb on a LeMaker HiKey dev-board. LEA-128 from 12.60 cpb to 11.89 cpb on AMD Opteron 1100.
2018-06-23 03:54:51 -04:00
Jeffrey Walton
fa7714f6cb
Add LEA-128 SSSE3 implementation (GH #669)
LEA-128(128) from 6.73 cpb to 2.84 cpb on modern Core-i5 6400. LEA-128 from 10.12 cpb to 7.84 cpb antique Core2 Duo.
2018-06-22 16:26:27 -04:00
Jeffrey Walton
b00a378a8d
Add CHAM64 SSSE3 implementation (PR #670)
CHAM64 from 20 cpb to 14 cpb on modern iCore. CHAM64 from 90 cpb to 18 cpb antique Core2 Duo
2018-06-21 00:37:10 -04:00
Jeffrey Walton
a80b1d35b0
Parameterize word type for subkeys in AdvancedProcessBlocks templates
This was needed a while ago but we mostly side-stepped the issues with casts. CHAM64 uses a word16 type for subkeys and a cast won't fix it because we favor word32 for 64-bit block sizes.
2018-06-20 19:25:52 -04:00
Jeffrey Walton
8146eda6a3
Clear unused variable warnings under GCC 2018-03-09 06:45:32 -05:00
Fabrice Fontaine
3c01bcc352 Allow user to set -DCRYPTOPP_ARM_NEON_AVAILABLE=0 (#595)
Disable neon through -DCRYPTOPP_ARM_NEON_AVAILABLE=0,
replace "if defined(CRYPTOPP_ARM_NEON_AVAILABLE)" by
"if (CRYPTOPP_ARM_NEON_AVAILABLE)"

Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
2018-03-05 18:49:10 -05:00
Jeffrey Walton
143f5a3079
Handle C++17 std::uncaught_exceptions (GH #590) 2018-02-21 09:59:52 -05:00
Jeffrey Walton
bd8c20562c
Clear unused variable warnings 2018-02-20 17:03:32 -05:00
Jeffrey Walton
33c10bc027
Fix ODR violation in AdvancedProcessBlocks_{ARCH} (GH #585)
The ALTIVEC function required an inline declaration. Lack of inline caused the self test failure. Two NEON functions needed the same. We also cleaned up constants in unnamed namespaces
2018-02-20 13:17:05 -05:00
Jeffrey Walton
85993b2529
Add xorInput and xorOutput flags to adv-simd classes
Analysis tools are generating findings when the pointer xorBlocks is used as the flag. The other missing piece is, xorBlocks is never NULL when either BT_XorInput or BT_XorOuput. But we don't know how to train the analyzers with the information, so we make it explicit with the boolean flags xorInput and xorOutput.
Switching to the explicit flags costs us about 0.01 cpb on a modern Intel Core processor. In the typical case 0.01 is negligible.
2018-01-24 12:06:15 -05:00
Jeffrey Walton
e4e1fbe0ed
Clear Coverity findings CID 186951, 186950, 186947
Coverity does not realize xorBlocks is always non-NULL when BT_XorInput is set
2018-01-19 19:42:03 -05:00
Jeffrey Walton
4f2c605209
Add Power4 unaligned Load and Store 2018-01-05 21:27:27 -05:00
Jeffrey Walton
d6d53f2e9d
Add Power4 Vector Load, Store, Add and Xor 2018-01-02 08:13:42 -05:00
Jeffrey Walton
fac3a44a84
Move Altivec AdvancedProcessBlocks into adv-simd.h 2018-01-02 07:08:13 -05:00
Jeffrey Walton
7b14ead0f3
Fix unaligned load for _mm_loaddup_pd with GCC and UBsan
This function was missed earlier. Unfortunately, it does not squash all of the unaligned load findings. I'm pretty sure it is a GCC problem
2017-12-28 01:16:17 -05:00
Jeffrey Walton
19deccf3ba
Fix Clang 5.0 "runtime error: addition of unsigned offset to 0xXXXX overflowed to 0xYYYY" (GH #549) 2017-12-16 18:18:53 -05:00
Jeffrey Walton
dc21de2483
Fix UBsan overflow finding
We were cating UBsan findings under Clang similar to "adv-simd.h:1138:26: runtime error: addition of unsigned offset to 0x000002d41410 overflowed to 0x000002d41400". The problem was CRYPTOPP_CONSTANT, which used an enum. The compiler is allowed to pick the underlying data type, and Clang was picking a signed type
2017-12-16 14:21:08 -05:00
Jeffrey Walton
195ac2c7c9
Refactor rijndael-simd.cpp and simon.simd.cpp to use adv-simd.h 2017-12-10 11:09:50 -05:00
Jeffrey Walton
e90cc9a028
Update comments 2017-12-10 05:41:19 -05:00
Jeffrey Walton
8a5911e6eb
Refactor <cipher>_AdvancedProcessBlocks_<arch> into adv-simd.h
This also fixes the SPECK64 bug where CTR mode self tests fail. It was an odd failure because it only affected 64-bit SPECK. SIMON was fine and it used nearly the same code. We tracked it down through trial and error to the table based rotates.
2017-12-09 21:04:25 -05:00