Jeffrey Walton
3dcceb55f5
Squash MS LNK4221 and libtool warnings
2018-07-06 03:46:25 -04:00
Jeffrey Walton
7f86f498d6
Remove GCC_NO_UBSAN attribute
2018-07-01 01:02:33 -04:00
Jeffrey Walton
80ae9f4f0a
Add AVX512 rotates to RotateLeft and RotateRight templates
2018-06-22 17:44:16 -04:00
Fabrice Fontaine
3c01bcc352
Allow user to set -DCRYPTOPP_ARM_NEON_AVAILABLE=0 ( #595 )
...
Disable neon through -DCRYPTOPP_ARM_NEON_AVAILABLE=0,
replace "if defined(CRYPTOPP_ARM_NEON_AVAILABLE)" by
"if (CRYPTOPP_ARM_NEON_AVAILABLE)"
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
2018-03-05 18:49:10 -05:00
Jeffrey Walton
e5a362c026
Re-add Simon and Speck, enable NEON and Aarch64 (GH #585 )
...
This commit re-adds Simon and Speck. The commit includes NEON, Aarch32 and Aarch64
2018-02-19 04:47:19 -05:00
Jeffrey Walton
5da795bf56
Whitespace check-in
2018-02-18 23:44:23 -05:00
Jeffrey Walton
e416b243d3
Re-add Simon and Speck, enable SSE (GH #585 )
...
This commit re-adds Simon and Speck. The commit includes C++, SSSE3 and SSE4. NEON, Aarch32 and Aarch64 are disabled at the moment.
2018-02-18 23:23:50 -05:00
Jeffrey Walton
15b14cc618
Remove Simon and Speck ciphers (GH #585 )
...
We recently learned our Simon and Speck implementation was wrong. The removal will stop harm until we can loop back and fix the issue.
The issue is, the paper, the test vectors and the ref-impl do not align. Each produces slightly different result. We followed the test vectors but they turned out to be wrong for the ciphers.
We have one kernel test vector but we don't have a working implementation to observe it to fix our implementation. Ugh...
2018-02-14 04:06:16 -05:00
Jeffrey Walton
5cee4a6573
Improve logic for <arm_acle.h> include (GH #568 )
2018-01-20 13:23:41 -05:00
Jeffrey Walton
fac3a44a84
Move Altivec AdvancedProcessBlocks into adv-simd.h
2018-01-02 07:08:13 -05:00
Jeffrey Walton
5f083d652e
Clear signed/unsigned warnings
2017-12-31 03:54:33 -05:00
Jeffrey Walton
4d9c91b425
Fix missing define for MSVC
2017-12-26 15:07:28 -05:00
Jeffrey Walton
4904d0fc8d
Fix unaligned load for _mm_loaddup_pd with GCC and UBsan
2017-12-26 14:55:10 -05:00
Jeffrey Walton
3fff9e85df
Fix unaligned load for _mm_loaddup_pd with GCC and UBsan
2017-12-26 12:41:04 -05:00
Jeffrey Walton
ae445c0b0f
Clear signed/unsigned warnings with GCC and -Wall -Wextra
2017-12-26 11:48:11 -05:00
Jeffrey Walton
195ac2c7c9
Refactor rijndael-simd.cpp and simon.simd.cpp to use adv-simd.h
2017-12-10 11:09:50 -05:00
Jeffrey Walton
e90cc9a028
Update comments
2017-12-10 05:41:19 -05:00
Jeffrey Walton
8a5911e6eb
Refactor <cipher>_AdvancedProcessBlocks_<arch> into adv-simd.h
...
This also fixes the SPECK64 bug where CTR mode self tests fail. It was an odd failure because it only affected 64-bit SPECK. SIMON was fine and it used nearly the same code. We tracked it down through trial and error to the table based rotates.
2017-12-09 21:04:25 -05:00
Jeffrey Walton
e457ca26f7
Add SSE3 <pmmintrin.h> for SImon and Speck
...
Add additional comments for WORKAROUND_GCC_OPTERON_ISSUE
2017-12-08 13:54:00 -05:00
Jeffrey Walton
148202369b
Fix Speck-64 CTR mode
...
It looks like the delay was due to some GCC 7 issue. We had to disable parallel blocks on Aarch64 with GCC 7. We may be running out of registers and that could be causing problems. It looks like GCC uses up to v30.
2017-12-07 22:30:03 -05:00
Jeffrey Walton
02037b5ce6
Fix Simon-64 CTR mode
...
This fixes CTR mode for Simon-64. We were only incrementing half the counters.
We still have Speck-64 to cleanup.
2017-12-07 19:45:32 -05:00
Jeffrey Walton
07f2a4fc3f
Fix Simon-64 and Speck-64 CTR mode
...
This fixes CTR mode for IA-32. We were only incrementing half the counters.
Added additional test vectors
2017-12-07 16:55:23 -05:00
Jeffrey Walton
86acc8ed45
Use 6x-2x-1x for Simon and Speck on IA-32
...
For Simon-64 and Speck-64 this means we are effectively using 12x-4x-1x. We are mostly at the threshold for IA-32 and parallelization. At any time 10 to 13 XMM registers are being used.
Prefer movsd by way of _mm_load_sd and _mm_store_sd.
Fix "error C3861: _mm_cvtsi128_si64x identifier not found".
2017-12-06 06:18:46 -05:00
Jeffrey Walton
e9654192f2
Remove unneeded temp[] array
2017-12-05 20:35:57 -05:00
Jeffrey Walton
490701acca
Use 12x-4x-1x for Simon and Speck on ARM
2017-12-05 18:43:53 -05:00
Jeffrey Walton
7bc621da62
Enable NEON/ASIMD for Simon and Speck on Aarch32/Aarch64 (GH #545 )
2017-12-05 14:02:48 -05:00
Jeffrey Walton
9b61d4143d
Add big- and little-endian rotates for Aarch32 and Aarch64
2017-12-05 12:32:26 -05:00
Jeffrey Walton
9faa504a24
Fix Aarch32 and Aarch64 rotates
2017-12-05 11:15:26 -05:00
Jeffrey Walton
c18793f862
Fix SIMON-64 missing transform
2017-12-05 09:14:58 -05:00
Jeffrey Walton
4990ffe5b8
Add SIMON-64 NEON intrinsics
2017-12-05 08:53:57 -05:00
Jeffrey Walton
e09e6af1f8
Enable multi-block for SPECK-64 and SIMON-64
...
Also cleaned up SIMON-64 vector permute code. Thanks again to Peter Cordes
2017-12-05 04:19:44 -05:00
Jeffrey Walton
46271660a1
Switch to uint64x2_t for SIMON-128
2017-12-04 05:47:34 -05:00
Jeffrey Walton
f0e49785f6
Fix incorrect SPECK-128 decrypt when blocks >= 6
...
Add defines for CRYPTOPP_SPECK64_ADVANCED_PROCESS_BLOCKS and CRYPTOPP_SPECK128_ADVANCED_PROCESS_BLOCKS
2017-12-03 09:00:39 -05:00
Jeffrey Walton
081afde0fd
Add SIMON-64 SSE intrinsics
...
Performance went from about 29 cpb (C++) to about 11.1 cpb (SSE)
2017-12-03 04:10:55 -05:00
Jeffrey Walton
25493ded49
Add AVX512VL rotate support
2017-12-01 09:39:05 -05:00
Jeffrey Walton
4792578f09
Rearrange statements and avoid intermediates
...
The folding of statements helps GCC elimate some of the intermediate stores it was performing. The elimination saved about 1.0 cpb. SIMON-128 is now running around 10 cpb, but it is still off the Simon and Speck team's numbers of 3.5 cpb
2017-12-01 04:11:31 -05:00
Jeffrey Walton
b7ced67892
Update comments
2017-12-01 02:38:19 -05:00
Jeffrey Walton
a7fec9c0f6
Fix assert in Debug builds
...
This was copy/paste from the template function
2017-11-30 11:54:21 -05:00
Jeffrey Walton
22257c4b6e
Remove SunCC const cast workaround
...
This code does not suffer SunCC losing const-ness
2017-11-29 12:56:19 -05:00
Jeffrey Walton
39594a53b0
Add fast rotate-by-8 for Aarch32 and Aarch64
2017-11-29 12:33:34 -05:00
Jeffrey Walton
532f13fe53
Fix compile using SunCC 12.4
2017-11-29 12:10:19 -05:00
Jeffrey Walton
16ebfa72bf
Cleanup comments and whitespace
2017-11-29 10:15:41 -05:00
Jeffrey Walton
6e829cebee
Use EPI8 Shuffle rather than Shifts and Or for rotate when R=8
...
Louis Wingers and Bryan Weeks from the Simon and Speck team offered the suggestion. The change save 0.7 cpb for Speck, and 5 cpb for Simon on x86_64.
Speck is now running very close to the Team's time sor SSE4. Simon is still off, but we know the root cause. For Simon, the Team used a fast bit-sliced implementation
2017-11-29 08:53:48 -05:00
Jeffrey Walton
a29b36c197
Whitespace check-in
2017-11-27 01:51:27 -05:00
Jeffrey Walton
568e608ea6
Add NEON and ASIMD intrinsics for SPECK-128 (GH #539 )
...
Performance increased by about 200% on a 980 MHz BananaPi dev-board. Throughput went from about 176.6 cpb to about 60.3 cpb.
2017-11-27 00:36:45 -05:00