Commit Graph

59 Commits

Author SHA1 Message Date
Jeffrey Walton
20962baf44
Fix ChaCha AVX2 implementation (GH #1069)
Many thanks to Jack Lloyd
2021-09-20 12:31:32 -04:00
Jeffrey Walton
3accc3d083
Disable ChaCha20 AVX2 implementation (GH #1069) 2021-09-17 20:49:12 -04:00
Jeffrey Walton
f7e6af6344
Add EnumToInt conversion macro for enum-enum conversion warnings (GH #1016) 2021-03-09 22:51:19 -05:00
Jeffrey Walton
de45ebeec1
Clear C++20 enum-enum conversion warnings (GH #1016) 2021-03-09 21:56:24 -05:00
Jeffrey Walton
d53f064c9b
Use Altivec as minimum ISA of ChaCha Simon64 and Speck64 2020-04-05 17:23:32 -04:00
Jeffrey Walton
fa39314b7a
Add XLC 12 loads and stores for AIX (PR #907)
Add XLC 12 loads and stores for AIX
2019-10-26 22:11:49 -04:00
Jeffrey Walton
39418a8512
Use PowerPC unaligned loads and stores with Power8 (GH #825, PR #826)
Use PowerPC unaligned loads and stores with Power8. Formerly we were using Power7 as the floor because the IBM POWER Architecture manuals said unaligned loads and stores were available. However, some compilers generate bad code for unaligned loads and stores using `-march=power7`, so bump to a known good.
2019-04-27 20:35:01 -04:00
Jeffrey Walton
95bc90adc4
Clear unused warnings with MSVC 2019-02-22 08:38:20 -05:00
Jeffrey Walton
e499131ea6
Latch previous ROUNDS in Salsa and ChaCha ciphers (GH #800, PR #804) 2019-02-12 16:56:01 -05:00
Jeffrey Walton
161d680434
Back-off ChaCha assert at the moment (GH #790)
We don't know what we are supposed to do at the moment. We need the CFRG or IETF to say what is supposed to happen.
2019-02-11 11:40:05 -05:00
Jeffrey Walton
76b47204df
Add IETF XChaCha20Poly1305 (GH #727, PR #795) 2019-02-06 04:14:39 -05:00
Jeffrey Walton
26c83877ef
Add IETF XChaCha (GH #727, PR #794) 2019-02-06 01:03:28 -05:00
Jeffrey Walton
c1ad534038
Update comments 2019-01-30 01:45:09 -05:00
Jeffrey Walton
b9d2310beb
Use ROUNDS constant for ChaChaTLS 2019-01-25 23:27:48 -05:00
Jeffrey Walton
76bdb328a6
Switch to RFC 8439 for ChaChaTLS
Unfortunately the block counter wrap problem is still present.
2019-01-25 21:51:43 -05:00
Jeffrey Walton
82f80124e6
Update comments 2019-01-25 19:49:17 -05:00
Jeffrey Walton
779e28a9b0
Update comments 2019-01-25 19:04:34 -05:00
Jeffrey Walton
6a68abea0a
Update comments 2019-01-25 08:14:23 -05:00
Jeffrey Walton
97df2b960b
Update comments 2019-01-25 07:54:00 -05:00
Jeffrey Walton
dcd9e67eeb
Refactor ChaCha and ChaChaTLS use a common core 2019-01-25 06:40:12 -05:00
Jeffrey Walton
70dcd29e0b
Refactor ChaCha and ChaChaTLS use a common core 2019-01-25 06:18:58 -05:00
Jeffrey Walton
d25ba0c59a
Enable SIMD implementation for ChaChaTLS (GH #265) 2019-01-25 02:57:11 -05:00
Jeffrey Walton
acde2f8e5e
Use word64 for ChaChaTLS InitialBlock (GH #265) 2019-01-25 02:34:07 -05:00
Jeffrey Walton
f23b58b73c
Remove rounds from ChaChaTLS
Rounds are alwys 20 in the IETF implementation.
2019-01-24 22:26:15 -05:00
Jeffrey Walton
a29b734a0f
Fix AlgorithmProvider for ChaChaTLS 2019-01-24 09:46:56 -05:00
Jeffrey Walton
5603661eec
Add ChaChaTLS implementation (GH #265)
We tweaked ChaCha to arrive at the IETF's implementation specified by RFC 7539. We are not sure how to handle block counter wrap. At the moment the caller is responsible for managing it. We were not able to find a reference implementation so we disable SIMD implementations like SSE, AVX, NEON and Power4. We need the wide block tests for corner cases to ensure our implementation is correct.
2019-01-24 09:36:05 -05:00
Jeffrey Walton
8838f78ec4
Fix ChaCha compiler crash for GCC 3.3 2018-12-29 01:08:43 -05:00
Jeffrey Walton
2e68e95a92
Add BLAKE2s and ChaCha CORE SIMD function (GH #656)
The CORE function provides the implementation for ChaCha_OperateKeystream_ALTIVEC, ChaCha_OperateKeystream_POWER7, BLAKE2_Compress32_ALTIVEC and BLAKE2_Compress32_POWER7. Depending on the options used to compile the source files, either POWER7 or ALTIVEC will be used.
This is needed to support the "new toolchain, ancient hardware" use case.
2018-11-18 14:43:48 -05:00
Jeffrey Walton
e28b2e0f02
Switch between POWER7 and POWER4 (GH #656)
This is kind of tricky. We automatically drop from POWER7 to POWER4 if 7 is notavailable. However, if POWER7 is available the runtime test checks for HasAltivec(), and not HasPower7(), if the drop does not occur.
All of this goodness is happening on an old Apple G4 laptop with Gentoo. It is a "new toolchain on old hardware".
2018-11-18 12:42:04 -05:00
Jeffrey Walton
10f85d6596
Make Altivec vector wraps friendly to downgrades
The way the existing ppc_simd.h is written makes it hard to to switch between the old Altivec loads and stores and the new POWER7 loads and stores. This checkin rewrites the wrappers to use _ALTIVEC_, _ARCH_PWR7 and _ARCH_PWR8. The wrappers in this file now honor -maltivec, -mcpu-power7 and -mcpu=power8. It allows users to compile a source file, like chacha_simd.cpp, with a lower ISA and things just work for them.
2018-11-15 02:11:00 -05:00
Jeffrey Walton
225ab6cb7b
Drop ChaCha requirements to POWER7
This costs about 0.6 cpb (700 MB/s on GCC112), but it makes the faster algorithm available to more machines. In the future we may want to provide both POWER7 and POWER8
2018-11-14 08:19:13 -05:00
Jeffrey Walton
d9011f07d2
Add ChaCha AVX2 implementation (GH #735) 2018-11-08 16:20:31 -05:00
Jeffrey Walton
6cc763939e
Skip unneeded wrap check in SIMD book keeping (GH #732) 2018-11-04 15:35:34 -05:00
Jeffrey Walton
29be6ed97a
Work-around potential counter increment problem in ChaCha20 (GH #732)
This is only a work-around for the moment. The issue only affects SIMD code. The problem is, the algorithm we use performs a 32-bit add as an intermediate result, but we really need a 64-bit add. We are running 4 transforms in parallel, and we can't add and carry the way we need to.

The workaround is, whenever we could cross the 32-bit counter boundary we use the C version of the transform. We determine the cross-over point by 'bool safe = 0xffffffff - state.low > 4'. When not safe we skip the SIMD version of the algorithm and use the C version. Once we are safe again we use the SIMD version again.

The work-around costs us about 0.1 to 0.2 cpb. At 1.10 or 1.15 cpb that equates to about 200 MB/s on a Skylake. We'd like to get it back eventually.
2018-11-04 14:49:26 -05:00
Jeffrey Walton
d7d76fa5f7
Add ChaCha Power8 implementation 2018-10-27 08:40:07 -04:00
Jeffrey Walton
c0b273dac8
Remove xorInput parameter from ChaCha SIMD functions
We can use the input pointer directly after checking KeystreamOperation
2018-10-26 10:10:52 -04:00
Jeffrey Walton
8da2b91cba
Add ChaCha AlgorithmName override 2018-10-26 03:13:06 -04:00
Jeffrey Walton
b4b3623938
Whitespace check-in 2018-10-25 12:15:33 -04:00
Jeffrey Walton
b1050636a6
Add ChaCha NEON implementation 2018-10-25 12:08:32 -04:00
Jeffrey Walton
b4c4c5aa14
Add SSSE3 rotates when available
This change obtains the remaining 0.1 to 0.15 cpb. It should be engaged with -march=native
2018-10-24 15:34:54 -04:00
Jeffrey Walton
18dcbdf514
Move input xor to ChaCha_OperateKeystream_SSE2
This picks up about 0.2 cpb in ChaCha::OperateKeystream. It may not sound like much but it puts SSE2 intrinsics version on par with the ASM version of Salsa20. Salsa20 leads ChaCha by 0.1 to 0.15 cpb, which equates to about 50 MB/s.
2018-10-24 11:00:35 -04:00
Jeffrey Walton
d230999b40
Fix ChaCha compile on ARM and MIPS 2018-10-24 01:11:45 -04:00
Jeffrey Walton
6a5d2ab03d
Remove unneeded params from ChaCha_OperateKeystream_SSE2 2018-10-23 08:52:29 -04:00
Jeffrey Walton
028a9f0494
Remove old comments from chacha.cpp
This should have been done at 916c4484a2
2018-10-23 08:12:02 -04:00
Jeffrey Walton
916c4484a2
Add ChaCha SSE2 implementation
Thanks to Jack Lloyd and Botan for allowing us to use the implementation.
The numbers for SSE2 are very good. When compared with Salsa20 ASM the results are:
  * Salsa20 2.55 cpb; ChaCha/20 2.90 cpb
  * Salsa20/12 1.61 cpb; ChaCha/12 1.90 cpb
  * Salsa20/8 1.34 cpb; ChaCha/8 1.5 cpb
2018-10-23 07:57:59 -04:00
Jeffrey Walton
48f2d95b0f
Fix ChaCha debug builds
This broke at https://github.com/weidai11/cryptopp/commit/e2be0cdecce7
2018-08-18 01:31:35 -04:00
Jeffrey Walton
e2be0cdecc
Make ChaCha an Salsa use the same design pattern 2018-08-17 06:19:30 -04:00
Jeffrey Walton
2f83777e9b
Backout ChaCha changes to Crypto++ 7.0
These changes made it in by accident at Commit b74a6f4445. We were going to try to let them ride but they broke versioning. They may be added later but we should avoid the change at this time.
2018-07-25 16:25:41 -04:00
Jeffrey Walton
b74a6f4445
Add algorithm provider member function to Algorithm class 2018-07-06 09:23:37 -04:00
Jeffrey Walton
a074722bfa
Switch to rotlConstant and rotrConstant
This will help Clang and its need for a constexpr
2017-11-25 02:52:19 -05:00