ext-cryptopp

mirror of https://github.com/shadps4-emu/ext-cryptopp.git synced 2024-11-23 18:09:48 +00:00

Author	SHA1	Message	Date
Jeffrey Walton	0cee6f01f0	Squash MS LNK4221 and libtool warnings	2018-07-06 01:22:38 -04:00
Jeffrey Walton	8279fab432	Fix AdvancedProcessBlocks128_6x1_NEON template name	2018-06-23 12:35:06 -04:00
Ilja	ec6c442cc6	Remove extra ; from rijndael-simd.cpp (PR #621 )	2018-03-31 13:04:42 -04:00
Jeffrey Walton	bd8c20562c	Clear unused variable warnings	2018-02-20 17:03:32 -05:00
Jeffrey Walton	244c40ed61	Remove unneeded round parameter on Rijndael_UncheckedSetKey_SSE4_AESNI	2018-02-20 13:32:53 -05:00
Jeffrey Walton	33c10bc027	Fix ODR violation in AdvancedProcessBlocks_{ARCH} (GH #585 ) The ALTIVEC function required an inline declaration. Lack of inline caused the self test failure. Two NEON functions needed the same. We also cleaned up constants in unnamed namespaces	2018-02-20 13:17:05 -05:00
Jeffrey Walton	c80e28eec8	Remove unneeded parameter for Rijndael_UncheckedSetKey_POWER8	2018-02-20 06:42:43 -05:00
Jeffrey Walton	d30afa4d01	Whitespace check-in	2018-02-20 04:18:58 -05:00
Jeffrey Walton	2b2303bc75	Remove unneeded Rijndael_Subkey_POWER8 (GH #588 ) This is due to the removal of a path in Rijndael_UncheckedSetKey_POWER8	2018-02-20 02:24:09 -05:00
Jeffrey Walton	5b09d46665	Cleanup signed integer overflow on ppc64 (GH #588 ) The code below was flagged by undefined behavior santizier under GCC 8. The offender was the doubling at "r4 = vec_add(r4, r4)". R4 is rcon and an unsigned type. It depends on integer wrap but GCC is generating code that is being flagged for signed overflow. GCC 7 and below is OK. for (unsigned int i=0; i<8; ++i) { r1 = Rijndael_Subkey_POWER8(r1, r4, r5); r4 = vec_add(r4, r4); skptr = IncrementPointerAndStore(r1, skptr); } // Final two rounds using table lookup ...	2018-02-20 02:10:17 -05:00
Jeffrey Walton	5cee4a6573	Improve logic for <arm_acle.h> include (GH #568 )	2018-01-20 13:23:41 -05:00
Jeffrey Walton	d6d53f2e9d	Add Power4 Vector Load, Store, Add and Xor	2018-01-02 08:13:42 -05:00
Jeffrey Walton	fac3a44a84	Move Altivec AdvancedProcessBlocks into adv-simd.h	2018-01-02 07:08:13 -05:00
Jeffrey Walton	66da740ad3	Use M128_CAST and CONST_M128_CAST for Clang Also see http://bugs.llvm.org/show_bug.cgi?id=20670	2017-12-26 11:20:18 -05:00
Jeffrey Walton	8e916e7bac	Use M128_CAST and CONST_M128_CAST for Clang Also see http://bugs.llvm.org/show_bug.cgi?id=20670	2017-12-26 11:16:52 -05:00
Jeffrey Walton	2c79be7a54	Add CRYPTOPP_POWER5_AVAILABLE Power4 lacks 'vector long long' Rename datatypes such as 'uint8x16_p8' to 'uint8x16_p'. Originally the p8 suffix indicated use with Power8 in-core crypto. We are now using Altivec/Power4 for general vector operations.	2017-12-12 08:17:17 -05:00
Jeffrey Walton	b7e636ac51	Rename ppc-crypto.h to ppc-simd.h	2017-12-12 07:15:59 -05:00
Jeffrey Walton	195ac2c7c9	Refactor rijndael-simd.cpp and simon.simd.cpp to use adv-simd.h	2017-12-10 11:09:50 -05:00
Jeffrey Walton	69c8a4f9c6	Prefix IS_LITTLE_ENDIAN and IS_BIG_ENDIAN with CRYPTOPP	2017-11-10 14:15:30 -05:00
Jeffrey Walton	ea3c80c949	Move Rijndael_AdvancedProcessBlocks_ARMV8 into anonymous namespace	2017-09-23 05:28:59 -04:00
Jeffrey Walton	26597059d9	Move to anonymous namespaces in rijndael-simd.cpp	2017-09-23 02:13:16 -04:00
Jeffrey Walton	12953fd0e4	Add IncrementPointerAndStore This speeds up XL C/C++ by 0.1 to 0.2 cpb	2017-09-22 20:35:18 -04:00
Jeffrey Walton	1057f89363	Move Power8 crypto functions into ppc-crypto.h	2017-09-22 05:23:29 -04:00
Jeffrey Walton	3e55817819	Add C++ templates for additional Vector ops Removed lower-level C-like functions such as Store8x16 and Store64x2	2017-09-22 04:15:33 -04:00
Jeffrey Walton	441e944a66	Switch to vec_vsx_ld, remove unaligned loads Partially unroll loop Rijndael_UncheckedSetKey_POWER8 loop. It saves about another 60 cycles	2017-09-22 02:53:08 -04:00
Jeffrey Walton	d9592a303c	Updated comments	2017-09-21 21:45:23 -04:00
Jeffrey Walton	dabad4b409	Cleanup asserts and casts	2017-09-21 20:55:35 -04:00
Jeffrey Walton	1edea5a80f	Vectorize tail of Rijndael_UncheckedSetKey_POWER8	2017-09-21 20:02:40 -04:00
Jeffrey Walton	e43c0eee74	Fold ConditionalByteReverse for non-Power8 paths	2017-09-21 19:17:42 -04:00
Jeffrey Walton	f763bf3da6	Updated comments	2017-09-21 12:08:54 -04:00
Jeffrey Walton	e78464a1af	Enable little endian Rijndael_UncheckedSetKey_POWER8 using built-ins The problem was vec_sld is endian sensitive. The built-in required more than us setting up arguments to ensure the vsx load resulted in a big endian value. Thanks to Paul R on Stack Overflow for sharing the information that IBM did not provide. Also see http://stackoverflow.com/q/46341923/608639	2017-09-21 09:56:37 -04:00
Jeffrey Walton	c6b096ddd4	Move Rijndael_UncheckedSetKey_POWER8 prior to GetUserKey call Arg... GetUserKey was performing a 32-bit word reverse. It was part of the problem on little endian machines	2017-09-21 01:08:44 -04:00
Jeffrey Walton	9fd5d023f9	Load r5 mask once for key expansion	2017-09-20 20:27:58 -04:00
Jeffrey Walton	35c0fa82fd	Use <time.h> for Borland/Embarcadero (GH #512 )	2017-09-20 18:10:07 -04:00
Jeffrey Walton	c5a427d690	Add PowerPC VectorLoadKeyUnaligned for AES-192 Make internal functions static. We get better optimizations depsice using unnamed namespaces Add PowerPC uint32x4 functions for handling 32-bit rcon and mask	2017-09-20 08:57:53 -04:00
Jeffrey Walton	c94d076aa1	Move r1 write to caller; remove from Rijndael_Subkey_POWER8 Signed-off-by: Jeffrey Walton <noloader@gmail.com>	2017-09-20 04:38:53 -04:00
Jeffrey Walton	5159d0803d	Add Power8 key expansion for big endian This is AES-128 key expansion for big endian. Little endian has a bug in it so it can't be enabled at the moment. GDB is acting up on GCC112, so I've had trouble investigating it	2017-09-20 03:34:54 -04:00
Jeffrey Walton	6102333fc3	Add CRYPTOPP_NO_CPU_FEATURE_PROBES (GH #511 ) We determine machine capabilities by performing an os/platform query first, like getauxv(). If the query fails, we move onto a cpu probe. The cpu probe tries to exeute an instruction and then catches a SIGILL on Linux or the exception EXCEPTION_ILLEGAL_INSTRUCTION on Windows. Some OSes fail to hangle a SIGILL gracefully, like Apple OSes. Apple machines corrupt memory and variables around the probe.	2017-09-19 21:08:37 -04:00
Jeffrey Walton	6440921723	Add Rijndael_UncheckedSetKey_POWER8 We are going to attempt to perform key setup using Power8 in-core vector instructions	2017-09-19 04:55:15 -04:00
Jeffrey Walton	923cf95571	ByteReverseArray → ReverseByteArrayLE	2017-09-18 18:40:19 -04:00
Jeffrey Walton	2c18fe8af8	Refactor LoadT() and StoreT(). Add separate ReverseT() for little endian machines The refactoring has no effect on little endian machines. However, on big endian GCC119 using GCC 7.1 the performance improved by 2.5x for ECB and CTR modes: BEFORE: <TR><TH>AES/CTR (128-bit key)<TD>2723<TD>1.4<TD>0.163<TD>670 <TR><TH>AES/CTR (192-bit key)<TD>2560<TD>1.5<TD>0.175<TD>719 <TR><TH>AES/CTR (256-bit key)<TD>2728<TD>1.4<TD>0.183<TD>749 <TR><TH>AES/CBC (128-bit key)<TD>1204<TD>3.2<TD>0.135<TD>554 <TR><TH>AES/CBC (192-bit key)<TD>1066<TD>3.7<TD>0.148<TD>605 <TR><TH>AES/CBC (256-bit key)<TD>948<TD>4.1<TD>0.155<TD>635 <TR><TH>AES/OFB (128-bit key)<TD>1019<TD>3.8<TD>0.158<TD>648 <TR><TH>AES/CFB (128-bit key)<TD>949<TD>4.1<TD>0.192<TD>787 <TR><TH>AES/ECB (128-bit key)<TD>3564<TD>1.1<TD>0.082<TD>337 AFTER: <TR><TH>AES/CTR (128-bit key)<TD>6484<TD>0.6<TD>0.163<TD>677 <TR><TH>AES/CTR (192-bit key)<TD>5641<TD>0.7<TD>0.176<TD>728 <TR><TH>AES/CTR (256-bit key)<TD>5005<TD>0.8<TD>0.183<TD>761 <TR><TH>AES/CBC (128-bit key)<TD>1223<TD>3.2<TD>0.135<TD>559 <TR><TH>AES/CBC (192-bit key)<TD>1080<TD>3.7<TD>0.147<TD>611 <TR><TH>AES/CBC (256-bit key)<TD>966<TD>4.1<TD>0.155<TD>642 <TR><TH>AES/OFB (128-bit key)<TD>1057<TD>3.7<TD>0.158<TD>656 <TR><TH>AES/CFB (128-bit key)<TD>1217<TD>3.3<TD>0.186<TD>774 <TR><TH>AES/ECB (128-bit key)<TD>7289<TD>0.5<TD>0.082<TD>342	2017-09-18 18:15:25 -04:00
Jeffrey Walton	f0c2324f6b	Fix armeabi and armv7-a for Android (GH #509 )	2017-09-17 20:07:53 -04:00
Jeffrey Walton	adea69ab68	Avoid increment during stores of 6x blocks This provides another 0.1 cpb with GCC	2017-09-14 21:06:44 -04:00
Jeffrey Walton	25efb7a140	Use 6x blocks for ARMv8 AES rather than 4x We gain 0.1 to 0.3 cpb, depending on the mode	2017-09-14 20:32:06 -04:00
Jeffrey Walton	58890ff053	Use 6x blocks for Power8 AES rather than 4x Perforamnce increased for all modes when performing 6x vs 4x. 8x and 12x performed worse. Here are the numbers: 4x Blocks: <TR><TH>AES/CTR (128-bit key)<TD>1563<TD>2.1<TD>0.409<TD>1392 <TR><TH>AES/CTR (192-bit key)<TD>1403<TD>2.3<TD>0.450<TD>1529 <TR><TH>AES/CTR (256-bit key)<TD>1280<TD>2.5<TD>0.482<TD>1639 <TR><TH>AES/CBC (128-bit key)<TD>582<TD>5.6<TD>0.359<TD>1222 <TR><TH>AES/CBC (192-bit key)<TD>517<TD>6.3<TD>0.394<TD>1339 <TR><TH>AES/CBC (256-bit key)<TD>474<TD>6.8<TD>0.432<TD>1469 <TR><TH>AES/OFB (128-bit key)<TD>533<TD>6.1<TD>0.402<TD>1368 <TR><TH>AES/CFB (128-bit key)<TD>563<TD>5.8<TD>0.461<TD>1568 <TR><TH>AES/ECB (128-bit key)<TD>1829<TD>1.8<TD>0.240<TD>817 6x Blocks: <TR><TH>AES/CTR (128-bit key)<TD>1750<TD>1.7<TD>0.406<TD>1300 <TR><TH>AES/CTR (192-bit key)<TD>1638<TD>1.9<TD>0.447<TD>1432 <TR><TH>AES/CTR (256-bit key)<TD>1528<TD>2.0<TD>0.482<TD>1541 <TR><TH>AES/CBC (128-bit key)<TD>582<TD>5.2<TD>0.358<TD>1145 <TR><TH>AES/CBC (192-bit key)<TD>517<TD>5.9<TD>0.394<TD>1260 <TR><TH>AES/CBC (256-bit key)<TD>474<TD>6.4<TD>0.431<TD>1379 <TR><TH>AES/OFB (128-bit key)<TD>533<TD>5.7<TD>0.400<TD>1281 <TR><TH>AES/CFB (128-bit key)<TD>563<TD>5.4<TD>0.461<TD>1476 <TR><TH>AES/ECB (128-bit key)<TD>1950<TD>1.6<TD>0.238<TD>763	2017-09-14 16:07:21 -04:00
Jeffrey Walton	08e4ee422e	Avoid increment during stores of 4x blocks This provides another 0.1 cpb with GCC	2017-09-14 15:12:07 -04:00
Jeffrey Walton	ddeae859d0	Use vec_xl_be and vec_xst_be for IBM XL C/C++ compiler	2017-09-14 13:27:49 -04:00
Jeffrey Walton	63a0af4efa	Fix endianess for s_one on ARM big-endian	2017-09-13 22:52:29 -04:00
Jeffrey Walton	8e52ce6dd2	Load correct value fo 1 under ARM big endian	2017-09-13 21:42:15 -04:00
Jeffrey Walton	c22507e38b	Clear unused variable warnings under Clang	2017-09-13 21:37:55 -04:00

1 2

64 Commits