linux/arch/x86/crypto
Mathias Krause 80dca4734b crypto: aesni - fix counter overflow handling in "by8" variant
The "by8" CTR AVX implementation fails to propperly handle counter
overflows. That was the reason it got disabled in commit 7da4b29d49
("crypto: aesni - disable "by8" AVX CTR optimization").

Fix the overflow handling by incrementing the counter block as a double
quad word, i.e. a 128 bit, and testing for overflows afterwards. We need
to use VPTEST to do so as VPADD* does not set the flags itself and
silently drops the carry bit.

As this change adds branches to the hot path, minor performance
regressions  might be a side effect. But, OTOH, we now have a conforming
implementation -- the preferable goal.

A tcrypt test on a SandyBridge system (i7-2620M) showed almost identical
numbers for the old and this version with differences within the noise
range. A dm-crypt test with the fixed version gave even slightly better
results for this version. So the performance impact might not be as big
as expected.

Tested-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-10-02 14:35:03 +08:00
..
sha-mb crypto: sha-mb - sha1_mb_alg_state can be static 2014-08-26 14:40:52 +08:00
aes_ctrby8_avx-x86_64.S crypto: aesni - fix counter overflow handling in "by8" variant 2014-10-02 14:35:03 +08:00
aes_glue.c
aes-i586-asm_32.S crypto: x86/aes - assembler clean-ups: use ENTRY/ENDPROC, localize jump targets 2013-01-20 10:16:47 +11:00
aes-x86_64-asm_64.S crypto: x86/aes - assembler clean-ups: use ENTRY/ENDPROC, localize jump targets 2013-01-20 10:16:47 +11:00
aesni-intel_asm.S crypto: aesni_intel - fix accessing of unaligned memory 2013-06-13 14:57:42 +08:00
aesni-intel_avx-x86_64.S crypto: aesni - fix build on x86 (32bit) 2014-01-15 11:36:34 +08:00
aesni-intel_glue.c crypto: aes - AES CTR x86_64 "by8" AVX optimization 2014-06-20 21:27:58 +08:00
blowfish_glue.c crypto: remove a duplicate checks in __cbc_decrypt() 2014-02-27 05:56:54 +08:00
blowfish-x86_64-asm_64.S crypto: blowfish-x86_64: use ENTRY()/ENDPROC() for assembler functions and localize jump targets 2013-01-20 10:16:48 +11:00
camellia_aesni_avx2_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
camellia_aesni_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
camellia_glue.c crypto: camellia-x86-64 - replace commas by semicolons and adjust code alignment 2013-08-21 21:08:32 +10:00
camellia-aesni-avx2-asm_64.S crypto: camellia-aesni-avx2 - tune assembly code for more performance 2013-06-21 14:44:23 +08:00
camellia-aesni-avx-asm_64.S crypto: x86/camellia-aesni-avx - add more optimized XTS code 2013-04-25 21:01:52 +08:00
camellia-x86_64-asm_64.S crypto: camellia-x86_64/aes-ni: use ENTRY()/ENDPROC() for assembler functions and localize jump targets 2013-01-20 10:16:48 +11:00
cast5_avx_glue.c crypto: remove a duplicate checks in __cbc_decrypt() 2014-02-27 05:56:54 +08:00
cast5-avx-x86_64-asm_64.S crypto: cast5-avx: use ENTRY()/ENDPROC() for assembler functions and localize jump targets 2013-01-20 10:16:48 +11:00
cast6_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
cast6-avx-x86_64-asm_64.S crypto: cast6-avx: use new optimized XTS code 2013-04-25 21:01:52 +08:00
crc32-pclmul_asm.S x86, crc32-pclmul: Fix build with older binutils 2013-05-30 16:36:23 -07:00
crc32-pclmul_glue.c crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation 2013-01-20 10:16:45 +11:00
crc32c-intel_glue.c crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction 2012-10-15 22:18:24 +08:00
crc32c-pcl-intel-asm_64.S crypto: crc32c-pclmul - Shrink K_table to 32-bit words 2014-06-20 21:27:57 +08:00
crct10dif-pcl-asm_64.S Reinstate "crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework" 2013-09-07 12:56:26 +10:00
crct10dif-pclmul_glue.c Reinstate "crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework" 2013-09-07 12:56:26 +10:00
des3_ede_glue.c crypto: des3_ede-x86_64 - fix parse warning 2014-06-25 21:38:43 +08:00
des3_ede-asm_64.S crypto: des_3des - add x86-64 assembly implementation 2014-06-20 21:27:58 +08:00
fpu.c
ghash-clmulni-intel_asm.S crypto: ghash-clmulni-intel - Use u128 instead of be128 for internal key 2014-04-04 21:06:14 +08:00
ghash-clmulni-intel_glue.c crypto: ghash-clmulni-intel - Use u128 instead of be128 for internal key 2014-04-04 21:06:14 +08:00
glue_helper-asm-avx2.S crypto: twofish - add AVX2/x86_64 assembler implementation of twofish cipher 2013-04-25 21:09:05 +08:00
glue_helper-asm-avx.S crypto: x86 - add more optimized XTS-mode for serpent-avx 2013-04-25 21:01:51 +08:00
glue_helper.c crypto: x86 - add more optimized XTS-mode for serpent-avx 2013-04-25 21:01:51 +08:00
Makefile crypto: sha-mb - SHA1 multibuffer job manager and glue code 2014-08-25 20:32:30 +08:00
salsa20_glue.c crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_* 2013-01-20 10:16:50 +11:00
salsa20-i586-asm_32.S crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_* 2013-01-20 10:16:50 +11:00
salsa20-x86_64-asm_64.S crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_* 2013-01-20 10:16:50 +11:00
serpent_avx2_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
serpent_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
serpent_sse2_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
serpent-avx2-asm_64.S crypto: serpent - add AVX2/x86_64 assembler implementation of serpent cipher 2013-04-25 21:09:07 +08:00
serpent-avx-x86_64-asm_64.S crypto: x86 - add more optimized XTS-mode for serpent-avx 2013-04-25 21:01:51 +08:00
serpent-sse2-i586-asm_32.S crypto: x86/serpent - use ENTRY/ENDPROC for assember functions and localize jump targets 2013-01-20 10:16:50 +11:00
serpent-sse2-x86_64-asm_64.S crypto: x86/serpent - use ENTRY/ENDPROC for assember functions and localize jump targets 2013-01-20 10:16:50 +11:00
sha1_avx2_x86_64_asm.S crypto: x86/sha1 - reduce size of the AVX2 asm implementation 2014-03-25 20:25:43 +08:00
sha1_ssse3_asm.S crypto: x86/sha1 - assembler clean-ups: use ENTRY/ENDPROC 2013-01-20 10:16:51 +11:00
sha1_ssse3_glue.c crypto: x86/sha1 - re-enable the AVX variant 2014-03-25 20:25:42 +08:00
sha256_ssse3_glue.c crypto: sha256_ssse3 - also test for BMI2 2013-10-07 14:17:10 +08:00
sha256-avx2-asm.S crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions 2013-04-03 09:06:32 +08:00
sha256-avx-asm.S crypto: sha256_ssse3 - fix stack corruption with SSSE3 and AVX implementations 2013-05-28 13:46:47 +08:00
sha256-ssse3-asm.S crypto: sha256_ssse3 - fix stack corruption with SSSE3 and AVX implementations 2013-05-28 13:46:47 +08:00
sha512_ssse3_glue.c crypto: sha512_ssse3 - fix byte count to bit count conversion 2014-06-25 21:55:02 +08:00
sha512-avx2-asm.S crypto: sha512 - Optimized SHA512 x86_64 assembly routine using AVX2 RORX instruction. 2013-04-25 21:00:58 +08:00
sha512-avx-asm.S crypto: sha512 - Optimized SHA512 x86_64 assembly routine using AVX instructions. 2013-04-25 21:00:58 +08:00
sha512-ssse3-asm.S crypto: sha512 - Optimized SHA512 x86_64 assembly routine using Supplemental SSE3 instructions. 2013-04-25 21:00:58 +08:00
twofish_avx_glue.c crypto: move x86 to the generic version of ablk_helper 2013-09-24 06:02:24 +10:00
twofish_glue_3way.c crypto: x86/glue_helper - use le128 instead of u128 for CTR mode 2012-10-24 21:10:54 +08:00
twofish_glue.c
twofish-avx-x86_64-asm_64.S crypto: x86/twofish-avx - use optimized XTS code 2013-04-25 21:01:51 +08:00
twofish-i586-asm_32.S crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels 2013-01-20 10:16:51 +11:00
twofish-x86_64-asm_64-3way.S crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels 2013-01-20 10:16:51 +11:00
twofish-x86_64-asm_64.S crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels 2013-01-20 10:16:51 +11:00