SIMON and SPECK keys can be pre-splatted in the forward direction when Altivec instructions will be used. Pre-splatting does not work for the reverse transformation. It breaks modes like CBC, so the speed-up is only applied to the forward transformation.
* Test fix ARM headers
This problem has been festering for some time. The header file includes are slightly different than the ISA options. Some platforms need an include, others don't.
* Fix cryptest-android.sh and cryptest-ios.sh
* Fix MSVC ARM32 and ARM64 compile
* Split ARM32 and ARM64 recipes in GNUmakefile
The Gentoo folks caught a bug at https://bugs.gentoo.org/689162. The 689162 bug uses -march=bdver1 -msse4.1 on a AMD Bulldozer machine.
Investigating the issue we are missing the XOP header blake2b_simd.cpp. However, adding the XOP header is not enough for this particular config. Four source files fail to compile with the expected headers. We are waiting on the GCC folks to get back to us with a fix.
Use PowerPC unaligned loads and stores with Power8. Formerly we were using Power7 as the floor because the IBM POWER Architecture manuals said unaligned loads and stores were available. However, some compilers generate bad code for unaligned loads and stores using `-march=power7`, so bump to a known good.