mirror of
https://github.com/RPCS3/llvm.git
synced 2024-12-26 22:26:16 +00:00
8887371782
Instead of expanding a packed shift into a sequence of scalar shifts, the backend now tries (when possible) to convert the vector shift into a vector multiply. Before this change, a shift of a MVT::v8i16 vector by a build_vector of constants was always scalarized into a long sequence of "vector extracts + scalar shifts + vector insert". With this change, if there is SSE2 support, we emit a single vector multiply. This change also affects SSE4.1, AVX, AVX2 shifts: - A shift of a MVT::v4i32 vector by a build_vector of non uniform constants is now lowered when possible into a single SSE4.1 vector multiply. - Packed v16i16 shift left by constant build_vector are now expanded when possible into a single AVX2 vpmullw. This change also improves the lowering of AVX512f vector shifts. Added test CodeGen/X86/vec_shift6.ll with some code examples that are affected by this change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201271 91177308-0d34-0410-b5e6-96231b3b80d8 |
||
---|---|---|
.. | ||
AArch64 | ||
ARM | ||
CPP | ||
Generic | ||
Hexagon | ||
Inputs | ||
Mips | ||
MSP430 | ||
NVPTX | ||
PowerPC | ||
R600 | ||
SPARC | ||
SystemZ | ||
Thumb | ||
Thumb2 | ||
X86 | ||
XCore |