mirror of
https://github.com/RPCS3/llvm.git
synced 2024-12-14 07:31:53 +00:00
34963650b8
This patch attempts to represent a shuffle as a repeating shuffle (recognisable by is128BitLaneRepeatedShuffleMask) with the source input(s) in their original lanes, followed by a single permutation of the 128-bit lanes to their final destinations. On AVX2 we can additionally attempt to match using 64-bit sub-lane permutation. AVX2 can also now match a similar 'broadcasted' repeating shuffle. This patch has several benefits: * Avoids prematurely matching with lowerVectorShuffleByMerging128BitLanes which can require both inputs to have their input lanes permuted before shuffling. * Can replace PERMPS/PERMD instructions - although these are useful for cross-lane unary shuffling, they require their shuffle mask to be pre-loaded (and increase register pressure). * Matching the repeating shuffle makes use of a lot of existing shuffle lowering. There is an outstanding minor AVX1 regression (combine_unneeded_subvector1 in vector-shuffle-combining.ll) of a previously 128-bit shuffle + subvector splat being converted to a subvector splat + (2 instruction) 256-bit shuffle, I intend to fix this in a followup patch for review. Differential Revision: http://reviews.llvm.org/D16537 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260834 91177308-0d34-0410-b5e6-96231b3b80d8 |
||
---|---|---|
.. | ||
Analysis | ||
Assembler | ||
Bindings | ||
Bitcode | ||
BugPoint | ||
CodeGen | ||
DebugInfo | ||
Examples | ||
ExecutionEngine | ||
Feature | ||
FileCheck | ||
Instrumentation | ||
Integer | ||
JitListener | ||
LibDriver | ||
Linker | ||
LTO | ||
MC | ||
Object | ||
Other | ||
SymbolRewriter | ||
TableGen | ||
tools | ||
Transforms | ||
Unit | ||
Verifier | ||
YAMLParser | ||
.clang-format | ||
CMakeLists.txt | ||
lit.cfg | ||
lit.site.cfg.in | ||
TestRunner.sh |