llvm-mirror/lib
Simon Pilgrim c3425b72b9 [X86][SSE] Vectorized i8 and i16 shift operators
This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:

1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.

The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.

Tested on SSE2, SSE41 and AVX machines.

Differential Revision: http://reviews.llvm.org/D9474

llvm-svn: 239509
2015-06-11 07:46:37 +00:00
..
Analysis [GVN] Set proper debug locations for some instructions created by GVN. 2015-06-10 17:37:38 +00:00
AsmParser Fix doxygen comments. NFC 2015-06-07 06:40:24 +00:00
Bitcode Use early return idiom. NFC 2015-06-06 20:44:53 +00:00
CodeGen [PHIElim] Use ranges and const-ify, NFC. 2015-06-11 07:45:05 +00:00
DebugInfo
ExecutionEngine fix crash 2015-06-10 03:06:06 +00:00
Fuzzer
IR Revert "Move dllimport name mangling to IR mangler." 2015-06-11 01:31:48 +00:00
IRReader
LibDriver LibDriver, llvm-lib: introduce. 2015-06-09 21:50:22 +00:00
LineEditor
Linker
LTO
MC Replace string GNU Triples with llvm::Triple in MCSubtargetInfo and create*MCSubtargetInfo(). NFC. 2015-06-10 12:11:26 +00:00
Object Remove object_error::success and use std::error_code() instead 2015-06-09 15:20:42 +00:00
Option
Passes
ProfileData
Support Add more wrappers for symbol APIs to the C API. 2015-06-09 15:57:30 +00:00
TableGen
Target [X86][SSE] Vectorized i8 and i16 shift operators 2015-06-11 07:46:37 +00:00
Transforms ArgumentPromotion: Drop sret attribute on functions that are only called directly. 2015-06-10 21:14:34 +00:00
CMakeLists.txt LibDriver, llvm-lib: introduce. 2015-06-09 21:50:22 +00:00
LLVMBuild.txt LibDriver, llvm-lib: introduce. 2015-06-09 21:50:22 +00:00
Makefile LibDriver, llvm-lib: introduce. 2015-06-09 21:50:22 +00:00