llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-12-11 13:37:07 +00:00

History

Chandler Carruth a34a4a834e [x86] Teach the 128-bit vector shuffle lowering routines to take advantage of the existence of a reasonable blend instruction. The 256-bit vector shuffle lowering has leveraged the general technique of decomposed shuffles and blends for quite some time, but this never made it back into the 128-bit code, and there are a large number of patterns where this is substantially better. For example, this removes almost all domain crossing in vector shuffles that involve some blend and some permutation with SSE4.1 and later. See the massive reduction in 'shufps' for integer test cases in this commit. This isn't perfect yet for a few reasons: 1) The v8i16 shuffle lowering continues to plague me. We don't always form an unpack-based blend when that would be better. But the wins pretty drastically outstrip the losses here. 2) The v16i8 shuffle lowering is just a disaster here. I never went and implemented blend support here for some terrible reason. I'll do that next probably. I've not updated it for now. More variations on this technique are coming as well -- we don't shuffle-into-unpack or shuffle-into-palignr, both of which would also be profitable. Note that some test cases grow significantly in the number of instructions, but I expect to actually be faster. We use pshufd+pshufd+blendw instead of a single shufps, but the pshufd's are very likely to pipeline well (two ports on most modern intel chips) and the blend is a very fast instruction. The domain switch penalty will essentially always be more than a blend instruction, which is the only increase in tree height. llvm-svn: 229350		2015-02-16 01:52:02 +00:00
..
Analysis	Fixed a bug where CFLAA would crash the compiler.	2015-02-12 03:07:07 +00:00
Assembler	AsmWriter/Bitcode: MDImportedEntity	2015-02-13 01:46:02 +00:00
Bindings	[OCaml] Add Llvm.build_empty_phi.	2015-02-06 13:42:03 +00:00
Bitcode	[Bitcode reader] Fix a few assertions when reading invalid files	2015-02-16 00:03:11 +00:00
BugPoint	IR: Move MDLocation into place	2015-01-14 22:27:36 +00:00
CodeGen	[x86] Teach the 128-bit vector shuffle lowering routines to take	2015-02-16 01:52:02 +00:00
DebugInfo	Add the missing testcase for r228764.	2015-02-10 23:32:56 +00:00
ExecutionEngine	[Orc] Make OrcMCJITReplacement::addObject calls transfer buffer ownership to the	2015-02-02 19:51:18 +00:00
Feature	Don't promote asynch EH invokes of nounwind functions to calls	2015-02-11 01:23:16 +00:00
FileCheck
Instrumentation	tsan: do not instrument not captured values	2015-02-12 09:55:28 +00:00
Integer
JitListener	IR: Move MDLocation into place	2015-01-14 22:27:36 +00:00
Linker	Add run line that was missing in r228999.	2015-02-13 16:00:03 +00:00
LTO	Introduce llvm/test/LTO/X86. LTO tests may be assumed as target-specific.	2015-01-30 10:09:26 +00:00
MC	[X86] Add assembly parser support for mnemonic aliases for AVX-512 vpcmp instructions.	2015-02-15 07:13:48 +00:00
Object	[ELFYAML] Provide default value 0 for YAML relocation addendum field	2015-01-29 06:56:24 +00:00
Other	Don't promote asynch EH invokes of nounwind functions to calls	2015-02-11 01:23:16 +00:00
SymbolRewriter	SymbolRewriter: allow rewriting with comdats	2015-01-27 22:57:39 +00:00
TableGen
tools	gold-plugin: fix test to allow default visibility on local symbols	2015-02-15 09:32:30 +00:00
Transforms	FileCheck-ize a test to make it easier to migrate to typeless pointers	2015-02-15 04:14:00 +00:00
Unit
Verifier	Verifier: Check for null operands in !llvm.module.flags	2015-02-11 09:13:06 +00:00
YAMLParser
.clang-format
CMakeLists.txt	Back out two accidental changes that snuck in with r229245. Sorry these	2015-02-14 09:05:58 +00:00
lit.cfg	[gold] Consolidate the gold plugin options and actually search for	2015-02-14 09:43:57 +00:00
lit.site.cfg.in	[gold] Consolidate the gold plugin options and actually search for	2015-02-14 09:43:57 +00:00
Makefile	[gold] Consolidate the gold plugin options and actually search for	2015-02-14 09:43:57 +00:00
Makefile.tests
TestRunner.sh