RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-01-06 12:04:52 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	649d92ad2f	[X86][SSE4A] Remove the GCCBuiltins from the movntsd/movntss intrinsic defs so we can emit native IR from clang. Clang-side sibling commit to follow. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273002 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-17 14:27:38 +00:00
Ranjeet Singh	e5b666f4cf	[ARM] Add support for mrrc/mrrc2 intrinsics. Reapplying patch as it was reverted when it was first committed because of an assertion failure when the mrrc2 intrinsic was called in ARM mode. The failure was happening because the instruction was being built in ARMISelDAGToDAG.cpp and the tablegen description for mrrc2 instruction doesn't allow you to use a predicate. The ARM architecture manuals do say that mrrc2 in ARM mode can be predicated with AL in assembly but this has no effect on the encoding of the instruction as the top 4 bits will always be 1111 not 1110 which is the encoding for the condition AL. Differential Revision: http://reviews.llvm.org/D21408 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272982 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-17 00:52:41 +00:00
Sanjay Patel	a240c3eb0e	[x86] autoupgrade and remove AVX2 integer min/max intrinsics This will (hopefully very temporarily) break clang. The clang side of this should be the next commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272932 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 18:44:20 +00:00
Rafael Espindola	1f4afa2405	dos2unix this test. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272928 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 18:21:11 +00:00
Sanjay Patel	255484c723	remove old FileCheck lines that are no longer used git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272921 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 17:04:16 +00:00
Sanjay Patel	f449b0e944	[DAG] Remove redundant FMUL in Newton-Raphson SQRT code When calculating a square root using Newton-Raphson with two constants, a naive implementation is to use five multiplications (four muls to calculate reciprocal square root and another one to calculate the square root itself). However, after some reassociation and CSE the same result can be obtained with only four multiplications. Unfortunately, there's no reliable way to do such a reassociation in the back-end. So, the patch modifies NR code itself so that it directly builds optimal code for SQRT and doesn't rely on any further reassociation. Patch by Nikolai Bozhenov! Differential Revision: http://reviews.llvm.org/D21127 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272920 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 16:58:54 +00:00
Rafael Espindola	b72793375f	Don't print (PLT) on arm. The R_ARM_PLT32 relocation is deprecated and is not produced by MC. This means that the code being deleted is dead from the .o point of view and was making the .s more confusing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272909 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 16:09:53 +00:00
Sanjay Patel	2c8c45d695	[x86] autoupgrade and remove SSE2/SSE41 integer min/max intrinsics Follow-up to: http://reviews.llvm.org/rL272806 http://reviews.llvm.org/rL272807 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272907 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 15:48:30 +00:00
Daniel Sanders	efa6f4843e	[mips][mips16] Fix machine verifier errors about incorrect register classes on load/stores. Summary: [ls][bh] and [ls][bh]u cannot use sp-relative addresses and must therefore lower frameindex nodes such that there is a copy to a CPU16Regs register. This is now done consistently using a separate addressing mode that does not permit frameindex nodes. As part of this I've had to remove an optimization that reduced the number of instructions needed to work around the lack of sp-relative addresses on [ls][bh] and [ls][bh]u. This optimization used one of the eight CPU16Regs registers as a copy of the stack pointer and it's implementation was the root cause of many of the register vs register class mismatches. lw/sw can use sp-relative addresses but we ought to ensure that we use the correct version of lw/sw internally for things like IAS. This is not currently the case and this change does not fix this. However, this change does clean it up sufficiently well to fix the machine verifier failures. Also removed irrelevant functions from stchar.ll. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21062 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272882 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 10:20:59 +00:00
Daniel Sanders	0b8fc77698	[llvm-objdump] Support detection of feature bits from the object and implement this for Mips. Summary: The Mips implementation only covers the feature bits described by the ELF e_flags so far. Mips stores additional feature bits such as MSA in the .MIPS.abiflags section. Also fixed a small bug this revealed where microMIPS wouldn't add the EF_MIPS_MICROMIPS flag when using -filetype=obj. Reviewers: echristo, rafael Subscribers: rafael, mehdi_amini, dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21125 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272880 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 09:17:03 +00:00
Hrvoje Varga	98d31c1b79	[mips][micromips] Implement DCLO, DCLZ, DROTR, DROTR32 and DROTRV instructions Differential Revision: http://reviews.llvm.org/D16917 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272876 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 07:06:25 +00:00
Tim Northover	bde073f537	AArch64: allow MOV (imm) alias to be printed The backend has been around for years, it's pretty ridiculous that we can't even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen can't handle the complex predicates when printing so it's a bunch of nasty C++. Oh well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272865 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 01:42:25 +00:00
Matt Arsenault	11e5e3bbe1	AMDGPU: Disable scheduling in some slow tests Disabling the pre-RA scheduler on large-work-group-registers causes it to be ~50% slower. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272860 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 00:56:47 +00:00
Sanjay Patel	079ac1edc0	[x86, SSE] update packed FP compare tests for direct translation from builtin to IR The clang side of this was r272840: http://reviews.llvm.org/rL272840 A follow-up step would be to auto-upgrade and remove these LLVM intrinsics completely. Differential Revision: http://reviews.llvm.org/D21269 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272841 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 21:22:15 +00:00
Sanjay Patel	efa2e76536	[x86] delete unnecessary function declarations Missed this in r272806, r272807. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272834 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 20:51:47 +00:00
Tim Northover	4d6b849855	AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints. Of course the assembly was right but because the opcode was MOVZWi it was encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and would go horribly wrong on a CPU. No idea how this bug survived this long. It seems nobody is using that aspect of patchpoints. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272831 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 20:33:36 +00:00
Sanjay Patel	2df18c8dc0	[x86] add folds for x86 vector compare nodes (PR27924) Ideally, we can get rid of most x86 LLVM intrinsics by transforming them to IR (and some of that happened with http://reviews.llvm.org/rL272807), but it doesn't cost much to have some simple folds in the backend too while we're working on that and as a backstop. This fixes: https://llvm.org/bugs/show_bug.cgi?id=27924 Differential Revision: http://reviews.llvm.org/D21356 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272828 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 20:26:58 +00:00
Kevin B. Smith	42abc92144	[X86]: Updated r272801 to promote 16 bit compares with immediate operand to 32 bits. This is in response to a comment by Eli Friedman. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272814 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 18:18:05 +00:00
Sanjay Patel	3539217c1c	[x86, SSE] remove the GCCBuiltins from the integer min/max intrinsics This allows us to emit native IR in Clang (next commit). Also, update the intrinsic tests to show that codegen already knows how to handle the IR that Clang will soon produce. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272806 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 17:17:27 +00:00
Kevin B. Smith	746c3b7ff4	[X86]: Quit promoting 8 and 16 bit compares to 32 bit. Differential Revision: http://reviews.llvm.org/D21144 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272801 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 16:37:46 +00:00
Kevin B. Smith	8800861c19	[X86]: Improve Liveness checking for X86FixupBWInsts.cpp Differential Revision: http://reviews.llvm.org/D21085 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272797 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 16:03:06 +00:00
Ranjeet Singh	a94e734a24	Reverting r272778 because there's an assertion failure when running the test CodeGen/ARM/intrinsics-coprocessor.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272791 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 14:23:29 +00:00
Simon Dardis	b5361d7101	[mips] Missing test case Add missing testcase from r272666. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272784 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 13:49:58 +00:00
Ranjeet Singh	c0f8f419a5	[ARM] Add support for mrrc/mrrc2 intrinsics. Differential Revision: http://reviews.llvm.org/D21178 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272778 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 11:32:24 +00:00
Daniel Sanders	5ffd0983e4	[mips] Removed invalid test from o32_cc.ll MIPS32R1 cannot implement a 64-bit FPU because this was introduced in MIPS32R2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272769 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 09:47:27 +00:00
Daniel Sanders	ee9790cc7e	[mips][msa] Fix register/register-class mismatches in emitINSERT_DF_VIDX(). Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21068 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272765 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 08:43:23 +00:00
Zlatko Buljan	1f61965a92	[mips][microMIPS] Add CodeGen support for AND, OR16, OR, XOR*, NOT16 and NOR instructions Differential Revision: http://reviews.llvm.org/D16719 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272764 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 07:46:24 +00:00
Igor Breger	b8387d0ad3	[AVX512] Fix BLENDM lowering patterns. Operands should be swapped to match SELECT behavior. Use BLENDM instead of masked move instruction. Differential Revision: http://reviews.llvm.org/D21001 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272763 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 07:30:38 +00:00
Nicolai Haehnle	682fc3e780	AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272761 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 07:13:05 +00:00
Sanjoy Das	e9f1be7f56	Don't force SP-relative addressing for statepoints Summary: ... when the offset is not statically known. Prioritize addresses relative to the stack pointer in the stackmap, but fallback gracefully to other modes of addressing if the offset to the stack pointer is not a known constant. Patch by Oscar Blumberg! Reviewers: sanjoy Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm Differential Revision: http://reviews.llvm.org/D21259 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272756 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 05:35:14 +00:00
David Majnemer	0c4f69f653	Remove the ScalarReplAggregates pass Nearly all the changes to this pass have been done while maintaining and updating other parts of LLVM. LLVM has had another pass, SROA, which has superseded ScalarReplAggregates for quite some time. Differential Revision: http://reviews.llvm.org/D21316 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272737 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 00:19:09 +00:00
Matt Arsenault	6af03e5068	AMDGPU: Run pointer optimization passes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272736 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 00:11:01 +00:00
Xinliang David Li	bfbb095051	Fix a test case to match its intention git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272733 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 23:05:46 +00:00
Dehao Chen	97615522d0	Set machine block placement hot prob threshold for both static and runtime profile. Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20991 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272729 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 22:27:17 +00:00
Sanjay Patel	8291779372	[x86] add current codegen tests for PR27924 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272714 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 21:25:46 +00:00
Peter Collingbourne	63b34cdf34	IR: Introduce local_unnamed_addr attribute. If a local_unnamed_addr attribute is attached to a global, the address is known to be insignificant within the module. It is distinct from the existing unnamed_addr attribute in that it only describes a local property of the module rather than a global property of the symbol. This attribute is intended to be used by the code generator and LTO to allow the linker to decide whether the global needs to be in the symbol table. It is possible to exclude a global from the symbol table if three things are true: - This attribute is present on every instance of the global (which means that the normal rule that the global must have a unique address can be broken without being observable by the program by performing comparisons against the global's address) - The global has linkonce_odr linkage (which means that each linkage unit must have its own copy of the global if it requires one, and the copy in each linkage unit must be the same) - It is a constant or a function (which means that the program cannot observe that the unique-address rule has been broken by writing to the global) Although this attribute could in principle be computed from the module contents, LTO clients (i.e. linkers) will normally need to be able to compute this property as part of symbol resolution, and it would be inefficient to materialize every module just to compute it. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html for earlier discussion. Part of the fix for PR27553. Differential Revision: http://reviews.llvm.org/D20348 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272709 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 21:01:22 +00:00
Wei Mi	b7b4ba37de	[X86] Reduce the width of multiplification when its operands are extended from i8 or i16 For <N x i32> type mul, pmuludq will be used for targets without SSE41, which often introduces many extra pack and unpack instructions in vectorized loop body because pmuludq generates <N/2 x i64> type value. However when the operands of <N x i32> mul are extended from smaller size values like i8 and i16, the type of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which generates better code. For targets with SSE41, pmulld is supported so no shrinking is needed. Differential Revision: http://reviews.llvm.org/D20931 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272694 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 18:53:20 +00:00
Nirav Dave	adf7e0e7c6	Fix BSS global handling in AsmPrinter Change EmitGlobalVariable to check final assembler section is in BSS before using .lcomm/.comm directive. This prevents globals from being put into .bss erroneously when -data-sections is used. This fixes PR26570. Reviewers: echristo, rafael Subscribers: llvm-commits, mehdi_amini Differential Revision: http://reviews.llvm.org/D21146 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272674 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 15:09:30 +00:00
Simon Dardis	659228b9db	[mips] Optimize stack pointer adjustments. Instead of always using addu to adjust the stack pointer when the size out is of the range of an addiu instruction, use subu so that a smaller constant can be generated. This can give savings of ~3 instructions whenever a function has a a stack frame whose size is out of range of an addiu instruction. This change may break some naive stack unwinders. Partially resolves PR/26291. Thanks to David Chisnall for reporting the issue. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D21321 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272666 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 13:39:43 +00:00
James Molloy	a523293cd9	[Thumb] Fix off-by-one error in r272007 We can only generate immediates up to #510 with a MOV+ADD, not #511, because there's no such instruction as add #256. Found by Oliver Stannard and csmith! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272665 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 13:33:07 +00:00
Simon Dardis	55037ea9b0	[mips][atomics] Fix atomic instruction descriptions and uses. PR27458 highlights that the MIPS backend does not have well formed MIR for atomic operations (among other errors). This patch adds expands and corrects the LL/SC descriptions and uses for MIPS(64). Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D19719 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272655 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 11:29:28 +00:00
Simon Pilgrim	ece1ebdf12	[X86][SSE4A] Added patterns for nontemporal stores of scalar float/doubles using MOVNTSD/MOVNTSS git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272651 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 09:43:38 +00:00
Simon Dardis	217c907cac	[mips] MIPS32/64 itineraries Itineraries for some pre MIPSR6 and EVA instructions. Some pseudo expanded instructions are marked as having no scheduling info. Reviewers: dsanders, vkalintiris Differential Review: http://reviews.llvm.org/D20418 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272648 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 09:35:29 +00:00
Daniel Sanders	bd52e9f225	[mips][dsp] Fix use without def on DSPCtrl registers read by rddsp intrinsic. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21063 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272647 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 09:29:46 +00:00
Daniel Sanders	9cea6726d3	[mips][msa] copyPhysReg() should not set RegState::Define on result of CTCMSA. Summary: The machine verifier reports 'Explicit operand marked as def' when it is manually specified even though it agrees with the operand info. Reviewers: sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D21065 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272646 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 09:11:33 +00:00
Diana Picus	7845b7dd45	[SelectionDAG] Remove exit-on-error flag from test (PR27765) The exit-on-error flag in the ARM test is necessary in order to avoid an unreachable in the DAGTypeLegalizer, when trying to expand a physical register. We can also avoid this situation by introducing a bitcast early on, where the invalid scalar-to-vector conversion is detected. We also add a test for PowerPC, which goes through a similar code path in the SelectionDAGBuilder. Fixes PR27765. Differential Revision: http://reviews.llvm.org/D21061 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272644 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 07:30:20 +00:00
Igor Breger	7c5456b142	re-generate the tests using the update_llc_test_checks.py script git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272643 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 07:05:10 +00:00
Craig Topper	62458bf56e	[AVX512] Use MOVZX32 instead of MOVZ16 for loading single v8/v4/v2/v1 masks when KMOVB is not available. This has better behavior with respect to partial register stalls since it won't need to preserve the upper 16-bits of the GPR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272626 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 03:13:00 +00:00
Craig Topper	51ad7064a4	[AVX512] Add patterns for zero-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272625 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 03:12:54 +00:00
Craig Topper	2dba2a4d42	[AVX512] Add tests for zero extending masks that show an unnecessary movzx instruction. A followup patch will remove that instruction, but adding the tests first to make the more obvious. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272624 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 03:12:48 +00:00
Sanjoy Das	60271907e7	Move previously added test case to the right location In rL272580 I accidentally added a test case to test/CodeGen when test/Transforms/DeadStoreElimination/ is a better place for it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272581 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 20:12:07 +00:00
Sanjoy Das	ae9ac9ce84	Fix AAResults::callCapturesBefore for operand bundles Summary: AAResults::callCapturesBefore would previously ignore operand bundles. It was possible for a later instruction to miss its memory dependency on a call site that would only access the pointer through a bundle. Patch by Oscar Blumberg! Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D21286 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272580 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 19:55:04 +00:00
Simon Pilgrim	933aa2e6be	[X86][SSE] Added extract to scalar nontemporal store tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272577 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 19:08:28 +00:00
David Majnemer	204d45582a	[X86] Remove llvm.x86.bit.scan.{forward,reverse}.32 The need for these intrinsics has been obviated by r272564 which reimplements their functionality using generic IR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272566 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 17:33:13 +00:00
Marek Olsak	760c36c5ae	AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272556 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 16:05:57 +00:00
Ulrich Weigand	603d680eb5	[SystemZ] Enable index register memory constraints for inline ASM This enables use of the 'R' and 'T' memory constraints for inline ASM operands on SystemZ, which allow an index register as well as an immediate displacement. This patch includes corresponding documentation and test case updates. As with the last patch of this kind, I moved the 'm' constraint to the most general case, which is now 'T' (base + 20-bit signed displacement + index register). Author: colpell Differential Revision: http://reviews.llvm.org/D21239 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272547 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 14:24:05 +00:00
Ranjeet Singh	1dd5b28858	[ARM] Reverting r272544 because clang patch needs to go in as soon as llvm patch has gone in because tests will start breaking in Clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272546 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 10:58:24 +00:00
Ranjeet Singh	84bf8bc6d0	[ARM] Add mrrc/mrrc2 co-processor intrinsics MRRC/MRRC2 instruction writes to two registers. The intrinsic definition returns a single uint64_t to represent the write, this is a compact way of representing a write to two 32 bit registers, the alternative might have been two return a struct of 2 uint32_t's but this isn't as nice. Differential Revision: git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272544 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 10:43:50 +00:00
Strahinja Petrovic	7417e35311	This patch fixes handling long double type when it is constant in soft float mode on PowerPC 32 architecture. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272543 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 10:29:29 +00:00
Simon Pilgrim	dc051d0c8c	[X86][SSE4A] Renamed tests to correspond with the the instruction with being tested git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272542 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 10:14:42 +00:00
Craig Topper	dbd262941c	[AVX512] Remove maksed pshufd, pshuflw, and phufhw intrinsics and autoupgrade them to selects and shufflevector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272527 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 02:36:48 +00:00
Sanjay Patel	9a476793c5	[x86, SSE] change patterns for CMPP to float types to allow matching with SSE1 (PR28044) This patch is intended to solve: https://llvm.org/bugs/show_bug.cgi?id=28044 By changing the definition of X86ISD::CMPP to use float types, we allow it to be created and pass legalization for an SSE1-only target where v4i32 is not legal. The motivational trail for this change includes: https://llvm.org/bugs/show_bug.cgi?id=28001 and eventually makes this trigger: http://reviews.llvm.org/D21190 Ie, after this step, we should be free to have Clang generate FP compare IR instead of x86 intrinsics for SSE C packed compare intrinsics. (We can auto-upgrade and remove the LLVM sse.cmp intrinsics as a follow-up step.) Once we're generating vector IR instead of x86 intrinsics, a big pile of generic optimizations can trigger. Differential Revision: http://reviews.llvm.org/D21235 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272511 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-12 15:03:25 +00:00
Craig Topper	b2cfb64e72	[X86] Remove sse2 pshufd/pshuflw/pshufhw intrinsics and upgrade them to shufflevector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272510 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-12 14:11:32 +00:00
Simon Pilgrim	e4f64b2456	[X86][BMI] Added fast-isel tests for BMI1 intrinsics A lot of the codegen is pretty awful for these as they are mostly implemented as generic bit twiddling ops git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272508 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-12 09:56:05 +00:00
Craig Topper	0771fcddeb	[X86] Move tests for llvm.x86.avx.vpermil.* intrinsics to a -upgrade test since they are autoupgraded to shufflevector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272494 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-12 01:41:06 +00:00
Simon Pilgrim	58140e4f69	[X86] Updated test checks script to generalise LCPI symbol refs The script now replace '.LCPI888_8' style asm symbols with the {{\.LCPI.*}} re pattern - this helps stop hardcoded symbols in 32-bit x86 tests changing with every edit of the file Refreshed some tests to demonstrate the new check git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272488 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 20:39:21 +00:00
Simon Pilgrim	fd46fc3322	[X86][SSSE3] Added PSHUFB LUT implementation of BITREVERSE PSHUFB can speed up BITREVERSE of byte vectors by performing LUT on the low/high nibbles separately and ORing the results. Wider integer vector types are already BSWAP'd beforehand so also make use of this approach. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272477 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 15:44:13 +00:00
Craig Topper	1385fc37d8	[AVX512] Re-generate v8i64 shuffle test now that we use pshufd for some cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272474 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 13:57:08 +00:00
Craig Topper	64162b5008	[AVX512] Lower v8i64 and v16i32 to pshufd when possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272473 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 13:43:21 +00:00
Simon Pilgrim	56a9634f03	[X86][SSE] Added PSLLDQ/PSRLDQ as a target shuffle type Ensure that PALIGNR/PSLLDQ/PSRLDQ are byte vectors so that they can be correctly decoded for target shuffle combining git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272471 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 13:38:28 +00:00
Simon Pilgrim	5221b09672	[X86][AVX2] Added PSLLDQ/PSRLDQ shuffle combining tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272469 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 13:18:21 +00:00
Craig Topper	ff7edbe38c	[AVX512] Add support for lowering v32i16 shuffles with repeated lanes. This allows us to create 512-bit PSHUFLW/PSHUFHW. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272450 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 03:27:42 +00:00
Quentin Colombet	5b7abef816	[IRTranslator] Support the translation of or. Now or instructions get translated into G_OR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272433 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 20:50:35 +00:00
Sanjay Patel	e208b3790d	[x86] enable bitcasted fabs/fneg transforms The vector cases don't change because we already have folds in X86ISelLowering to look through and remove bitcasts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272427 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 20:33:50 +00:00
Zhan Jun Liau	23b8e3e7cc	[SystemZ] Support Compare and Traps Support and generate Compare and Traps like CRT, CIT, etc. Support Trap as legal DAG opcodes and generate "j .+2" for them by default. Add support for Conditional Traps and use the If Converter to convert them into the corresponding compare and trap opcodes. Differential Revision: http://reviews.llvm.org/D21155 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272419 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 19:58:10 +00:00
Tom Stellard	4ee3d0cb4d	AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations, and fixup_si_rodata is for relative relocations. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21153 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272417 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 19:26:38 +00:00
Mehdi Amini	af43daa44e	Move CodeGen test from Generic to X86 specific directory git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272416 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 19:14:01 +00:00
Mehdi Amini	13c1e2501a	Interprocedural Register Allocation (IPRA): add a Transformation Pass Adds a MachineFunctionPass that scans the body to find calls, and update the register mask with the one saved by the RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D21180 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272414 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 18:37:21 +00:00
Sanjay Patel	2e8d26714f	[x86] add test for PR28044 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272411 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 18:05:55 +00:00
Mehdi Amini	32b9ed845a	Interprocedural Register Allocation (IPRA) Analysis Add an option to enable the analysis of MachineFunction register usage to extract the list of clobbered registers. When enabled, the CodeGen order is changed to be bottom up on the Call Graph. The analysis is split in two parts, RegUsageInfoCollector is the MachineFunction Pass that runs post-RA and collect the list of clobbered registers to produce a register mask. An immutable pass, RegisterUsageInfo, stores the RegMask produced by RegUsageInfoCollector, and keep them available. A future tranformation pass will use this information to update every call-sites after instruction selection. Patch by Vivek Pandya <vivekvpandya@gmail.com> Differential Revision: http://reviews.llvm.org/D20769 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272403 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 16:19:46 +00:00
Sanjay Patel	2f8d84f8f9	[x86] fix test attributes and autogenerate checks git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272398 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 15:30:52 +00:00
Sanjay Patel	8858190401	[x86] add missing tests for fcmp ueq/one Somehow, the codegen logic for these sequences has gone completely untested until now (note the 2 compare instructions generated per test). There's also an Intel AVX optimization opportunity exposed in these cases and the existing tests. Intel's (but not AMD's) AVX spec shows that extra FP predicates were added, so a single comparison should always be sufficient, and operand commutation should never be necessary. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272397 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 15:17:54 +00:00
Sanjay Patel	7564bb471e	[x86] regenerate checks git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272396 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 14:48:50 +00:00
Simon Pilgrim	d6d608f9e8	[X86][SSE] Added target shuffle combine tests for byte shift/rotates (PSLLDQ/PSRLDQ/PALIGNR) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272392 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 13:03:22 +00:00
Simon Pilgrim	0d0f696346	[X86][AVX512] Added VPSLLDQ/VPSRLDQ memory fold tests Memory operand is new for AVX512 (SSE/AVX2 didn't support it). Also dropped the 'mask' from the tests (VPSLLDQ/VPSRLDQ don't support masked operations). Regenerated VPALIGNR test now that the shuffle comments work git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272383 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 09:56:20 +00:00
Craig Topper	e041e2676c	[AVX512] Add shuffle comment printing for masked VPERMPD/VPERMQ. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272371 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 05:12:40 +00:00
Craig Topper	0e4bc86c7c	[AVX512] Fix shuffle comment printing to handle the masked versions of some shuffles. Previously we were printing the mask operands as the register names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272367 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 04:48:05 +00:00
Quentin Colombet	47535581c8	[LiveRangeEdit] Add a test case for r272314. The test case is not great espicially because it is still cumbersome to run the regalloc pass with run-pass. (We miss a bunch of initiliazier to be properly implemented.) Related to llvm.org/PR27983 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272360 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 01:57:48 +00:00
Quentin Colombet	aef5f640f7	[llc] Add support for several run-pass options. Previously we could run only one machine pass with the run-pass option. With that patch, we can now specify several passes with several run-pass options (or just one option with a list of comma separated passes) and llc will build the related pipeline. This is great to test the interaction of two passes that are not necessarily next to each other in the pipeline, or play with pass ordering. Now, we should be at parity with opt for the flexibility of running passes. Note: I also moved the run pass option from CommandFlags.h to llc.cpp because, really, this is needed only there! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272356 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 00:52:10 +00:00
Matt Arsenault	dbaa4b4486	AMDGPU: v_cndmask_b32 does not def vcc Fixes verifier errors after SIShrinkInstructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272351 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 00:18:41 +00:00
Tom Stellard	60f588f570	AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_permute Summary: This fixes a bug with ds_permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272349 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 00:01:04 +00:00
Matt Arsenault	4080a06a24	AMDGPU: Fix flat atomics The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272345 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:54 +00:00
Matt Arsenault	bada556f73	AMDGPU: Fix i64 global cmpxchg This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272344 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:48 +00:00
Matt Arsenault	003d842e7f	AMDGPU: Fix missing and broken check lines in atomic tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272343 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:44 +00:00
Eric Christopher	7da58e6313	Add aliases for mfvrsave/mtvrsave. Update a test as we're now going to emit it for easier reading of generated assembly as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272339 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:27:48 +00:00
Simon Pilgrim	8f579ce1a6	[X86][AVX512] Added avx512 VPSLLDQ/VPSRLDQ instruction comments git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272319 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 22:03:15 +00:00
Simon Pilgrim	3ddec70a78	[X86][AVX512] Dropped avx512 VPSLLDQ/VPSRLDQ intrinsics Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272308 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 21:09:03 +00:00
Simon Pilgrim	f921bac68f	[X86][AVX512] Fixed issue with v16i32 shuffles lowering to VPALIGNR git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272307 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 20:53:12 +00:00
Simon Pilgrim	9ceba992f6	[X86][AVX512] Added support for lowering 512-bit vector shuffles to bit/byte shifts 512-bit VPSLLDQ/VPSRLDQ can only be used for avx512bw targets so lowerVectorShuffleAsShift had to be adjusted to include the subtarget git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272300 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 20:13:58 +00:00
Justin Lebar	dd9e8b3bcc	[NVPTX] Add intrinsics for shfl instructions. Summary: Currently clang emits these instructions via inline (volatile) asm in the CUDA headers. Switching to intrinsics will let the optimizer reason across calls to these intrinsics. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D21160 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272298 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 20:04:08 +00:00
Wei Ding	39ce7152a2	AMDGPU/SI: Fix 32-bit fdiv lowering We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272292 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 19:17:15 +00:00
Davide Italiano	a72ade5c07	Also fix a typo. Need more coffee today. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272278 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 17:06:01 +00:00
Davide Italiano	c4c43eaa95	Improve r272262, check that __stack_chk_guard is used. Thanks to Rafael for the suggestion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272277 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 17:04:38 +00:00
Jan Vesely	406c47ff89	SelectionDAG: Implement expansion of {S,U}MIN/MAX in integer legalization Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D17898 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272272 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 16:04:00 +00:00
Haicheng Wu	c4f2258852	Reapply "[MBP] Reduce code size by running tail merging in MBP."" This reapplies commit r271930, r271915, r271923. They hit a bug in Thumb which is fixed in r272258 now. The original message: The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272267 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 15:24:29 +00:00
Ulrich Weigand	09f4ea27b7	[SystemZ] Enable long displacement constraints for inline ASM operands This enables use of the 'S' constraint for inline ASM operands on SystemZ, which allows for a memory reference with a signed 20-bit immediate displacement. This patch includes corresponding documentation and test case updates. I've changed the 'T' constraint to match the new behavior for 'S', as 'T' also uses a long displacement (though index constraints are still not implemented). I also changed 'm' to match the behavior for 'S' as this will allow for a wider range of displacements for 'm', though correct me if that's not the right decision. Author: colpell Differential Revision: http://reviews.llvm.org/D21097 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272266 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 15:19:16 +00:00
Davide Italiano	cbf7512550	Move stackguard test to X86/ directory as it's not generic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272264 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 15:16:58 +00:00
Davide Italiano	cb6cf5b6ec	[CodeGen] Change getSDagStackGuard to get an internal sym. Fixes a crash in the backend during an LTO build of rtld(1) in FreeBSD. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272262 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 14:23:38 +00:00
Igor Breger	de21197e48	[AVX512] Remove masked_move/blendm intrinsic from back-end. This is complement patch to D21060. Differential Revision: http://reviews.llvm.org/D21174 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272257 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 11:46:55 +00:00
Zlatko Buljan	2edd549258	[mips][microMIPS] Add CodeGen support for SEL., SELEQZ, SELNEZ, SELEQZ., SELNEZ.* and CMP.condn.fmt instructions Differential Revision: http://reviews.llvm.org/D20862 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272256 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 11:15:53 +00:00
Diana Picus	1063f931d5	[llc] Remove exit-on-error flag from MIR tests (PR27770) This is made possible by removing an assert in llc that assumed MIRParser::parseLLVMModule would exit on error. MIRParser's documentation states that it returns null if a parsing error occurs, so there's no reason to assert. We can instead just fall through to where the check for a module is performed and exit if it is null. This commit is part of the clean-up after r269655. Fixes PR27770 Differential Revision: http://reviews.llvm.org/D20371 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272254 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 10:31:05 +00:00
Craig Topper	b867bf3ea9	[AVX512] Fix shuffle decode printing for several instructions with write masks. There are still more bugs here with UNPCK and PALIGN for sure. But these were the easiest ones to fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272252 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 07:49:08 +00:00
James Molloy	95709cad3b	[Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead; int i(int a) { return a & 0xfffffeec; } Used to produce: ldr r1, [CONSTPOOL] ands r0, r1 CONSTPOOL: 0xfffffeec And now produces: movs r1, #255 adds r1, #20 ; Less costly immediate generation bics r0, r1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272251 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 07:39:08 +00:00
Craig Topper	cadff981d8	[X86] Fix a test I failed to re-generate in r272249. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272250 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 07:10:34 +00:00
Craig Topper	1b683873d6	[X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272249 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 07:06:38 +00:00
Quentin Colombet	c4f21c06ed	[MIR] Check that generic virtual registers get a size. Without that check it was possible to write test cases where the size was not specified and we ended up with weird asserts down the road, because the default value (1) would not make sense. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272226 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 23:27:46 +00:00
Dehao Chen	8153f63460	Revive http://reviews.llvm.org/D12778 to handle forward-hot-prob and backward-hot-prob consistently. Summary: Consider the following diamond CFG: A / \ B C \/ D Suppose A->B and A->C have probabilities 81% and 19%. In block-placement, A->B is called a hot edge and the final placement should be ABDC. However, the current implementation outputs ABCD. This is because when choosing the next block of B, it checks if Freq(C->D) > Freq(B->D) * 20%, which is true (if Freq(A) = 100, then Freq(B->D) = 81, Freq(C->D) = 19, and 19 > 8120%=16.2). Actually, we should use 25% instead of 20% as the probability here, so that we have 19 < 8125%=20.25, and the desired ABDC layout will be generated. Reviewers: djasper, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20989 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272203 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 21:30:12 +00:00
Quentin Colombet	792b56f6a7	[AArch64][RegisterBankInfo] G_OR are fine on either GPR or FPR. Teach AArch64RegisterBankInfo that G_OR can be mapped on either GPR or FPR for 64-bit or 32-bit values. Add test cases demonstrating how this information is used to coalesce a computation on a single register bank. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272170 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 16:53:32 +00:00
Oliver Stannard	25429add0f	[ARM] MSR instructions implicitly set CPSR The MSR instructions can write to the CPSR, but we did not model this fact, so we could emit them in the middle of IT blocks, changing the condition flags for later instructions in the block. The tests use two calls to llvm.write_register.i32 because it is valid to use these instructions at the end of an IT block, which if conversion does do in some cases. With two calls, the first clobbers the flags, so a branch has to be used to make the second one conditional. Differential Revision: http://reviews.llvm.org/D21139 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272154 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 15:26:34 +00:00
Matthias Braun	0f19dc2756	MIR: Fix parsing of stack object references in MachineMemOperands The MachineMemOperand parser lacked the code to handle %stack.X references (%fixed-stack.X was working). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272082 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 00:47:07 +00:00
Nicolai Haehnle	2ac1fa00c9	AMDGPU: Add amdgpu-ps-wqm-outputs function attributes Summary: The presence of this attribute indicates that VGPR outputs should be computed in whole quad mode. This will be used by Mesa for prolog pixel shaders, so that derivatives can be taken of shader inputs computed by the prolog, fixing a bug. The generated code could certainly be improved: if a prolog pixel shader is used (which isn't common in modern OpenGL - they're used for gl_Color, polygon stipples, and forcing per-sample interpolation), Mesa will use this attribute unconditionally, because it has to be conservative. So WQM may be used in the prolog when it isn't really needed, and furthermore a silly back-and-forth switch is likely to happen at the boundary between prolog and main shader parts. Fixing this is a bit involved: we'd first have to add a mechanism by which LLVM writes the WQM-related input requirements to the main shader part binary, and then Mesa specializes the prolog part accordingly. At that point, we may as well just compile a monolithic shader... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272063 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 21:37:17 +00:00
Simon Pilgrim	3aa4772f49	[X86][SSE4A] Regenerated SSE4A intrinsics tests There are no VEX encoded versions of SSE4A instructions, make sure that AVX targets give the same output git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272060 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 21:15:45 +00:00
Eric Christopher	c2a7f10882	Revert "Differential Revision: http://reviews.llvm.org/D20557 " Author: Wei Ding <wei.ding2@amd.com> Date: Tue Jun 7 19:04:44 2016 +0000 Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8 as it was breaking the bots. This reverts commit r272044. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272056 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 20:27:12 +00:00
Etienne Bergeron	70cf01c276	[stack-protection] Add support for MSVC buffer security check Summary: This patch is adding support for the MSVC buffer security check implementation The buffer security check is turned on with the '/GS' compiler switch. * https://msdn.microsoft.com/en-us/library/8dbf701c.aspx * To be added to clang here: http://reviews.llvm.org/D20347 Some overview of buffer security check feature and implementation: * https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx * http://www.ksyash.com/2011/01/buffer-overflow-protection-3/ * http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html For the following example: ``` int example(int offset, int index) { char buffer[10]; memset(buffer, 0xCC, index); return buffer[index]; } ``` The MSVC compiler is adding these instructions to perform stack integrity check: ``` push ebp mov ebp,esp sub esp,50h [1] mov eax,dword ptr [__security_cookie (01068024h)] [2] xor eax,ebp [3] mov dword ptr [ebp-4],eax push ebx push esi push edi mov eax,dword ptr [index] push eax push 0CCh lea ecx,[buffer] push ecx call _memset (010610B9h) add esp,0Ch mov eax,dword ptr [index] movsx eax,byte ptr buffer[eax] pop edi pop esi pop ebx [4] mov ecx,dword ptr [ebp-4] [5] xor ecx,ebp [6] call @__security_check_cookie@4 (01061276h) mov esp,ebp pop ebp ret ``` The instrumentation above is: * [1] is loading the global security canary, * [3] is storing the local computed ([2]) canary to the guard slot, * [4] is loading the guard slot and ([5]) re-compute the global canary, * [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling. Overview of the current stack-protection implementation: * lib/CodeGen/StackProtector.cpp * There is a default stack-protection implementation applied on intermediate representation. * The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie. * An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast). * Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling. * Guard manipulation and comparison are added directly to the intermediate representation. * lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp * lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp * There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls). * see long comment above 'class StackProtectorDescriptor' declaration. * The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr). * 'getSDagStackGuard' returns the appropriate stack guard (security cookie) * The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'. * include/llvm/Target/TargetLowering.h * Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'. * lib/Target/X86/X86ISelLowering.cpp * Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm. Function-based Instrumentation: * The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions. * To support function-based instrumentation, this patch is * adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h), * If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue. * modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation, * generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp), * if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp). Modifications * adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp) * adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h) Results * IR generated instrumentation: ``` clang-cl /GS test.cc /Od /c -mllvm -print-isel-input ``` ``` * Final LLVM Code input to ISel * ; Function Attrs: nounwind sspstrong define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 { entry: %StackGuardSlot = alloca i8* <<<-- Allocated guard slot %0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot) %index.addr = alloca i32, align 4 %offset.addr = alloca i32, align 4 %buffer = alloca [10 x i8], align 1 store i32 %index, i32* %index.addr, align 4 store i32 %offset, i32* %offset.addr, align 4 %arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0 %1 = load i32, i32* %index.addr, align 4 call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false) %2 = load i32, i32* %index.addr, align 4 %arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2 %3 = load i8, i8* %arrayidx, align 1 %conv = sext i8 %3 to i32 %4 = load volatile i8, i8* %StackGuardSlot <<<-- Loading Guard slot call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check ret i32 %conv } ``` * SelectionDAG generated instrumentation: ``` clang-cl /GS test.cc /O1 /c /FA ``` ``` "?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z" # BB#0: # %entry pushl %esi subl $16, %esp movl ___security_cookie, %eax <<<-- Loading Stack Guard value movl 28(%esp), %esi movl %eax, 12(%esp) <<<-- Store to Guard slot leal 2(%esp), %eax pushl %esi pushl $204 pushl %eax calll _memset addl $12, %esp movsbl 2(%esp,%esi), %esi movl 12(%esp), %ecx <<<-- Loading Guard slot calll @__security_check_cookie@4 <<<-- Epilogue function-based check movl %esi, %eax addl $16, %esp popl %esi retl ``` Reviewers: kcc, pcc, eugenis, rnk Subscribers: majnemer, llvm-commits, hans, thakis, rnk Differential Revision: http://reviews.llvm.org/D20346 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272053 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 20:15:35 +00:00
Wei Ding	e2d1122183	Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 19:04:44 +00:00
Geoff Berry	f323692d97	Reapply [AArch64] Fix isLegalAddImmediate() to return true for valid negative values. Originally reviewed here: http://reviews.llvm.org/D17463 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272023 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 16:48:43 +00:00
Haicheng Wu	63ca44cb85	Revert "[MBP] Reduce code size by running tail merging in MBP." This reverts commit r271930, r271915, r271923. They break a thumb selfhosting bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272017 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 15:17:21 +00:00
Simon Pilgrim	c9a83046c3	[X86][AVX512] Added 512-bit integer vector non-temporal load tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272016 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 15:12:47 +00:00
Simon Pilgrim	6ad3e358e9	[X86][SSE] Add general lowering of nontemporal vector loads Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins. This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch. We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction. There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets. Differential Review: http://reviews.llvm.org/D20965 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272010 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 13:34:24 +00:00
James Molloy	6e988c5976	[Thumb-1] Add optimized constant materialization for integers [256..512) We can materialize these integers using a MOV; ADDi8 pair. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272007 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 13:10:14 +00:00
Igor Breger	7e0019d8f7	[AVX512] Fix load opcode for fast isel. Differential Revision: http://reviews.llvm.org/D21067 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272006 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 13:08:45 +00:00
Ulrich Weigand	d7ad443387	[PowerPC] Support multiple return values with fast isel Using an LLVM IR aggregate return value type containing three or more integer values causes an abort in the fast isel pass. This patch adds two more registers to RetCC_PPC64_ELF_FIS to allow returning up to four integers with fast isel, just the same as is currently supported with regular isel (RetCC_PPC). This is needed for Swift and (possibly) other non-clang frontends. Fixes PR26190. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272005 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 12:48:22 +00:00
Simon Pilgrim	c1e27cc453	[X86][SSE] Improved blend+zero target shuffle combining to use combined shuffle mask directly We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened. This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272003 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 12:20:14 +00:00
James Molloy	87f50aafbc	[ARM] Shrink post-indexed LDR and STR to LDM/STM A Thumb-2 post-indexed LDR instruction such as: ldr.w r0, [r1], #4 Can be rewritten as: ldm.n r1!, {r0} LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272002 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 12:13:34 +00:00
James Molloy	d5127f4273	[ARM] Transform LDMs into writeback form to save code size If we have an LDM that uses only low registers and doesn't write to its base register: ldm.w r0, {r1, r2, r3} And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding: ldm.n r0!, {r1, r2, r3} Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272000 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 11:47:24 +00:00
Saleem Abdulrasool	48ff9d62da	ARM: correct TLS access on WoA TLS access requires an offset from the TLS index. The index itself is the section-relative distance of the symbol. For ARM, the relevant relocation (IMAGE_REL_ARM_SECREL) is applied as a constant. This means that the value may not be an immediate and must be lowered into a constant pool. This offset will not be base relocated. We were previously emitting the actual address of the symbol which would be base relocated and would therefore be the vaue offset by the ImageBase + TLS Offset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271974 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 03:15:07 +00:00
Matt Arsenault	bc1b8d5b49	AMDGPU: Fix constantexpr addrspacecasts If we had a constant group address space cast the queue pointer wasn't enabled for the function, resulting in a crash on noreg later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271935 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 20:03:31 +00:00
Haicheng Wu	a6caf31dea	Fix a test case. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271930 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 19:11:53 +00:00
Haicheng Wu	84755987d0	[MBP] Reduce code size by running tail merging in MBP. The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. Differential Revision: http://reviews.llvm.org/D20276 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271925 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 18:36:07 +00:00
Artem Tamazov	7049ac906c	[AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271900 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 15:23:43 +00:00
Igor Breger	5238dbe213	[KNL] Fix UMULO lowering. Differential Revision: http://reviews.llvm.org/D21013 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271891 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 12:24:52 +00:00
Craig Topper	6dbfac925f	[AVX512] Remove masked palignr intrinsics and auto-upgrade them to native IR of vector shuffle and select. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271872 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 06:12:54 +00:00
Craig Topper	856b53e006	[AVX512] Add PALIGNR shuffle lowering for v32i16 and v16i32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271870 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 05:39:10 +00:00
Craig Topper	5cc4ee2898	[AVX512] Update tests to show shuffle decoding for vpshuflw/vpshufhw. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271869 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 05:39:07 +00:00
Simon Pilgrim	7705b591df	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS raw mask decoding for target shuffle combines git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271834 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 15:21:30 +00:00
Simon Pilgrim	2d63358b82	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS as a target shuffle type git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271831 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 15:01:45 +00:00
Craig Topper	3789c0fe24	[AVX512] Add support for lowering PALIGNR for v64i8. Could do this for other types to, but this is what's needed to replace the instrinsic with native IR in clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271828 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 06:29:12 +00:00
Craig Topper	4b35a62824	[AVX512] Split command lines and regenerate a test to prepare for a future commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271827 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 06:29:08 +00:00
Craig Topper	dd17bc5daa	[AVX512] Fix PANDN combining for v4i32/v8i32 when VLX is enabled. v4i32/v8i32 ANDs aren't promoted to v2i64/v4i64 when VLX is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271826 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-05 05:35:11 +00:00
Simon Pilgrim	b6dac61a73	[X86][XOP] Added VPERMIL2PD/VPERMIL2PS shuffle mask comment decoding git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271809 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 21:44:28 +00:00

1 2 3 4 5 ...

17183 Commits