archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	f9d4e1794d	[X86] Add v2i4 store test case (PR20012) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312874 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 20:28:50 +00:00
Simon Pilgrim	5037d51d6d	[X86] Add v2i2 test case (PR20011) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312873 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 20:22:35 +00:00
Simon Pilgrim	8b80450d25	[X86][FMA] Regenerate FMA tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312871 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 19:25:59 +00:00
Simon Pilgrim	a22f9f2405	[X86][SSE] i32 vector multiplications test cases from PR6399 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312868 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 18:18:17 +00:00
Simon Pilgrim	ce6571eae6	[X86][MOVBE] Fix typo in MOVBE scheduling test names Copy+paste is not your friend git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312867 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 17:52:44 +00:00
Craig Topper	9080ea8806	[X86] Don't disable slow INC/DEC if optimizing for size Summary: Just because INC/DEC is a little slow on some processors doesn't mean we shouldn't prefer it when optimizing for size. This appears to match gcc behavior. Reviewers: chandlerc, zvi, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37177 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312866 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 17:11:59 +00:00
Craig Topper	403bab200a	[X86] Simplify the slow-incdec test and add test cases with optsize. I think we want to consider using inc/dec with optsize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312804 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 17:33:54 +00:00
Simon Pilgrim	977c908e78	[X86] Added PR31045 test case Reduced version of 'addr-calc-crash.ll' that was included in D27044, that had been fixed already by D31286/rL298633 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312786 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 10:49:11 +00:00
Jatin Bhateja	8277ad7473	[X86] Adding a test point for PR34149 'Suboptimal codegen for "fast" minnum and maxnum' Differential Revision: https://reviews.llvm.org/D37614 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312778 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 09:15:36 +00:00
Chandler Carruth	575526c082	[x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to cover the bitwise operators. Nothing really exciting here, this just stamps out the rest of the core operations that can RMW memory and set flags. Still not implemented here: ADC, SBB. Those will require more interesting logic to channel the flags in, and I'm not currently planning to try to tackle that. It might be interesting for someone who wants to improve our code generation for bignum implementations. Differential Revision: https://reviews.llvm.org/D37141 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312768 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 00:17:12 +00:00
Chandler Carruth	803b2c7e69	[x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands. This is a bit trickier than register operands, and we still want to fall back on a register operands even for things that appear to be "immediates" when they won't actually select into the operation's immediate operand. This also requires us to handle things like selecting `sub` vs. `add` to minimize the number of bits needed to represent the immediate, and picking the shortest immediate encoding. In order to that, we in turn need to scan to make sure that CF isn't used as it will get inverted. The end result seems very nice though, and we're now generating optimal instruction sequences for these patterns IMO. A follow-up patch will further expand this to other operations with RMW memory operands. But handing `add` and `sub` are useful starting points to flesh out the machinery and make sure interesting and complex cases can be handled. Thanks to Craig Topper who provided a few fixes and improvements to this patch in addition to the review! Differential Revision: https://reviews.llvm.org/D37139 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312764 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 23:54:24 +00:00
Paul Robinson	06296a65fd	[DWARF] Line 0 should not have a discriminator. It's meaningless and takes up extra space in the line table. Differential Revision: https://reviews.llvm.org/D37364 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312751 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 22:15:44 +00:00
Michael Zuckerman	563f2fdd92	[X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess (VF{8\|16\|32} stride 3). This patch expands the support of lowerInterleavedload to {8\|16\|32}x8i stride 3. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8\|16\|32}) and we plan to include the store (deinterleved side). The patch goal is to optimize the following sequence: a0 b0 c0 a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 a5 b5 c5 a6 b6 c6 a7 b7 c7 into a0 a1 a2 a3 a4 a5 a6 a7 b0 b1 b2 b3 b4 b5 b6 b7 c0 c1 c2 c3 c4 c5 c6 c7 Reviewers 1. zvi 2. igor 3. guyblank 4. dorit 5. Ayal git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312722 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 14:02:13 +00:00
Florian Hahn	651af02437	[MachineCombiner] Update instruction depths incrementally for large BBs. Summary: For large basic blocks with lots of combinable instructions, the MachineTraceMetrics computations in MachineCombiner can dominate the compile time, as computing the trace information is quadratic in the number of instructions in a BB and it's relevant successors/predecessors. In most cases, knowing the instruction depth should be enough to make combination decisions. As we already iterate over all instructions in a basic block, the instruction depth can be computed incrementally. This reduces the cost of machine-combine drastically in cases where lots of instructions are combined. The major drawback is that AFAIK, computing the critical path length cannot be done incrementally. Therefore we only compute instruction depths incrementally, for basic blocks with more instructions than inc_threshold. The -machine-combiner-inc-threshold option can be used to set the threshold and allows for easier experimenting and checking if using incremental updates for all basic blocks has any impact on the performance. Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn Reviewed By: fhahn Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D36619 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312719 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 12:49:39 +00:00
Alexander Ivchenko	8f5188c5d8	[x86] Update to cmov promotion tests for D36711; NFC Adding i8 -> [i16, i32, i64] and i32 -> i64 cases. This way we can see what the current codegen looks like. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312707 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 08:59:05 +00:00
Zvi Rackover	8a2fcfe5be	X86: Improve AVX512 fptoui lowering Summary: Add patterns for fptoui <16 x float> to <16 x i8> fptoui <16 x float> to <16 x i16> Reviewers: igorb, delena, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37505 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312704 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 07:40:34 +00:00
Sanjay Patel	64aa32b606	[x86] fix triple and regenerate checks for psubus; NFC Patch by Yulia Koval! Differential Revision: https://reviews.llvm.org/D37523 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312662 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 19:05:20 +00:00
Wei Mi	78696b31cd	[TailCall] Allow llvm.memcpy/memset/memmove to be tail calls when parent function return the intrinsics's first argument. llvm.memcpy/memset/memmove return void but they will return the first argument after they are expanded as libcalls. Now if the parent function has any return value, llvm.memcpy cannot be turned into tail call after expansion. The patch is to handle that case in SelectionDAGBuilder so when caller function return the same value as the first argument of llvm.memcpy, tail call is allowed. Differential Revision: https://reviews.llvm.org/D37406 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312641 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 16:05:17 +00:00
Chandler Carruth	1467a089bc	[x86] Fix PR34377 by disabling cmov conversion when we relied on it performing a zext of a register. On the PR there is discussion of how to more effectively handle this, but this patch prevents us from miscompiling code. Differential Revision: https://reviews.llvm.org/D37504 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312620 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 06:28:08 +00:00
Zvi Rackover	922eae4d2e	X86 Tests: Tidy up AVX512 conversion tests. NFC. Rename functions to a consistent format to make it easier to track coverage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312619 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 05:33:04 +00:00
Jatin Bhateja	2411ad4316	Updating a test reference for rL312608. Differential Revision: https://reviews.llvm.org/D37501 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312614 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 03:58:14 +00:00
Jatin Bhateja	f3b9c95869	[X86] Allow cross-lane permutations for sub targets supporting AVX2. Summary: Most instructions in AVX work “in-lane”, that is, each source element is applied only to other elements of the same lane, thus a cross lane permutation is costly and needs more than one instrution. AVX2 includes instructions to perform any-to-any permutation of words over a 256-bit register and vectorized table lookup. This should also Fix PR34369 Differential Revision: https://reviews.llvm.org/D37388 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312608 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 02:58:47 +00:00
Reid Kleckner	c86178ea37	Add llvm.codeview.annotation to implement MSVC __annotation Summary: This intrinsic represents a label with a list of associated metadata strings. It is modelled as reading and writing inaccessible memory so that it won't be removed as dead code. I think the intention is that the annotation strings should appear at most once in the debug info, so I marked it noduplicate. We are allowed to inline code with annotations as long as we strip the annotation, but that can be done later. Reviewers: majnemer Subscribers: eraman, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D36904 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312569 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 20:14:58 +00:00
Craig Topper	8c5b337a87	[X86] Remove unnecessary (v4f32 (X86vzmovl (v4f32 (scalar_to_vector FR32X)))) patterns We had already disabled the pattern for SSE4.1 and SSE4.2. But it got re-enabled for AVX and AVX512. With SSE41 we rely on a separate (v4f32 (X86vzmovl VR128)) pattern to select blendps with a xorps to create zeroess. And a separate (v4f32 (scalar_to_vector FR32X)) to select a COPY_TO_REG_CLASS to move FR32 to VR128 The same thing can happen for AVX with vblendps and those separate patterns already exist. For AVX512, (v4f32 (X86vzmov VR128)) will select a VMOVSS instruction instead of VBLENDPS due to their not being a EVEX VBLENDPS. This is what we were getting out of the larger pattern anyway. So the larger pattern is unneeded for AVX512 too. For SSE1-SSSE3 we can rely on (v4f32 (X86vzmov VR128)) selecting a MOVSS similar to AVX512. Again this is what the larger pattern did too. So the only real change here is that AVX1/2 now properly outputs a VBLENDPS during isel instead of a VMOVSS to match SSE41. Most tests didn't notice because the two address instruction pass knows how to turn VMOVSS into VBLENDPS to get an independent destination register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312564 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 19:09:02 +00:00
Zvi Rackover	9c369c6f9c	X86 Tests: Adding missing AVX512 fptoui coverage tests. NFC. Some of the cases show missing pattern i intend to fix shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312560 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 18:24:39 +00:00
Craig Topper	035520018a	[AVX512] Remove patterns for (v8f32 (X86vzmovl (insert_subvector undef, (v4f32 (scalar_to_vector FR32X:)), (iPTR 0)))) and the same for v4f64. We don't have this same pattern for AVX2 so I don't believe we should have it for AVX512. We also didn't have it for v16f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312543 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 17:33:58 +00:00
Simon Pilgrim	76db91a4f0	[X86] Limit store merge size when implicitfloat is enabled (PR34421) As suggested by @niravd : https://bugs.llvm.org/show_bug.cgi?id=34421#c2 Differential Revision: https://reviews.llvm.org/D37464 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312534 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 13:40:29 +00:00
Simon Pilgrim	34cbdf56ca	[X86] Regenerate scalar rotation tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312530 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 12:28:30 +00:00
Simon Pilgrim	d5802f5e18	[X86][AVX512] Use AVX512 attributes instead of -mcpu in vector shift tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312529 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 12:23:45 +00:00
Simon Pilgrim	3eb1ddf19a	[X86][AVX512] Use AVX512 attributes instead of -mcpu git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312528 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 12:05:14 +00:00
Sanjay Patel	cfc091852b	[x86] add tests for vector store merge opportunity; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312504 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 22:01:25 +00:00
Sanjay Patel	07477455af	[x86] auto-generate complete checks; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312503 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 21:46:05 +00:00
Sanjay Patel	9435706923	[x86] add/regenerate complete checks; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312502 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 21:43:32 +00:00
Sanjay Patel	8bafe87c16	[x86] add test for unnecessary cmp + masked store; NFC As noted in PR11210: https://bugs.llvm.org/show_bug.cgi?id=11210 ...fixing this should allow us to eliminate x86-specific masked store intrinsics in IR. (Although more testing will be needed to confirm that.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312496 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 17:21:17 +00:00
Sam McCall	c7c869be7e	Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"" This crashes on boringSSL on PPC (will send reduced testcase) This reverts commit r312328. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312490 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 15:47:00 +00:00
Simon Pilgrim	f070a0d73d	[X86][AVX512] Add support for VPERMILPS v16f32 shuffle lowering (PR34382) Avoid use of VPERMPS where we don't need it by instead using the variable mask version of VPERMILPS for unary shuffles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312486 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 13:51:57 +00:00
Simon Pilgrim	e6cf8170cc	Added shuffle test case from PR34382 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312485 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 13:43:13 +00:00
Simon Pilgrim	2677f9404b	Added shuffle test case from PR34369 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312481 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 11:08:47 +00:00
Ayman Musa	228d11f2a9	[X86] Replace -mcpu option with -mattr in LIT tests added in https://reviews.llvm.org/rL312442 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312474 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 09:31:32 +00:00
Igor Breger	4e66147d1e	[GlobalISel][X86] G_PHI support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312473 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 09:06:45 +00:00
Dean Michael Berris	cfbb872e5e	[XRay][CodeGen] Use PIC-friendly code in XRay sleds and remove synthetic references in .text Summary: This is a re-roll of D36615 which uses PLT relocations in the back-end to the call to __xray_CustomEvent() when building in -fPIC and -fxray-instrument mode. Reviewers: pcc, djasper, bkramer Subscribers: sdardis, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37373 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312466 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 05:34:58 +00:00
Craig Topper	7f04a10c78	[X86] Add a combine to recognize when we have two insert subvectors that together write the whole vector, but the starting vector isn't undef. In this case we should replace the starting vector with undef. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312462 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 01:13:36 +00:00
Craig Topper	42d6767626	[X86] Add a combine to turn (insert_subvector zero, (insert_subvector zero, X, Idx), Idx) into an insert of X into the larger zero vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312460 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 22:25:52 +00:00
Craig Topper	03f273f10e	[X86] Add more patterns to use moves to zero the upper portions of a vector register that I missed in r312450. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312459 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 22:25:50 +00:00
Craig Topper	3b552a56bb	[X86] Combine inserting a vector of zeros into a vector of zeros just the larger vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312458 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 22:25:49 +00:00
Craig Topper	3cef9810b2	[X86] Add patterns to turn an insert into lower subvector of a zero vector into a move instruction which will implicitly zero the upper elements. Ideally we'd be able to emit the SUBREG_TO_REG without the explicit register->register move, but we'd need to be sure the producing operation would select something that guaranteed the upper bits were already zeroed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312450 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 17:52:25 +00:00
Craig Topper	849412352b	[X86] Add VBLENDPS/VPBLENDD to the execution domain fixing tables. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312449 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 17:52:23 +00:00
Craig Topper	05f56c0b3f	[X86] Canonicalize (concat_vectors X, zero) -> (insert_subvector zero, X, 0). In a future patch, I plan to teach isel to use a small vector move with implicit zeroing of the upper elements when it sees the (insert_subvector zero, X, 0) pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312448 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 17:52:19 +00:00
Ayman Musa	1ee1fb6046	[X86] Add -mtriple option to LIT tests added in https://reviews.llvm.org/rL312442 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312443 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 15:06:26 +00:00
Ayman Musa	5684b1b1ad	[X86][AVX512] Add simple tests for all AVX512 shuffle instructions. Throughout an effort to strongly check the behavior of CodeGen with the IR shufflevector instruction we generated many tests while predicting the best X86 sequence that may be generated. This is a subset of the generated tests that we think may add value to our X86 set of tests. Some of the checks are not optimal and will be changed after fixing: 1. PR34394 2. PR34382 3. PR34380 4. PR34359 Differential Revision: https://reviews.llvm.org/D37329 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312442 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 13:53:44 +00:00

1 2 3 4 5 ...

10455 Commits