archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Xinliang David Li	45f64ad571	Revert r320104: infinite loop profiling bug fix Causes unexpected memory issue with New PM this time. The new PM invalidates BPI but not BFI, leaving the reference to BPI from BFI invalid. Abandon this patch. There is a more general solution which also handles runtime infinite loop (but not statically). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320180 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-08 19:38:07 +00:00
Alexey Bataev	b415523f30	[InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1, &V2)) --> store (, load (select(Cond, load &V1, load &V2))) Summary: If we have the code like this: ``` float a, b; a = std::max(a ,b); ``` it is converted into something like this: ``` %call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %a.addr, float* nonnull dereferenceable(4) %b.addr) %1 = bitcast float* %call to i32* %2 = load i32, i32* %1, align 4 %3 = bitcast float* %a.addr to i32* store i32 %2, i32* %3, align 4 ``` After inlinning this code is converted to the next: ``` %1 = load float, float* %a.addr %2 = load float, float* %b.addr %cmp.i = fcmp fast olt float %1, %2 %__b.__a.i = select i1 %cmp.i, float* %a.addr, float* %b.addr %3 = bitcast float* %__b.__a.i to i32* %4 = load i32, i32* %3, align 4 %5 = bitcast float* %arrayidx to i32* store i32 %4, i32* %5, align 4 ``` This pattern is not recognized as minmax pattern. Patch solves this problem by converting sequence ``` store (bitcast, (load bitcast (select ((cmp V1, V2), &V1, &V2)))) ``` to a sequence ``` store (,load (select((cmp V1, V2), &V1, &V2))) ``` After this the code is recognized as minmax pattern. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40304 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320157 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-08 15:32:10 +00:00
Xinliang David Li	134f1a833f	[PGO] detect infinite loop and form MST properly Differential Revision: http://reviews.llvm.org/D40873 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320104 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 22:23:28 +00:00
Sanjay Patel	1f609ed2b6	[InstCombine] add tests for abs using bit hackery; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320068 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 18:13:33 +00:00
Igor Laevsky	e18338969d	[InstCombine] Don't crash on out of bounds index in the insertelement Differential Revision: https://reviews.llvm.org/D40390 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320049 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 15:00:52 +00:00
Igor Laevsky	ab26b59c06	[InstSimplify] Add tests for the rL319894 Differential Revision: https://reviews.llvm.org/D40650 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320014 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 08:52:24 +00:00
Adam Nemet	d62c5bdaf9	[LV] Interleaved access vectorization: fix computing new alias info As a new access is generated spanning across multiple fields, we need to propagate alias info from all the fields to form the most generic alias info. rdar://35602528 Differential Revision: https://reviews.llvm.org/D40617 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319979 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-06 22:42:24 +00:00
Sanjay Patel	2b68e112a3	[InstCombine] canonicalize constant-minus-boolean to select-of-constants This restores the half of: https://reviews.llvm.org/rL75531 that was reverted at: https://reviews.llvm.org/rL159230 For the x86 case mentioned there, we now produce: leal 1(%rdi), %eax subl %esi, %eax We have target hooks to invert this in DAGCombiner (and x86 is enabled) with: https://reviews.llvm.org/rL296977 https://reviews.llvm.org/rL311731 AArch64 and possibly other targets would probably benefit from enabling those hooks too. See PR30327: https://bugs.llvm.org/show_bug.cgi?id=30327#c2 Differential Revision: https://reviews.llvm.org/D40612 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319964 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-06 21:22:57 +00:00
Zvi Rackover	138b4f2021	InstructionSimplify: 'extractelement' with an undef index is undef Summary: An undef extract index can be arbitrarily chosen to be an out-of-range index value, which would result in the instruction being undef. This change closes a gap identified while working on lowering vector permute intrinsics with variable index vectors to pure LLVM IR. Reviewers: arsenm, spatel, majnemer Reviewed By: arsenm, spatel Subscribers: fhahn, nhaehnle, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D40231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319910 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-06 17:51:46 +00:00
Igor Laevsky	3b06fccc77	[InstSimplify] Fold insertelement into undef if index is out of bounds Differential Revision: https://reviews.llvm.org/D40650 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319894 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-06 14:04:45 +00:00
Hans Wennborg	d5bd8eeafe	Revert r319482 and r319483 "[memcpyopt] Teach memcpyopt to optimize across basic blocks" This caused PR35519. > [memcpyopt] Teach memcpyopt to optimize across basic blocks > > This teaches memcpyopt to make a non-local memdep query when a local query > indicates that the dependency is non-local. This notably allows it to > eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. > > Fixes PR28958. > > Differential Revision: https://reviews.llvm.org/D38374 > > [memcpyopt] Commit file missed in r319482. > > This change was meant to be included with r319482 but was accidentally > omitted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319873 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-06 01:47:55 +00:00
Xinliang David Li	4f7feb41e5	Revert r319794: [PGO] detect infinite loop and form MST properly: memory leak problem git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319841 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 21:54:01 +00:00
Joel Galenson	2bfaf4d2f2	[CVP] Remove some {s\|u}sub.with.overflow checks. This uses ConstantRange::makeGuaranteedNoWrapRegion's newly-added handling for subtraction to allow CVP to remove some subtraction overflow checks. Differential Revision: https://reviews.llvm.org/D40039 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319807 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 18:14:24 +00:00
Xinliang David Li	9a320d9284	[PGO] detect infinite loop and form MST properly Differential Revision: http://reviews.llvm.org/D40702 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319794 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 17:19:41 +00:00
Alexey Bataev	4996301956	[InstCombine] Additional test for PR35354, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319783 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 16:15:55 +00:00
Mikael Holmen	dd0ef7d32c	Bail out of a SimplifyCFG switch table opt at undef values. Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by JesperAntonsson. Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D40639 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319768 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 14:14:00 +00:00
Igor Laevsky	b8aa60d240	[InstCombine] Don't crash on out of bounds shifts Differential Revision: https://reviews.llvm.org/D40649 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319761 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 12:18:15 +00:00
Haicheng Wu	15bf0436ab	[ConstantFold] Support vector index when factoring out GEP index into preceding dimensions Follow-up of r316824. This patch supports the vector type for both current and previous index when factoring out the current one into the previous one. Differential Revision: https://reviews.llvm.org/D39556 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319683 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 19:56:33 +00:00
Sanjoy Das	3745f0675a	[BypassSlowDivision] Improve our handling of divisions by constants (This reapplies r314253. r314253 was reverted on r314482 because of a correctness regression on P100, but that regression was identified to be something else.) Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319677 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 19:21:58 +00:00
Anna Thomas	7c3eddc7d6	[Loop Predication] Teach LP about reverse loops Summary: Currently, we only support predication for forward loops with step of 1. This patch enables loop predication for reverse or countdownLoops, which satisfy the following conditions: 1. The step of the IV is -1. 2. The loop has a singe latch as B(X) = X <pred> latchLimit with pred as s> or u> 3. The IV of the guard is the decrement IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit). This patch was downstream for a while and is the last series of patches that's from our LP implementation downstream. Reviewers: apilipenko, mkazantsev, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40353 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319659 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 15:11:48 +00:00
Matt Morehouse	79f1f0032c	Revert "[X86] Improvement in CodeGen instruction selection for LEAs." This reverts r319543, due to ASan bot breakage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 22:20:26 +00:00
Philip Reames	0fed3ad4cb	[IndVars] Fix a bug introduced in r317012 Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant. In this case, there is no loop invariant value contributing and we'd fail an assert. The test case was found by a java fuzzer and reduced. It's a real cornercase. You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate. To my knowledge, only the fuzzer has hit this case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319583 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 20:57:19 +00:00
Adam Nemet	c4f204db88	[opt-remarks] If hotness threshold is set, ignore remarks without hotness These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 Re-commit after fixing the failing testcase in rL319576, rL319577 and rL319578. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319581 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 20:41:38 +00:00
Adam Nemet	53cc245050	Revert "[opt-remarks] If hotness threshold is set, ignore remarks without hotness" This reverts commit r319556. Something is not working with this when used with sample-based profiling. Investigating... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319562 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 18:12:29 +00:00
Adam Nemet	7cb910a453	[opt-remarks] If hotness threshold is set, ignore remarks without hotness These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319556 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 17:02:04 +00:00
Hans Wennborg	faed772a25	Revert r319531 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops." It causes builds to fail with "Instruction does not dominate all uses" (PR35497). > Patch tries to improve vectorization of the following code: > > void add1(int * __restrict dst, const int * __restrict src) { > dst++ = src++; > dst++ = src++ + 1; > dst++ = src++ + 2; > dst++ = src++ + 3; > } > Allows to vectorize even if the very first operation is not a binary add, but just a load. > > Fixed issues related to previous commit. > > Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev > > Reviewed By: ABataev, RKSimon > > Subscribers: llvm-commits, RKSimon > > Differential Revision: https://reviews.llvm.org/D28907 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319550 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 16:17:24 +00:00
Jatin Bhateja	f5040aa4c6	[X86] Improvement in CodeGen instruction selection for LEAs. Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will now look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. 4/ Simplify LEA converts (lea (BASE,1,INDEX,0) --> add (BASE, INDEX) which offers better through put. PR32755 will be taken care of by this pathc. Previous patch revisions : r313343 , r314886 Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja Reviewed By: lsaba, RKSimon, jbhateja Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319543 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 14:07:38 +00:00
Mikael Holmen	4ece91062b	Revert r319537: Bail out of a SimplifyCFG switch table opt at undef values. Broke build bots so reverting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319539 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 13:11:39 +00:00
Florian Hahn	03c5509e66	[InstSimplify] More fcmp cases when comparing against negative constants. Summary: For known positive non-zero value X: fcmp uge X, -C => true fcmp ugt X, -C => true fcmp une X, -C => true fcmp oeq X, -C => false fcmp ole X, -C => false fcmp olt X, -C => false Patch by Paul Walker. Reviewers: majnemer, t.p.northover, spatel, RKSimon Reviewed By: spatel Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D40012 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319538 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 12:34:16 +00:00
Mikael Holmen	8c4503a350	Bail out of a SimplifyCFG switch table opt at undef values. Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by: JesperAntonsson Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40639 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319537 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 12:30:49 +00:00
Dinar Temirbulatov	8caaeced90	[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { dst++ = src++; dst++ = src++ + 1; dst++ = src++ + 2; dst++ = src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Fixed issues related to previous commit. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev Reviewed By: ABataev, RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319531 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 11:10:47 +00:00
Hiroshi Inoue	e983865629	Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and stores Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319522 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 06:05:05 +00:00
Peter Collingbourne	31b9aa7755	ThinLTOBitcodeWriter: Try harder to discard unused references to the merged module. If the thin module has no references to an internal global in the merged module, we need to make sure to preserve that property if the global is a member of a comdat group, as otherwise promotion can end up adding global symbols to the comdat, which is not allowed. This situation can arise if the external global in the thin module has dead constant users, which would cause use_empty() to return false and would cause us to try to promote it. To prevent this from happening, discard the dead constant users before asking whether a global is empty. Differential Revision: https://reviews.llvm.org/D40593 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319494 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 23:05:52 +00:00
Dan Gohman	7a991ca87c	[memcpyopt] Teach memcpyopt to optimize across basic blocks This teaches memcpyopt to make a non-local memdep query when a local query indicates that the dependency is non-local. This notably allows it to eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%. Fixes PR28958. Differential Revision: https://reviews.llvm.org/D38374 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319482 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 22:10:53 +00:00
Xinliang David Li	64d8eced83	[PGO] Skip counter promotion for infinite loops Differential Revision: http://reviews.llvm.org/D40662 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319462 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 19:16:25 +00:00
Alexey Bataev	06cb6a98c1	[InstCombine] Additional test for PR35354, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319436 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 14:33:58 +00:00
Hiroshi Inoue	50f1f9435a	Revert rL319407: [SROA] enable splitting for non-whole-alloca loads and stores This reverts commit rL319407 due to failures in some buildbot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319410 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 08:29:51 +00:00
Hiroshi Inoue	39be023c86	[SROA] enable splitting for non-whole-alloca loads and stores Currently, SROA splits loads and stores only when they are accessing the whole alloca. This patch relaxes this limitation to allow splitting a load/store if all other loads and stores to the alloca are disjoint to or fully included in the current load/store. If there is no other load or store that crosses the boundary of the current load/store, the current splitting implementation works as is. The whole-alloca loads and stores meet this new condition and so they are still splittable. Here is a simplified motivating example. struct record { long long a; int b; int c; }; int func(struct record r) { for (int i = 0; i < r.c; i++) r.b++; return r.b; } When updating r.b (or r.c as well), LLVM generates redundant instructions on some platforms (such as x86_64, ppc64); here, r.b and r.c are packed into one 64-bit GPR when the struct is passed as a method argument. With this patch, the above example is compiled into only few instructions without loop. Without the patch, unnecessary loop-carried dependency is introduced by SROA and the loop cannot be eliminated by the later optimizers. Differential Revision: https://reviews.llvm.org/D32998 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319407 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 07:44:46 +00:00
Graham Yiu	9d522dc415	- Removed unused lamba (IsReturnBlock) causing build bots to fail for r319398 - Added lit testcases that were supposed to be part of r319398 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319399 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 03:36:57 +00:00
Sanjay Patel	0221a1478f	[InstCombine] add tests for select-of-constants; NFC These are variants of a test that was originally added in: https://reviews.llvm.org/rL75531 ...but removed with: https://reviews.llvm.org/rL159230 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319327 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-29 17:21:39 +00:00
Serguei Katkov	c2659f12f7	[CGP] Fix common type handling in optimizeMemoryInst If common type is different we should bail out due to we will not be able to create a select or Phi of these values. Basically it is done in ExtAddrMode::compare however it does not work if we handle the null first and then two values of different types. so add a check in initializeMap as well. The check in ExtAddrMode::compare is used as earlier bail out. Reviewers: reames, john.brawn Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40479 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319292 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-29 05:51:26 +00:00
Adam Nemet	999c1417c7	Remove this test After r319235, we no longer generate this remark. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319242 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 22:39:38 +00:00
Adam Nemet	1f9846951b	Demote this opt remark to DEBUG. From a random opt-stat output: Top 10 remarks: tailcallelim/tailcall 53% inline/AlwaysInline 13% gvn/LoadClobbered 13% inline/Inlined 8% inline/TooCostly 2% inline/NoDefinition 2% licm/LoadWithLoopInvariantAddressInvalidated 2% licm/Hoisted 1% asm-printer/InstructionCount 1% prologepilog/StackSize 1% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319235 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 22:11:00 +00:00
Alexey Bataev	4ff622660e	[SLP] Additional test for PR35354, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319224 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 20:48:24 +00:00
Sanjay Patel	67b235da2e	[InstCombine] auto-generate complete test checks; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319205 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 19:13:23 +00:00
Sanjay Patel	68e8f78488	[InstCombine] auto-generate complete test checks; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319203 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 19:07:28 +00:00
Hans Wennborg	dd74d55407	EntryExitInstrumenter: set DebugLocs on the inserted call instructions (PR35412) Apparently the verifier requires that inlineable calls in a function with debug info have debug locations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319199 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 18:44:26 +00:00
Sanjay Patel	240ac15972	[InstCombine] add tests from D39421 to show current transforms; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319182 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 16:40:30 +00:00
Chandler Carruth	6efdd0fb59	Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after everything that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319164 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 11:32:31 +00:00
Florian Hahn	04289f6357	[TailRecursionElimination] Skip debug intrinsics. Summary: I think we do not need to analyze debug intrinsics here, as they should not impact codegen. This has 2 benefits: 1) slightly less work to do and 2) avoiding generating optimization remarks for converting calls to debug intrinsics to tail calls, which are not really helpful for users. Based on work by Sander de Smalen. Reviewers: davide, trentxintong, aprantl Reviewed By: aprantl Subscribers: llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D40440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319158 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-28 09:32:25 +00:00

1 2 3 4 5 ...

11090 Commits