archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Nuno Lopes	fe353a0cbf	Merge isKnownNonNull into isKnownNonZero It now knows the tricks of both functions. Also, fix a bug that considered allocas of non-zero address space to be always non null Differential Revision: https://reviews.llvm.org/D37628 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312869 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 18:23:11 +00:00
Sanjay Patel	2d82e741fe	[DivRemPairs] split tests per target to account for bots that don't build for all targets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312863 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 14:10:59 +00:00
Sanjay Patel	193e898f75	[DivRempairs] add a pass to optimize div/rem pairs (PR31028) This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312862 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 13:38:18 +00:00
Alexey Bataev	4fcc7e8528	[SLP] Support for horizontal min/max reduction. SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 13:49:36 +00:00
Max Kazantsev	2e0950763d	Re-enable "[IRCE] Identify loops with latch comparison against current IV value" Re-applying after the found bug was fixed. Differential Revision: https://reviews.llvm.org/D36215 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312783 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 10:15:05 +00:00
Max Kazantsev	cf9cd148bd	diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp index f72a808..9fa49fd 100644 --- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp +++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp @@ -450,20 +450,10 @@ struct LoopStructure { // equivalent to: // // intN_ty inc = IndVarIncreasing ? 1 : -1; - // pred_ty predicate = IndVarIncreasing - // ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT - // : IsSignedPredicate ? ICMP_SGT : ICMP_UGT; + // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT; // - // - // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt); - // iv = IndVarNext) + // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase) // ... body ... - // - // Here IndVarBase is either current or next value of the induction variable. - // in the former case, IsIndVarNext = false and IndVarBase points to the - // Phi node of the induction variable. Otherwise, IsIndVarNext = true and - // IndVarBase points to IV increment instruction. - // Value IndVarBase; Value IndVarStart; @@ -471,13 +461,12 @@ struct LoopStructure { Value LoopExitAt; bool IndVarIncreasing; bool IsSignedPredicate; - bool IsIndVarNext; LoopStructure() : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr), LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr), IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr), - IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {} + IndVarIncreasing(false), IsSignedPredicate(true) {} template <typename M> LoopStructure map(M Map) const { LoopStructure Result; @@ -493,7 +482,6 @@ struct LoopStructure { Result.LoopExitAt = Map(LoopExitAt); Result.IndVarIncreasing = IndVarIncreasing; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; return Result; } @@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, return false; }; - // `ICI` can either be a comparison against IV or a comparison of IV.next. - // Depending on the interpretation, we calculate the start value differently. + // `ICI` is interpreted as taking the backedge if the next* value of the + // induction variable satisfies some constraint. - // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch - // comparisons happens against the IV before or after its value is - // incremented. Two valid combinations for them are: - // - // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false }, - // 2) { iv.next; true }. - // - // The latch comparison happens against IndVarBase which can be either current - // or next value of the induction variable. const SCEVAddRecExpr IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV); bool IsIncreasing = false; bool IsSignedPredicate = true; - bool IsIndVarNext = false; ConstantInt StepCI; if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) { FailureReason = "LHS in icmp not induction variable"; return None; } - const SCEV IndVarStart = nullptr; - // TODO: Currently we only handle comparison against IV, but we can extend - // this analysis to be able to deal with comparison against sext(iv) and such. - if (isa<PHINode>(LeftValue) && - cast<PHINode>(LeftValue)->getParent() == Header) - // The comparison is made against current IV value. - IndVarStart = IndVarBase->getStart(); - else { - // Assume that the comparison is made against next IV value. - const SCEV StartNext = IndVarBase->getStart(); - const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); - IndVarStart = SE.getAddExpr(StartNext, Addend); - IsIndVarNext = true; - } + const SCEV StartNext = IndVarBase->getStart(); + const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); + const SCEV IndVarStart = SE.getAddExpr(StartNext, Addend); const SCEV Step = SE.getSCEV(StepCI); ConstantInt One = ConstantInt::get(IndVarTy, 1); @@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, Result.IndVarIncreasing = IsIncreasing; Result.LoopExitAt = RightValue; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; FailureReason = nullptr; @@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd( BranchToContinuation); NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader); - auto FixupValue = - LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN; - NewPHI->addIncoming(FixupValue, RRI.ExitSelector); + NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch), + RRI.ExitSelector); RRI.PHIValuesAtPseudoExit.push_back(NewPHI); } @@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop L, LPPassManager &LPM) { } LoopStructure LS = MaybeLoopStructure.getValue(); const SCEVAddRecExpr IndVar = - cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase)); - if (LS.IsIndVarNext) - IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar, - SE.getSCEV(LS.IndVarStep))); + cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep))); Optional<InductiveRangeCheck::Range> SafeIterRange; Instruction ExprInsertPt = Preheader->getTerminator(); diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll deleted file mode 100644 index afea0e6..0000000 --- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll +++ /dev/null @@ -1,182 +0,0 @@ -; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 \| FileCheck %s - -; Check that IRCE is able to deal with loops where the latch comparison is -; done against current value of the IV, not the IV.next. - -; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> - -; SLT condition for increasing loop from 0 to 100. -define void @test_01(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_01 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp slt i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp slt i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; ULT condition for increasing loop from 0 to 100. -define void @test_02(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_02 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp ult i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp ult i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_01, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_03(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_03 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp slt i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_02, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_04(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_04 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp ult i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -!0 = !{i32 0, i32 50} -!1 = !{i64 0, i64 50} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312775 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 04:26:41 +00:00
Peter Collingbourne	7480eb9933	WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat. This is required when targeting COFF, as the comdat name must match one of the names of the symbols in the comdat. Differential Revision: https://reviews.llvm.org/D37550 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312767 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 00:10:53 +00:00
Richard Trieu	2cf792edc2	Revert r312318, r312325, r312424, r312489 r312318 - Debug info for variables whose type is shrinked to bool r312325, r312424, r312489 - Test case for r312318 Revision 312318 introduced a null dereference bug. Details in https://bugs.llvm.org/show_bug.cgi?id=34490 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312758 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 23:20:35 +00:00
Michael Zuckerman	563f2fdd92	[X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess (VF{8\|16\|32} stride 3). This patch expands the support of lowerInterleavedload to {8\|16\|32}x8i stride 3. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8\|16\|32}) and we plan to include the store (deinterleved side). The patch goal is to optimize the following sequence: a0 b0 c0 a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 a5 b5 c5 a6 b6 c6 a7 b7 c7 into a0 a1 a2 a3 a4 a5 a6 a7 b0 b1 b2 b3 b4 b5 b6 b7 c0 c1 c2 c3 c4 c5 c6 c7 Reviewers 1. zvi 2. igor 3. guyblank 4. dorit 5. Ayal git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312722 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 14:02:13 +00:00
Matt Arsenault	e0de89287c	InstSimplify: canonicalize is idempotent git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312685 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 01:21:43 +00:00
Krzysztof Parzyszek	9ca441aa44	Disable jump threading into loop headers Consider this type of a loop: for (...) { ... if (...) continue; ... } Normally, the "continue" would branch to the loop control code that checks whether the loop should continue iterating and which contains the (often) unique loop latch branch. In certain cases jump threading can "thread" the inner branch directly to the loop header, creating a second loop latch. Loop canonicalization would then transform this loop into a loop nest. The problem with this is that in such a loop nest neither loop is countable even if the original loop was. This may inhibit subsequent loop optimizations and be detrimental to performance. Differential Revision: https://reviews.llvm.org/D36404 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312664 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 19:36:58 +00:00
Sanjay Patel	04894a4949	[ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite(): https://bugs.llvm.org/show_bug.cgi?id=27145 In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno with a constant operand. But while looking at those patterns, I realized we were missing a canonicalization for nonzero constants. Rather than limiting to just folds for constants, we're adding a general value tracking method for this based on an existing DAG helper. By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps() and pick up missing vector folds. Differential Revision: https://reviews.llvm.org/D37427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 23:13:13 +00:00
Sanjay Patel	d92ccb5856	[InstCombine] add nnan tests; NFC As suggested in D37427, we could have a value tracking function and folds that use it to simplify these cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312578 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 21:20:35 +00:00
Reid Kleckner	c86178ea37	Add llvm.codeview.annotation to implement MSVC __annotation Summary: This intrinsic represents a label with a list of associated metadata strings. It is modelled as reading and writing inaccessible memory so that it won't be removed as dead code. I think the intention is that the annotation strings should appear at most once in the debug info, so I marked it noduplicate. We are allowed to inline code with annotations as long as we strip the annotation, but that can be done later. Reviewers: majnemer Subscribers: eraman, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D36904 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312569 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 20:14:58 +00:00
Adam Nemet	a155485803	Split opt-remark YAML and opt output testing on this test This prepares for https://reviews.llvm.org/D33514 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312544 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 18:03:39 +00:00
Craig Topper	203c00ded6	[InstCombine] Add test cases for folding (select (icmp ne/eq (and X, C1), (bitwiseop Y, C2), Y -> (bitwiseop Y, (shl/shr (and X, C1), C3)) or similar. This is possible if C1 and C2 are both powers of 2. Or if binop is 'and' then ~C2 needs to be a power of 2. We already support this for 'or', but we should be able to support 'and' and 'xor'. This will be enhanced by D37274. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312519 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 05:26:38 +00:00
Daniel Berlin	110f9f2e80	NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect self-cycles of phi nodes. We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312510 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 02:17:43 +00:00
Daniel Berlin	660fd0b5be	NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312509 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-05 02:17:42 +00:00
Strahinja Petrovic	a78328c441	Fix test/Transforms/GlobalOpt/integer-bool-dwarf This patch fixes regression related with integer-bool-dwarf test. Patch by Nikola Prica. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312489 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 15:14:37 +00:00
Michael Zuckerman	e11eab53ee	Update test for testing avx512 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312487 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 14:15:34 +00:00
Zvi Rackover	3438d07f09	LoopVectorize: MaxVF should not be larger than the loop trip count Summary: Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count. Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow: 1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks. 2. Compute MaxVF returned 16 on a target supporting AVX512. 3. OptForSize -> choose VF:=MaxVF. 4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop. With this patch step 2. will choose MaxVF=8 based on TripCount. Reviewers: Ayal, dorit, mkuper, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D37425 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312472 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 08:35:13 +00:00
Sam Parker	040fcc3883	[LoopUnroll][DebugInfo] Don't add metadata to unrolled remainder loop Debug information can be, and was, corrupted when the runtime remainder loop was fully unrolled. This is because a !null node can be created instead of a unique one describing the loop. In this case, the original node gets incorrectly updated with the NewLoopID metadata. In the case when the remainder loop is going to be quickly fully unrolled, there isn't the need to add loop metadata for it anyway. Differential Revision: https://reviews.llvm.org/D37338 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312471 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-04 08:12:16 +00:00
Sanjay Patel	74c232b75f	[InstCombine] add tests for fcmp ord/uno canonicalization; NFC Currently, we canonicalize some cases to use 0.0, but we miss others. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312445 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-03 15:35:10 +00:00
Don Hinton	f158190afe	Fix buildbot failures for new test that requires the X86 target be built. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312424 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-02 22:56:58 +00:00
Sanjay Patel	03b20941fc	[InstSimplify] regenerate checks; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-02 14:38:15 +00:00
Sanjay Patel	3e9b6b5304	[InstCombine] put 2 related tests in the same file; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312412 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-02 14:35:18 +00:00
Sanjay Patel	15de536e6e	[InstSimplify] move fcmp simplification tests from InstCombine These are all tests that result in a constant, so moving the tests over to where they are actually handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312411 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-02 14:27:00 +00:00
Daniel Berlin	026a351310	Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders. Summary: After a discussion with Rekka, i believe this (or a small variant) should fix the remaining phi-of-ops problems. Rekka's algorithm for completeness relies on looking up expressions that should have no leader, and expecting it to fail (IE looking up expressions that can't exist in a predecessor, and expecting it to find nothing). Unfortunately, sometimes these expressions can be simplified to constants, but we need the lookup to fail anyway. Additionally, our simplifier outsmarts this by taking these "not quite right" expressions, and simplifying them into other expressions or walking through phis, etc. In the past, we've sometimes been able to find leaders for these expressions, incorrectly. This change causes us to not to try to phi of ops such expressions. We determine safety by seeing if they depend on a phi node in our block. This is not perfect, we can do a bit better, but this should be a "correctness start" that we can then improve. It also requires a bunch of caching that i'll eventually like to eliminate. The right solution, longer term, to the simplifier issues, is to make the query interface for the instruction simplifier/constant folder have the flags we need, so that we can keep most things going, but turn off the possibly-invalid parts (threading through phis, etc). This is an issue in another wrong code bug as well. Reviewers: davide, mcrosier Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D37175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312401 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-02 02:18:44 +00:00
Craig Topper	85fcd3487c	[InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask. This allows some code to be removed from InstSimplify that was doing this functionality. This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out. There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary. Differential Revision: https://reviews.llvm.org/D37158 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312382 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 21:27:34 +00:00
Davide Italiano	e5593530f5	[TTI] Fix getGEPCost() for geps with a single operand. Previously this would sporadically crash as TargetType was never initialized. We special-case the single-operand case returning earlier and trying to mimic the behaviour of isLegalAddressingMode as closely as possible. Differential Revision: https://reviews.llvm.org/D37277 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312357 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 19:54:08 +00:00
Daniel Berlin	29492fbac5	NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context. Reviewers: davide, mcrosier Subscribers: sanjoy, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D37174 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312352 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 19:20:18 +00:00
Strahinja Petrovic	88eda0cc8a	Adding missing test case in rL312318 Adding test for debug info for integer variables whose type is shrinked to bool. Patch by Nikola Prica. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312325 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 11:37:53 +00:00
Clement Courbet	4855d2de9a	Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312322 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 10:56:34 +00:00
Clement Courbet	1a4fd5c74c	Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312317 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 09:43:08 +00:00
Clement Courbet	930b028c65	[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312315 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-01 09:07:05 +00:00
Akira Hatanaka	fec731bad8	[ObjCARC] Pass the correct BasicBlock to fix assertion failure. The BasicBlock passed to FindPredecessorRetainWithSafePath should be the parent block of Autorelease. This fixes a crash that occurs in FindDependencies when StartInst is not in StartBB. rdar://problem/33866381 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312266 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-31 18:27:47 +00:00
Sanjay Patel	da536d4e17	[InstCombine] improve demanded vector elements analysis of insertelement Recurse instead of returning on the first found optimization. Also, return early in the caller instead of continuing because that allows another round of simplification before we might potentially lose undef information from a shuffle mask by eliminating the shuffle. As noted in the review, we could probably do better and be more efficient by moving all of demanded elements into a separate pass, but this is yet another quick fix to instcombine. Differential Revision: https://reviews.llvm.org/D37236 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312248 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-31 15:57:17 +00:00
Max Kazantsev	5c5cdb3c21	[IRCE] Identify loops with latch comparison against current IV value Current implementation of parseLoopStructure interprets the latch comparison as a comarison against `iv.next`. If the actual comparison is made against the `iv` current value then the loop may be rejected, because this misinterpretation leads to incorrect evaluation of the latch start value. This patch teaches the IRCE to distinguish this kind of loops and perform the optimization for them. Now we use `IndVarBase` variable which can be either next or current value of the induction variable (previously we used `IndVarNext` which was always the value on next iteration). Differential Revision: https://reviews.llvm.org/D36215 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312221 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-31 07:04:20 +00:00
Sanjay Patel	e6ed9807f6	[InstCombine] add more vector demand examples; NFC See D37236 for discussion. It seems unlikely that we actually want/need to do this kind of folding in InstCombine in the long run, but moving everything will be a bigger follow-up step. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312172 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-30 21:11:46 +00:00
Adrian Prantl	69e607f200	Canonicalize the representation of empty an expression in DIGlobalVariableExpression This change simplifies code that has to deal with DIGlobalVariableExpression and mirrors how we treat DIExpressions in debug info intrinsics. Before this change there were two ways of representing empty expressions on globals, a nullptr and an empty !DIExpression(). If someone needs to upgrade out-of-tree testcases: perl -pi -e 's/(!DIGlobalVariableExpression\(var: ![0-9]*)\)/\1, expr: !DIExpression())/g' <MYTEST.ll> will catch 95%. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312144 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-30 18:06:51 +00:00
Sanjay Patel	17f4bfb8d0	[InstCombine] remove unnecessary vector select fold; NFCI This code is double-dead: 1. We simplify all selects with constant true/false condition in InstSimplify. I've minimized/moved the tests to show that works as expected. 2. All remaining vector selects with a constant condition are canonicalized to shufflevector, so we really can't see this pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312123 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-30 14:04:57 +00:00
Florian Hahn	785320780e	[InstCombine] Fold insert sequence if first ins has multiple users. Summary: If the first insertelement instruction has multiple users and inserts at position 0, we can re-use this instruction when folding a chain of insertelement instructions. As we need to generate the first insertelement instruction anyways, this should be a strict improvement. We could get rid of the restriction of inserting at position 0 by creating a different shufflemask, but it is probably worth to keep the first insertelement instruction with position 0, as this is easier to do efficiently than at other positions I think. Reviewers: grosser, mkuper, fpetrogalli, efriedma Reviewed By: fpetrogalli Subscribers: gareevroman, llvm-commits Differential Revision: https://reviews.llvm.org/D37064 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312110 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-30 10:54:21 +00:00
Evgeniy Stepanov	aa1a72db63	[cfi] Avoid branch veneers in jump tables when possible. Summary: When jumptable encoding does not match target code encoding (arm vs thumb), a veneer is inserted by the linker. We can not avoid this in all cases, because entries within one jumptable must have the same encoding, but we can make it less common by selecting the jumptable encoding to match the majority of its targets. This change only covers FullLTO, and not ThinLTO. Reviewers: pcc Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37171 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312054 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 22:40:19 +00:00
Evgeniy Stepanov	21f4a97e9c	[cfi] Build __cfi_check as Thumb when applicable. Summary: Cross-DSO CFI needs all __cfi_check exports to use the same encoding (ARM vs Thumb). Reviewers: pcc Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37243 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312052 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 22:29:15 +00:00
Wei Mi	0488e47901	[LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition fails to find a loop invariant condition. This is wrong and it will disable loop unswitch for select. The patch fixes the bug. Differential Revision: https://reviews.llvm.org/D36985 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312045 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 21:45:11 +00:00
Alexey Bataev	c95fd24a5a	[SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores. Summary: If SimplifyCFG pass is able to merge conditional stores into single one, it loses the alignment. This may lead to incorrect codegen. Patch sets the alignment of the new instruction if it is set in the original one. Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36841 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312030 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 20:06:24 +00:00
Craig Topper	3ff9c137a4	[InstCombine] Support vector splats in transformZExtICmp This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code. One test didn't vectorize optimize properly due to another bug so a TODO has been added. Differential Revision: https://reviews.llvm.org/D37253 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312023 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 18:58:13 +00:00
Davide Italiano	fac36eb3e3	[LoopUnroll] Make the test for PR33437 actually useful. I forgot to specify -unroll-loop-peel, making this test not really effective. While here, adjust some details (naming and run line). Thanks to Sanjoy and Michael Z. for pointing out in their post-commit reviews. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312015 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 17:24:09 +00:00
Dehao Chen	31ec888106	Add null check for promoted direct call Summary: We originally assume that in pgo-icp, the promoted direct call will never be null after strip point casts. However, stripPointerCasts is so smart that it could possibly return the value of the function call if it knows that the return value is always an argument. In this case, the returned value cannot cast to Instruction. In this patch, null check is added to ensure null pointer will not be accessed. Reviewers: tejohnson, xur, davidxl, djasper Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D37252 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312005 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 15:28:12 +00:00
Alexey Bataev	a67ad7b80e	[SimplifyCFG] Update initial test for better testing of the fix for PR34219, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312002 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-29 14:37:23 +00:00

1 2 3 4 5 ...

10687 Commits