Commit Graph

10743 Commits

Author SHA1 Message Date
Peter Collingbourne
37911b93a0 WholeProgramDevirt: Add import/export support for targets without absolute symbol constants.
Not all targets support the use of absolute symbols to export
constants. In particular, ARM has a wide variety of constant encodings
that cannot currently be relocated by linkers. So instead of exporting
the constants using symbols, export them directly in the summary.
The values of the constants are left as zeroes on targets that support
symbolic exports.

This may result in more cache misses when targeting those architectures
as a result of arbitrary changes in constant values, but this seems
somewhat unavoidable for now.

Differential Revision: https://reviews.llvm.org/D37407

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312967 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 22:34:42 +00:00
Sanjay Patel
7d56a780c0 [InstSimplify] fix some test names; NFC
Too much division...the quotient is the answer.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312943 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 20:38:31 +00:00
Sanjay Patel
c18e8bc0df [InstSimplify] add tests for possible sdiv/srem simplifications; NFC
As noted in PR34517, the handling of signed div/rem is not on par with
unsigned div/rem. Signed is harder to reason about, but it should be
possible to handle at least some of these using the same technique that
we use for unsigned: use icmp logic to see if there's a relationship
between the quotient and divisor.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312938 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 19:42:41 +00:00
Hiroshi Yamauchi
b6b25740b7 Unmerge GEPs to reduce register pressure on IndirectBr edges.
Summary:
GEP merging can sometimes increase the number of live values and register
pressure across control edges and cause performance problems particularly if the
increased register pressure results in spills.

This change implements GEP unmerging around an IndirectBr in certain cases to
mitigate the issue. This is in the CodeGenPrepare pass (after all the GEP
merging has happened.)

With this patch, the Python interpreter loop runs faster by ~5%.

Reviewers: sanjoy, hfinkel

Reviewed By: hfinkel

Subscribers: eastig, junbuml, llvm-commits

Differential Revision: https://reviews.llvm.org/D36772

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312930 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 17:52:08 +00:00
Michael Zuckerman
9942f5143c [Interleved][Stride 3]Adding test for case the VF=64 target with AVX512.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312907 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 10:57:15 +00:00
Sanjay Patel
44e68c6bb8 [InstSimplify] refactor udiv/urem code and add tests; NFCI
This removes some duplicated code and makes it easier to support signed div/rem
in a similar way if we want to do that. Note that the existing comments were not
accurate - we don't need a constant divisor to simplify; icmp simplification does
more than that. But as the added tests show, it could go even further.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312885 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-10 17:55:08 +00:00
Nuno Lopes
fe353a0cbf Merge isKnownNonNull into isKnownNonZero
It now knows the tricks of both functions.
Also, fix a bug that considered allocas of non-zero address space to be always non null

Differential Revision: https://reviews.llvm.org/D37628

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312869 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-09 18:23:11 +00:00
Sanjay Patel
2d82e741fe [DivRemPairs] split tests per target to account for bots that don't build for all targets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312863 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-09 14:10:59 +00:00
Sanjay Patel
193e898f75 [DivRempairs] add a pass to optimize div/rem pairs (PR31028)
This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented 
as an independent pass, so there's no stretching of scope and feature creep for an existing pass. 
I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost 
this same functionality as an addition to CGP in the motivating example of PR31028:
https://bugs.llvm.org/show_bug.cgi?id=31028

The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow 
more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and
undo the hoisting that is done here.

Decomposing remainder may allow removing some code from the backend (PPC and possibly others).

Differential Revision: https://reviews.llvm.org/D37121 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312862 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-09 13:38:18 +00:00
Alexey Bataev
4fcc7e8528 [SLP] Support for horizontal min/max reduction.
SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.

Differential revision: https://reviews.llvm.org/D27846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 13:49:36 +00:00
Max Kazantsev
2e0950763d Re-enable "[IRCE] Identify loops with latch comparison against current IV value"
Re-applying after the found bug was fixed.

Differential Revision: https://reviews.llvm.org/D36215


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 10:15:05 +00:00
Max Kazantsev
cf9cd148bd diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
index f72a808..9fa49fd 100644
--- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
+++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp
@@ -450,20 +450,10 @@ struct LoopStructure {
   // equivalent to:
   //
   // intN_ty inc = IndVarIncreasing ? 1 : -1;
-  // pred_ty predicate = IndVarIncreasing
-  //                         ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT
-  //                         : IsSignedPredicate ? ICMP_SGT : ICMP_UGT;
+  // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT;
   //
-  //
-  // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt);
-  //      iv = IndVarNext)
+  // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase)
   //   ... body ...
-  //
-  // Here IndVarBase is either current or next value of the induction variable.
-  // in the former case, IsIndVarNext = false and IndVarBase points to the
-  // Phi node of the induction variable. Otherwise, IsIndVarNext = true and
-  // IndVarBase points to IV increment instruction.
-  //
 
   Value *IndVarBase;
   Value *IndVarStart;
@@ -471,13 +461,12 @@ struct LoopStructure {
   Value *LoopExitAt;
   bool IndVarIncreasing;
   bool IsSignedPredicate;
-  bool IsIndVarNext;
 
   LoopStructure()
       : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr),
         LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr),
         IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr),
-        IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {}
+        IndVarIncreasing(false), IsSignedPredicate(true) {}
 
   template <typename M> LoopStructure map(M Map) const {
     LoopStructure Result;
@@ -493,7 +482,6 @@ struct LoopStructure {
     Result.LoopExitAt = Map(LoopExitAt);
     Result.IndVarIncreasing = IndVarIncreasing;
     Result.IsSignedPredicate = IsSignedPredicate;
-    Result.IsIndVarNext = IsIndVarNext;
     return Result;
   }
 
@@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
     return false;
   };
 
-  // `ICI` can either be a comparison against IV or a comparison of IV.next.
-  // Depending on the interpretation, we calculate the start value differently.
+  // `ICI` is interpreted as taking the backedge if the *next* value of the
+  // induction variable satisfies some constraint.
 
-  // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch
-  // comparisons happens against the IV before or after its value is
-  // incremented. Two valid combinations for them are:
-  //
-  // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false },
-  // 2) { iv.next; true }.
-  //
-  // The latch comparison happens against IndVarBase which can be either current
-  // or next value of the induction variable.
   const SCEVAddRecExpr *IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV);
   bool IsIncreasing = false;
   bool IsSignedPredicate = true;
-  bool IsIndVarNext = false;
   ConstantInt *StepCI;
   if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) {
     FailureReason = "LHS in icmp not induction variable";
     return None;
   }
 
-  const SCEV *IndVarStart = nullptr;
-  // TODO: Currently we only handle comparison against IV, but we can extend
-  // this analysis to be able to deal with comparison against sext(iv) and such.
-  if (isa<PHINode>(LeftValue) &&
-      cast<PHINode>(LeftValue)->getParent() == Header)
-    // The comparison is made against current IV value.
-    IndVarStart = IndVarBase->getStart();
-  else {
-    // Assume that the comparison is made against next IV value.
-    const SCEV *StartNext = IndVarBase->getStart();
-    const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
-    IndVarStart = SE.getAddExpr(StartNext, Addend);
-    IsIndVarNext = true;
-  }
+  const SCEV *StartNext = IndVarBase->getStart();
+  const SCEV *Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE));
+  const SCEV *IndVarStart = SE.getAddExpr(StartNext, Addend);
   const SCEV *Step = SE.getSCEV(StepCI);
 
   ConstantInt *One = ConstantInt::get(IndVarTy, 1);
@@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE,
   Result.IndVarIncreasing = IsIncreasing;
   Result.LoopExitAt = RightValue;
   Result.IsSignedPredicate = IsSignedPredicate;
-  Result.IsIndVarNext = IsIndVarNext;
 
   FailureReason = nullptr;
 
@@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd(
                                       BranchToContinuation);
 
     NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader);
-    auto *FixupValue =
-        LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN;
-    NewPHI->addIncoming(FixupValue, RRI.ExitSelector);
+    NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch),
+                        RRI.ExitSelector);
     RRI.PHIValuesAtPseudoExit.push_back(NewPHI);
   }
 
@@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop *L, LPPassManager &LPM) {
   }
   LoopStructure LS = MaybeLoopStructure.getValue();
   const SCEVAddRecExpr *IndVar =
-      cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase));
-  if (LS.IsIndVarNext)
-    IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar,
-                                                  SE.getSCEV(LS.IndVarStep)));
+      cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep)));
 
   Optional<InductiveRangeCheck::Range> SafeIterRange;
   Instruction *ExprInsertPt = Preheader->getTerminator();
diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll
deleted file mode 100644
index afea0e6..0000000
--- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll
+++ /dev/null
@@ -1,182 +0,0 @@
-; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 | FileCheck %s
-
-; Check that IRCE is able to deal with loops where the latch comparison is
-; done against current value of the IV, not the IV.next.
-
-; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting>
-
-; SLT condition for increasing loop from 0 to 100.
-define void @test_01(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK:      test_01
-; CHECK:        entry:
-; CHECK-NEXT:     %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT:     [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK:        loop:
-; CHECK-NEXT:     %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT:     %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT:     %abc = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK:        in.bounds:
-; CHECK-NEXT:     %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT:     store i32 0, i32* %addr
-; CHECK-NEXT:     %next = icmp slt i32 %idx, 100
-; CHECK-NEXT:     [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK:        main.exit.selector:
-; CHECK-NEXT:     %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT:     [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100
-; CHECK-NEXT:     br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK:        loop.postloop:
-; CHECK-NEXT:    %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT:     %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT:     %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT:     br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
-  %len = load i32, i32* %a_len_ptr, !range !0
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %abc = icmp slt i32 %idx, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp slt i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; ULT condition for increasing loop from 0 to 100.
-define void @test_02(i32* %arr, i32* %a_len_ptr) #0 {
-
-; CHECK:      test_02
-; CHECK:        entry:
-; CHECK-NEXT:     %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0
-; CHECK-NEXT:     [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit
-; CHECK:        loop:
-; CHECK-NEXT:     %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ]
-; CHECK-NEXT:     %idx.next = add nuw nsw i32 %idx, 1
-; CHECK-NEXT:     %abc = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 true, label %in.bounds, label %out.of.bounds.loopexit1
-; CHECK:        in.bounds:
-; CHECK-NEXT:     %addr = getelementptr i32, i32* %arr, i32 %idx
-; CHECK-NEXT:     store i32 0, i32* %addr
-; CHECK-NEXT:     %next = icmp ult i32 %idx, 100
-; CHECK-NEXT:     [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at
-; CHECK-NEXT:     br i1 [[COND3]], label %loop, label %main.exit.selector
-; CHECK:        main.exit.selector:
-; CHECK-NEXT:     %idx.lcssa = phi i32 [ %idx, %in.bounds ]
-; CHECK-NEXT:     [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100
-; CHECK-NEXT:     br i1 [[COND4]], label %main.pseudo.exit, label %exit
-; CHECK-NOT: loop.preloop:
-; CHECK:        loop.postloop:
-; CHECK-NEXT:    %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ]
-; CHECK-NEXT:     %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1
-; CHECK-NEXT:     %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at
-; CHECK-NEXT:     br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit
-
-entry:
-  %len = load i32, i32* %a_len_ptr, !range !0
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %abc = icmp ult i32 %idx, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp ult i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; Same as test_01, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_03(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK:      test_03
-
-entry:
-  %len = load i64, i64* %a_len_ptr, !range !1
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %idx.ext = sext i32 %idx to i64
-  %abc = icmp slt i64 %idx.ext, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp slt i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-; Same as test_02, but comparison happens against IV extended to a wider type.
-; This test ensures that IRCE rejects it and does not falsely assume that it was
-; a comparison against iv.next.
-; TODO: We can actually extend the recognition to cover this case.
-define void @test_04(i32* %arr, i64* %a_len_ptr) #0 {
-
-; CHECK:      test_04
-
-entry:
-  %len = load i64, i64* %a_len_ptr, !range !1
-  br label %loop
-
-loop:
-  %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
-  %idx.next = add nsw nuw i32 %idx, 1
-  %idx.ext = sext i32 %idx to i64
-  %abc = icmp ult i64 %idx.ext, %len
-  br i1 %abc, label %in.bounds, label %out.of.bounds
-
-in.bounds:
-  %addr = getelementptr i32, i32* %arr, i32 %idx
-  store i32 0, i32* %addr
-  %next = icmp ult i32 %idx, 100
-  br i1 %next, label %loop, label %exit
-
-out.of.bounds:
-  ret void
-
-exit:
-  ret void
-}
-
-!0 = !{i32 0, i32 50}
-!1 = !{i64 0, i64 50}


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312775 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 04:26:41 +00:00
Peter Collingbourne
7480eb9933 WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat.
This is required when targeting COFF, as the comdat name must match
one of the names of the symbols in the comdat.

Differential Revision: https://reviews.llvm.org/D37550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312767 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 00:10:53 +00:00
Richard Trieu
2cf792edc2 Revert r312318, r312325, r312424, r312489
r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318

Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312758 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 23:20:35 +00:00
Michael Zuckerman
563f2fdd92 [X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess (VF{8|16|32} stride 3).
This patch expands the support of lowerInterleavedload to {8|16|32}x8i stride 3.

LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) and we plan to include the store (deinterleved side).

The patch goal is to optimize the following sequence:
a0 b0 c0 a1 b1 c1 a2 b2
c2 a3 b3 c3 a4 b4 c4 a5
b5 c5 a6 b6 c6 a7 b7 c7

into

a0 a1 a2 a3 a4 a5 a6 a7
b0 b1 b2 b3 b4 b5 b6 b7
c0 c1 c2 c3 c4 c5 c6 c7

Reviewers
1. zvi
2. igor
3. guyblank
4. dorit
5. Ayal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312722 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 14:02:13 +00:00
Matt Arsenault
e0de89287c InstSimplify: canonicalize is idempotent
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312685 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 01:21:43 +00:00
Krzysztof Parzyszek
9ca441aa44 Disable jump threading into loop headers
Consider this type of a loop:
    for (...) {
      ...
      if (...) continue;
      ...
    }
Normally, the "continue" would branch to the loop control code that
checks whether the loop should continue iterating and which contains
the (often) unique loop latch branch. In certain cases jump threading
can "thread" the inner branch directly to the loop header, creating
a second loop latch. Loop canonicalization would then transform this
loop into a loop nest. The problem with this is that in such a loop
nest neither loop is countable even if the original loop was. This
may inhibit subsequent loop optimizations and be detrimental to
performance.

Differential Revision: https://reviews.llvm.org/D36404


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312664 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-06 19:36:58 +00:00
Sanjay Patel
04894a4949 [ValueTracking, InstCombine] canonicalize fcmp ord/uno with non-NAN ops to null constants
This is a preliminary step towards solving the remaining part of PR27145 - IR for isfinite():
https://bugs.llvm.org/show_bug.cgi?id=27145

In order to solve that one more generally, we need to add matching for and/or of fcmp ord/uno
with a constant operand.

But while looking at those patterns, I realized we were missing a canonicalization for nonzero
constants. Rather than limiting to just folds for constants, we're adding a general value
tracking method for this based on an existing DAG helper.

By transforming everything to 0.0, we can simplify the existing code in foldLogicOfFCmps()
and pick up missing vector folds.

Differential Revision: https://reviews.llvm.org/D37427


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312591 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 23:13:13 +00:00
Sanjay Patel
d92ccb5856 [InstCombine] add nnan tests; NFC
As suggested in D37427, we could have a value tracking function and folds that use
it to simplify these cases. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312578 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 21:20:35 +00:00
Reid Kleckner
c86178ea37 Add llvm.codeview.annotation to implement MSVC __annotation
Summary:
This intrinsic represents a label with a list of associated metadata
strings. It is modelled as reading and writing inaccessible memory so
that it won't be removed as dead code. I think the intention is that the
annotation strings should appear at most once in the debug info, so I
marked it noduplicate. We are allowed to inline code with annotations as
long as we strip the annotation, but that can be done later.

Reviewers: majnemer

Subscribers: eraman, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D36904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312569 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 20:14:58 +00:00
Adam Nemet
a155485803 Split opt-remark YAML and opt output testing on this test
This prepares for https://reviews.llvm.org/D33514

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312544 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 18:03:39 +00:00
Craig Topper
203c00ded6 [InstCombine] Add test cases for folding (select (icmp ne/eq (and X, C1), (bitwiseop Y, C2), Y -> (bitwiseop Y, (shl/shr (and X, C1), C3)) or similar.
This is possible if C1 and C2 are both powers of 2. Or if binop is 'and' then ~C2 needs to be a power of 2.

We already support this for 'or', but we should be able to support 'and' and 'xor'. This will be enhanced by D37274.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312519 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 05:26:38 +00:00
Daniel Berlin
110f9f2e80 NewGVN: Fix PR 34430 - we need to look through predicateinfo copies to detect self-cycles of phi nodes. We also need to not ignore certain types of arguments when testing whether the phi has a backedge or was originally constant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312510 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 02:17:43 +00:00
Daniel Berlin
660fd0b5be NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312509 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 02:17:42 +00:00
Strahinja Petrovic
a78328c441 Fix test/Transforms/GlobalOpt/integer-bool-dwarf
This patch fixes regression related with 
integer-bool-dwarf test.

Patch by Nikola Prica.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312489 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 15:14:37 +00:00
Michael Zuckerman
e11eab53ee Update test for testing avx512
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312487 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 14:15:34 +00:00
Zvi Rackover
3438d07f09 LoopVectorize: MaxVF should not be larger than the loop trip count
Summary:
Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count.

Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow:
1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks.
2. Compute MaxVF returned 16 on a target supporting AVX512.
3. OptForSize -> choose VF:=MaxVF.
4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop.

With this patch step 2. will choose MaxVF=8 based on TripCount.

Reviewers: Ayal, dorit, mkuper, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D37425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312472 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 08:35:13 +00:00
Sam Parker
040fcc3883 [LoopUnroll][DebugInfo] Don't add metadata to unrolled remainder loop
Debug information can be, and was, corrupted when the runtime
remainder loop was fully unrolled. This is because a !null node can
be created instead of a unique one describing the loop. In this case,
the original node gets incorrectly updated with the NewLoopID
metadata.

In the case when the remainder loop is going to be quickly fully
unrolled, there isn't the need to add loop metadata for it anyway.

Differential Revision: https://reviews.llvm.org/D37338


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312471 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 08:12:16 +00:00
Sanjay Patel
74c232b75f [InstCombine] add tests for fcmp ord/uno canonicalization; NFC
Currently, we canonicalize some cases to use 0.0, but we miss others.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312445 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-03 15:35:10 +00:00
Don Hinton
f158190afe Fix buildbot failures for new test that requires the X86 target be built.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312424 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-02 22:56:58 +00:00
Sanjay Patel
03b20941fc [InstSimplify] regenerate checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312413 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-02 14:38:15 +00:00
Sanjay Patel
3e9b6b5304 [InstCombine] put 2 related tests in the same file; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312412 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-02 14:35:18 +00:00
Sanjay Patel
15de536e6e [InstSimplify] move fcmp simplification tests from InstCombine
These are all tests that result in a constant, so moving the tests over to where they are actually handled.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312411 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-02 14:27:00 +00:00
Daniel Berlin
026a351310 Fix PR/33305. caused by trying to simplify expressions in phi of ops that should have no leaders.
Summary:
After a discussion with Rekka, i believe this (or a small variant)
should fix the remaining phi-of-ops problems.

Rekka's algorithm for completeness relies on looking up expressions
that should have no leader, and expecting it to fail (IE looking up
expressions that can't exist in a predecessor, and expecting it to
find nothing).

Unfortunately, sometimes these expressions can be simplified to
constants, but we need the lookup to fail anyway.  Additionally, our
simplifier outsmarts this by taking these "not quite right"
expressions, and simplifying them into other expressions or walking
through phis, etc.  In the past, we've sometimes been able to find
leaders for these expressions, incorrectly.

This change causes us to not to try to phi of ops such expressions.
We determine safety by seeing if they depend on a phi node in our
block.

This is not perfect, we can do a bit better, but this should be a
"correctness start" that we can then improve.  It also requires a
bunch of caching that i'll eventually like to eliminate.

The right solution, longer term, to the simplifier issues, is to make
the query interface for the instruction simplifier/constant folder
have the flags we need, so that we can keep most things going, but
turn off the possibly-invalid parts (threading through phis, etc).
This is an issue in another wrong code bug as well.

Reviewers: davide, mcrosier

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37175

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312401 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-02 02:18:44 +00:00
Craig Topper
85fcd3487c [InstCombine][InstSimplify] Teach decomposeBitTestICmp to look through truncate instructions
This patch teaches decomposeBitTestICmp to look through truncate instructions on the input to the compare. If a truncate is found it will now return the pre-truncated Value and appropriately extend the APInt mask.

This allows some code to be removed from InstSimplify that was doing this functionality.

This allows InstCombine's bit test combining code to match a pre-truncate Value with the same Value appear with an 'and' on another icmp. Or it allows us to combine a truncate to i16 and a truncate to i8. This also required removing the type check from the beginning of getMaskedTypeForICmpPair, but I believe that's ok because we still have to find two values from the input to each icmp that are equal before we'll do any transformation. So the type check was really just serving as an early out.

There was one user of decomposeBitTestICmp that didn't want to look through truncates, so I've added a flag to prevent that behavior when necessary.

Differential Revision: https://reviews.llvm.org/D37158

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312382 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 21:27:34 +00:00
Davide Italiano
e5593530f5 [TTI] Fix getGEPCost() for geps with a single operand.
Previously this would sporadically crash as TargetType
was never initialized. We special-case the single-operand
case returning earlier and trying to mimic the behaviour of
isLegalAddressingMode as closely as possible.

Differential Revision:  https://reviews.llvm.org/D37277

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312357 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 19:54:08 +00:00
Daniel Berlin
29492fbac5 NewGVN: Make sure we don't incorrectly use PredicateInfo when doing PHI of ops
Summary: When we backtranslate expressions, we can't use the predicateinfo, since we are evaluating them in a different context.

Reviewers: davide, mcrosier

Subscribers: sanjoy, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D37174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312352 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 19:20:18 +00:00
Strahinja Petrovic
88eda0cc8a Adding missing test case in rL312318
Adding test for debug info for integer 
variables whose type is shrinked to bool.

Patch by Nikola Prica.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312325 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 11:37:53 +00:00
Clement Courbet
4855d2de9a Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
Add missing header.

This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312322 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 10:56:34 +00:00
Clement Courbet
1a4fd5c74c Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains of integer"
Break build

This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312317 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 09:43:08 +00:00
Clement Courbet
930b028c65 [MergeICmps] MergeICmps is a new optimization pass that turns chains of integer
comparisons into memcmp.

Thanks to recent improvements in the LLVM codegen, the memcmp is typically
inlined as a chain of efficient hardware comparisons.
This typically benefits C++ member or nonmember operator==().

For now this is disabled by default until:
 - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete
 - Benchmarks show that this is always useful.

Differential Revision:
https://reviews.llvm.org/D33987

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312315 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 09:07:05 +00:00
Akira Hatanaka
fec731bad8 [ObjCARC] Pass the correct BasicBlock to fix assertion failure.
The BasicBlock passed to FindPredecessorRetainWithSafePath should be the
parent block of Autorelease. This fixes a crash that occurs in
FindDependencies when StartInst is not in StartBB.

rdar://problem/33866381

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312266 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 18:27:47 +00:00
Sanjay Patel
da536d4e17 [InstCombine] improve demanded vector elements analysis of insertelement
Recurse instead of returning on the first found optimization. Also, return early in the caller
instead of continuing because that allows another round of simplification before we might
potentially lose undef information from a shuffle mask by eliminating the shuffle.

As noted in the review, we could probably do better and be more efficient by moving all of
demanded elements into a separate pass, but this is yet another quick fix to instcombine.

Differential Revision: https://reviews.llvm.org/D37236


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312248 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 15:57:17 +00:00
Max Kazantsev
5c5cdb3c21 [IRCE] Identify loops with latch comparison against current IV value
Current implementation of parseLoopStructure interprets the latch comparison as a
comarison against `iv.next`. If the actual comparison is made against the `iv` current value
then the loop may be rejected, because this misinterpretation leads to incorrect evaluation
of the latch start value.

This patch teaches the IRCE to distinguish this kind of loops and perform the optimization
for them. Now we use `IndVarBase` variable which can be either next or current value of the
induction variable (previously we used `IndVarNext` which was always the value on next iteration).

Differential Revision: https://reviews.llvm.org/D36215


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312221 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 07:04:20 +00:00
Sanjay Patel
e6ed9807f6 [InstCombine] add more vector demand examples; NFC
See D37236 for discussion. It seems unlikely that we actually want/need
to do this kind of folding in InstCombine in the long run, but moving
everything will be a bigger follow-up step.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312172 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-30 21:11:46 +00:00
Adrian Prantl
69e607f200 Canonicalize the representation of empty an expression in DIGlobalVariableExpression
This change simplifies code that has to deal with
DIGlobalVariableExpression and mirrors how we treat DIExpressions in
debug info intrinsics. Before this change there were two ways of
representing empty expressions on globals, a nullptr and an empty
!DIExpression().

If someone needs to upgrade out-of-tree testcases:
  perl -pi -e 's/(!DIGlobalVariableExpression\(var: ![0-9]*)\)/\1, expr: !DIExpression())/g' <MYTEST.ll>
will catch 95%.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312144 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-30 18:06:51 +00:00
Sanjay Patel
17f4bfb8d0 [InstCombine] remove unnecessary vector select fold; NFCI
This code is double-dead:
1. We simplify all selects with constant true/false condition in InstSimplify.
   I've minimized/moved the tests to show that works as expected.
2. All remaining vector selects with a constant condition are canonicalized to
   shufflevector, so we really can't see this pattern.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312123 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-30 14:04:57 +00:00
Florian Hahn
785320780e [InstCombine] Fold insert sequence if first ins has multiple users.
Summary:
If the first insertelement instruction has multiple users and inserts at
position 0, we can re-use this instruction when folding a chain of
insertelement instructions. As we need to generate the first
insertelement instruction anyways, this should be a strict improvement.

We could get rid of the restriction of inserting at position 0 by
creating a different shufflemask, but it is probably worth to keep the
first insertelement instruction with position 0, as this is easier to do
efficiently than at other positions I think.

Reviewers: grosser, mkuper, fpetrogalli, efriedma

Reviewed By: fpetrogalli

Subscribers: gareevroman, llvm-commits

Differential Revision: https://reviews.llvm.org/D37064

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312110 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-30 10:54:21 +00:00
Evgeniy Stepanov
aa1a72db63 [cfi] Avoid branch veneers in jump tables when possible.
Summary:
When jumptable encoding does not match target code encoding (arm vs
thumb), a veneer is inserted by the linker. We can not avoid this
in all cases, because entries within one jumptable must have the same
encoding, but we can make it less common by selecting the jumptable
encoding to match the majority of its targets.

This change only covers FullLTO, and not ThinLTO.

Reviewers: pcc

Subscribers: aemerson, mehdi_amini, javed.absar, kristof.beyls, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312054 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 22:40:19 +00:00
Evgeniy Stepanov
21f4a97e9c [cfi] Build __cfi_check as Thumb when applicable.
Summary:
Cross-DSO CFI needs all __cfi_check exports to use the same encoding
(ARM vs Thumb).

Reviewers: pcc

Subscribers: aemerson, srhines, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37243

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312052 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 22:29:15 +00:00
Wei Mi
0488e47901 [LoopUnswitch] Fix a simple bug which disables loop unswitch for select statement
This is to fix PR34257. rL309059 takes an early return when FindLIVLoopCondition
fails to find a loop invariant condition. This is wrong and it will disable loop
unswitch for select. The patch fixes the bug.

Differential Revision: https://reviews.llvm.org/D36985


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312045 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 21:45:11 +00:00
Alexey Bataev
c95fd24a5a [SimplifyCFG] Fix for PR34219: Preserve alignment after merging conditional stores.
Summary:
If SimplifyCFG pass is able to merge conditional stores into single one,
it loses the alignment. This may lead to incorrect codegen. Patch
sets the alignment of the new instruction if it is set in the original
one.

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312030 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 20:06:24 +00:00
Craig Topper
3ff9c137a4 [InstCombine] Support vector splats in transformZExtICmp
This patch adds splat support to transformZExtICmp. The test cases are vector versions of tests that failed when commenting out parts of the existing scalar code.

One test didn't vectorize optimize properly due to another bug so a TODO has been added.

Differential Revision: https://reviews.llvm.org/D37253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312023 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 18:58:13 +00:00
Davide Italiano
fac36eb3e3 [LoopUnroll] Make the test for PR33437 actually useful.
I forgot to specify -unroll-loop-peel, making this test not
really effective. While here, adjust some details (naming and
run line). Thanks to Sanjoy and Michael Z. for pointing out in
their post-commit reviews.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312015 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 17:24:09 +00:00
Dehao Chen
31ec888106 Add null check for promoted direct call
Summary: We originally assume that in pgo-icp, the promoted direct call will never be null after strip point casts. However, stripPointerCasts is so smart that it could possibly return the value of the function call if it knows that the return value is always an argument. In this case, the returned value cannot cast to Instruction. In this patch, null check is added to ensure null pointer will not be accessed.

Reviewers: tejohnson, xur, davidxl, djasper

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D37252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312005 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 15:28:12 +00:00
Alexey Bataev
a67ad7b80e [SimplifyCFG] Update initial test for better testing of the fix for
PR34219, NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312002 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 14:37:23 +00:00
Max Kazantsev
b074309e70 [LSR] Fix Shadow IV in case of integer overflow
When LSR processes code like

  int accumulator = 0;
  for (int i = 0; i < N; i++) {
    accummulator += i;
    use((double) accummulator);
  }

It may decide to replace integer `accumulator` with a double Shadow IV to get rid
of casts.  The problem with that is that the `accumulator`'s value may overflow.
Starting from this moment, the behavior of integer and double accumulators
will differ.

This patch strenghtens up the conditions of Shadow IV mechanism applicability.
We only allow it for IVs that are proved to be `AddRec`s with `nsw`/`nuw` flag.

Differential Revision: https://reviews.llvm.org/D37209


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311986 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 07:32:20 +00:00
Craig Topper
a540c13df1 [InstCombine] Uncomment two test cases that were commented out with a TODO about them not optimizing.
If we can't see the current code how will we ever know if they get fixed or even what the problem is?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311985 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 07:08:39 +00:00
Max Kazantsev
e7580586ab [NFC] Fix indents in test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311982 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 05:30:58 +00:00
Max Kazantsev
768c0b6db4 [NFC] Refactor ShadowIV test to use FileCheck
Also get rid of unnamed values that make the test hard to read.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311980 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 05:20:56 +00:00
Craig Topper
93198099d3 [InstCombine] Teach foldSelectICmpAndOr to handle vector splats
This was pretty close to working already. While I was here I went ahead and passed the ICmpInst pointer from the caller instead of doing a dyn_cast that can never fail.

Differential Revision: https://reviews.llvm.org/D37237

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311960 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 00:13:49 +00:00
Craig Topper
ccba49dfc2 [InstCombine] Teach select01 helper of foldSelectIntoOp to handle vector splats
We were handling some vectors in foldSelectIntoOp, but not if the operand of the bin op was any kind of vector constant. This patch fixes it to treat vector splats the same as scalars.

Differential Revision: https://reviews.llvm.org/D37232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 22:00:27 +00:00
Sanjay Patel
cd4a7cd9dc [InstCombine] add tests to show failure of SimplifyDemandedVectorElts + shuffle combining; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311934 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 21:14:26 +00:00
Davide Italiano
5e8dffb156 [LoopUnroll] Properly update loop structure in case of successful peeling.
When peeling kicks in, it updates the loop preheader.
Later, a successful full unroll of the loop needs to update a PHI
which i-th argument comes from the loop preheader, so it'd better look
at the correct block. Fixes PR33437.

Differential Revision:  https://reviews.llvm.org/D37153

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311922 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 20:29:33 +00:00
Taewook Oh
2dc1928441 Create PHI node for the return value only when the return value has uses.
Summary:
Currently, a phi node is created in the normal destination to unify the return values from promoted calls and the original indirect call. This patch makes this phi node to be created only when the return value has uses.

This patch is necessary to generate valid code, as compiler crashes with the attached test case without this patch. Without this patch, an illegal phi node that has no incoming value from `entry`/`catch` is created in `cleanup` block.

I think existing implementation is good as far as there is at least one use of the original indirect call. `insertCallRetPHI` creates a new phi node in the normal destination block only when the original indirect call dominates its use and the normal destination block. Otherwise, `fixupPHINodeForNormalDest` will handle the unification of return values naturally without creating a new phi node. However, if there's no use, `insertCallRetPHI` still creates a new phi node even when the original indirect call does not dominate the normal destination block, because `getCallRetPHINode` returns false.

Reviewers: xur, davidxl, danielcdh

Reviewed By: xur

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37176

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311906 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:57:00 +00:00
Craig Topper
a3ced95cbe [InstCombine] Call hasNoSignedWrap instead of hasNoUnsignedWrap to get the NSW flag when handling Add in SimplifyDemandedUseBits.
This is a typo from r311789.

This should fix PR34349.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311902 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:44:28 +00:00
Dehao Chen
3607b8f0f2 revert r310985 which breaks for the following case:
struct string {
  ~string();
};
void f2();
void f1(int) { f2(); }
void run(int c) {
  string body;
  while (true) {
    if (c)
      f1(c);
    else
      f1(c);
  }
}

Will recommit once the issue is fixed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311864 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 22:22:39 +00:00
Ayal Zaks
a183d6cf2a [LV] Fix PR34248 - recommit D32871 after revert r311304
Original commit r311077 of D32871 was reverted in r311304 due to failures
reported in PR34248.

This recommit fixes PR34248 by restricting the packing of predicated scalars
into vectors only when vectorizing, avoiding doing so when unrolling w/o
vectorizing. Added a test derived from the reproducer of PR34248.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311849 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 12:55:46 +00:00
Daniel Berlin
7824530ee0 NewGVN: Fix PR33204 - We need to add memory users when we bypass memorydefs for loads, not just when we do it for stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311829 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 07:37:11 +00:00
Craig Topper
3ec9576f37 [InstCombine] Add tests to show missed opportunities to combine bit tests hidden by a sign compare and a truncate. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311784 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 17:14:35 +00:00
Florian Hahn
f360477df5 [LoopInterchange] Skip zext instructions when looking for induction var.
Summary:
SimplifyIndVar may introduce zext instructions to widen arguments of the
loop exit check. They should not prevent us from splitting the loop at
the induction variable, but maybe the check should be more conservative,
e.g. making sure it only extends arguments used by a comparison?

Reviewers: karthikthecool, mcrosier, mzolotukhin

Reviewed By: mcrosier

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D34879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 16:52:29 +00:00
Amjad Aboud
513af851dd [InstCombine] Consider more cases where SimplifyDemandedUseBits does not convert AShr to LShr.
There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit.

Differential Revision: https://reviews.llvm.org/D36936




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311773 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 11:07:54 +00:00
Gor Nishanov
8970bfadd2 [coroutines] Add support for symmetric control transfer (musttail on coro.resumes followed by a suspend)
Summary:
Add musttail to any resume instructions that is immediately followed by a
suspend (i.e. ret). We do this even in -O0 to support guaranteed tail call
for symmetrical coroutine control transfer (C++ Coroutines TS extension).
This transformation is done only in the resume part of the coroutine that has
identical signature and calling convention as the coro.resume call.

Reviewers: GorNishanov

Reviewed By: GorNishanov

Subscribers: EricWF, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D37125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311751 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 02:25:10 +00:00
Mandeep Singh Grang
89c6743f22 [ADT] Enable reverse iteration for DenseMap
Reviewers: mehdi_amini, dexonsmith, dblaikie, davide, chandlerc, davidxl, echristo, efriedma

Reviewed By: dblaikie

Subscribers: rsmith, mgorny, emaste, llvm-commits

Differential Revision: https://reviews.llvm.org/D35043

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311730 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 23:02:48 +00:00
Xinliang David Li
092c93330a [Profile] backward propagate profile info in JumpThreading
Take-2 after fixing bugs in the original patch.

Differential Revsion: http://reviews.llvm.org/D36864


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311727 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 22:54:01 +00:00
Sanjay Patel
3ff3073768 [InstCombine] fix and enhance udiv/urem narrowing
There are 3 small independent changes here:

  1. Account for multiple uses in the pattern matching: avoid the transform if it increases the instruction count.
  2. Add a missing fold for the case where the numerator is the constant: http://rise4fun.com/Alive/E2p
  3. Enable all folds for vector types.

There's still one more potential change - use "shouldChangeType()" to keep from transforming to an illegal integer type.

Differential Revision: https://reviews.llvm.org/D36988


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311726 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 22:54:01 +00:00
Dehao Chen
d38687abb5 Move accurate-sample-profile into the function attribute.
Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO.

Reviewers: davidxl, rsmith

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37113

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311706 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 21:37:04 +00:00
Michael Zuckerman
9db416111e Adding base lit test for x86interleaved
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311658 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 14:11:28 +00:00
Evgeny Astigeevich
6e59618ef9 [ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes
ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes
for the Thumb1 target. This causes generation of redundant code and affects
performance.

This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106

Differential Revision: https://reviews.llvm.org/D36467


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311649 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 10:00:25 +00:00
Mikael Holmen
6dbfbe1563 [Reassociate] Do not drop debug location if replacement is missing
Summary:
When reassociating an expression, do not drop the instruction's
original debug location in case the replacement location is
missing.

The debug location must at least not be dropped for inlinable
callsites of debug-info-bearing functions in debug-info-bearing
functions. Failing to do so would result in an "inlinable function "
"call in a function with debug info must have a !dbg location"
error in the verifier.

As preserving the original debug location is not expected
to result in overly jumpy debug line information, it is
preserved for all other cases too.

This fixes PR34231:
https://bugs.llvm.org/show_bug.cgi?id=34231

Original patch by David Stenberg

Reviewers: davide, craig.topper, mcrosier, dblaikie, aprantl

Reviewed By: davide, aprantl

Subscribers: aprantl

Differential Revision: https://reviews.llvm.org/D36865

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311642 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 09:05:00 +00:00
Daniel Berlin
9dc6eddb1d NewGVN: We weren't properly simplifying selects with equal arguments due to a thinko.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311626 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 02:43:17 +00:00
Dehao Chen
4d101c0667 Add test to cover accurate-sample-profile.
Summary: This patch adds test to cover the logic guarded by "accurate-sample-profile" flag.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311618 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 23:19:11 +00:00
Rong Xu
7996242b16 [PGO] Set edge weights for indirectbr instruction with profile counts
Current PGO only annotates the edge weight for branch and switch instructions
with profile counts. We should also annotate the indirectbr instruction as
all the information is there. This patch enables the annotating for indirectbr
instructions. Also uses this annotation in branch probability analysis.

Differential Revision: https://reviews.llvm.org/D37074


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311604 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 21:36:02 +00:00
Reid Kleckner
a5b2af0eae Parse and print DIExpressions inline to ease IR and MIR testing
Summary:
Most DIExpressions are empty or very simple. When they are complex, they
tend to be unique, so checking them inline is reasonable.

This also avoids the need for CodeGen passes to append to the
llvm.dbg.mir named md node.

See also PR22780, for making DIExpression not be an MDNode.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37075

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311594 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 20:31:27 +00:00
Hans Wennborg
18b8cfa7ee LowerAtomic: Don't skip optnone functions; atomic still need lowering (PR34020)
The lowering isn't really an optimization, so optnone shouldn't make a
difference. ARM relies on the pass running when using "-mthread-model
single", because in that mode, it doesn't run AtomicExpand. See bug for
more details.

Differential Revision: https://reviews.llvm.org/D37040

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 15:43:28 +00:00
Gor Nishanov
1e95aaa810 [coroutines] CoroBegin from inner coroutines should be considered for spills
Summary:
If a coroutine outer calls another coroutine inner and the inner coroutine body is inlined into the outer, coro.begin from the inner coroutine should be considered for spilling if accessed across suspends.

Prior to this change, coroutine frame building code was not considering any coro.begins for spilling.
With this change, we only ignore coro.begin for the current coroutine, but, any coro.begins that were inlined into the current coroutine are eligible for spills.

Fixes PR34267

Reviewers: GorNishanov

Subscribers: qcolombet, llvm-commits, EricWF

Differential Revision: https://reviews.llvm.org/D37062

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311556 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 14:47:52 +00:00
Chad Rosier
37d17304a3 [Reassociate] Don't canonicalize x + (-Constant * y) -> x - (Constant * y)..
..if the resulting subtract will be broken up later.  This can cause us to get
into an infinite loop.

x + (-5.0 * y)      -> x - (5.0 * y)       ; Canonicalize neg const
x - (5.0 * y)       -> x + (0 - (5.0 * y)) ; Break up subtract
x + (0 - (5.0 * y)) -> x + (-5.0 * y)      ; Replace 0-X with X*-1.

PR34078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 14:10:06 +00:00
Davide Italiano
b7a48d833a [InstCombine] Fold branches with irrelevant conditions to a constant.
InstCombine folds instructions with irrelevant conditions to undef.
This, as Nuno confirmed is a bug.
(see https://bugs.llvm.org/show_bug.cgi?id=33409#c1 )

Given the original motivation for the change is that of removing an
USE, we now fold to false instead (which reaches the same goal
without undesired side effects).

Fixes PR33409.

Differential Revision:  https://reviews.llvm.org/D36975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311540 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 09:14:37 +00:00
Craig Topper
1bc52fbec6 [InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast
Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway.

For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'.

With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok.

Differential Revision: https://reviews.llvm.org/D36213

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311508 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 23:40:15 +00:00
Peter Collingbourne
bb516bcc22 WholeProgramDevirt: Create bitcast to i8* at each virtual call site.
We can't reuse the llvm.assume instruction's bitcast because it may not
dominate every user of the vtable pointer.

Differential Revision: https://reviews.llvm.org/D36994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311491 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 21:41:19 +00:00
Jakub Kuderski
5288e9123b [ADCE][Dominators] Reapply: Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

This is reapplies the original patch r311057 that was reverted in r311381.
The previous version wasn't using the batch update api for updating dominators,
which in vary rare cases caused assertion failures.

This also fixes PR34258.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311467 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 16:30:21 +00:00
Sanjay Patel
6995f18af0 [InstCombine] add udiv/urem tests with constant numerator; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311396 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 22:40:02 +00:00
Sanjay Patel
15e40f1526 [InstCombine] add more tests for udiv/urem narrowing; NFC
We don't currently limit these folds with hasOneUse() or shouldChangeType().


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311390 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 21:57:52 +00:00
Sanjoy Das
4547fffc04 Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators"
Summary: This partially reverts commit r311057 since it breaks ADCE.  See PR34258.

Reviewers: kuhar

Subscribers: mcrosier, david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D36979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311381 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 20:39:18 +00:00
Sanjay Patel
5aca549a9a [LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try)
The 1st try was reverted because it could inf-loop by creating a dead instruction.
Fixed that to not happen and added a test case to verify.

Original commit message:

Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311366 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 19:13:14 +00:00
Craig Topper
bbbb2f573f [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type
This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely.

Differential Revision: https://reviews.llvm.org/D36498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311362 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 19:02:06 +00:00
Sanjay Patel
2e732d6a1b [InstCombine] add tests for memcmp with constant; NFC
This is the baseline (current) version of the tests that would
have been added with the transform in r311333 (reverted at 
r311340 due to inf-looping).

Adding these now to aid in testing and minimize the patch if/when
it is reinstated.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311350 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:47:12 +00:00
Sam Elliott
b4d267277b Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311349 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:45:47 +00:00
Craig Topper
1952c98f8b [InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions
Summary:
If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear.

Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits.

Reviewers: spatel, majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311343 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:04:11 +00:00
Xinliang David Li
3ab3d94ff5 Revert 311208, 311209
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311341 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:00:38 +00:00