archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Nuno Lopes	fe353a0cbf	Merge isKnownNonNull into isKnownNonZero It now knows the tricks of both functions. Also, fix a bug that considered allocas of non-zero address space to be always non null Differential Revision: https://reviews.llvm.org/D37628 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312869 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 18:23:11 +00:00
Craig Topper	9080ea8806	[X86] Don't disable slow INC/DEC if optimizing for size Summary: Just because INC/DEC is a little slow on some processors doesn't mean we shouldn't prefer it when optimizing for size. This appears to match gcc behavior. Reviewers: chandlerc, zvi, RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37177 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312866 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 17:11:59 +00:00
Sanjay Patel	193e898f75	[DivRempairs] add a pass to optimize div/rem pairs (PR31028) This is intended to be a superset of the functionality from D31037 (EarlyCSE) but implemented as an independent pass, so there's no stretching of scope and feature creep for an existing pass. I also proposed a weaker version of this for SimplifyCFG in D30910. And I initially had almost this same functionality as an addition to CGP in the motivating example of PR31028: https://bugs.llvm.org/show_bug.cgi?id=31028 The advantage of positioning this ahead of SimplifyCFG in the pass pipeline is that it can allow more flattening. But it needs to be after passes (InstCombine) that could sink a div/rem and undo the hoisting that is done here. Decomposing remainder may allow removing some code from the backend (PPC and possibly others). Differential Revision: https://reviews.llvm.org/D37121 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312862 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 13:38:18 +00:00
Craig Topper	a5d2fa0ce2	[X86] Call removeDeadNode when we're done doing custom isel for mul, div and test Summary: Once we've done our custom isel for these nodes, I think we should be calling removeDeadNode to prune them out of the DAG. Table driven isel ultimately either calls morphNodeTo which modifies a node and doesn't leave dead nodes. Or it emits new nodes and then calls removeDeadNode as part of Opc_CompleteMatch. If you run a simple multiply test case like this through llc with -debug you'll see a umul_lohi node get printed as part of the dump for Instruction Selection ends. ``` define i64 @foo(i64 %a, i64 %b) local_unnamed_addr #0 { entry: %conv = zext i64 %a to i128 %conv1 = zext i64 %b to i128 %mul = mul nuw nsw i128 %conv1, %conv %shr = lshr i128 %mul, 64 %conv2 = trunc i128 %shr to i64 ret i64 %conv2 } ``` Reviewers: RKSimon, spatel, zvi, guyblank, niravd Reviewed By: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37547 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312857 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 05:57:20 +00:00
Craig Topper	97293278fd	[X86] Use ReplaceNode instead of ReplaceUses when converting X86ISD::SHRUNKBLEND to ISD::VSELECT during isel. This ensures that the SHRUNKBLEND node gets erased immediately. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312856 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 05:57:19 +00:00
Kostya Serebryany	eba4372133	[sanitizer-coverage] call appendToUsed once per module, not once per function (which is too slow) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312855 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 05:30:13 +00:00
Alexey Bataev	c2b919492b	[SLP] Fix buildbots, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312853 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 02:08:45 +00:00
Matthias Braun	8f6af689a2	RegAllocFast: Fix warning; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312852 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 01:16:59 +00:00
Matthias Braun	8da30a7fc9	RegAllocFast: Cleanup; NFC - Use range based for - Variable names should start with upper case - Add `const` - Change class name to match filename - Fix doxygen comments - Use MCPhysReg instead of unsigned - Use references instead of pointers where things cannot be nullptr - Misc coding style improvements git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312846 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 00:52:46 +00:00
Matthias Braun	53da3e9f59	RegAllocFast: Move vector to class level to avoid reallocation; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312845 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 00:52:45 +00:00
Matthias Braun	d25fc768a7	RegAllocFast: Remove write-only set; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312844 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 00:52:42 +00:00
Kyle Butt	6ece35c79b	PPC: Don't select lxv/stxv for insufficiently aligned stack slots. The lxv/stxv instructions require an offset that is 0 % 16. Previously we were selecting lxv/stxv for loads and stores to the stack where the offset from the slot was a multiple of 16, but the stack slot was not 16 or more byte aligned. When the frame gets lowered these transform to r(1\|31) + slot + offset. If slot is not aligned, slot + offset may not be 0 % 16. Now we require 16 byte or more alignment for select lxv/stxv to stack slots. Includes a testcase that shows both sufficiently and insufficiently aligned stack slots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312843 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-09 00:37:56 +00:00
Davide Italiano	1546bf0dba	[AMDGPU] Remove unused function. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312836 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 23:54:11 +00:00
Yonghong Song	f5858045aa	bpf: proper print imm64 expression in inst printer Fixed an issue in printImm64Operand where if the value is an expression, print out the expression properly. Currently, it will print r1 = <MCOperand Expr:(tx_port)>ll With the patch, the printout will be r1 = tx_port Suggested-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312833 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 23:32:38 +00:00
Guozhi Wei	19969b8b8f	[TargetTransformInfo] Add a new public interface getInstructionCost Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface: enum TargetCostKind { TCK_RecipThroughput, ///< Reciprocal throughput. TCK_Latency, ///< The latency of instruction. TCK_CodeSize ///< Instruction code size. }; int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const; All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model. This patch also provides a simple default implementation of getInstructionLatency. The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways: Add more detail into this function. Add getXXXLatency function and call it from here. Implement target specific getInstructionLatency function. Differential Revision: https://reviews.llvm.org/D37170 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312832 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 22:29:17 +00:00
Matt Arsenault	75448f1d3b	AMDGPU: Start using !con operator We have a lot of operand definition work essentially producing every valid permutation of operands to workaround builiding operand lists based on the instruction features. Apparently tablegen already has a mostly undocumented operator to concat dags which simplies this. Convert one simple place to use this. The BUF instruction definitions have much more complicated logic that can be totally rewritten now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312822 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 19:09:13 +00:00
Matt Arsenault	fadb61df65	AMDGPU: Recompute scc liveness The various scalar bit operations set SCC, so one is erased or moved it needs to be recomputed. Not sure why the existing tests don't fail on this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312819 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 18:51:26 +00:00
Vedant Kumar	33671ba388	[Coverage] Build sorted and unique segments A coverage segment contains a starting line and column, an execution count, and some other metadata. Clients of the coverage library use segments to prepare line-oriented reports. Users of the coverage library depend on segments being unique and sorted in source order. Currently this is not guaranteed (this is why the clang change which introduced deferred regions was reverted). This commit documents the "unique and sorted" condition and asserts that it holds. It also fixes the SegmentBuilder so that it produces correct output in some edge cases. Testing: I've added unit tests for some edge cases. I've also checked that the new SegmentBuilder implementation is fully covered. Apart from running check-profile and the llvm-cov tests, I've successfully used a stage1 llvm-cov to prepare a coverage report for an instrumented clang binary. Differential Revision: https://reviews.llvm.org/D36813 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312817 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 18:44:50 +00:00
Vedant Kumar	5d6f66273b	[Coverage] Define LineColPair for convenience. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312815 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 18:44:48 +00:00
Vedant Kumar	55c5795332	[Coverage] Report errors when reading malformed source regions Each source region has a start and end location. Report an error when the end location does not precede the begin location. The old lineExecutionCounts.covmapping test actually had a buggy source region in it. This commit introduces a regenerated copy of the coverage and moves the old copy to malformedRegions.covmapping, for a test. Differential Revision: https://reviews.llvm.org/D37387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312814 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 18:44:47 +00:00
Chandler Carruth	400e98d4ca	[x86] Fix GCC pedantic warnings about default arguments for lambdas. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312809 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 18:23:42 +00:00
Dinar Temirbulatov	996c1871f5	[SLPVectorizer] Add struct InstructionsState that holds information about analysis of vector to be vectorized. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D37212 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312802 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 17:08:17 +00:00
Wei Mi	9930880cbf	Fix a bug for rL312641. rL312641 Allowed llvm.memcpy/memset/memmove to be tail calls when parent function return the intrinsics's first argument. However on arm-none-eabi platform, llvm.memcpy will be expanded to __aeabi_memcpy which doesn't have return value. The fix is to check the libcall name after expansion to match "memcpy/memset/memmove" before allowing those intrinsic to be tail calls. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312799 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 16:44:52 +00:00
Krzysztof Parzyszek	145740999d	Preserve existing regs when adding pristines to LivePhysRegs/LiveRegUnits Differential Revision: https://reviews.llvm.org/D37600 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312797 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 16:29:50 +00:00
Alexey Bataev	c95cfa7758	[SLP] Fix the warning about paths not returning the value, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312793 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 14:32:20 +00:00
Alexey Bataev	4fcc7e8528	[SLP] Support for horizontal min/max reduction. SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions. Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions. Patch fixes PR26956. Differential revision: https://reviews.llvm.org/D27846 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 13:49:36 +00:00
Max Kazantsev	2e0950763d	Re-enable "[IRCE] Identify loops with latch comparison against current IV value" Re-applying after the found bug was fixed. Differential Revision: https://reviews.llvm.org/D36215 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312783 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 10:15:05 +00:00
Jonas Devlieghere	1a443ce0dd	[dwarfdump] Verify line table prologue This patch adds prologue verification, which is already present in Apple's dwarfdump. It checks for invalid directory indices and warns about duplicate file paths. Differential revision: https://reviews.llvm.org/D37511 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312782 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 09:48:51 +00:00
Martin Storsjo	6b1b54293c	[llvm-dlltool] Mention arm64 in the lists of architecture alternatives This was missed in SVN r310223 when arm64 support was added. Differential Revision: https://reviews.llvm.org/D37588 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312776 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 06:49:46 +00:00
Max Kazantsev	cf9cd148bd	diff --git a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp index f72a808..9fa49fd 100644 --- a/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp +++ b/lib/Transforms/Scalar/InductiveRangeCheckElimination.cpp @@ -450,20 +450,10 @@ struct LoopStructure { // equivalent to: // // intN_ty inc = IndVarIncreasing ? 1 : -1; - // pred_ty predicate = IndVarIncreasing - // ? IsSignedPredicate ? ICMP_SLT : ICMP_ULT - // : IsSignedPredicate ? ICMP_SGT : ICMP_UGT; + // pred_ty predicate = IndVarIncreasing ? ICMP_SLT : ICMP_SGT; // - // - // for (intN_ty iv = IndVarStart; predicate(IndVarBase, LoopExitAt); - // iv = IndVarNext) + // for (intN_ty iv = IndVarStart; predicate(iv, LoopExitAt); iv = IndVarBase) // ... body ... - // - // Here IndVarBase is either current or next value of the induction variable. - // in the former case, IsIndVarNext = false and IndVarBase points to the - // Phi node of the induction variable. Otherwise, IsIndVarNext = true and - // IndVarBase points to IV increment instruction. - // Value IndVarBase; Value IndVarStart; @@ -471,13 +461,12 @@ struct LoopStructure { Value LoopExitAt; bool IndVarIncreasing; bool IsSignedPredicate; - bool IsIndVarNext; LoopStructure() : Tag(""), Header(nullptr), Latch(nullptr), LatchBr(nullptr), LatchExit(nullptr), LatchBrExitIdx(-1), IndVarBase(nullptr), IndVarStart(nullptr), IndVarStep(nullptr), LoopExitAt(nullptr), - IndVarIncreasing(false), IsSignedPredicate(true), IsIndVarNext(false) {} + IndVarIncreasing(false), IsSignedPredicate(true) {} template <typename M> LoopStructure map(M Map) const { LoopStructure Result; @@ -493,7 +482,6 @@ struct LoopStructure { Result.LoopExitAt = Map(LoopExitAt); Result.IndVarIncreasing = IndVarIncreasing; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; return Result; } @@ -841,42 +829,21 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, return false; }; - // `ICI` can either be a comparison against IV or a comparison of IV.next. - // Depending on the interpretation, we calculate the start value differently. + // `ICI` is interpreted as taking the backedge if the next* value of the + // induction variable satisfies some constraint. - // Pair {IndVarBase; IsIndVarNext} semantically designates whether the latch - // comparisons happens against the IV before or after its value is - // incremented. Two valid combinations for them are: - // - // 1) { phi [ iv.start, preheader ], [ iv.next, latch ]; false }, - // 2) { iv.next; true }. - // - // The latch comparison happens against IndVarBase which can be either current - // or next value of the induction variable. const SCEVAddRecExpr IndVarBase = cast<SCEVAddRecExpr>(LeftSCEV); bool IsIncreasing = false; bool IsSignedPredicate = true; - bool IsIndVarNext = false; ConstantInt StepCI; if (!IsInductionVar(IndVarBase, IsIncreasing, StepCI)) { FailureReason = "LHS in icmp not induction variable"; return None; } - const SCEV IndVarStart = nullptr; - // TODO: Currently we only handle comparison against IV, but we can extend - // this analysis to be able to deal with comparison against sext(iv) and such. - if (isa<PHINode>(LeftValue) && - cast<PHINode>(LeftValue)->getParent() == Header) - // The comparison is made against current IV value. - IndVarStart = IndVarBase->getStart(); - else { - // Assume that the comparison is made against next IV value. - const SCEV StartNext = IndVarBase->getStart(); - const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); - IndVarStart = SE.getAddExpr(StartNext, Addend); - IsIndVarNext = true; - } + const SCEV StartNext = IndVarBase->getStart(); + const SCEV Addend = SE.getNegativeSCEV(IndVarBase->getStepRecurrence(SE)); + const SCEV IndVarStart = SE.getAddExpr(StartNext, Addend); const SCEV Step = SE.getSCEV(StepCI); ConstantInt One = ConstantInt::get(IndVarTy, 1); @@ -1060,7 +1027,6 @@ LoopStructure::parseLoopStructure(ScalarEvolution &SE, Result.IndVarIncreasing = IsIncreasing; Result.LoopExitAt = RightValue; Result.IsSignedPredicate = IsSignedPredicate; - Result.IsIndVarNext = IsIndVarNext; FailureReason = nullptr; @@ -1350,9 +1316,8 @@ LoopConstrainer::RewrittenRangeInfo LoopConstrainer::changeIterationSpaceEnd( BranchToContinuation); NewPHI->addIncoming(PN->getIncomingValueForBlock(Preheader), Preheader); - auto FixupValue = - LS.IsIndVarNext ? PN->getIncomingValueForBlock(LS.Latch) : PN; - NewPHI->addIncoming(FixupValue, RRI.ExitSelector); + NewPHI->addIncoming(PN->getIncomingValueForBlock(LS.Latch), + RRI.ExitSelector); RRI.PHIValuesAtPseudoExit.push_back(NewPHI); } @@ -1735,10 +1700,7 @@ bool InductiveRangeCheckElimination::runOnLoop(Loop L, LPPassManager &LPM) { } LoopStructure LS = MaybeLoopStructure.getValue(); const SCEVAddRecExpr IndVar = - cast<SCEVAddRecExpr>(SE.getSCEV(LS.IndVarBase)); - if (LS.IsIndVarNext) - IndVar = cast<SCEVAddRecExpr>(SE.getMinusSCEV(IndVar, - SE.getSCEV(LS.IndVarStep))); + cast<SCEVAddRecExpr>(SE.getMinusSCEV(SE.getSCEV(LS.IndVarBase), SE.getSCEV(LS.IndVarStep))); Optional<InductiveRangeCheck::Range> SafeIterRange; Instruction ExprInsertPt = Preheader->getTerminator(); diff --git a/test/Transforms/IRCE/latch-comparison-against-current-value.ll b/test/Transforms/IRCE/latch-comparison-against-current-value.ll deleted file mode 100644 index afea0e6..0000000 --- a/test/Transforms/IRCE/latch-comparison-against-current-value.ll +++ /dev/null @@ -1,182 +0,0 @@ -; RUN: opt -verify-loop-info -irce-print-changed-loops -irce -S < %s 2>&1 \| FileCheck %s - -; Check that IRCE is able to deal with loops where the latch comparison is -; done against current value of the IV, not the IV.next. - -; CHECK: irce: in function test_01: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK: irce: in function test_02: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_03: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> -; CHECK-NOT: irce: in function test_04: constrained Loop at depth 1 containing: %loop<header><exiting>,%in.bounds<latch><exiting> - -; SLT condition for increasing loop from 0 to 100. -define void @test_01(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_01 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp slt i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp slt i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp slt i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp slt i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp slt i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp slt i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; ULT condition for increasing loop from 0 to 100. -define void @test_02(i32* %arr, i32* %a_len_ptr) #0 { - -; CHECK: test_02 -; CHECK: entry: -; CHECK-NEXT: %exit.mainloop.at = load i32, i32* %a_len_ptr, !range !0 -; CHECK-NEXT: [[COND2:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND2]], label %loop.preheader, label %main.pseudo.exit -; CHECK: loop: -; CHECK-NEXT: %idx = phi i32 [ %idx.next, %in.bounds ], [ 0, %loop.preheader ] -; CHECK-NEXT: %idx.next = add nuw nsw i32 %idx, 1 -; CHECK-NEXT: %abc = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 true, label %in.bounds, label %out.of.bounds.loopexit1 -; CHECK: in.bounds: -; CHECK-NEXT: %addr = getelementptr i32, i32* %arr, i32 %idx -; CHECK-NEXT: store i32 0, i32* %addr -; CHECK-NEXT: %next = icmp ult i32 %idx, 100 -; CHECK-NEXT: [[COND3:%[^ ]+]] = icmp ult i32 %idx, %exit.mainloop.at -; CHECK-NEXT: br i1 [[COND3]], label %loop, label %main.exit.selector -; CHECK: main.exit.selector: -; CHECK-NEXT: %idx.lcssa = phi i32 [ %idx, %in.bounds ] -; CHECK-NEXT: [[COND4:%[^ ]+]] = icmp ult i32 %idx.lcssa, 100 -; CHECK-NEXT: br i1 [[COND4]], label %main.pseudo.exit, label %exit -; CHECK-NOT: loop.preloop: -; CHECK: loop.postloop: -; CHECK-NEXT: %idx.postloop = phi i32 [ %idx.copy, %postloop ], [ %idx.next.postloop, %in.bounds.postloop ] -; CHECK-NEXT: %idx.next.postloop = add nuw nsw i32 %idx.postloop, 1 -; CHECK-NEXT: %abc.postloop = icmp ult i32 %idx.postloop, %exit.mainloop.at -; CHECK-NEXT: br i1 %abc.postloop, label %in.bounds.postloop, label %out.of.bounds.loopexit - -entry: - %len = load i32, i32* %a_len_ptr, !range !0 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %abc = icmp ult i32 %idx, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_01, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_03(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_03 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp slt i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp slt i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -; Same as test_02, but comparison happens against IV extended to a wider type. -; This test ensures that IRCE rejects it and does not falsely assume that it was -; a comparison against iv.next. -; TODO: We can actually extend the recognition to cover this case. -define void @test_04(i32* %arr, i64* %a_len_ptr) #0 { - -; CHECK: test_04 - -entry: - %len = load i64, i64* %a_len_ptr, !range !1 - br label %loop - -loop: - %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ] - %idx.next = add nsw nuw i32 %idx, 1 - %idx.ext = sext i32 %idx to i64 - %abc = icmp ult i64 %idx.ext, %len - br i1 %abc, label %in.bounds, label %out.of.bounds - -in.bounds: - %addr = getelementptr i32, i32* %arr, i32 %idx - store i32 0, i32* %addr - %next = icmp ult i32 %idx, 100 - br i1 %next, label %loop, label %exit - -out.of.bounds: - ret void - -exit: - ret void -} - -!0 = !{i32 0, i32 50} -!1 = !{i64 0, i64 50} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312775 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 04:26:41 +00:00
Adrian Prantl	b4f316bbe0	Fix a crash when emitting debug info for multi-reg function arguments by reusing more of the existing machinery This is a follow-up to r312169. Thanks to Björn Pettersson for the testcase! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312773 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 02:31:37 +00:00
Dean Michael Berris	f7f70f7f6c	[XRay][CodeGen][PowerPC] Fix tail exit codegen for XRay in PPC Summary: This fixes code-gen for XRay in PPC. The regression wasn't caught by codegen tests which we add in this change. What happened was the following: - For tail exits, we used to unconditionally prepend the returns/exits with a pseudo-instruction that gets lowered to the instrumentation sled (and leave the actual return/exit instruction as-is). - Changes to the XRay instrumentation pass caused the tail exits to suddenly also emit the tail exit pseudo-instruction, since the check for whether a return instruction was also a call instruction meant it was a tail exit instruction. - None of the tests caught the regression either due to non-existent tests, or the tests being disabled/removed for continuous breakage. This change re-introduces some of the basic tests and verifies that we're back to a state that allows the back-end to generate appropriate XRay instrumented binaries for PPC in the presence of tail exits. Reviewers: echristo, timshen Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D37570 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312772 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 01:47:56 +00:00
Chandler Carruth	575526c082	[x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to cover the bitwise operators. Nothing really exciting here, this just stamps out the rest of the core operations that can RMW memory and set flags. Still not implemented here: ADC, SBB. Those will require more interesting logic to channel the flags in, and I'm not currently planning to try to tackle that. It might be interesting for someone who wants to improve our code generation for bignum implementations. Differential Revision: https://reviews.llvm.org/D37141 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312768 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 00:17:12 +00:00
Peter Collingbourne	7480eb9933	WholeProgramDevirt: When promoting for single-impl devirt, also rename the comdat. This is required when targeting COFF, as the comdat name must match one of the names of the symbols in the comdat. Differential Revision: https://reviews.llvm.org/D37550 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312767 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-08 00:10:53 +00:00
Chandler Carruth	803b2c7e69	[x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands. This is a bit trickier than register operands, and we still want to fall back on a register operands even for things that appear to be "immediates" when they won't actually select into the operation's immediate operand. This also requires us to handle things like selecting `sub` vs. `add` to minimize the number of bits needed to represent the immediate, and picking the shortest immediate encoding. In order to that, we in turn need to scan to make sure that CF isn't used as it will get inverted. The end result seems very nice though, and we're now generating optimal instruction sequences for these patterns IMO. A follow-up patch will further expand this to other operations with RMW memory operands. But handing `add` and `sub` are useful starting points to flesh out the machinery and make sure interesting and complex cases can be handled. Thanks to Craig Topper who provided a few fixes and improvements to this patch in addition to the review! Differential Revision: https://reviews.llvm.org/D37139 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312764 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 23:54:24 +00:00
Rafael Espindola	14fde14ca1	Don't call exit from cl::PrintHelpMessage. Most callers were not expecting the exit(0) and trying to exit with a different value. This also adds back the call to cl::PrintHelpMessage in llvm-ar. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312761 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 23:30:48 +00:00
Eugene Zelenko	f95cab8de6	[Bitcode] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312760 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 23:28:24 +00:00
Reid Kleckner	6619121492	Sink some IntrinsicInst.h and Intrinsics.h out of llvm/include Many of these uses can get by with forward declarations. Hopefully this speeds up compilation after adding a single intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312759 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 23:27:44 +00:00
Richard Trieu	2cf792edc2	Revert r312318, r312325, r312424, r312489 r312318 - Debug info for variables whose type is shrinked to bool r312325, r312424, r312489 - Test case for r312318 Revision 312318 introduced a null dereference bug. Details in https://bugs.llvm.org/show_bug.cgi?id=34490 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312758 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 23:20:35 +00:00
Reid Kleckner	ccd20ac14d	Move duplicate helpers from DbgValueInst / DbgDeclareInst to DbgInfoIntrinsic NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312754 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 22:46:24 +00:00
Paul Robinson	06296a65fd	[DWARF] Line 0 should not have a discriminator. It's meaningless and takes up extra space in the line table. Differential Revision: https://reviews.llvm.org/D37364 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312751 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 22:15:44 +00:00
Petr Hosek	746c778562	[yaml2obj][ELF] Add support for symbol indexes greater than SHN_LORESERVE Right now Symbols must be either undefined or defined in a specific section. Some symbols have section indexes like SHN_ABS however. This change adds support for outputting symbols that have such section indexes. Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D37391 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312745 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 20:44:16 +00:00
Peter Collingbourne	95137e4939	COFF: PDB: Allow multiple modules with the same name. It is possible for two modules to have the same name if they are archive members with the same name, or if we are doing LTO (in which case all modules will have the name "lto.tmp"). Differential Revision: https://reviews.llvm.org/D37589 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312744 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 20:39:46 +00:00
Peter Collingbourne	64755be94b	Remove dead code. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312740 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 19:17:30 +00:00
Artem Belevich	4059d374ce	[CUDA] Added rudimentary support for CUDA-9 and sm_70. For now CUDA-9 is not included in the list of CUDA versions clang searches for, so the path to CUDA-9 must be explicitly passed via --cuda-path=. On LLVM side NVPTX added sm_70 GPU type which bumps required PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment. Differential Revision: https://reviews.llvm.org/D37576 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312734 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 18:14:32 +00:00
Matt Arsenault	0bb6355f63	AMDGPU: Start selecting v_mad_mix_f32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312732 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 18:05:07 +00:00
Matt Arsenault	e46354d6fa	DAG: Allow creating extract_vector_elt post-legalize Fixes some combine issues for AMDGPU where we weren't getting the many extract_vector_elt combines expected in a future patch. This should really be checking isOperationLegalOrCustom on the extract. That improves a number of x86 lit tests, but a few get stuck in an infinite loop from one place where a similar looking extract is created. I have a different workaround in the backend for that which keeps many of those improvements, but also adds a few regressions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312730 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 17:24:43 +00:00
Konstantin Zhuravlyov	3964b8bfc8	AMDGPU: Handle non-temporal loads and stores Differential Revision: https://reviews.llvm.org/D36862 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312729 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 17:14:54 +00:00
Konstantin Zhuravlyov	b6f64be453	AMDGPU: Handle more than one memory operand in SIMemoryLegalizer Differential Revision: https://reviews.llvm.org/D37397 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312725 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 16:14:21 +00:00
Benjamin Kramer	55915d8771	[ARM] Remove redundant vcvt patterns. These don't add any value as they're just compositions of existing patterns. However, they can confuse the cost logic in ISel, leading to duplicated vcvt instructions like in PR33199. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312724 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-07 14:52:26 +00:00

1 2 3 4 5 ...

106288 Commits