archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Sanjay Patel	85678535ba	[Reassociate] add test for missing FP constant analysis; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336208 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-03 15:56:04 +00:00
Sanjay Patel	c810b4e17d	[InstCombine] fold shuffle-with-binop and common value This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336196 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-03 13:44:22 +00:00
Bjorn Pettersson	13e9d31258	[DebugInfo] Corrections for salvageDebugInfo Summary: When salvaging a dbg.declare/dbg.addr we should not add DW_OP_stack_value to the DIExpression (see test/Transforms/InstCombine/salvage-dbg-declare.ll). Consider this example %vla = alloca i32, i64 2 call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression()) Instcombine will turn it into %vla1 = alloca [2 x i32] %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0 call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression()) If the GEP can be eliminated, then the dbg.declare will be salvaged and we should get %vla1 = alloca [2 x i32] call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression()) The problem was that salvageDebugInfo did not recognize dbg.declare as being indirect (%vla1 points to the value, it does not hold the value), so we incorrectly got call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value)) I also made sure that llvm::salvageDebugInfo and DIExpression::prependOpcodes do not add DW_OP_stack_value to the DIExpression in case no new operands are added to the DIExpression. That way we avoid to, unneccessarily, turn a register location expression into an implicit location expression in some situations (see test11 in test/Transforms/LICM/sinking.ll). Reviewers: aprantl, vsk Reviewed By: aprantl, vsk Subscribers: JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D48837 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336191 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-03 11:29:00 +00:00
Chandler Carruth	ba89ffcade	[PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when unswitching loops. Original patch trying to address this was sent in D47624, but that didn't quite handle things correctly. There are two key principles used to select whether and how to invalidate SCEV-cached information about loops: 1) We must invalidate any info SCEV has cached before unswitching as we may change (or destroy) the loop structure by the act of unswitching, and make it hard to recover everything we want to invalidate within SCEV. 2) We need to invalidate all of the loops whose CFGs are mutated by the unswitching. Notably, this isn't the entire loop nest, this is every loop contained by the outermost loop reached by an exit block relevant to the unswitch. And we need to do this even when doing trivial unswitching. I've added more focused tests that directly check that SCEV starts off with imprecise information and after unswitching (and simplifying instructions) re-querying SCEV will produce precise information. These tests also specifically work to check that an outer loop's information becomes precise. However, the testing here is still a bit imperfect. Crafting test cases that reliably fail to be analyzed by SCEV before unswitching and succeed afterward proved ... very, very hard. It took me several hours and careful work to build these, and I'm not optimistic about necessarily coming up with more to cover more elaborate possibilities. Fortunately, the code pattern we are testing here in the pass is really straightforward and reliable. Thanks to Max Kazantsev for the initial work on this as well as the review, and to Hal Finkel for helping me talk through approaches to test this stuff even if it didn't come to much. Differential Revision: https://reviews.llvm.org/D47624 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336183 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-03 09:13:27 +00:00
Max Kazantsev	6ac04e25a3	[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done This patch changes order of transform in InstCombineCompares to avoid performing transforms based on ranges which produce complex bit arithmetics before more simple things (like folding with constants) are done. See PR37636 for the motivating example. Differential Revision: https://reviews.llvm.org/D48584 Reviewed By: spatel, lebedev.ri git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336172 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-03 06:23:57 +00:00
Tim Shen	7276550b78	[SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428). Summary: Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change SCEV is able to prove that the loop doesn't wrap-self (due to zext i16 to i64), disabling the entire loop versioning pass. Removed the zext and just use i64. Reviewers: sanjoy Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits Differential Revision: https://reviews.llvm.org/D48409 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336140 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 20:01:54 +00:00
Farhana Aleen	13f7859c20	[SLP] Recognize min/max pattern using instructions producing same values. Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization. %1 = extractelement <2 x i32> %a, i32 0 %2 = extractelement <2 x i32> %a, i32 1 %cond = icmp sgt i32 %1, %2 %3 = extractelement <2 x i32> %a, i32 0 %4 = extractelement <2 x i32> %a, i32 1 %select = select i1 %cond, i32 %3, i32 %4 Author: FarhanaAleen Reviewed By: ABataev, RKSimon, spatel Differential Revision: https://reviews.llvm.org/D47608 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 17:55:31 +00:00
Sanjay Patel	c9a157f7fd	[InstCombine] reverse canonicalization of add --> or to allow more shuffle folding This extends D48485 to allow another pair of binops (add/or) to be combined either with or without a leading shuffle: or X, C --> add X, C (when X and C have no common bits set) Here, we need value tracking to determine that the 'or' can be reversed into an 'add', and we've added general infrastructure to allow extending to other opcodes or moving to where other passes could use that functionality. Differential Revision: https://reviews.llvm.org/D48662 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336128 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 17:42:29 +00:00
Simon Pilgrim	198bcb65d9	[SLPVectorizer][X86] Begin adding alternate tests for call operators Alternate opcode handling only supports binary operators, these tests demonstrate a missed opportunity to vectorize ceil/floor calls git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336125 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 17:23:45 +00:00
Sanjay Patel	cdbffdd9b5	[ValueTracking] allow undef elements when matching vector abs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336111 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 14:43:40 +00:00
Sanjay Patel	736ca3f1be	[InstCombine] adjust shuffle tests with IR flags; NFC Due to current limitations in constant analysis, we need flags on add or mul to show propagation for the potential transform suggested in these tests (no other binops currently report identity constants). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336101 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 13:40:54 +00:00
Florian Hahn	2a87571b08	Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336098 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 12:44:04 +00:00
Sanjay Patel	d9fdb86057	[InstCombine] add tests for shuffle-binop; NFC This is another pattern mentioned in PR37806. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336096 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 12:30:46 +00:00
Simon Pilgrim	386f15c93a	[SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns. We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case. This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336095 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 11:28:01 +00:00
Max Kazantsev	e52381764a	[NFC] Test that shows unprofitability of instcombine with bit ranges git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336078 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 06:55:00 +00:00
Piotr Padlewski	c2f24d9ea8	Implement strip.invariant.group Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336073 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 04:49:30 +00:00
Sanjay Patel	77442a72e0	[InstCombine] add abs tests with undef elts; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336065 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 17:14:37 +00:00
Sanjay Patel	6fed3d8070	[PatternMatch] allow undef elements in vectors with m_Neg This is similar to the m_Not change from D44076. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336064 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 13:42:57 +00:00
David Green	e101271f21	[UnrollAndJam] New Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336062 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 12:47:30 +00:00
Simon Pilgrim	243c2fa15c	[SLPVectorizer][X86] Add some alternate tests for cast operators Alternate opcode handling only supports binary operators, these tests demonstrate missed opportunities to vectorize some sitofp/uitofp and fptosi/fptoui style casts as well as some (successful) float bits manipulations git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336060 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 11:29:46 +00:00
Eugene Leviant	eaabebdbe0	[Evaluator] Improve evaluation of call instruction Recommit of r335324 after buildbot failure fix git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336059 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 11:02:07 +00:00
Sanjay Patel	4d00d7aadb	[InstCombine] add tests for negate vector with undef elts; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336050 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-30 14:11:46 +00:00
Sanjay Patel	6e44255b70	[InstCombine] add more tests for shuffle-binop folds; NFC The mul+shl tests add coverage for the fold enabled with D48678. The and+or tests are not handled yet; that's D48662. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335984 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-29 15:28:11 +00:00
Sanjay Patel	ce6e09c740	[InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806) This was discussed in D48401 as another improvement for: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have 2 different variable values, then we shuffle (select) those lanes, shuffle (select) the constants, and then perform the binop. This eliminates a binop. The new shuffle uses the same shuffle mask as the existing shuffle, so there's no danger of creating a difficult shuffle. All of the earlier constraints still apply, but we also check for extra uses to avoid creating more instructions than we'll remove. Additionally, we're disallowing the fold for div/rem because that could expose a UB hole. Differential Revision: https://reviews.llvm.org/D48678 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335974 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-29 13:44:06 +00:00
Roman Lebedev	998f860433	SCEVExpander::expandAddRecExprLiterally(): check before casting as Instruction Summary: An alternative to D48597. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=37936 \| PR37936 ]]. The problem is as follows: 1. `indvars` marks `%dec` as `NUW`. 2. `loop-instsimplify` runs `instsimplify`, which constant-folds `%dec` to -1 (D47908) 3. `loop-reduce` tries to do some further modification, but crashes with an type assertion in cast, because `%dec` is no longer an `Instruction`, If the runline is split into two, i.e. you first run `-indvars -loop-instsimplify`, store that into a file, and then run `-loop-reduce`, there is no crash. So it looks like the problem is due to `-loop-instsimplify` not discarding SCEV. But in this case we can just not crash if it's not an `Instruction`. This is just a local fix, unlike D48597, so there may very well be other problems. Reviewers: mkazantsev, uabelho, sanjoy, silviu.baranga, wmi Reviewed By: mkazantsev Subscribers: evstupac, javed.absar, spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D48599 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335950 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-29 07:44:20 +00:00
Sanjay Patel	87b704f790	[InstCombine] adjust shuffle tests; NFC Use xor for the extra uses test because div/rem have other problems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335924 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 21:14:02 +00:00
Teresa Johnson	1dc0b96afd	[ThinLTO] Port InlinerFunctionImportStats handling to new PM Summary: The InlinerFunctionImportStats will collect and dump stats regarding how many function inlined into the module were imported by ThinLTO. Reviewers: wmi, dexonsmith Subscribers: mehdi_amini, inglorion, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D48729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335914 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 20:07:47 +00:00
Anastasis Grammenos	fbc17042e9	[SROA] Preserve DebugLoc when rewriting alloca partitions When rewriting an alloca partition copy the DL from the old alloca over the the new one. Differential Revision: https://reviews.llvm.org/D48640 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335904 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 18:58:30 +00:00
Sanjay Patel	25be6bdd6f	[InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806) This is an enhancement to D48401 that was discussed in: https://bugs.llvm.org/show_bug.cgi?id=37806 We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other direction because that's generally better of course). This allows us to remove the shuffle as we do in the regular opcodes-are-the-same cases. This requires a small hack to make sure we don't introduce any extra poison: https://rise4fun.com/Alive/ZGv Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There are planned enhancements for opcode transforms such or -> add. Note that there's a different fold needed if we've already managed to simplify away a binop as seen in the test based on PR37806, but we manage to get that one case here because this fold is positioned above the demanded elements fold currently. Differential Revision: https://reviews.llvm.org/D48485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335888 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 17:48:04 +00:00
Florian Hahn	e6a8acefb8	[SCCP] Mark CFG as preserved. SCCP does not change the CFG, so we can mark it as preserved. Reviewers: dberlin, efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D47149 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335820 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 09:53:38 +00:00
Max Kazantsev	3eb8221ad9	[IndVarSimplify] Ignore unreachable users of truncs If a trunc has a user in a block which is not reachable from entry, we can safely perform trunc elimination as if this user didn't exist. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335816 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 08:20:03 +00:00
Sanjay Patel	45f68e6a2e	[InstCombine] add tests for vector-select-of-binops with 2 variables; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335778 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-27 20:23:47 +00:00
Sanjay Patel	c86651469c	[InstCombine] add more tests for shuffle with different binops; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335756 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-27 17:21:57 +00:00
Craig Topper	480d03dbeb	[X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics that don't take a mask as input to exclude '.mask.' from their name. I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335744 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-27 15:57:53 +00:00
Adhemerval Zanella	751c17bfa0	[AArch64] Add custom lowering for v4i8 trunc store This patch adds a custom trunc store lowering for v4i8 vector types. Since there is not v.4b register, the v4i8 is promoted to v4i16 (v.4h) and default action for v4i8 is to extract each element and issue 4 byte stores. A better strategy would be to extended the promoted v4i16 to v8i16 (with undef elements) and extract and store the word lane which represents the v4i8 subvectores. The construction: define void @foo(<4 x i16> %x, i8* nocapture %p) { %0 = trunc <4 x i16> %x to <4 x i8> %1 = bitcast i8* %p to <4 x i8>* store <4 x i8> %0, <4 x i8>* %1, align 4, !tbaa !2 ret void } Can be optimized from: umov w8, v0.h[3] umov w9, v0.h[2] umov w10, v0.h[1] umov w11, v0.h[0] strb w8, [x0, #3] strb w9, [x0, #2] strb w10, [x0, #1] strb w11, [x0] ret To: xtn v0.8b, v0.8h str s0, [x0] ret The patch also adjust the memory cost for autovectorization, so the C code: void foo (const int src, int width, unsigned char dst) { for (int i = 0; i < width; i++) dst++ = src++; } can be vectorized to: .LBB0_4: // %vector.body // =>This Inner Loop Header: Depth=1 ldr q0, [x0], #16 subs x12, x12, #4 // =4 xtn v0.4h, v0.4s xtn v0.8b, v0.8h st1 { v0.s }[0], [x2], #4 b.ne .LBB0_4 Instead of byte operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335735 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-27 13:58:46 +00:00
Vedant Kumar	bde65c4dd9	[InstCombine] Avoid creating mis-sized dbg.values in commonCastTransforms() This prevents InstCombine from creating mis-sized dbg.values when replacing a sequence of casts with a simpler cast. For example, in: (fptrunc (floor (fpext X))) -> (floorf X) We no longer emit dbg.value(X) (with a 32-bit float operand) to describe (fpext X) (which is a 64-bit float). This was diagnosed by the debugify check added in r335682. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335696 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-27 00:47:53 +00:00
Vedant Kumar	3dbfeaffe7	[Debugify] Handle failure to get fragment size when checking dbg.values It's not possible to get the fragment size of some dbg.values. Teach the mis-sized dbg.value diagnostic to detect this scenario and bail out. Tested with: $ find test/Transforms -print -exec opt -debugify-each -instcombine {} \; git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335695 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-27 00:47:52 +00:00
Michael Zolotukhin	d3c8f20a14	[JumpThreading] Don't try to rewrite a use if it's already valid. Summary: When recording uses we need to rewrite after cloning a loop we need to check if the use is not dominated by the original def. The initial assumption was that the cloned basic block will introduce a new path and thus the original def will only dominate the use if they are in the same BB, but as the reproducer from PR37745 shows it's not always the case. This fixes PR37745. Reviewers: haicheng, Ka-Ka Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48111 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335675 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 22:19:48 +00:00
Matt Arsenault	737191dd29	ConstantFold: Don't fold global address vs. null for addrspace != 0 Not sure why this logic seems to be repeated in 2 different places, one called by the other. On AMDGPU addrspace(3) globals start allocating at 0, so these checks will be incorrect (not that real code actually tries to compare these addresses) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335649 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 18:55:43 +00:00
Matt Arsenault	93ae5a35af	LoopUnroll: Allow analyzing intrinsic call costs I'm not sure why the code here is skipping calls since TTI does try to do something for general calls, but it at least should allow intrinsics. Skip intrinsics that should not be omitted as calls, which is by far the most common case on AMDGPU. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335645 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 18:51:17 +00:00
Sanjay Patel	5f8dac1929	[InstSimplify] fold shifts by sext bool https://rise4fun.com/Alive/c3Y git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335633 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 17:31:38 +00:00
Sanjay Patel	84f6f2281a	[InstSimplify] add tests for shifts by sext bool; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335631 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 17:15:07 +00:00
Sanjay Patel	e9e5731866	[InstCombine] fold urem with sext bool divisor Similar to other patches in this series: https://reviews.llvm.org/rL335512 https://reviews.llvm.org/rL335527 https://reviews.llvm.org/rL335597 https://reviews.llvm.org/rL335616 ...this is filling a gap in analysis that is exposed by an unrelated select-of-constants transform. I didn't see a way to unify the sext cases because each div/rem opcode results in a different fold. Note that in this case, the backend might want to convert the select into math: Name: sext urem %e = sext i1 %x to i32 %r = urem i32 %y, %e => %c = icmp eq i32 %y, -1 %z = zext i1 %c to i32 %r = add i32 %z, %y git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335622 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 16:30:00 +00:00
Simon Pilgrim	40be0055ae	[SLPVectorizer] Recognise non uniform power of 2 constants Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them. As SLP works with arrays of values I don't think we can easily use the pattern match helpers here. Differential Revision: https://reviews.llvm.org/D48214 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335621 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 16:20:16 +00:00
Sanjay Patel	0c1dedd8ed	[InstCombine] add tests for urem with sext bool divisor; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335619 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 16:01:24 +00:00
Sanjay Patel	7706083ace	[InstSimplify] fold srem with sext bool divisor git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335616 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 15:32:54 +00:00
Sanjay Patel	a301ecd20b	[InstSimplify] add tests for srem with sext bool divisor; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335609 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 14:47:31 +00:00
Sanjay Patel	86366d2a15	[InstCombine] fold udiv with sext bool divisor Note: I didn't add a hasOneUse() check because the existing, related fold doesn't have that check. I suspect that the improved analysis and codegen make these some of the rare canonicalization cases where we allow an increase in instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335597 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 12:41:15 +00:00
Florian Hahn	1766212247	[IPSCCP] Change dead blocks to unreachable after visiting all executable blocks. changeToUnreachable may remove PHI nodes from executable blocks we found values for and we would fail to replace them. By changing dead blocks to unreachable after we replaced constants in all executable blocks, we ensure such PHI nodes are replaced by their known value before. Fixes PR37780. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D48421 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335588 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 10:15:02 +00:00
Bjorn Pettersson	2079dc34cb	Improve ConvertDebugDeclareToDebugValue Summary: This is a follow-up to r334830 and r335031. In the valueCoversEntireFragment check we now also handle the situation when there is a variable length array (VLA) involved, and the length of the array has been reduced to a constant. The ConvertDebugDeclareToDebugValue functions that are related to PHI nodes and load instructions now avoid inserting dbg.value intrinsics when the value does not, for certain, cover the variable/fragment that should be described. In r334830 we assumed that the value always covered the entire var/fragment and we had assertions in the code to show that assumption. However, those asserts failed when compiling code with VLAs, so we removed the asserts in r335031. Now when we know that the valueCoversEntireFragment check can fail also for PHI/Load instructions we avoid to insert the faulty dbg.value intrinsic in such situations. Compared to the Store instruction scenario we simply drop the dbg.value here (as the variable does not change its value due to PHI/Load, so an earlier dbg.value describing the variable should still be valid). Reviewers: aprantl, vsk, efriedma Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48547 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335580 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 06:17:00 +00:00

1 2 3 4 5 ...

12181 Commits