Commit Graph

12181 Commits

Author SHA1 Message Date
Sanjay Patel
85678535ba [Reassociate] add test for missing FP constant analysis; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336208 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 15:56:04 +00:00
Sanjay Patel
c810b4e17d [InstCombine] fold shuffle-with-binop and common value
This is the last significant change suggested in PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806#c5
...though there are several follow-ups noted in the code comments 
in this patch to complete this transform.

It's possible that a binop feeding a select-shuffle has been eliminated 
by earlier transforms (or the code was just written like this in the 1st 
place), so we'll fail to match the patterns that have 2 binops from: 
D48401, 
D48678, 
D48662, 
D48485.

In that case, we can try to materialize identity constants for the remaining
binop to fill in the "ghost" lanes of the vector (where we just want to pass 
through the original values of the source operand).

I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. 
For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor).

Differential Revision: https://reviews.llvm.org/D48830


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336196 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 13:44:22 +00:00
Bjorn Pettersson
13e9d31258 [DebugInfo] Corrections for salvageDebugInfo
Summary:
When salvaging a dbg.declare/dbg.addr we should not add
DW_OP_stack_value to the DIExpression
(see test/Transforms/InstCombine/salvage-dbg-declare.ll).

Consider this example
  %vla = alloca i32, i64 2
  call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression())

Instcombine will turn it into
  %vla1 = alloca [2 x i32]
  %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression())

If the GEP can be eliminated, then the dbg.declare will be salvaged
and we should get
  %vla1 = alloca [2 x i32]
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression())

The problem was that salvageDebugInfo did not recognize dbg.declare
as being indirect (%vla1 points to the value, it does not hold the
value), so we incorrectly got
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value))

I also made sure that llvm::salvageDebugInfo and
DIExpression::prependOpcodes do not add DW_OP_stack_value to
the DIExpression in case no new operands are added to the
DIExpression. That way we avoid to, unneccessarily, turn a
register location expression into an implicit location expression
in some situations (see test11 in test/Transforms/LICM/sinking.ll).

Reviewers: aprantl, vsk

Reviewed By: aprantl, vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336191 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 11:29:00 +00:00
Chandler Carruth
ba89ffcade [PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when
unswitching loops.

Original patch trying to address this was sent in D47624, but that
didn't quite handle things correctly. There are two key principles used
to select whether and how to invalidate SCEV-cached information about
loops:

1) We must invalidate any info SCEV has cached before unswitching as we
   may change (or destroy) the loop structure by the act of unswitching,
   and make it hard to recover everything we want to invalidate within
   SCEV.

2) We need to invalidate all of the loops whose CFGs are mutated by the
   unswitching. Notably, this isn't the *entire* loop nest, this is
   every loop contained by the outermost loop reached by an exit block
   relevant to the unswitch.

And we need to do this even when doing trivial unswitching.

I've added more focused tests that directly check that SCEV starts off
with imprecise information and after unswitching (and simplifying
instructions) re-querying SCEV will produce precise information. These
tests also specifically work to check that an *outer* loop's information
becomes precise.

However, the testing here is still a bit imperfect. Crafting test cases
that reliably fail to be analyzed by SCEV before unswitching and succeed
afterward proved ... very, very hard. It took me several hours and
careful work to build these, and I'm not optimistic about necessarily
coming up with more to cover more elaborate possibilities. Fortunately,
the code pattern we are testing here in the pass is really
straightforward and reliable.

Thanks to Max Kazantsev for the initial work on this as well as the
review, and to Hal Finkel for helping me talk through approaches to test
this stuff even if it didn't come to much.

Differential Revision: https://reviews.llvm.org/D47624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336183 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 09:13:27 +00:00
Max Kazantsev
6ac04e25a3 [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336172 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 06:23:57 +00:00
Tim Shen
7276550b78 [SCEV] Strengthen StrengthenNoWrapFlags (reapply r334428).
Summary:
Comment on Transforms/LoopVersioning/incorrect-phi.ll: With the change
SCEV is able to prove that the loop doesn't wrap-self (due to zext i16
to i64), disabling the entire loop versioning pass. Removed the zext and
just use i64.

Reviewers: sanjoy

Subscribers: jlebar, hiraditya, javed.absar, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D48409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336140 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 20:01:54 +00:00
Farhana Aleen
13f7859c20 [SLP] Recognize min/max pattern using instructions producing same values.
Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.

         %1 = extractelement <2 x i32> %a, i32 0
         %2 = extractelement <2 x i32> %a, i32 1
         %cond = icmp sgt i32 %1, %2
         %3 = extractelement <2 x i32> %a, i32 0
         %4 = extractelement <2 x i32> %a, i32 1
         %select = select i1 %cond, i32 %3, i32 %4

Author: FarhanaAleen

Reviewed By: ABataev, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D47608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:55:31 +00:00
Sanjay Patel
c9a157f7fd [InstCombine] reverse canonicalization of add --> or to allow more shuffle folding
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)

Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving 
to where other passes could use that functionality.

Differential Revision: https://reviews.llvm.org/D48662


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336128 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:42:29 +00:00
Simon Pilgrim
198bcb65d9 [SLPVectorizer][X86] Begin adding alternate tests for call operators
Alternate opcode handling only supports binary operators, these tests demonstrate a missed opportunity to vectorize ceil/floor calls

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336125 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:23:45 +00:00
Sanjay Patel
cdbffdd9b5 [ValueTracking] allow undef elements when matching vector abs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336111 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 14:43:40 +00:00
Sanjay Patel
736ca3f1be [InstCombine] adjust shuffle tests with IR flags; NFC
Due to current limitations in constant analysis, we need flags
on add or mul to show propagation for the potential transform
suggested in these tests (no other binops currently report 
identity constants).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336101 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 13:40:54 +00:00
Florian Hahn
2a87571b08 Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters.
This version contains a fix to add values for which the state in ParamState change
to the worklist if the state in ValueState did not change. To avoid adding the
same value multiple times, mergeInValue returns true, if it added the value to
the worklist. The value is added to the worklist depending on its state in
ValueState.

Original message:
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.

Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.

Reviewers: dberlin, mssimpso, davide, efriedma

Reviewed By: mssimpso

Differential Revision: https://reviews.llvm.org/D43762


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336098 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 12:44:04 +00:00
Sanjay Patel
d9fdb86057 [InstCombine] add tests for shuffle-binop; NFC
This is another pattern mentioned in PR37806.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336096 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 12:30:46 +00:00
Simon Pilgrim
386f15c93a [SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns.
We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case.

This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336095 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 11:28:01 +00:00
Max Kazantsev
e52381764a [NFC] Test that shows unprofitability of instcombine with bit ranges
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336078 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 06:55:00 +00:00
Piotr Padlewski
c2f24d9ea8 Implement strip.invariant.group
Summary:
This patch introduce new intrinsic -
strip.invariant.group that was described in the
RFC: Devirtualization v2

Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar

Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47103

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336073 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 04:49:30 +00:00
Sanjay Patel
77442a72e0 [InstCombine] add abs tests with undef elts; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336065 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 17:14:37 +00:00
Sanjay Patel
6fed3d8070 [PatternMatch] allow undef elements in vectors with m_Neg
This is similar to the m_Not change from D44076.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336064 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 13:42:57 +00:00
David Green
e101271f21 [UnrollAndJam] New Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

  for i..
    ForeBlocks(i)
    for j..
      SubLoopBlocks(i, j)
    AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

  for i... i+=2
    ForeBlocks(i)
    ForeBlocks(i+1)
    for j..
      SubLoopBlocks(i, j)
      SubLoopBlocks(i+1, j)
    AftBlocks(i)
    AftBlocks(i+1)
  Remainder Loop

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336062 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 12:47:30 +00:00
Simon Pilgrim
243c2fa15c [SLPVectorizer][X86] Add some alternate tests for cast operators
Alternate opcode handling only supports binary operators, these tests demonstrate missed opportunities to vectorize some sitofp/uitofp and fptosi/fptoui style casts as well as some (successful) float bits manipulations

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336060 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 11:29:46 +00:00
Eugene Leviant
eaabebdbe0 [Evaluator] Improve evaluation of call instruction
Recommit of r335324 after buildbot failure fix


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336059 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 11:02:07 +00:00
Sanjay Patel
4d00d7aadb [InstCombine] add tests for negate vector with undef elts; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336050 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-30 14:11:46 +00:00
Sanjay Patel
6e44255b70 [InstCombine] add more tests for shuffle-binop folds; NFC
The mul+shl tests add coverage for the fold enabled with D48678.
The and+or tests are not handled yet; that's D48662.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335984 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 15:28:11 +00:00
Sanjay Patel
ce6e09c740 [InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
This was discussed in D48401 as another improvement for:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have 2 different variable values, then we shuffle (select) those lanes, 
shuffle (select) the constants, and then perform the binop. This eliminates a binop.

The new shuffle uses the same shuffle mask as the existing shuffle, so there's no 
danger of creating a difficult shuffle.

All of the earlier constraints still apply, but we also check for extra uses to 
avoid creating more instructions than we'll remove.

Additionally, we're disallowing the fold for div/rem because that could expose a
UB hole.

Differential Revision: https://reviews.llvm.org/D48678


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335974 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 13:44:06 +00:00
Roman Lebedev
998f860433 SCEVExpander::expandAddRecExprLiterally(): check before casting as Instruction
Summary:
An alternative to D48597.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=37936 | PR37936 ]].

The problem is as follows:
1. `indvars` marks `%dec` as `NUW`.
2. `loop-instsimplify` runs `instsimplify`, which constant-folds `%dec` to -1 (D47908)
3. `loop-reduce` tries to do some further modification, but crashes
    with an type assertion in cast, because `%dec` is no longer an `Instruction`,

If the runline is split into two, i.e. you first run `-indvars -loop-instsimplify`,
store that into a file, and then run `-loop-reduce`, there is no crash.

So it looks like the problem is due to `-loop-instsimplify` not discarding SCEV.
But in this case we can just not crash if it's not an `Instruction`.
This is just a local fix, unlike D48597, so there may very well be other problems.

Reviewers: mkazantsev, uabelho, sanjoy, silviu.baranga, wmi

Reviewed By: mkazantsev

Subscribers: evstupac, javed.absar, spatel, llvm-commits

Differential Revision: https://reviews.llvm.org/D48599

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335950 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 07:44:20 +00:00
Sanjay Patel
87b704f790 [InstCombine] adjust shuffle tests; NFC
Use xor for the extra uses test because div/rem have other problems.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335924 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 21:14:02 +00:00
Teresa Johnson
1dc0b96afd [ThinLTO] Port InlinerFunctionImportStats handling to new PM
Summary:
The InlinerFunctionImportStats will collect and dump stats regarding how
many function inlined into the module were imported by ThinLTO.

Reviewers: wmi, dexonsmith

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D48729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335914 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 20:07:47 +00:00
Anastasis Grammenos
fbc17042e9 [SROA] Preserve DebugLoc when rewriting alloca partitions
When rewriting an alloca partition copy the DL from the
old alloca over the the new one.

Differential Revision: https://reviews.llvm.org/D48640

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335904 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 18:58:30 +00:00
Sanjay Patel
25be6bdd6f [InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)
This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806

We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other 
direction because that's generally better of course). This allows us to remove the shuffle 
as we do in the regular opcodes-are-the-same cases.

This requires a small hack to make sure we don't introduce any extra poison:
https://rise4fun.com/Alive/ZGv

Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already 
canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There 
are planned enhancements for opcode transforms such or -> add.

Note that there's a different fold needed if we've already managed to simplify away a binop 
as seen in the test based on PR37806, but we manage to get that one case here because this 
fold is positioned above the demanded elements fold currently.

Differential Revision: https://reviews.llvm.org/D48485


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335888 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 17:48:04 +00:00
Florian Hahn
e6a8acefb8 [SCCP] Mark CFG as preserved.
SCCP does not change the CFG, so we can mark it as preserved.

Reviewers: dberlin, efriedma, davide

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D47149


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335820 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 09:53:38 +00:00
Max Kazantsev
3eb8221ad9 [IndVarSimplify] Ignore unreachable users of truncs
If a trunc has a user in a block which is not reachable from entry,
we can safely perform trunc elimination as if this user didn't exist.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335816 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 08:20:03 +00:00
Sanjay Patel
45f68e6a2e [InstCombine] add tests for vector-select-of-binops with 2 variables; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335778 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 20:23:47 +00:00
Sanjay Patel
c86651469c [InstCombine] add more tests for shuffle with different binops; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335756 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 17:21:57 +00:00
Craig Topper
480d03dbeb [X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics that don't take a mask as input to exclude '.mask.' from their name.
I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335744 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 15:57:53 +00:00
Adhemerval Zanella
751c17bfa0 [AArch64] Add custom lowering for v4i8 trunc store
This patch adds a custom trunc store lowering for v4i8 vector types.
Since there is not v.4b register, the v4i8 is promoted to v4i16 (v.4h)
and default action for v4i8 is to extract each element and issue 4
byte stores.

A better strategy would be to extended the promoted v4i16 to v8i16
(with undef elements) and extract and store the word lane which
represents the v4i8 subvectores. The construction:

  define void @foo(<4 x i16> %x, i8* nocapture %p) {
    %0 = trunc <4 x i16> %x to <4 x i8>
    %1 = bitcast i8* %p to <4 x i8>*
    store <4 x i8> %0, <4 x i8>* %1, align 4, !tbaa !2
    ret void
  }

Can be optimized from:

  umov    w8, v0.h[3]
  umov    w9, v0.h[2]
  umov    w10, v0.h[1]
  umov    w11, v0.h[0]
  strb    w8, [x0, #3]
  strb    w9, [x0, #2]
  strb    w10, [x0, #1]
  strb    w11, [x0]
  ret

To:

  xtn     v0.8b, v0.8h
  str     s0, [x0]
  ret

The patch also adjust the memory cost for autovectorization, so the C
code:

  void foo (const int *src, int width, unsigned char *dst)
  {
    for (int i = 0; i < width; i++)
       *dst++ = *src++;
  }

can be vectorized to:

  .LBB0_4:                                // %vector.body
                                          // =>This Inner Loop Header: Depth=1
        ldr     q0, [x0], #16
        subs    x12, x12, #4            // =4
        xtn     v0.4h, v0.4s
        xtn     v0.8b, v0.8h
        st1     { v0.s }[0], [x2], #4
        b.ne    .LBB0_4

Instead of byte operations.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335735 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 13:58:46 +00:00
Vedant Kumar
bde65c4dd9 [InstCombine] Avoid creating mis-sized dbg.values in commonCastTransforms()
This prevents InstCombine from creating mis-sized dbg.values when
replacing a sequence of casts with a simpler cast. For example, in:

  (fptrunc (floor (fpext X))) -> (floorf X)

We no longer emit dbg.value(X) (with a 32-bit float operand) to describe
(fpext X) (which is a 64-bit float).

This was diagnosed by the debugify check added in r335682.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335696 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 00:47:53 +00:00
Vedant Kumar
3dbfeaffe7 [Debugify] Handle failure to get fragment size when checking dbg.values
It's not possible to get the fragment size of some dbg.values. Teach the
mis-sized dbg.value diagnostic to detect this scenario and bail out.

Tested with:
$ find test/Transforms -print -exec opt -debugify-each -instcombine {} \;

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335695 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 00:47:52 +00:00
Michael Zolotukhin
d3c8f20a14 [JumpThreading] Don't try to rewrite a use if it's already valid.
Summary:
When recording uses we need to rewrite after cloning a loop we need to
check if the use is not dominated by the original def. The initial
assumption was that the cloned basic block will introduce a new path and
thus the original def will only dominate the use if they are in the same
BB, but as the reproducer from PR37745 shows it's not always the case.

This fixes PR37745.

Reviewers: haicheng, Ka-Ka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48111

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335675 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 22:19:48 +00:00
Matt Arsenault
737191dd29 ConstantFold: Don't fold global address vs. null for addrspace != 0
Not sure why this logic seems to be repeated in 2 different places,
one called by the other.

On AMDGPU addrspace(3) globals start allocating at 0, so these
checks will be incorrect (not that real code actually tries
to compare these addresses)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335649 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 18:55:43 +00:00
Matt Arsenault
93ae5a35af LoopUnroll: Allow analyzing intrinsic call costs
I'm not sure why the code here is skipping calls since
TTI does try to do something for general calls, but it
at least should allow intrinsics.

Skip intrinsics that should not be omitted as calls, which
is by far the most common case on AMDGPU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335645 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 18:51:17 +00:00
Sanjay Patel
5f8dac1929 [InstSimplify] fold shifts by sext bool
https://rise4fun.com/Alive/c3Y


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335633 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 17:31:38 +00:00
Sanjay Patel
84f6f2281a [InstSimplify] add tests for shifts by sext bool; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335631 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 17:15:07 +00:00
Sanjay Patel
e9e5731866 [InstCombine] fold urem with sext bool divisor
Similar to other patches in this series:
https://reviews.llvm.org/rL335512
https://reviews.llvm.org/rL335527
https://reviews.llvm.org/rL335597
https://reviews.llvm.org/rL335616

...this is filling a gap in analysis that is exposed by an unrelated select-of-constants transform.
I didn't see a way to unify the sext cases because each div/rem opcode results in a different fold.

Note that in this case, the backend might want to convert the select into math:
Name: sext urem
%e = sext i1 %x to i32
%r = urem i32 %y, %e
=>
%c = icmp eq i32 %y, -1
%z = zext i1 %c to i32
%r = add i32 %z, %y


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335622 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 16:30:00 +00:00
Simon Pilgrim
40be0055ae [SLPVectorizer] Recognise non uniform power of 2 constants
Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them.

As SLP works with arrays of values I don't think we can easily use the pattern match helpers here.

Differential Revision: https://reviews.llvm.org/D48214

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335621 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 16:20:16 +00:00
Sanjay Patel
0c1dedd8ed [InstCombine] add tests for urem with sext bool divisor; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335619 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 16:01:24 +00:00
Sanjay Patel
7706083ace [InstSimplify] fold srem with sext bool divisor
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335616 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 15:32:54 +00:00
Sanjay Patel
a301ecd20b [InstSimplify] add tests for srem with sext bool divisor; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335609 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 14:47:31 +00:00
Sanjay Patel
86366d2a15 [InstCombine] fold udiv with sext bool divisor
Note: I didn't add a hasOneUse() check because the existing,
related fold doesn't have that check. I suspect that the
improved analysis and codegen make these some of the rare
canonicalization cases where we allow an increase in
instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335597 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 12:41:15 +00:00
Florian Hahn
1766212247 [IPSCCP] Change dead blocks to unreachable after visiting all executable blocks.
changeToUnreachable may remove PHI nodes from executable blocks we found values
for and we would fail to replace them. By changing dead blocks to unreachable after
we replaced constants in all executable blocks, we ensure such PHI nodes are replaced
by their known value before.

Fixes PR37780.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D48421


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335588 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 10:15:02 +00:00
Bjorn Pettersson
2079dc34cb Improve ConvertDebugDeclareToDebugValue
Summary:
This is a follow-up to r334830 and r335031.

In the valueCoversEntireFragment check we now also handle
the situation when there is a variable length array (VLA)
involved, and the length of the array has been reduced to
a constant.

The ConvertDebugDeclareToDebugValue functions that are related
to PHI nodes and load instructions now avoid inserting dbg.value
intrinsics when the value does not, for certain, cover the
variable/fragment that should be described.
In r334830 we assumed that the value always covered the entire
var/fragment and we had assertions in the code to show that
assumption. However, those asserts failed when compiling code
with VLAs, so we removed the asserts in r335031. Now when we
know that the valueCoversEntireFragment check can fail also for
PHI/Load instructions we avoid to insert the faulty dbg.value
intrinsic in such situations. Compared to the Store instruction
scenario we simply drop the dbg.value here (as the variable does
not change its value due to PHI/Load, so an earlier dbg.value
describing the variable should still be valid).

Reviewers: aprantl, vsk, efriedma

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335580 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 06:17:00 +00:00