Commit Graph

20183 Commits

Author SHA1 Message Date
Sanjay Patel
c810b4e17d [InstCombine] fold shuffle-with-binop and common value
This is the last significant change suggested in PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806#c5
...though there are several follow-ups noted in the code comments 
in this patch to complete this transform.

It's possible that a binop feeding a select-shuffle has been eliminated 
by earlier transforms (or the code was just written like this in the 1st 
place), so we'll fail to match the patterns that have 2 binops from: 
D48401, 
D48678, 
D48662, 
D48485.

In that case, we can try to materialize identity constants for the remaining
binop to fill in the "ghost" lanes of the vector (where we just want to pass 
through the original values of the source operand).

I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. 
For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor).

Differential Revision: https://reviews.llvm.org/D48830


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336196 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 13:44:22 +00:00
Bjorn Pettersson
13e9d31258 [DebugInfo] Corrections for salvageDebugInfo
Summary:
When salvaging a dbg.declare/dbg.addr we should not add
DW_OP_stack_value to the DIExpression
(see test/Transforms/InstCombine/salvage-dbg-declare.ll).

Consider this example
  %vla = alloca i32, i64 2
  call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression())

Instcombine will turn it into
  %vla1 = alloca [2 x i32]
  %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression())

If the GEP can be eliminated, then the dbg.declare will be salvaged
and we should get
  %vla1 = alloca [2 x i32]
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression())

The problem was that salvageDebugInfo did not recognize dbg.declare
as being indirect (%vla1 points to the value, it does not hold the
value), so we incorrectly got
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value))

I also made sure that llvm::salvageDebugInfo and
DIExpression::prependOpcodes do not add DW_OP_stack_value to
the DIExpression in case no new operands are added to the
DIExpression. That way we avoid to, unneccessarily, turn a
register location expression into an implicit location expression
in some situations (see test11 in test/Transforms/LICM/sinking.ll).

Reviewers: aprantl, vsk

Reviewed By: aprantl, vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336191 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 11:29:00 +00:00
Chandler Carruth
ba89ffcade [PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when
unswitching loops.

Original patch trying to address this was sent in D47624, but that
didn't quite handle things correctly. There are two key principles used
to select whether and how to invalidate SCEV-cached information about
loops:

1) We must invalidate any info SCEV has cached before unswitching as we
   may change (or destroy) the loop structure by the act of unswitching,
   and make it hard to recover everything we want to invalidate within
   SCEV.

2) We need to invalidate all of the loops whose CFGs are mutated by the
   unswitching. Notably, this isn't the *entire* loop nest, this is
   every loop contained by the outermost loop reached by an exit block
   relevant to the unswitch.

And we need to do this even when doing trivial unswitching.

I've added more focused tests that directly check that SCEV starts off
with imprecise information and after unswitching (and simplifying
instructions) re-querying SCEV will produce precise information. These
tests also specifically work to check that an *outer* loop's information
becomes precise.

However, the testing here is still a bit imperfect. Crafting test cases
that reliably fail to be analyzed by SCEV before unswitching and succeed
afterward proved ... very, very hard. It took me several hours and
careful work to build these, and I'm not optimistic about necessarily
coming up with more to cover more elaborate possibilities. Fortunately,
the code pattern we are testing here in the pass is really
straightforward and reliable.

Thanks to Max Kazantsev for the initial work on this as well as the
review, and to Hal Finkel for helping me talk through approaches to test
this stuff even if it didn't come to much.

Differential Revision: https://reviews.llvm.org/D47624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336183 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 09:13:27 +00:00
Max Kazantsev
6ac04e25a3 [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336172 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 06:23:57 +00:00
Alina Sbirlea
89e8753d7b Replace "Replacable" with "Replaceable". [NFC]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336133 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 18:53:40 +00:00
Farhana Aleen
13f7859c20 [SLP] Recognize min/max pattern using instructions producing same values.
Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.

         %1 = extractelement <2 x i32> %a, i32 0
         %2 = extractelement <2 x i32> %a, i32 1
         %cond = icmp sgt i32 %1, %2
         %3 = extractelement <2 x i32> %a, i32 0
         %4 = extractelement <2 x i32> %a, i32 1
         %select = select i1 %cond, i32 %3, i32 %4

Author: FarhanaAleen

Reviewed By: ABataev, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D47608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:55:31 +00:00
Sanjay Patel
c9a157f7fd [InstCombine] reverse canonicalization of add --> or to allow more shuffle folding
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)

Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving 
to where other passes could use that functionality.

Differential Revision: https://reviews.llvm.org/D48662


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336128 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:42:29 +00:00
Simon Pilgrim
a40b37f909 [SLPVectorizer] Remove nullptr early-outs from Instruction::ShuffleVector getEntryCost
This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336102 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 13:41:29 +00:00
Florian Hahn
2a87571b08 Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters.
This version contains a fix to add values for which the state in ParamState change
to the worklist if the state in ValueState did not change. To avoid adding the
same value multiple times, mergeInValue returns true, if it added the value to
the worklist. The value is added to the worklist depending on its state in
ValueState.

Original message:
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.

Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.

Reviewers: dberlin, mssimpso, davide, efriedma

Reviewed By: mssimpso

Differential Revision: https://reviews.llvm.org/D43762


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336098 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 12:44:04 +00:00
Simon Pilgrim
386f15c93a [SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns.
We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case.

This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336095 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 11:28:01 +00:00
Simon Pilgrim
a43dcd4394 [SLPVectorizer] Only Alternate opcodes use ShuffleVector cases for getEntryCost/vectorizeTree. NFCI.
Add assertions - we're already assuming this in how we use the AltOpcode and treat everything as BinaryOperators.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336092 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 10:54:19 +00:00
Simon Pilgrim
a634a07771 [SLPVectorizer] Call InstructionsState.isOpcodeOrAlt with Instruction instead of an opcode. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336069 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 20:22:46 +00:00
Simon Pilgrim
b1dbeaa9ce [SLPVectorizer] Replace sameOpcodeOrAlt with InstructionsState.isOpcodeOrAlt helper. NFCI.
This is a basic step towards matching more general instructions types than just opcodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336068 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 20:07:30 +00:00
Simon Pilgrim
4c7a6ba2a4 [SLPVectorizer] Use InstructionsState Op/Alt opcodes directly. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336063 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 13:41:58 +00:00
David Green
e101271f21 [UnrollAndJam] New Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

  for i..
    ForeBlocks(i)
    for j..
      SubLoopBlocks(i, j)
    AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

  for i... i+=2
    ForeBlocks(i)
    ForeBlocks(i+1)
    for j..
      SubLoopBlocks(i, j)
      SubLoopBlocks(i+1, j)
    AftBlocks(i)
    AftBlocks(i+1)
  Remainder Loop

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336062 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 12:47:30 +00:00
Eugene Leviant
eaabebdbe0 [Evaluator] Improve evaluation of call instruction
Recommit of r335324 after buildbot failure fix


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336059 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 11:02:07 +00:00
Chandler Carruth
b2b950d7b1 [instsimplify] Move the instsimplify pass to use more obvious file names
and diretory.

Also cleans up all the associated naming to be consistent and removes
the public access to the pass ID which was unused in LLVM.

Also runs clang-format over parts that changed, which generally cleans
up a bunch of formatting.

This is in preparation for doing some internal cleanups to the pass.

Differential Revision: https://reviews.llvm.org/D47352

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336028 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 23:36:03 +00:00
Alex Shlyapnikov
1c3bbb4664 [HWASan] Do not retag allocas before return from the function.
Summary:
Retagging allocas before returning from the function might help
detecting use after return bugs, but it does not work at all in real
life, when instrumented and non-instrumented code is intermixed.
Consider the following code:

F_non_instrumented() {
  T x;
  F1_instrumented(&x);
  ...
}

{
  F_instrumented();
  F_non_instrumented();
}

- F_instrumented call leaves the stack below the current sp tagged
  randomly for UAR detection
- F_non_instrumented allocates its own vars on that tagged stack,
  not generating any tags, that is the address of x has tag 0, but the
  shadow memory still contains tags left behind by F_instrumented on the
  previous step
- F1_instrumented verifies &x before using it and traps on tag mismatch,
  0 vs whatever tag was set by F_instrumented

Reviewers: eugenis

Subscribers: srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D48664

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336011 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 20:20:17 +00:00
Sean Fertile
551913f7e3 Revert "Extend CFGPrinter and CallPrinter with Heat Colors"
This reverts r335996 which broke graph printing in Polly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336000 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 17:48:58 +00:00
Sean Fertile
8b6f52ae8e Extend CFGPrinter and CallPrinter with Heat Colors
Extends the CFGPrinter and CallPrinter with heat colors based on heuristics or
profiling information. The colors are enabled by default and can be toggled
on/off for CFGPrinter by using the option -cfg-heat-colors for both
-dot-cfg[-only] and -view-cfg[-only].  Similarly, the colors can be toggled
on/off for CallPrinter by using the option -callgraph-heat-colors for both
-dot-callgraph and -view-callgraph.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D40425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335996 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 17:13:58 +00:00
Sanjay Patel
ce6e09c740 [InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
This was discussed in D48401 as another improvement for:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have 2 different variable values, then we shuffle (select) those lanes, 
shuffle (select) the constants, and then perform the binop. This eliminates a binop.

The new shuffle uses the same shuffle mask as the existing shuffle, so there's no 
danger of creating a difficult shuffle.

All of the earlier constraints still apply, but we also check for extra uses to 
avoid creating more instructions than we'll remove.

Additionally, we're disallowing the fold for div/rem because that could expose a
UB hole.

Differential Revision: https://reviews.llvm.org/D48678


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335974 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 13:44:06 +00:00
Sanjay Patel
83bc27f0aa [InstCombine] fix opcode check in shuffle fold
There's no way to expose this difference currently, 
but we should use the updated variable because the
original opcodes can go stale if we transform into
something new.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335920 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 20:52:43 +00:00
Teresa Johnson
1dc0b96afd [ThinLTO] Port InlinerFunctionImportStats handling to new PM
Summary:
The InlinerFunctionImportStats will collect and dump stats regarding how
many function inlined into the module were imported by ThinLTO.

Reviewers: wmi, dexonsmith

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D48729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335914 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 20:07:47 +00:00
Anastasis Grammenos
fbc17042e9 [SROA] Preserve DebugLoc when rewriting alloca partitions
When rewriting an alloca partition copy the DL from the
old alloca over the the new one.

Differential Revision: https://reviews.llvm.org/D48640

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335904 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 18:58:30 +00:00
Sanjay Patel
25be6bdd6f [InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)
This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806

We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other 
direction because that's generally better of course). This allows us to remove the shuffle 
as we do in the regular opcodes-are-the-same cases.

This requires a small hack to make sure we don't introduce any extra poison:
https://rise4fun.com/Alive/ZGv

Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already 
canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There 
are planned enhancements for opcode transforms such or -> add.

Note that there's a different fold needed if we've already managed to simplify away a binop 
as seen in the test based on PR37806, but we manage to get that one case here because this 
fold is positioned above the demanded elements fold currently.

Differential Revision: https://reviews.llvm.org/D48485


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335888 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 17:48:04 +00:00
Benjamin Kramer
1bbef2cc89 Revert "Add support for generating a call graph profile from Branch Frequency Info."
This reverts commits r335794 and r335797. Breaks ThinLTO+FDO selfhost.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 13:15:03 +00:00
Jesper Antonsson
9d41c557cb Comment change to verify commit rights. NFC.
Summary: Just a silly one-character correction.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48709

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335832 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 10:55:04 +00:00
Florian Hahn
e6a8acefb8 [SCCP] Mark CFG as preserved.
SCCP does not change the CFG, so we can mark it as preserved.

Reviewers: dberlin, efriedma, davide

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D47149


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335820 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 09:53:38 +00:00
Max Kazantsev
3eb8221ad9 [IndVarSimplify] Ignore unreachable users of truncs
If a trunc has a user in a block which is not reachable from entry,
we can safely perform trunc elimination as if this user didn't exist.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335816 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 08:20:03 +00:00
Michael J. Spencer
09b856ac45 [CGProfile] Fix unused variable warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335797 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 00:12:04 +00:00
Michael J. Spencer
71b21d8fcc Add support for generating a call graph profile from Branch Frequency Info.
=== Generating the CG Profile ===

The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions.  For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.

After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}

!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```

Differential Revision: https://reviews.llvm.org/D48105

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335794 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 23:58:08 +00:00
Teresa Johnson
752939e86a [ThinLTO] Print names in function import debug messages when available
Summary:
Rather than just print the GUID, when it is available in the index,
print the global name as well in the function import thin link debug
messages. Names will be available when the combined index is being
built by the same process, e.g. a linker or "llvm-lto2 run".

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits

Differential Revision: https://reviews.llvm.org/D48612

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335760 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 18:03:39 +00:00
Craig Topper
480d03dbeb [X86] Rename the autoupgraded of packed fp compare and fpclass intrinsics that don't take a mask as input to exclude '.mask.' from their name.
I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335744 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 15:57:53 +00:00
Vedant Kumar
bde65c4dd9 [InstCombine] Avoid creating mis-sized dbg.values in commonCastTransforms()
This prevents InstCombine from creating mis-sized dbg.values when
replacing a sequence of casts with a simpler cast. For example, in:

  (fptrunc (floor (fpext X))) -> (floorf X)

We no longer emit dbg.value(X) (with a 32-bit float operand) to describe
(fpext X) (which is a 64-bit float).

This was diagnosed by the debugify check added in r335682.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335696 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 00:47:53 +00:00
Evgeniy Stepanov
0ed99e80c7 Revert "[asan] Instrument comdat globals on COFF targets"
Causes false positive ODR violation reports on __llvm_profile_raw_version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335681 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 22:43:48 +00:00
Michael Zolotukhin
d3c8f20a14 [JumpThreading] Don't try to rewrite a use if it's already valid.
Summary:
When recording uses we need to rewrite after cloning a loop we need to
check if the use is not dominated by the original def. The initial
assumption was that the cloned basic block will introduce a new path and
thus the original def will only dominate the use if they are in the same
BB, but as the reproducer from PR37745 shows it's not always the case.

This fixes PR37745.

Reviewers: haicheng, Ka-Ka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48111

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335675 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 22:19:48 +00:00
Vedant Kumar
8d224a36c1 Use a variable to appease a no-asserts bot, NFC
Failure URL:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/22836

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335648 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 18:55:26 +00:00
Matt Arsenault
93ae5a35af LoopUnroll: Allow analyzing intrinsic call costs
I'm not sure why the code here is skipping calls since
TTI does try to do something for general calls, but it
at least should allow intrinsics.

Skip intrinsics that should not be omitted as calls, which
is by far the most common case on AMDGPU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335645 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 18:51:17 +00:00
Vedant Kumar
99917384c8 [Local] Add a convenient insertReplacementDbgValues overload, NFC
Add an overload for the common case where the replacement dbg.values
have the same DIExpressions as the originals.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335643 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 18:44:53 +00:00
Vedant Kumar
6a78ba2a8c [Local] Sink salvageDI's early exit into helper functions, NFC
salvageDebugInfo() performs a check that allows it to exit early without
doing a DenseMap lookup. It's a bit neater and marginally more useful to
sink this early exit into the findDbg{Addr,Users,Values} helpers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335642 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 18:44:52 +00:00
Sanjay Patel
41f2034b16 [InstCombine] simplify code for urem fold; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335623 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 16:39:29 +00:00
Sanjay Patel
e9e5731866 [InstCombine] fold urem with sext bool divisor
Similar to other patches in this series:
https://reviews.llvm.org/rL335512
https://reviews.llvm.org/rL335527
https://reviews.llvm.org/rL335597
https://reviews.llvm.org/rL335616

...this is filling a gap in analysis that is exposed by an unrelated select-of-constants transform.
I didn't see a way to unify the sext cases because each div/rem opcode results in a different fold.

Note that in this case, the backend might want to convert the select into math:
Name: sext urem
%e = sext i1 %x to i32
%r = urem i32 %y, %e
=>
%c = icmp eq i32 %y, -1
%z = zext i1 %c to i32
%r = add i32 %z, %y


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335622 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 16:30:00 +00:00
Simon Pilgrim
40be0055ae [SLPVectorizer] Recognise non uniform power of 2 constants
Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them.

As SLP works with arrays of values I don't think we can easily use the pattern match helpers here.

Differential Revision: https://reviews.llvm.org/D48214

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335621 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 16:20:16 +00:00
Sanjay Patel
86366d2a15 [InstCombine] fold udiv with sext bool divisor
Note: I didn't add a hasOneUse() check because the existing,
related fold doesn't have that check. I suspect that the
improved analysis and codegen make these some of the rare
canonicalization cases where we allow an increase in
instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335597 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 12:41:15 +00:00
Florian Hahn
1766212247 [IPSCCP] Change dead blocks to unreachable after visiting all executable blocks.
changeToUnreachable may remove PHI nodes from executable blocks we found values
for and we would fail to replace them. By changing dead blocks to unreachable after
we replaced constants in all executable blocks, we ensure such PHI nodes are replaced
by their known value before.

Fixes PR37780.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D48421


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335588 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 10:15:02 +00:00
Bjorn Pettersson
2079dc34cb Improve ConvertDebugDeclareToDebugValue
Summary:
This is a follow-up to r334830 and r335031.

In the valueCoversEntireFragment check we now also handle
the situation when there is a variable length array (VLA)
involved, and the length of the array has been reduced to
a constant.

The ConvertDebugDeclareToDebugValue functions that are related
to PHI nodes and load instructions now avoid inserting dbg.value
intrinsics when the value does not, for certain, cover the
variable/fragment that should be described.
In r334830 we assumed that the value always covered the entire
var/fragment and we had assertions in the code to show that
assumption. However, those asserts failed when compiling code
with VLAs, so we removed the asserts in r335031. Now when we
know that the valueCoversEntireFragment check can fail also for
PHI/Load instructions we avoid to insert the faulty dbg.value
intrinsic in such situations. Compared to the Store instruction
scenario we simply drop the dbg.value here (as the variable does
not change its value due to PHI/Load, so an earlier dbg.value
describing the variable should still be valid).

Reviewers: aprantl, vsk, efriedma

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335580 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 06:17:00 +00:00
Gil Rapaport
d7d68723b8 [InstCombine] (A + 1) + (B ^ -1) --> A - B
Turn canonicalized subtraction back into (-1 - B) and combine it with (A + 1) into (A - B).
This is similar to the folding already done for (B ^ -1) + Const into (-1 + Const) - B.

Differential Revision: https://reviews.llvm.org/D48535


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335579 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 05:31:18 +00:00
Chandler Carruth
224904bf2c [PM/LoopUnswitch] Teach the new unswitch to handle nontrivial
unswitching of switches.

This works much like trivial unswitching of switches in that it reliably
moves the switch out of the loop. Here we potentially clone the entire
loop into each successor of the switch and re-point the cases at these
clones.

Due to the complexity of actually doing nontrivial unswitching, this
patch doesn't create a dedicated routine for handling switches -- it
would duplicate far too much code. Instead, it generalizes the existing
routine to handle both branches and switches as it largely reduces to
looping in a few places instead of doing something once. This actually
improves the results in some cases with branches due to being much more
careful about how dead regions of code are managed. With branches,
because exactly one clone is created and there are exactly two edges
considered, somewhat sloppy handling of the dead regions of code was
sufficient in most cases. But with switches, there are much more
complicated patterns of dead code and so I've had to move to a more
robust model generally. We still do as much pruning of the dead code
early as possible because that allows us to avoid even cloning the code.

This also surfaced another problem with nontrivial unswitching before
which is that we weren't as precise in reconstructing loops as we could
have been. This seems to have been mostly harmless, but resulted in
pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we
have to get this *right*, and everything benefits from it.

While the testing may seem a bit light here because we only have two
real cases with actual switches, they do a surprisingly good job of
exercising numerous edge cases. Also, because we share the logic with
branches, most of the changes in this patch are reasonably well covered
by existing tests.

The new unswitch now has all of the same fundamental power as the old
one with the exception of the single unsound case of *partial* switch
unswitching -- that really is just loop specialization and not
unswitching at all. It doesn't fit into the canonicalization model in
any way. We can add a loop specialization pass that runs late based on
profile data if important test cases ever come up here.

Differential Revision: https://reviews.llvm.org/D47683

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335553 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 23:32:54 +00:00
Sanjay Patel
d3b9487abb [InstCombine] cleanup udiv folds; NFCI
This removes a "UDivFoldAction" in favor of a simple constant
matcher. In theory, the existing code could do more matching,
but I don't see any evidence or need for it. I've left a TODO
about using ValueTracking in case we see any regressions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335545 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 22:50:26 +00:00
Benjamin Kramer
5c3069979f [Instrumentation] Remove unused include
It's also a layering violation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335528 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 21:43:09 +00:00