8860 Commits

Author SHA1 Message Date
Florian Hahn
35603461c5 [IPSCCP] Run Solve each time we resolved an undef in a function.
Once we resolved an undef in a function we can run Solve, which could
lead to finding a constant return value for the function, which in turn
could turn undefs into constants in other functions that call it, before
resolving undefs there.

Computationally the amount of work we are doing stays the same, just the
order we process things is slightly different and potentially there are
a few less undefs to resolve.

We are still relying on the order of functions in the IR, which means
depending on the order, we are able to resolve the optimal undef first
or not. For example, if @test1 comes before @testf, we find the constant
return value of @testf too late and we cannot use it while solving
@test1.

This on its own does not lead to more constants removed in the
test-suite, probably because currently we have to be very lucky to visit
applicable functions in the right order.

Maybe we manage to come up with a better way of resolving undefs in more
'profitable' functions first.

Reviewers: efriedma, mssimpso, davide

Reviewed By: efriedma, davide

Differential Revision: https://reviews.llvm.org/D49385


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337283 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 14:04:59 +00:00
Tim Shen
c31e75d199 [LSR] If no Use is interesting, early return.
Summary:
By looking at the callers of getUse(), we can see that even though
IVUsers may offer uses, but they may not be interesting to
LSR. It's possible that none of them is interesting.

Reviewers: sanjoy

Subscribers: jlebar, hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D49049

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337072 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 23:40:00 +00:00
Eric Christopher
d972c63af4 Temporarily revert "Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters." as it's causing miscompiles.
A testcase was provided in the original review thread.

This reverts commit r336098.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336877 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 01:53:21 +00:00
Craig Topper
99a2c222c2 [LoopIdiomRecognize] Don't convert a do while loop to ctlz.
This commit suppresses turning loops like this into "(bitwidth - ctlz(input))".

unsigned foo(unsigned input) {
  unsigned num = 0;
  do {
    ++num;
    input >>= 1;
  } while (input != 0);
  return num;
}

The loop version returns a value of 1 for both an input of 0 and an input of 1. Converting to a naive ctlz does not preserve that.

Theoretically we could do better if we checked isKnownNonZero or we could insert a select to handle the divergence. But until we have motivating cases for that, this is the easiest solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336864 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 22:35:28 +00:00
Chandler Carruth
bdef81ad96 [PM/Unswitch] Fix unused variable in r336646.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336647 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 08:57:04 +00:00
Chandler Carruth
f0e0135e65 [PM/Unswitch] Fix a collection of closely related issues with trivial
switch unswitching.

The core problem was that the way we handled unswitching trivial exit
edges through the default successor of a switch. For some reason
I thought the right way to do this was to add a block containing
unreachable and point the default successor at this block. In
retrospect, this has an amazing number of problems.

The first issue is the one that this pass has always worked around -- we
have to *detect* such edges and avoid unswitching them again. This
seemed pretty easy really. You juts look for an edge to a block
containing unreachable. However, this pattern is woefully unsound. So
many things can break it. The amazing thing is that I found a test case
where *simple-loop-unswitch itself* breaks this! When we do
a *non-trivial* unswitch of a switch we will end up splitting this exit
edge. The result will be a default successor that is an exit and
terminates in ... a perfectly normal branch. So the first test case that
I started trying to fix is added to the nontrivial test cases. This is
a ridiculous example that did just amazing things previously. With just
unswitch, it would create 10+ copies of this stuff stamped out. But if
you combine it *just right* with a bunch of other passes (like
simplify-cfg, loop rotate, and some LICM) you can get it to do this
infinitely. Or at least, I never got it to finish. =[

This, in turn, uncovered another related issue. When we are manipulating
these switches after doing a trivial unswitch we never correctly updated
PHI nodes to reflect our edits. As soon as I started changing how these
edges were managed, it became obvious there were more issues that
I couldn't realistically leave unaddressed, so I wrote more test cases
around PHI updates here and ensured all of that works now.

And this, in turn, required some adjustment to how we collect and manage
the exit successor when it is the default successor. That showed a clear
bug where we failed to include it in our search for the outer-most loop
reached by an unswitched exit edge. This was actually already tested and
the test case didn't work. I (wrongly) thought that was due to SCEV
failing to analyze the switch. In fact, it was just a simple bug in the
code that skipped the default successor. While changing this, I handled
it correctly and have updated the test to reflect that we now get
precise SCEV analysis of trip counts for the outer loop in one of these
cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336646 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 08:36:05 +00:00
Manoj Gupta
c6da6867a1 llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: efriedma, george.burgess.iv

Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47895

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336613 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 22:27:23 +00:00
Chandler Carruth
1cbf1da4d6 [PM/Unswitch] Fix a nasty bug in the new PM's unswitch introduced in
r335553 with the non-trivial unswitching of switches.

The code correctly updated most aspects of the CFG and analyses, but
missed some crucial aspects:
1) When multiple cases have the same successor, we unswitch that
   a single time and replace the switch with a direct branch. The CFG
   here is correct, but the target of this direct branch may have had
   a PHI node with multiple entries in it.
2) When we still have to clone a successor of the switch into an
   unswitched copy of the loop, we'll delete potentially multiple edges
   entering this successor, not just one.
3) We also have to delete multiple edges entering the successors in the
   original loop when they have to be retained.
4) When the "retained successor" *also* occurs as a case successor, we
   just assert failed everywhere. This doesn't happen very easily
   because its always valid to simply drop the case -- the retained
   successor for switches is always the default successor. However, it
   is likely possible through some contrivance of different loop passes,
   unrolling, and simplifying for this to occur in practice and
   certainly there is nothing "invalid" about the IR so this pass needs
   to handle it.
5) In the case of #4, we also will replace these multiple edges with
   a direct branch much like in #1 and need to collapse the entries in
   any PHI nodes to a single enrty.

All of this stems from the delightful fact that the same successor can
show up in multiple parts of the switch terminator, and each of these
are considered a distinct edge for the purpose of PHI nodes (and
iterating the successors and predecessors) but not for unswitching
itself, the dominator tree, or many other things. For the record,
I intensely dislike this "feature" of the IR in large part because of
the complexity it causes in passes like this. We already have a ton of
logic building sets and handling duplicates, and we just had to add
a bunch more.

I've added a complex test case that covers all five of the above failure
modes. I've also added a variation on it where #4 and #5 occur in loop
exit, adding fun where we have an LCSSA PHI node with "multiple entries"
despite have dedicated exits. There were no additional issues found by
this, but it seems a useful corner case to cover with testing.

One thing that working on all of this code has made painfully clear for
me as well is how amazingly inefficient our PHI node representation is
(in terms of the in-memory data structures and the APIs used to update
them). This code has truly marvelous complexity bounds because every
time we remove an entry from a PHI node we do a linear scan to find it
and then a linear update to the data structure to remove it. We could in
theory batch all of the PHI node updates into a single linear walk of
the operands making this much more efficient, but the APIs fight hard
against this and the fact that we have to handle duplicates in the
peculiar manner we do (removing all but one in some cases) makes even
implementing that very tedious and annoying. Anyways, none of this is
new here or specific to loop unswitching. All code in LLVM that updates
PHI node operands suffers from these problems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336536 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 10:30:48 +00:00
Craig Topper
7d6057ca55 [LoopIdiomRecognize] Support for converting loops that use LSHR to CTLZ.
In the 'detectCTLZIdiom' function support for loops that use LSHR instruction instead of ASHR has been added.

This supports creating ctlz from the following code.

int lzcnt(int x) {
     int count = 0;
     while (x > 0)  {
          count++;
          x = x >> 1;
     }
    return count;
}

Patch by Olga Moldovanova

Differential Revision: https://reviews.llvm.org/D48354

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336509 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-08 01:45:47 +00:00
Chandler Carruth
4498596c19 [PM/LoopUnswitch] Fix PR37889, producing the correct loop nest structure
after trivial unswitching.

This PR illustrates that a fundamental analysis update was not performed
with the new loop unswitch. This update is also somewhat fundamental to
the core idea of the new loop unswitch -- we actually *update* the CFG
based on the unswitching. In order to do that, we need to update the
loop nest in addition to the domtree.

For some reason, when writing trivial unswitching, I thought that the
loop nest structure cannot be changed by the transformation. But the PR
helps illustrate that it clearly can. I've expanded this to a number of
different test cases that try to cover the different cases of this. When
we unswitch, we move an exit edge of a loop out of the loop. If this
exit edge changes which loop reached by an exit is the innermost loop,
it changes the parent of the loop. Essentially, this transformation may
hoist the inner loop up the nest. I've added the simple logic to handle
this reliably in the trivial unswitching case. This just requires
updating LoopInfo and rebuilding LCSSA on the impacted loops. In the
trivial case, we don't even need to handle dedicated exits because we're
only hoisting the one loop and we just split its preheader.

I've also ported all of these tests to non-trivial unswitching and
verified that the logic already there correctly handles the loop nest
updates necessary.

Differential Revision: https://reviews.llvm.org/D48851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336477 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-07 01:12:56 +00:00
Benjamin Kramer
c988f9e6c3 [LoopSink] Make the enforcement of determinism deterministic.
LoopBlockNumber is a DenseMap<BasicBlock*, int>, comparing the result of
find() will compare a pair<BasicBlock*, int>. That's of course depending
on pointer ordering which varies from run to run. Reverse iteration
doesn't find this because we're copying to a vector first.

This bug has been there since 2016 but only recently showed up on clang
selfhost with FDO and ThinLTO, which is also why I didn't manage to get
a reasonable test case for this. Add an assert that would've caught
this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336439 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-06 14:20:58 +00:00
Michael Zolotukhin
bda51c282f Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading.""
There were a couple of issues reported (PR38047, PR37929) - I'll reland
the patch when I figure out and fix the rootcause.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336393 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 22:10:31 +00:00
Chandler Carruth
ba89ffcade [PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when
unswitching loops.

Original patch trying to address this was sent in D47624, but that
didn't quite handle things correctly. There are two key principles used
to select whether and how to invalidate SCEV-cached information about
loops:

1) We must invalidate any info SCEV has cached before unswitching as we
   may change (or destroy) the loop structure by the act of unswitching,
   and make it hard to recover everything we want to invalidate within
   SCEV.

2) We need to invalidate all of the loops whose CFGs are mutated by the
   unswitching. Notably, this isn't the *entire* loop nest, this is
   every loop contained by the outermost loop reached by an exit block
   relevant to the unswitch.

And we need to do this even when doing trivial unswitching.

I've added more focused tests that directly check that SCEV starts off
with imprecise information and after unswitching (and simplifying
instructions) re-querying SCEV will produce precise information. These
tests also specifically work to check that an *outer* loop's information
becomes precise.

However, the testing here is still a bit imperfect. Crafting test cases
that reliably fail to be analyzed by SCEV before unswitching and succeed
afterward proved ... very, very hard. It took me several hours and
careful work to build these, and I'm not optimistic about necessarily
coming up with more to cover more elaborate possibilities. Fortunately,
the code pattern we are testing here in the pass is really
straightforward and reliable.

Thanks to Max Kazantsev for the initial work on this as well as the
review, and to Hal Finkel for helping me talk through approaches to test
this stuff even if it didn't come to much.

Differential Revision: https://reviews.llvm.org/D47624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336183 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 09:13:27 +00:00
Alina Sbirlea
89e8753d7b Replace "Replacable" with "Replaceable". [NFC]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336133 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 18:53:40 +00:00
Florian Hahn
2a87571b08 Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters.
This version contains a fix to add values for which the state in ParamState change
to the worklist if the state in ValueState did not change. To avoid adding the
same value multiple times, mergeInValue returns true, if it added the value to
the worklist. The value is added to the worklist depending on its state in
ValueState.

Original message:
For comparisons with parameters, we can use the ParamState lattice
elements which also provide constant range information. This improves
the code for PR33253 further and gets us closer to use
ValueLatticeElement for all values.

Also, as we are using the range information in the solver directly, we
do not need tryToReplaceWithConstantRange afterwards anymore.

Reviewers: dberlin, mssimpso, davide, efriedma

Reviewed By: mssimpso

Differential Revision: https://reviews.llvm.org/D43762


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336098 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 12:44:04 +00:00
David Green
e101271f21 [UnrollAndJam] New Unroll and Jam pass
This is a simple implementation of the unroll-and-jam classical loop
optimisation.

The basic idea is that we take an outer loop of the form:

  for i..
    ForeBlocks(i)
    for j..
      SubLoopBlocks(i, j)
    AftBlocks(i)

Instead of doing normal inner or outer unrolling, we unroll as follows:

  for i... i+=2
    ForeBlocks(i)
    ForeBlocks(i+1)
    for j..
      SubLoopBlocks(i, j)
      SubLoopBlocks(i+1, j)
    AftBlocks(i)
    AftBlocks(i+1)
  Remainder Loop

So we have unrolled the outer loop, then jammed the two inner loops into
one. This can lead to a simpler inner loop if memory accesses can be shared
between the now jammed loops.

To do this we have to prove that this is all safe, both for the memory
accesses (using dependence analysis) and that ForeBlocks(i+1) can move before
AftBlocks(i) and SubLoopBlocks(i, j).

Differential Revision: https://reviews.llvm.org/D41953



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336062 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 12:47:30 +00:00
Chandler Carruth
b2b950d7b1 [instsimplify] Move the instsimplify pass to use more obvious file names
and diretory.

Also cleans up all the associated naming to be consistent and removes
the public access to the pass ID which was unused in LLVM.

Also runs clang-format over parts that changed, which generally cleans
up a bunch of formatting.

This is in preparation for doing some internal cleanups to the pass.

Differential Revision: https://reviews.llvm.org/D47352

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336028 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 23:36:03 +00:00
Sean Fertile
551913f7e3 Revert "Extend CFGPrinter and CallPrinter with Heat Colors"
This reverts r335996 which broke graph printing in Polly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336000 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 17:48:58 +00:00
Sean Fertile
8b6f52ae8e Extend CFGPrinter and CallPrinter with Heat Colors
Extends the CFGPrinter and CallPrinter with heat colors based on heuristics or
profiling information. The colors are enabled by default and can be toggled
on/off for CFGPrinter by using the option -cfg-heat-colors for both
-dot-cfg[-only] and -view-cfg[-only].  Similarly, the colors can be toggled
on/off for CallPrinter by using the option -callgraph-heat-colors for both
-dot-callgraph and -view-callgraph.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D40425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335996 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 17:13:58 +00:00
Anastasis Grammenos
fbc17042e9 [SROA] Preserve DebugLoc when rewriting alloca partitions
When rewriting an alloca partition copy the DL from the
old alloca over the the new one.

Differential Revision: https://reviews.llvm.org/D48640

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335904 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 18:58:30 +00:00
Florian Hahn
e6a8acefb8 [SCCP] Mark CFG as preserved.
SCCP does not change the CFG, so we can mark it as preserved.

Reviewers: dberlin, efriedma, davide

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D47149


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335820 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 09:53:38 +00:00
Michael Zolotukhin
d3c8f20a14 [JumpThreading] Don't try to rewrite a use if it's already valid.
Summary:
When recording uses we need to rewrite after cloning a loop we need to
check if the use is not dominated by the original def. The initial
assumption was that the cloned basic block will introduce a new path and
thus the original def will only dominate the use if they are in the same
BB, but as the reproducer from PR37745 shows it's not always the case.

This fixes PR37745.

Reviewers: haicheng, Ka-Ka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48111

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335675 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 22:19:48 +00:00
Matt Arsenault
93ae5a35af LoopUnroll: Allow analyzing intrinsic call costs
I'm not sure why the code here is skipping calls since
TTI does try to do something for general calls, but it
at least should allow intrinsics.

Skip intrinsics that should not be omitted as calls, which
is by far the most common case on AMDGPU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335645 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 18:51:17 +00:00
Florian Hahn
1766212247 [IPSCCP] Change dead blocks to unreachable after visiting all executable blocks.
changeToUnreachable may remove PHI nodes from executable blocks we found values
for and we would fail to replace them. By changing dead blocks to unreachable after
we replaced constants in all executable blocks, we ensure such PHI nodes are replaced
by their known value before.

Fixes PR37780.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D48421


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335588 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 10:15:02 +00:00
Chandler Carruth
224904bf2c [PM/LoopUnswitch] Teach the new unswitch to handle nontrivial
unswitching of switches.

This works much like trivial unswitching of switches in that it reliably
moves the switch out of the loop. Here we potentially clone the entire
loop into each successor of the switch and re-point the cases at these
clones.

Due to the complexity of actually doing nontrivial unswitching, this
patch doesn't create a dedicated routine for handling switches -- it
would duplicate far too much code. Instead, it generalizes the existing
routine to handle both branches and switches as it largely reduces to
looping in a few places instead of doing something once. This actually
improves the results in some cases with branches due to being much more
careful about how dead regions of code are managed. With branches,
because exactly one clone is created and there are exactly two edges
considered, somewhat sloppy handling of the dead regions of code was
sufficient in most cases. But with switches, there are much more
complicated patterns of dead code and so I've had to move to a more
robust model generally. We still do as much pruning of the dead code
early as possible because that allows us to avoid even cloning the code.

This also surfaced another problem with nontrivial unswitching before
which is that we weren't as precise in reconstructing loops as we could
have been. This seems to have been mostly harmless, but resulted in
pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we
have to get this *right*, and everything benefits from it.

While the testing may seem a bit light here because we only have two
real cases with actual switches, they do a surprisingly good job of
exercising numerous edge cases. Also, because we share the logic with
branches, most of the changes in this patch are reasonably well covered
by existing tests.

The new unswitch now has all of the same fundamental power as the old
one with the exception of the single unsound case of *partial* switch
unswitching -- that really is just loop specialization and not
unswitching at all. It doesn't fit into the canonicalization model in
any way. We can add a loop specialization pass that runs late based on
profile data if important test cases ever come up here.

Differential Revision: https://reviews.llvm.org/D47683

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335553 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 23:32:54 +00:00
Craig Topper
92a88d2ee3 [LoopIdiomRecognize] Fix a couple places where it appears we were unintenionally making copies of DebugLoc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335521 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 20:45:45 +00:00
Stanislav Mekhanoshin
17957ec5e5 Fix invariant fdiv hoisting in LICM
FDiv is replaced with multiplication by reciprocal and invariant
reciprocal is hoisted out of the loop, while multiplication remains
even if invariant.

Switch checks for all invariant operands and only invariant
denominator to fix the issue.

Differential Revision: https://reviews.llvm.org/D48447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335411 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-23 04:01:28 +00:00
Eli Friedman
ea67fb768b [LoopReroll] Rewrite induction variable rewriting.
This gets rid of a bunch of weird special cases; instead, just use SCEV
rewriting for everything.  In addition to being simpler, this fixes a
bug where we would use the wrong stride in certain edge cases.

The one bit I'm not quite sure about is the trip count handling,
specifically the FIXME about overflow.  In general, I think we need to
widen the exit condition, but that's probably not profitable if the new
type isn't legal, so we probably need a check somewhere.  That said, I
don't think I'm making the existing problem any worse.

As a followup to this, a bunch of IV-related code in root-finding could
be cleaned up; with SCEV-based rewriting, there isn't any reason to
assume a loop will have exactly one or two PHI nodes.

Differential Revision: https://reviews.llvm.org/D45191



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335400 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-22 22:58:55 +00:00
Alina Sbirlea
53f3ded93d [LoopUnswitch]Fix comparison for DomTree updates.
Summary:
In LoopUnswitch when replacing a branch Parent -> Succ with a conditional
branch Parent -> True & Parent->False, the DomTree updates should insert an edge for
each of True/False if True/False are different than Succ, and delete Parent->Succ edge
if both are different. The comparison with Succ appears to be incorect,
it's comparing with Parent instead.
There is no test failing either before or after this change, but it seems to me this is
the right way to do the update.

Reviewers: chandlerc, kuhar

Subscribers: sanjoy, jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D48457

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335369 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-22 17:14:35 +00:00
Francis Visoiu Mistrih
5119d140c5 Revert r335206 "Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions."
This reverts commit r335206.

As discussed here: https://reviews.llvm.org/rL333740, a fix will come
tomorrow. In the meanwhile, revert this to fix some bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335272 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 19:18:36 +00:00
Florian Hahn
cb4031ebdb Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
r335150 should resolve the issues with the clang-with-thin-lto-ubuntu
and clang-with-lto-ubuntu builders.

Original message:
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335206 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 07:15:08 +00:00
Chandler Carruth
b92fd73274 [PM/LoopUnswitch] Add partial non-trivial unswitching for invariant
conditions feeding a chain of `and`s or `or`s for a branch.

Much like with full non-trivial unswitching, we rely on the pass manager
to handle iterating until all of the profitable unswitches have been
done. This is to allow other more profitable unswitches to fire on any
of the cloned, simpler versions of the loop if viable.

Threading the partial unswiching through the non-trivial unswitching
logic motivated some minor refactorings. If those are too disruptive to
make it reasonable to review this patch, I can separate them out, but
it'll be somewhat timeconsuming so I wanted to send it for initial
review as-is. Feel free to tell me whether it warrants pulling apart.

I've tried to re-use (and factor out) logic form the partial trivial
unswitching, but not as much could be shared as I had haped. Still, this
wasn't as bad as I naively expected.

Some basic testing is added, but I probably need more. Suggestions for
things you'd like to see tested more than welcome. One thing I'd like to
do is add some testing that when we schedule this with loop-instsimplify
it effectively cleans up the cruft created.

Last but not least, this uncovered a bug that has been in loop cloning
the entire time for non-trivial unswitching. Specifically, we didn't
correctly add the outer-most cloned loop to the list of cloned loops.
This meant that LCSSA wouldn't be updated for it hypothetically, and
more significantly that we would never visit it in the loop pass
manager. I noticed this while checking loop-instsimplify by hand. I'll
try to separate this bugfix out into its own patch with a more focused
test. But it is just one line, so shouldn't significantly confuse the
review here.

After this patch, the only missing "feature" in this unswitch I'm aware
of us non-trivial unswitching of switches. I'll try implementing *full*
non-trivial unswitching of switches (which is at least a sound thing to
implement), but *partial* non-trivial unswitching of switches is
something I don't see any sound and principled way to implement. I also
have no interesting test cases for the latter, so I'm not really
worried. The rest of the things that need to be ported are bug-fixes and
more narrow / targeted support for specific issues.

Differential Revision: https://reviews.llvm.org/D47522

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335203 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 06:14:03 +00:00
Alina Sbirlea
8d9ac7ff94 Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
Summary:
Two utils methods have essentially the same functionality. This is an attempt to merge them into one.
1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor

Prior to the patch:
1. MergeBasicBlockIntoOnlyPred
Updates either DomTree or DeferredDominance
Moves all instructions from Pred to BB, deletes Pred
Asserts BB has single predecessor
If address was taken, replace the block address with constant 1 (?)

2. MergeBlockIntoPredecessor
Updates DomTree, LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

After the patch:
Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:

1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
Updated in this patch. No challenges.

2. lib/CodeGen/CodeGenPrepare.cpp
Updated in this patch.
  i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
  ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks
Some interesting aspects:
  - Since Pred is not deleted (BB is), the entry block does not need updating.
  - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
  - isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
  - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
  - adding some test owner as subscribers for the interesting tests modified:
    test/CodeGen/X86/avx-cmp.ll
    test/CodeGen/AMDGPU/nested-loop-conditions.ll
    test/CodeGen/AMDGPU/si-annotate-cf.ll
    test/CodeGen/X86/hoist-spill.ll
    test/CodeGen/X86/2006-11-17-IllegalMove.ll

3. lib/Transforms/Scalar/JumpThreading.cpp
Not covered in this patch. It is the only use case using the DeferredDominance.
I would defer to Brian Rzycki to make this replacement.

Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar

Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D48202

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335183 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-20 22:01:04 +00:00
Chandler Carruth
224eb0a785 [PM/LoopUnswitch] Support partial trivial unswitching.
The idea of partial unswitching is to take a *part* of a branch's
condition that is loop invariant and just unswitching that part. This
primarily makes sense with i1 conditions of branches as opposed to
switches. When dealing with i1 conditions, we can easily extract loop
invariant inputs to a a branch and unswitch them to test them entirely
outside the loop.

As part of this, we now create much more significant cruft in the loop
body, so this relies on adding cleanup passes to the loop pipeline and
revisiting unswitched loops to do that cleanup before continuing to
process them.

This already appears to be more powerful at unswitching than the old
loop unswitch pass, and so I'd appreciate pretty careful review in case
I'm just missing some correctness checks. The `LIV-loop-condition` test
case is not unswitched by the old unswitch pass, but is with this pass.

Thanks to Sanjoy and Fedor for the review!

Differential Revision: https://reviews.llvm.org/D46706

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335156 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-20 18:57:07 +00:00
David Green
d6cd67b722 [LoopSimplifyCFG] Invalidate SCEV in LoopSimplifyCFG
LoopSimplifyCFG, being a loop pass, needs to preserve scalar
evolution. This invalidates SE for the loops altered during
block merging.

Differential Revision: https://reviews.llvm.org/D48258



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335036 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-19 09:43:36 +00:00
Florian Hahn
6517111aa9 [LoopInterchange] Move PHI handling to adjustLoopBranches.
This patch moves the logic to handle reduction PHI nodes to the end of
adjustLoopBranches. Reduction PHI nodes in the outer loop header can be
moved to the inner loop header and reduction PHI nodes from the inner loop
header can be moved to the outer loop header. In the latter situation,
we have to deal with 1 kind of PHI nodes:

    PHI nodes that are part of inner loop-only reductions.

We can replace the PHI node with the value coming from outside
the inner loop.

Reviewers: mcrosier, efriedma, karthikthecool

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D46198


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335027 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-19 08:03:24 +00:00
Michael Zolotukhin
273aba7517 CorrelatedValuePropagation: Preserve DT.
Summary:
We only modify CFG in a couple of places, and we can preserve DT there
with a little effort.

Reviewers: davide, vsk

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48059

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334895 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-16 18:57:31 +00:00
Simon Pilgrim
ad70f357e8 [EarlyCSE] Fix MSVC build. NFCI.
MSVC doesn't let you assign different lambdas through a ternary operator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334715 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-14 14:22:03 +00:00
Max Kazantsev
5cadc32f5f [EarlyCSE] Propagate conditions of AND and OR instructions
This patches teaches EarlyCSE to figure out that if `and i1 %x, %y` is true then both
`%x` and `%y` are true in the taken branch, and if `or i1 %x, %y` is false then both
`%x` and `%y` are false in non-taken branch. Fix for PR37635.

Differential Revision: https://reviews.llvm.org/D47574
Reviewed By: reames


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334707 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-14 13:02:13 +00:00
Hiroshi Inoue
26570985ef [NFC] fix trivial typos in comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334687 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-14 05:41:49 +00:00
Florian Hahn
b2621b34b7 Use SmallPtrSet explicitly for SmallSets with pointer types (NFC).
Currently SmallSet<PointerTy> inherits from SmallPtrSet<PointerTy>. This
patch replaces such types with SmallPtrSet, because IMO it is slightly
clearer and allows us to get rid of unnecessarily including SmallSet.h

Reviewers: dblaikie, craig.topper

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47836


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334492 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-12 11:16:56 +00:00
Craig Topper
74eba10280 Use SmallPtrSet instead of SmallSet in places where we iterate over the set.
SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.

These places were found by hiding the begin/end methods in the SmallSet forwarding

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334343 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-09 05:04:20 +00:00
Daniil Fukalov
a3fe40f2b0 reapply r334209 with fixes for harfbuzz in Chromium
r334209 description:
[LSR] Check yet more intrinsic pointer operands

the patch fixes another assertion in isLegalUse()

Differential Revision: https://reviews.llvm.org/D47794

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334300 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-08 16:22:52 +00:00
Reid Kleckner
1306190f31 Revert r334209 "[LSR] Check yet more intrinsic pointer operands"
This causes cast failures when compiling harfbuzz in Chromium.
Reproducer on the way.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334254 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-08 00:43:27 +00:00
Daniil Fukalov
9378ab2ca4 [LSR] Check yet more intrinsic pointer operands
the patch fixes another assertion in isLegalUse()

Differential Revision: https://reviews.llvm.org/D47794

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334209 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-07 17:30:58 +00:00
Michael Zolotukhin
d13f6289f7 SpeculativeExecution Pass: Set PreserveCFG to avoid unnecessary analyses invalidation.
The pass doesn't touch CFG in any way, only moves instructions between
blocks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334150 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-07 00:19:29 +00:00
David Blaikie
8325fb20d4 Move Analysis/Utils/Local.h back to Transforms
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333954 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-04 21:23:21 +00:00
Chandler Carruth
2aaad36203 [PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses.
Summary:
I noticed this issue because we didn't put the primary cloned loop into
the `NonChildClonedLoops` vector and so never iterated on it. Once
I fixed that, it made it clear why I had to do a really complicated and
unnecesasry dance when updating the loops to remain in canonical form --
I was unwittingly working around the fact that the primary cloned loop
wasn't in the expected list of cloned loops. Doh!

Now that we include it in this vector, we don't need to return it and we
can consolidate the update logic as we correctly have a single place
where it can be handled.

I've just added a test for the iteration order aspect as every time
I changed the update logic partially or incorrectly here, an existing
test failed and caught it so that seems well covered (which is also
evidenced by the extensive working around of this missing update).

Reviewers: asbirlea, sanjoy

Subscribers: mcrosier, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47647

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333811 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-02 01:29:01 +00:00
Florian Hahn
b062161990 Revert r333740: IPSCCP] Use PredicateInfo to propagate facts from cmp.
This is breaking the clang-with-thin-lto-ubuntu bot.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333745 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-01 12:58:43 +00:00
Florian Hahn
39b491d6bb Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

Differential Revision: https://reviews.llvm.org/D45330


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333740 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-01 10:48:54 +00:00