20275 Commits

Author SHA1 Message Date
Eric Christopher
d972c63af4 Temporarily revert "Recommit r328307: [IPSCCP] Use constant range information for comparisons of parameters." as it's causing miscompiles.
A testcase was provided in the original review thread.

This reverts commit r336098.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336877 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 01:53:21 +00:00
Craig Topper
18d8ba4a18 [X86] Remove and autoupgrade the scalar fma intrinsics with masking.
This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336871 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 00:29:56 +00:00
Craig Topper
99a2c222c2 [LoopIdiomRecognize] Don't convert a do while loop to ctlz.
This commit suppresses turning loops like this into "(bitwidth - ctlz(input))".

unsigned foo(unsigned input) {
  unsigned num = 0;
  do {
    ++num;
    input >>= 1;
  } while (input != 0);
  return num;
}

The loop version returns a value of 1 for both an input of 0 and an input of 1. Converting to a naive ctlz does not preserve that.

Theoretically we could do better if we checked isKnownNonZero or we could insert a select to handle the divergence. But until we have motivating cases for that, this is the easiest solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336864 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 22:35:28 +00:00
Roman Lebedev
2dd26c97a8 [InstCombine] Fold x & (-1 >> y) == x to x u<= (-1 >> y)
Summary:
https://bugs.llvm.org/show_bug.cgi?id=38123

This pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958
https://bugs.llvm.org/show_bug.cgi?id=21530
in unsigned case, therefore it is probably a good idea to improve it.

https://rise4fun.com/Alive/Rny
^ there are more opportunities for folds, i will follow up with them afterwards.

Caveat: this somehow exposes a missing opportunities
in `test/Transforms/InstCombine/icmp-logical.ll`
It seems, the problem is in `foldLogOpOfMaskedICmps()` in `InstCombineAndOrXor.cpp`.
But i'm not quite sure what is wrong, because it calls `getMaskedTypeForICmpPair()`,
which calls `decomposeBitTestICmp()` which should already work for these cases...
As @spatel notes in https://reviews.llvm.org/D49179#1158760,
that code is a rather complex mess, so we'll let it slide.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: yamauchi, majnemer, t.p.northover, llvm-commits

Differential Revision: https://reviews.llvm.org/D49179

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336834 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 19:05:04 +00:00
Simon Pilgrim
33f4d61062 [SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED)
We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Reapplied with fix to only accept 2 different casts if they come from the same source type.

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336812 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 15:05:10 +00:00
Simon Pilgrim
191ae9ef3c Revert rL336804: [SLPVectorizer] Add initial alternate opcode support for cast instructions.
Reverting due to buildbot failures

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336806 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 14:08:16 +00:00
Simon Pilgrim
71b0da15d2 [SLPVectorizer] Add initial alternate opcode support for cast instructions.
We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336804 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 13:34:09 +00:00
George Burgess IV
b3a7b8e123 Sort includes + include a missing extern "C" header
If we don't include Initialization.h,
`LLVMInitializeAggressiveInstCombiner` won't see its `extern "C"` decl.
This causes sadness, name mangling, and linker errors.

Reported on the mailing lists by Vladimir Vissoultchev. Thanks!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336736 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 22:48:13 +00:00
Teresa Johnson
7b0ba29ddb [ThinLTO] Use std::map to get determistic imports files
Summary:
I noticed that the .imports files emitted for distributed ThinLTO
backends do not have consistent ordering. This is because StringMap
iteration order is not guaranteed to be deterministic. Since we already
have a std::map with this information, used when emitting the individual
index files (ModuleToSummariesForIndex), use it for the imports files as
well.

This issue is likely causing some unnecessary rebuilds of the ThinLTO
backends in our distributed build system as the imports files are inputs
to those backends.

Reviewers: pcc, steven_wu, mehdi_amini

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336721 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 20:06:04 +00:00
Eugene Leviant
ab6c6cb993 [Evaluator] Examine alias when evaluating function call
This fixes PR38120


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336702 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 16:34:23 +00:00
Sanjay Patel
128882e5b6 [InstCombine] allow flag propagation when using safe constant
This corresponds with the code for the single binop pattern
added in rL336684.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336696 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 16:09:49 +00:00
Ulrich Weigand
8d4bfa624a [gcov] Fix ABI when calling llvm_gcov_... routines from instrumentation code
The llvm_gcov_... routines in compiler-rt are regular C functions that
need to be called using the proper C ABI for the target. The current
code simply calls them using plain LLVM IR types. Since the type are
mostly simple, this happens to just work on certain targets. But other
targets still need special handling; in particular, it may be necessary
to sign- or zero-extended sub-word values to comply with the ABI. This
caused gcov failures on SystemZ in particular.

Now the very same problem was already fixed for the llvm_profile_ calls
here: https://reviews.llvm.org/D21736

This patch uses the same method to fix the llvm_gcov_ calls, in
particular calls to llvm_gcda_start_file, llvm_gcda_emit_function, and
llvm_gcda_emit_arcs.

Reviewed By: marco-c

Differential Revision: https://reviews.llvm.org/D49134



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336692 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 16:05:47 +00:00
Sanjay Patel
57f9ec6739 [InstCombine] safely allow non-commutative binop identity constant folds
This was originally intended with D48893, but as discussed there, we
have to make the folds safe from producing extra poison. This should
give the single binop folds the same capabilities as the existing
folds for 2-binops+shuffle.

LLVM binary opcode review: there are a total of 18 binops. There are 7 
commutative binops (add, mul, and, or, xor, fadd, fmul) which we already 
fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr,
fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't 
bother with sub/fsub with constant operand 1 because those are 
canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336684 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 15:12:31 +00:00
Sanjay Patel
4b88500342 [InstCombine] drop poison flags when shuffle mask undef propagates to constant
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336679 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 14:27:55 +00:00
Sanjay Patel
db58626da5 [InstCombine] allow more shuffle-binop folds with safe constants
The case with 2 variables is more complicated than the case where
we eliminate the shuffle entirely because a shuffle with an undef 
mask element creates an undef result. 

I'm not aware of any current analysis/transform that recognizes that 
undef propagating to a div/rem/shift, but we have to guard against 
the possibility.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336668 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 13:33:26 +00:00
Anastasis Grammenos
c1175857e2 [DebugInfo][LoopVectorize] Preserve DL in induction PHI and Add
Differential Revision: https://reviews.llvm.org/D48968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336667 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 13:29:50 +00:00
Karl-Johan Karlsson
8008d9214c [LowerSwitch] Fixed faulty PHI nodes
Summary:
Fixed two cases of where PHI nodes need to be updated by lowerswitch.

When lowerswitch find out that the switch default branch is not
reachable it remove the old default and replace it with the most
popular block from the cases, but it forget to update the PHI
nodes in the default block.

The PHI nodes also need to be updated when the switch is replaced
with a single branch.

Reviewers: hans, reames, arsenm

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D47203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336659 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 12:06:16 +00:00
Chandler Carruth
bdef81ad96 [PM/Unswitch] Fix unused variable in r336646.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336647 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 08:57:04 +00:00
Chandler Carruth
f0e0135e65 [PM/Unswitch] Fix a collection of closely related issues with trivial
switch unswitching.

The core problem was that the way we handled unswitching trivial exit
edges through the default successor of a switch. For some reason
I thought the right way to do this was to add a block containing
unreachable and point the default successor at this block. In
retrospect, this has an amazing number of problems.

The first issue is the one that this pass has always worked around -- we
have to *detect* such edges and avoid unswitching them again. This
seemed pretty easy really. You juts look for an edge to a block
containing unreachable. However, this pattern is woefully unsound. So
many things can break it. The amazing thing is that I found a test case
where *simple-loop-unswitch itself* breaks this! When we do
a *non-trivial* unswitch of a switch we will end up splitting this exit
edge. The result will be a default successor that is an exit and
terminates in ... a perfectly normal branch. So the first test case that
I started trying to fix is added to the nontrivial test cases. This is
a ridiculous example that did just amazing things previously. With just
unswitch, it would create 10+ copies of this stuff stamped out. But if
you combine it *just right* with a bunch of other passes (like
simplify-cfg, loop rotate, and some LICM) you can get it to do this
infinitely. Or at least, I never got it to finish. =[

This, in turn, uncovered another related issue. When we are manipulating
these switches after doing a trivial unswitch we never correctly updated
PHI nodes to reflect our edits. As soon as I started changing how these
edges were managed, it became obvious there were more issues that
I couldn't realistically leave unaddressed, so I wrote more test cases
around PHI updates here and ensured all of that works now.

And this, in turn, required some adjustment to how we collect and manage
the exit successor when it is the default successor. That showed a clear
bug where we failed to include it in our search for the outer-most loop
reached by an unswitched exit edge. This was actually already tested and
the test case didn't work. I (wrongly) thought that was due to SCEV
failing to analyze the switch. In fact, it was just a simple bug in the
code that skipped the default successor. While changing this, I handled
it correctly and have updated the test to reflect that we now get
precise SCEV analysis of trip counts for the outer loop in one of these
cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336646 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 08:36:05 +00:00
Sanjay Patel
1cc2d65953 [InstCombine] allow more shuffle folds using safe constants
getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming
that the constant we wanted was operand 1 (RHS). That's wrong, but I
don't think we could expose a bug or even a suboptimal fold from that
because the callers have other guards for any binop that would have
been affected.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336617 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 23:22:47 +00:00
Manoj Gupta
c6da6867a1 llvm: Add support for "-fno-delete-null-pointer-checks"
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.

More details : https://lkml.org/lkml/2018/4/4/601

GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.

-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.

This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.

Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv

Reviewed By: efriedma, george.burgess.iv

Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47895

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336613 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 22:27:23 +00:00
Sanjay Patel
c2d5ed6454 [InstCombine] avoid extra poison when moving shift above shuffle
As discussed in D49047 / D48987, shift-by-undef produces poison,
so we can't use undef vector elements in that case..

Note that we need to extend this for poison-generating flags,
and there's a proposal to create poison from FMF in D47963,



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336562 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 17:20:20 +00:00
Sanjay Patel
fa29fd677f [InstCombine] generalize safe vector constant utility
This is almost NFC, but there could be some case where the original
code had undefs in the constants (rather than just the shuffle mask),
and we'll use safe constants rather than undefs now.

The FIXME noted in foldShuffledBinop() is already visible in existing
tests, so correcting that is the next step.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336558 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 16:16:51 +00:00
Diego Caballero
dc4af2458e [VPlan][LV] Introduce condition bit in VPBlockBase
This patch introduces a VPValue in VPBlockBase to represent the condition
bit that is used as successor selector when a block has multiple successors.
This information wasn't necessary until now, when we are about to introduce
outer loop vectorization support in VPlan code gen.

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48814



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336554 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 15:57:09 +00:00
Xin Tong
955b8d422d [CVP] Handle calls with void return value. No need to create CVPLattice state for it.
Summary:
Tests: 10
Metric: compile_time

Program                                         unpatch-result  patch-result diff

Bullet/bullet                                  32.39           30.54        -5.7%
SPASS/SPASS                                    18.14           17.25        -4.9%
mafft/pairlocalalign                           12.10           11.64        -3.8%
ClamAV/clamscan                                19.21           19.63         2.2%
7zip/7zip-benchmark                            49.55           48.85        -1.4%
kimwitu++/kc                                   15.68           15.87         1.2%
lencod/lencod                                  21.13           21.34         1.0%
consumer-typeset/consumer-typeset              13.65           13.62        -0.2%
tramp3d-v4/tramp3d-v4                          29.88           29.92         0.1%
sqlite3/sqlite3                                18.48           18.46        -0.1%
       unpatch-result  patch-result       diff
count  10.000000       10.000000     10.000000
mean   23.022000       22.712400    -0.011671
std    11.362831       11.094183     0.027338
min    12.104000       11.640000    -0.057298
25%    16.299000       16.214000    -0.032282
50%    18.844000       19.048000    -0.001350
75%    27.689000       27.774000     0.007752
max    49.552000       48.852000     0.021861

I also tested only this pass by concatenating all the code from the
llvm/lib/Analysis/ folder and do clang -g followed by opt. I get close to 20% speedup
for the pass. I expect a majority of the gain come from skipping the dbg intrinsics.

Before patch (opt -time-passes -called-value-propagation):
============
===-------------------------------------------------------------------------===
 ... Pass execution timing report ...
===-------------------------------------------------------------------------===
 Total Execution Time: 3.8303 seconds (3.8279 wall clock)

 ---User Time--- --System Time-- --User+System-- ---Wall Time--- ---
Name ---
 2.0768 ( 57.3%) 0.0990 ( 48.0%) 2.1757 ( 56.8%) 2.1757 ( 56.8%) Bitcode
Writer
 0.8444 ( 23.3%) 0.0600 ( 29.1%) 0.9044 ( 23.6%) 0.9044 ( 23.6%) Called
Value Propagation
 0.7031 ( 19.4%) 0.0472 ( 22.9%) 0.7502 ( 19.6%) 0.7478 ( 19.5%) Module
Verifier
 3.6242 (100.0%) 0.2062 (100.0%) 3.8303 (100.0%) 3.8279 (100.0%) Total

After patch (opt -time-passes -called-value-propagation):
============
===-------------------------------------------------------------------------===
 ... Pass execution timing report ...
===-------------------------------------------------------------------------===
 Total Execution Time: 3.6605 seconds (3.6579 wall clock)

 ---User Time--- --System Time-- --User+System-- ---Wall Time--- ---
Name ---
 2.0716 ( 59.7%) 0.0990 ( 52.5%) 2.1705 ( 59.3%) 2.1706 ( 59.3%) Bitcode
Writer
 0.7144 ( 20.6%) 0.0300 ( 15.9%) 0.7444 ( 20.3%) 0.7444 ( 20.4%) Called
Value Propagation
 0.6859 ( 19.8%) 0.0596 ( 31.6%) 0.7455 ( 20.4%) 0.7429 ( 20.3%) Module
Verifier
 3.4719 (100.0%) 0.1886 (100.0%) 3.6605 (100.0%) 3.6579 (100.0%) Total

Reviewers: davide, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336551 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 14:53:37 +00:00
Sanjay Patel
f03a571cbf [InstCombine] fix shuffle-of-binops transform to avoid poison/undef
As noted in D48987, there are many different ways for this transform to go wrong. 
In particular, the poison potential for shifts means we have to more careful with those ops. 
I added tests to make that behavior visible for all of the different cases that I could find.

This is a partial fix. To make this review easier, I did not make changes for the single binop 
pattern (handled in foldSelectShuffleWith1Binop()). I also left out some potential optimizations 
noted with TODO comments. I'll follow-up once we're confident that things are correct here.

The goal is to correct all marked FIXME tests to either avoid the shuffle transform or do it safely.

Note that distinguishing when the shuffle mask contains undefs and using getBinOpIdentity() allows 
for some improvements to div/rem patterns, so there are wins along with the missed opportunities 
and fixes.

Differential Revision: https://reviews.llvm.org/D49047


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336546 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 13:21:46 +00:00
Chandler Carruth
1cbf1da4d6 [PM/Unswitch] Fix a nasty bug in the new PM's unswitch introduced in
r335553 with the non-trivial unswitching of switches.

The code correctly updated most aspects of the CFG and analyses, but
missed some crucial aspects:
1) When multiple cases have the same successor, we unswitch that
   a single time and replace the switch with a direct branch. The CFG
   here is correct, but the target of this direct branch may have had
   a PHI node with multiple entries in it.
2) When we still have to clone a successor of the switch into an
   unswitched copy of the loop, we'll delete potentially multiple edges
   entering this successor, not just one.
3) We also have to delete multiple edges entering the successors in the
   original loop when they have to be retained.
4) When the "retained successor" *also* occurs as a case successor, we
   just assert failed everywhere. This doesn't happen very easily
   because its always valid to simply drop the case -- the retained
   successor for switches is always the default successor. However, it
   is likely possible through some contrivance of different loop passes,
   unrolling, and simplifying for this to occur in practice and
   certainly there is nothing "invalid" about the IR so this pass needs
   to handle it.
5) In the case of #4, we also will replace these multiple edges with
   a direct branch much like in #1 and need to collapse the entries in
   any PHI nodes to a single enrty.

All of this stems from the delightful fact that the same successor can
show up in multiple parts of the switch terminator, and each of these
are considered a distinct edge for the purpose of PHI nodes (and
iterating the successors and predecessors) but not for unswitching
itself, the dominator tree, or many other things. For the record,
I intensely dislike this "feature" of the IR in large part because of
the complexity it causes in passes like this. We already have a ton of
logic building sets and handling duplicates, and we just had to add
a bunch more.

I've added a complex test case that covers all five of the above failure
modes. I've also added a variation on it where #4 and #5 occur in loop
exit, adding fun where we have an LCSSA PHI node with "multiple entries"
despite have dedicated exits. There were no additional issues found by
this, but it seems a useful corner case to cover with testing.

One thing that working on all of this code has made painfully clear for
me as well is how amazingly inefficient our PHI node representation is
(in terms of the in-memory data structures and the APIs used to update
them). This code has truly marvelous complexity bounds because every
time we remove an entry from a PHI node we do a linear scan to find it
and then a linear update to the data structure to remove it. We could in
theory batch all of the PHI node updates into a single linear walk of
the operands making this much more efficient, but the APIs fight hard
against this and the fact that we have to handle duplicates in the
peculiar manner we do (removing all but one in some cases) makes even
implementing that very tedious and annoying. Anyways, none of this is
new here or specific to loop unswitching. All code in LLVM that updates
PHI node operands suffers from these problems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336536 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 10:30:48 +00:00
Chijun Sima
51d3739fb8 [PGOMemOPSize] Preserve the DominatorTree
Summary:
PGOMemOPSize only modifies CFG in a couple of places; thus we can preserve the DominatorTree with little effort.
When optimizing SQLite with -O3, this patch can decrease 3.8% of the numbers of nodes traversed by DFS and 5.7% of the times DominatorTreeBase::recalculation is called.

Reviewers: kuhar, davide, dmgreen

Reviewed By: dmgreen

Subscribers: mzolotukhin, vsk, llvm-commits

Differential Revision: https://reviews.llvm.org/D48914

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336522 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 08:07:21 +00:00
Craig Topper
7d6057ca55 [LoopIdiomRecognize] Support for converting loops that use LSHR to CTLZ.
In the 'detectCTLZIdiom' function support for loops that use LSHR instruction instead of ASHR has been added.

This supports creating ctlz from the following code.

int lzcnt(int x) {
     int count = 0;
     while (x > 0)  {
          count++;
          x = x >> 1;
     }
    return count;
}

Patch by Olga Moldovanova

Differential Revision: https://reviews.llvm.org/D48354

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336509 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-08 01:45:47 +00:00
Chandler Carruth
4498596c19 [PM/LoopUnswitch] Fix PR37889, producing the correct loop nest structure
after trivial unswitching.

This PR illustrates that a fundamental analysis update was not performed
with the new loop unswitch. This update is also somewhat fundamental to
the core idea of the new loop unswitch -- we actually *update* the CFG
based on the unswitching. In order to do that, we need to update the
loop nest in addition to the domtree.

For some reason, when writing trivial unswitching, I thought that the
loop nest structure cannot be changed by the transformation. But the PR
helps illustrate that it clearly can. I've expanded this to a number of
different test cases that try to cover the different cases of this. When
we unswitch, we move an exit edge of a loop out of the loop. If this
exit edge changes which loop reached by an exit is the innermost loop,
it changes the parent of the loop. Essentially, this transformation may
hoist the inner loop up the nest. I've added the simple logic to handle
this reliably in the trivial unswitching case. This just requires
updating LoopInfo and rebuilding LCSSA on the impacted loops. In the
trivial case, we don't even need to handle dedicated exits because we're
only hoisting the one loop and we just split its preheader.

I've also ported all of these tests to non-trivial unswitching and
verified that the logic already there correctly handles the loop nest
updates necessary.

Differential Revision: https://reviews.llvm.org/D48851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336477 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-07 01:12:56 +00:00
Vedant Kumar
c2801b1640 Use Type::isIntOrPtrTy where possible, NFC
It's a bit neater to write T.isIntOrPtrTy() over `T.isIntegerTy() ||
T.isPointerTy()`.

I used Python's re.sub with this regex to update users:

  r'([\w.\->()]+)isIntegerTy\(\)\s*\|\|\s*\1isPointerTy\(\)'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336462 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-06 20:17:42 +00:00
Vedant Kumar
0f83e1fc48 [Local] replaceAllDbgUsesWith: Update debug values before RAUW
The replaceAllDbgUsesWith utility helps passes preserve debug info when
replacing one value with another.

This improves upon the existing insertReplacementDbgValues API by:

- Updating debug intrinsics in-place, while preventing use-before-def of
  the replacement value.
- Falling back to salvageDebugInfo when a replacement can't be made.
- Moving the responsibiliy for rewriting llvm.dbg.* DIExpressions into
  common utility code.

Along with the API change, this teaches replaceAllDbgUsesWith how to
create DIExpressions for three basic integer and pointer conversions:

- The no-op conversion. Applies when the values have the same width, or
  have bit-for-bit compatible pointer representations.
- Truncation. Applies when the new value is wider than the old one.
- Zero/sign extension. Applies when the new value is narrower than the
  old one.

Testing:

- check-llvm, check-clang, a stage2 `-g -O3` build of clang,
  regression/unit testing.
- This resolves a number of mis-sized dbg.value diagnostics from
  Debugify.

Differential Revision: https://reviews.llvm.org/D48676

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336451 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-06 17:32:39 +00:00
Benjamin Kramer
c988f9e6c3 [LoopSink] Make the enforcement of determinism deterministic.
LoopBlockNumber is a DenseMap<BasicBlock*, int>, comparing the result of
find() will compare a pair<BasicBlock*, int>. That's of course depending
on pointer ordering which varies from run to run. Reverse iteration
doesn't find this because we're copying to a vector first.

This bug has been there since 2016 but only recently showed up on clang
selfhost with FDO and ThinLTO, which is also why I didn't manage to get
a reasonable test case for this. Add an assert that would've caught
this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336439 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-06 14:20:58 +00:00
Max Kazantsev
ed10960040 Revert "[InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336410 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-06 04:04:13 +00:00
Michael Zolotukhin
bda51c282f Revert r332168: "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading.""
There were a couple of issues reported (PR38047, PR37929) - I'll reland
the patch when I figure out and fix the rootcause.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336393 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 22:10:31 +00:00
Matt Arsenault
30ba76520a Fix asserts in AMDGCN fmed3 folding by handling more cases of NaN
Better NaN handling for AMDGCN fmed3.

All operands are checked for NaN now. The checks
were moved before the canonicalization to provide
a better mapping from fclamp. Changed the behaviour
of fmed3(x,y,NaN) to return max(x,y) instead of
min(x,y) in light of this. Updated tests as a result
and added some new cases to cover the fix.

Patch by Alan Baker

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336375 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 17:05:36 +00:00
Simon Pilgrim
7029d5a52f [SLPVectorizer] Begin abstracting InstructionsState alternate matching away from opcodes. NFCI.
This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism.

Differential Revision: https://reviews.llvm.org/D48945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336344 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 12:30:44 +00:00
Craig Topper
5f1cfe90f3 [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336315 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 06:52:55 +00:00
Sanjay Patel
b8f7539938 [InstCombine] allow narrowing of min/max/abs
We have bailout hacks based on min/max in various places in instcombine 
that shouldn't be necessary. The affected test was added for:
D48930 
...which is a consequence of the improvement in:
D48584 (https://reviews.llvm.org/rL336172)

I'm assuming the visitTrunc bailout in this patch was added specifically 
to avoid a change from SimplifyDemandedBits, so I'm just moving that 
below the EvaluateInDifferentType optimization. A narrow min/max is still
a min/max.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336293 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 17:44:04 +00:00
Simon Pilgrim
79d7b5cd91 Fix some irregular whitespace/indentation. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336291 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 17:24:05 +00:00
Anastasis Grammenos
fb31c6481f [DebugInfo][LoopVectorize] Preserve DL in generated phi instruction
When creating `phi` instructions to resume at the scalar part of the loop,
copy the DebugLoc from the original phi over to the new one.

Differential Revision: https://reviews.llvm.org/D48769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336256 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 10:16:55 +00:00
Anastasis Grammenos
baf68707d0 [DebugInfo][InstCombine] Preserve DI after combining zext
When zext is EvaluatedInDifferentType, InstCombine
drops the dbg.value intrinsic. This patch tries to
preserve said DI, by inserting the zext's old DI in the
resulting instruction. (Only for integer type for now)

Differential Revision: https://reviews.llvm.org/D48331

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336254 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 09:55:46 +00:00
Sanjay Patel
c810b4e17d [InstCombine] fold shuffle-with-binop and common value
This is the last significant change suggested in PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806#c5
...though there are several follow-ups noted in the code comments 
in this patch to complete this transform.

It's possible that a binop feeding a select-shuffle has been eliminated 
by earlier transforms (or the code was just written like this in the 1st 
place), so we'll fail to match the patterns that have 2 binops from: 
D48401, 
D48678, 
D48662, 
D48485.

In that case, we can try to materialize identity constants for the remaining
binop to fill in the "ghost" lanes of the vector (where we just want to pass 
through the original values of the source operand).

I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. 
For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor).

Differential Revision: https://reviews.llvm.org/D48830


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336196 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 13:44:22 +00:00
Bjorn Pettersson
13e9d31258 [DebugInfo] Corrections for salvageDebugInfo
Summary:
When salvaging a dbg.declare/dbg.addr we should not add
DW_OP_stack_value to the DIExpression
(see test/Transforms/InstCombine/salvage-dbg-declare.ll).

Consider this example
  %vla = alloca i32, i64 2
  call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression())

Instcombine will turn it into
  %vla1 = alloca [2 x i32]
  %vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression())

If the GEP can be eliminated, then the dbg.declare will be salvaged
and we should get
  %vla1 = alloca [2 x i32]
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression())

The problem was that salvageDebugInfo did not recognize dbg.declare
as being indirect (%vla1 points to the value, it does not hold the
value), so we incorrectly got
  call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value))

I also made sure that llvm::salvageDebugInfo and
DIExpression::prependOpcodes do not add DW_OP_stack_value to
the DIExpression in case no new operands are added to the
DIExpression. That way we avoid to, unneccessarily, turn a
register location expression into an implicit location expression
in some situations (see test11 in test/Transforms/LICM/sinking.ll).

Reviewers: aprantl, vsk

Reviewed By: aprantl, vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336191 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 11:29:00 +00:00
Chandler Carruth
ba89ffcade [PM/LoopUnswitch] Fix PR37651 by correctly invalidating SCEV when
unswitching loops.

Original patch trying to address this was sent in D47624, but that
didn't quite handle things correctly. There are two key principles used
to select whether and how to invalidate SCEV-cached information about
loops:

1) We must invalidate any info SCEV has cached before unswitching as we
   may change (or destroy) the loop structure by the act of unswitching,
   and make it hard to recover everything we want to invalidate within
   SCEV.

2) We need to invalidate all of the loops whose CFGs are mutated by the
   unswitching. Notably, this isn't the *entire* loop nest, this is
   every loop contained by the outermost loop reached by an exit block
   relevant to the unswitch.

And we need to do this even when doing trivial unswitching.

I've added more focused tests that directly check that SCEV starts off
with imprecise information and after unswitching (and simplifying
instructions) re-querying SCEV will produce precise information. These
tests also specifically work to check that an *outer* loop's information
becomes precise.

However, the testing here is still a bit imperfect. Crafting test cases
that reliably fail to be analyzed by SCEV before unswitching and succeed
afterward proved ... very, very hard. It took me several hours and
careful work to build these, and I'm not optimistic about necessarily
coming up with more to cover more elaborate possibilities. Fortunately,
the code pattern we are testing here in the pass is really
straightforward and reliable.

Thanks to Max Kazantsev for the initial work on this as well as the
review, and to Hal Finkel for helping me talk through approaches to test
this stuff even if it didn't come to much.

Differential Revision: https://reviews.llvm.org/D47624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336183 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 09:13:27 +00:00
Max Kazantsev
6ac04e25a3 [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336172 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 06:23:57 +00:00
Alina Sbirlea
89e8753d7b Replace "Replacable" with "Replaceable". [NFC]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336133 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 18:53:40 +00:00
Farhana Aleen
13f7859c20 [SLP] Recognize min/max pattern using instructions producing same values.
Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.

         %1 = extractelement <2 x i32> %a, i32 0
         %2 = extractelement <2 x i32> %a, i32 1
         %cond = icmp sgt i32 %1, %2
         %3 = extractelement <2 x i32> %a, i32 0
         %4 = extractelement <2 x i32> %a, i32 1
         %select = select i1 %cond, i32 %3, i32 %4

Author: FarhanaAleen

Reviewed By: ABataev, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D47608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:55:31 +00:00
Sanjay Patel
c9a157f7fd [InstCombine] reverse canonicalization of add --> or to allow more shuffle folding
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)

Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving 
to where other passes could use that functionality.

Differential Revision: https://reviews.llvm.org/D48662


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336128 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:42:29 +00:00
Simon Pilgrim
a40b37f909 [SLPVectorizer] Remove nullptr early-outs from Instruction::ShuffleVector getEntryCost
This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336102 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 13:41:29 +00:00