Commit Graph

9997 Commits

Author SHA1 Message Date
Sanjay Patel
d439e709a9 [InstSimplify] fix copy-paste mistake in test comments; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302251 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 16:24:58 +00:00
Sanjay Patel
5eef1b5eb5 [InstSimplify] add tests for (icmp X, C1 | icmp X, C2); NFC
These are the 'or' counterparts for the tests added with r300493.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302248 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 16:12:05 +00:00
Aditya Kumar
978eb7c50b [LoopIdiom] check for safety while expanding
Loop Idiom recognition was generating memset in a case that
would result generating a division operation to an unsafe location.

Differential Revision: https://reviews.llvm.org/D32674

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302238 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 14:49:45 +00:00
Martin Storsjo
ae5b79d34f [ArgPromotion] Add a testcase for PR32917
Differential Revision: https://reviews.llvm.org/D32882

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302216 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 08:40:24 +00:00
Dehao Chen
c6a0731aa0 Update VP prof metadata during inlining.
Summary: r298270 added profile update logic for branch_weights. This patch implements profile update logic for VP prof metadata too.

Reviewers: eraman, tejohnson, davidxl

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32773

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302209 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 00:47:34 +00:00
Sanjay Patel
0eebc2900c [InstSimplify] add folds for or-of-casted-icmps
The sibling folds for 'and' with casts were added with https://reviews.llvm.org/rL273200.
This is a preliminary step for adding the 'or' variants for the folds added with https://reviews.llvm.org/rL301260.

The reason for the strange form with constant LHS in the 1st test is because there's another missing fold in that
case for the inverted predicate. That should be fixed when we add the ConstantRange functionality for 'or-of-icmps' 
that already exists for 'and-of-icmps'.

I'm hoping to share more code for the and/or cases, so we won't have these differences. This will allow us to remove
code from InstCombine. It's also possible that we can remove some code here in InstSimplify. I think we have some 
duplicated folds because patterns are not matched in a general way.

Differential Revision: https://reviews.llvm.org/D32876


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302189 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 19:51:34 +00:00
Sanjay Patel
151fce62d4 [InstSimplify] add tests for or-of-casted-icmps; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302174 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 17:36:53 +00:00
Easwaran Raman
de37aad1ce [PM] Add ProfileSummaryAnalysis as a required pass in the new pipeline.
Differential revision: https://reviews.llvm.org/D32768

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302170 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 16:58:45 +00:00
Adrian Prantl
5013f6327b Cleanup tests to not share a DISubprogram between multiple Functions.
rdar://problem/31926379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302166 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 16:24:31 +00:00
Matthias Braun
bcd731ca9c strlen-1.ll: Fix test
Change test for `strlen(x) == 0 --> *x == 0` to actually test the
pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302094 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 23:32:51 +00:00
Craig Topper
e6aa7aa29e [InstCombine][KnownBits] Use KnownBits better to detect nsw adds
Change checkRippleForAdd from a heuristic to a full check -
if it is provable that the add does not overflow return true, otherwise false.

Patch by Yoav Ben-Shalom

Differential Revision: https://reviews.llvm.org/D32686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302093 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 23:22:46 +00:00
Elad Cohen
ea59a24717 Support arbitrary address space pointers in masked gather/scatter intrinsics.
Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a
non-default address space pointer we fail with a "Calling a function with a
bad singature!" assertion. This patch solves this by adding the 'vector of
pointers' argument as an overloaded type which will determine the address
space.

Differential revision: https://reviews.llvm.org/D31490



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302018 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 12:28:54 +00:00
Anna Thomas
469af3f37f [Loop Deletion] Delete loops that are never executed
Summary:
Currently, loop deletion deletes loop where the only values
that are used outside the loop are loop-invariant.
This patch adds logic to delete loops where the loop is proven to be
never executed (i.e. the only predecessor of the loop preheader has a
constant conditional branch as terminator, and the preheader is not the
taken target). This will remove loops that become dead after
loop-unswitching generates constant conditional branches.

The next steps are:
1. moving the loop deletion implementation to LoopUtils.
2. Add logic in loop-simplifyCFG which will support changing conditional
constant branches to unconditional branches. If loops become unreachable in this
process, they can be removed using `deleteDeadLoop` function.

Reviewers: chandlerc, efriedma, sanjoy, reames

Reviewed by: sanjoy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302015 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 11:47:11 +00:00
Matt Arsenault
921f454dae Replace hardcoded intrinsic list with speculatable attribute.
No change in which intrinsics should be speculated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301995 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 02:26:10 +00:00
Peter Collingbourne
8d8582cf12 Revert r295861, "[ModuleSummaryAnalysis] Don't crash when referencing unnamed globals."
We should always expect values to be named before running the module summary
analysis (see NameAnonGlobals pass), so it's fine if we crash in that case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301991 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 00:18:48 +00:00
Xinliang David Li
1d482c01f6 [PartialInlining] Add more early filtering
This is a follow up to the previous
inline cost patch for quicker filtering.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301959 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 18:43:21 +00:00
Matt Arsenault
5c80457142 SpeculativeExecution: Stop using whitelist for costs
Just let TTI's cost do this instead of arbitrarily restricting
this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301950 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 18:02:18 +00:00
Matt Arsenault
822c8e2ad2 AMDGPU: Make intrinsics speculatable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301937 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 16:57:44 +00:00
Sanjay Patel
dc1b8f4271 [InstCombine] don't use DeMorgan's Law on integer constants (2nd try)
This was originally checked in here:
https://reviews.llvm.org/rL301923

And reverted here:
https://reviews.llvm.org/rL301924

Because there's a clang test that would fail after this. I fixed/removed the
offending CHECK lines in:
https://reviews.llvm.org/rL301928

So let's try this again. Original commit message:

This is the fold that causes the infinite loop in BoringSSL
(https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c)
when we fix instcombine demanded bits to prefer 'not' ops as in https://reviews.llvm.org/D32255.

There are 2 or 3 problems with dyn_castNotVal, and I don't think we can
reinstate https://reviews.llvm.org/D32255 until dyn_castNotVal is completely eliminated.

1. As shown here, it transforms 'not' into random xor. This transform is harmful to SCEV and codegen because 'not' can often be folded while random xor cannot.
2. It does not transform vector constants. This is actually a good thing, but if you don't believe the above argument, then we shouldn't have excluded vectors.
3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't match the greedy nature of instcombine. If we DeMorganize a pattern that has an extra 'not' in it: ~(~(~X) & Y) --> (~X | ~Y)

  That's just another case of DeMorgan, so we should trust that we'll fold that pattern too: (~X | ~ Y) --> ~(X & Y)

Differential Revision: https://reviews.llvm.org/D32665



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301929 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 15:31:40 +00:00
Sanjay Patel
a30d9c6883 revert r301923 : [InstCombine] don't use DeMorgan's Law on integer constants
There's a clang test that is wrongly using -O1 and failing after this commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301924 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 14:48:23 +00:00
Sanjay Patel
9697c664a6 [InstCombine] don't use DeMorgan's Law on integer constants
This is the fold that causes the infinite loop in BoringSSL 
(https://github.com/google/boringssl/blob/master/crypto/cipher/e_rc2.c) 
when we fix instcombine demanded bits to prefer 'not' ops as in D32255.

There are 2 or 3 problems with dyn_castNotVal, and I don't think we can 
reinstate D32255 until dyn_castNotVal is completely eliminated.
1. As shown here, it transforms 'not' into random xor. This transform is 
   harmful to SCEV and codegen because 'not' can often be folded while 
   random xor cannot.
2. It does not transform vector constants. This is actually a good thing, 
   but if you don't believe the above argument, then we shouldn't have 
   excluded vectors.
3. It tries to avoid transforming not(not(X)). That's nice, but it doesn't
   match the greedy nature of instcombine. If we DeMorganize a pattern 
   that has an extra 'not' in it:
   ~(~(~X) & Y) --> (~X | ~Y)

   That's just another case of DeMorgan, so we should trust that we'll fold
   that pattern too:
   (~X | ~ Y) --> ~(X & Y)

Differential Revision: https://reviews.llvm.org/D32665


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301923 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 14:31:30 +00:00
Xinliang David Li
0a9c93cad3 [PartialInlining] Hook up inline cost analysis
Differential Revision: http://reviews.llvm.org/D32666


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301894 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 02:44:14 +00:00
George Burgess IV
4d11ee489b Revert r301880
This change caused buildbot failures, apparently because we're not
passing around types that InstSimplify is used to seeing. I'm not overly
familiar with InstSimplify, so I'm reverting this until I can figure out
what exactly is wrong.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301885 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 23:54:41 +00:00
George Burgess IV
8bd46914df [InstSimplify] Handle selects of GEPs with 0 offset
In particular (since it wouldn't fit nicely in the summary):
(select (icmp eq V 0) P (getelementptr P V)) -> (getelementptr P V)

Differential Revision: https://reviews.llvm.org/D31435


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301880 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 23:12:08 +00:00
Davide Italiano
f1457bf082 [NewGVN] Don't derive incorrect implications.
In the testcase attached,  we believe %tmp1 implies %tmp4.
where:
  br i1 %tmp1, label %bb2, label %bb7
  br i1 %tmp4, label %bb5, label %bb7

because Wwhile looking at PredicateInfo stuffs we end up calling
isImpliedTrueByMatchingCmp() with the arguments backwards.

Differential Revision:  https://reviews.llvm.org/D32718

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301849 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 22:26:28 +00:00
Sanjay Patel
5ef20bbe9c [InstCombine] check one-use before applying DeMorgan nor/nand folds
If we have ~(~X & Y), it only makes sense to transform it to (X | ~Y) when we do not need 
the intermediate (~X & Y) value. In that case, we would need an extra instruction to 
generate ~Y + 'or' (as shown in the test changes).

It's ok if we have multiple uses of ~X or Y, however. In those cases, we may not reduce the
instruction count or critical path, but we might improve throughput because we can generate 
~X and ~Y in parallel. Whether that actually makes perf sense or not for a target is something 
we can't answer in IR.

Differential Revision: https://reviews.llvm.org/D32703


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301848 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 22:25:42 +00:00
Xin Tong
7da1589b90 Take indirect branch into account as well when folding.
We may not be able to rewrite indirect branch target, but we also want to take it into
account when folding, i.e. if it and all its successor's predecessors go to the same
destination, we can fold, i.e. no need to thread.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301816 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 17:15:37 +00:00
Xin Tong
fc533c3fb3 [JumpThread] Do RAUW in case Cond folds to a constant in the CFG
Summary: [JumpThread] Do RAUW in case Cond folds to a constant in the CFG

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32407

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301804 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 15:34:17 +00:00
Sanjay Patel
217ed17252 [InstCombine] add multi-use variants for DeMorgan folds; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301802 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 14:52:17 +00:00
Sanjay Patel
ae542a60cb [InstCombine] use FileCheck and auto-generate checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301801 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 14:20:30 +00:00
Sanjay Patel
8e3a89da19 [InstCombine] consolidate more DeMorgan tests; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301800 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 14:10:59 +00:00
Sanjay Patel
07255f2be8 [InstCombine] consolidate tests for DeMorgan folds; NFC
I'm proposing to add tests and change behavior in D32665.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301774 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-30 18:57:12 +00:00
Zvi Rackover
778f5177f0 InstructionSimplify: Simplify a shuffle with a undef mask to undef
Summary:
Following the discussion in pr32486, adding the simplification:
 shuffle %x, %y, undef -> undef

Reviewers: spatel, RKSimon, andreadb, davide

Reviewed By: spatel

Subscribers: jroelofs, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D32293

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301764 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-30 06:06:26 +00:00
Akira Hatanaka
e85d77f956 [ObjCARC] Do not move a release between a call and a
retainAutoreleasedReturnValue that retains the returned value.

This commit fixes a bug in ARC optimizer where it moves a release
between a call and a retainAutoreleasedReturnValue, causing the returned
object to be released before the retainAutoreleasedReturnValue can
retain it.

This commit accomplishes that by doing a lookahead and checking whether
the call prevents the release from moving upwards. In the long term, we
should treat the region between the retainAutoreleasedReturnValue and
the call as a critical section and disallow moving anything there
(possibly using operand bundles).

rdar://problem/20449878

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301724 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-29 00:23:11 +00:00
Davide Italiano
bbc33ba775 [LoopUnswitch] Don't remove instructions with side effects.
This fixes PR32818.

Differential Revision:  https://reviews.llvm.org/D32664

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301722 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-29 00:12:18 +00:00
Sanjay Patel
337942b20a [InstCombine] add tests to show potentially bogus application of DeMorgan (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301714 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 23:14:33 +00:00
Matt Arsenault
754511fb00 InferAddressSpaces: Search constant expressions for addrspacecasts
These are pretty common when using local memory, and the 64-bit generic
addressing is much more expensive to compute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301711 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 22:52:41 +00:00
Matt Arsenault
afc9030f67 InferAddressSpaces: Infer from just addrspacecasts
Eliminates some more cases where some subset of the addressing
computation remains flat. Some cases with addrspacecasts
in nested constant expressions are still left behind however.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301704 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 22:18:08 +00:00
Matt Arsenault
ba657060a1 [ValueTracking] Teach isSafeToSpeculativelyExecute() about the speculatable attribute
Patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301688 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 21:13:09 +00:00
Jun Bum Lim
98e89d3bb4 [InlineCost] Improve the cost heuristic for Switch
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:                                   ; preds = %if.then
  switch i32 %Status, label %if.then27 [
    i32 -7012, label %if.end35
    i32 -10008, label %if.end35
    i32 -10016, label %if.end35
    i32 15000, label %if.end35
    i32 14013, label %if.end35
    i32 10114, label %if.end35
    i32 10107, label %if.end35
    i32 10105, label %if.end35
    i32 10013, label %if.end35
    i32 10011, label %if.end35
    i32 7008, label %if.end35
    i32 7007, label %if.end35
    i32 5002, label %if.end35
  ]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9:                              // %lor.lhs.false2
        mov     w8, #10012
        cmp             w19, w8
        b.gt    .LBB853_14
// BB#10:                               // %lor.lhs.false2
        mov     w8, #5001
        cmp             w19, w8
        b.gt    .LBB853_18
// BB#11:                               // %lor.lhs.false2
        mov     w8, #-10016
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#12:                               // %lor.lhs.false2
        mov     w8, #-10008
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#13:                               // %lor.lhs.false2
        mov     w8, #-7012
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_14:                             // %lor.lhs.false2
        mov     w8, #14012
        cmp             w19, w8
        b.gt    .LBB853_21
// BB#15:                               // %lor.lhs.false2
        mov     w8, #-10105
        add             w8, w19, w8
        cmp             w8, #9          // =9
        b.hi    .LBB853_17
// BB#16:                               // %lor.lhs.false2
        orr     w9, wzr, #0x1
        lsl     w8, w9, w8
        mov     w9, #517
        and             w8, w8, w9
        cbnz    w8, .LBB853_23
.LBB853_17:                             // %lor.lhs.false2
        mov     w8, #10013
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_18:                             // %lor.lhs.false2
        mov     w8, #-7007
        add             w8, w19, w8
        cmp             w8, #2          // =2
        b.lo    .LBB853_23
// BB#19:                               // %lor.lhs.false2
        mov     w8, #5002
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#20:                               // %lor.lhs.false2
        mov     w8, #10011
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_21:                             // %lor.lhs.false2
        mov     w8, #14013
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#22:                               // %lor.lhs.false2
        mov     w8, #15000
        cmp             w19, w8
        b.ne    .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
  -inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301649 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 16:04:03 +00:00
Teresa Johnson
4a6d61ac00 Memory intrinsic value profile optimization: Avoid divide by 0
Summary:
Skip memops if the total value profiled count is 0, we can't correctly
scale up the counts and there is no point anyway.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301645 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 14:30:54 +00:00
Max Kazantsev
6faf1a3c28 [EarlyCSE] Mark the condition of assume intrinsic as true
EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions.

Reviewers: sanjoy, reames, apilipenko, anna, skatkov

Reviewed By: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32482


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301625 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 06:25:39 +00:00
Max Kazantsev
49ccc8d7cb [EarlyCSE] Remove guards with conditions known to be true
If a condition is calculated only once, and there are multiple guards on this condition, we should be able
to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition
of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and
mark its condition as 'true' for future consideration.

Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin

Reviewed By: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32476


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301623 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 06:05:48 +00:00
Sanjay Patel
402d01739e [InstCombine] fix matcher to bind to specific operand (PR32830)
Matching any random value would be very wrong:
https://bugs.llvm.org/show_bug.cgi?id=32830


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301594 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-27 21:55:03 +00:00
Chandler Carruth
1d4cf6e01f [PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass.
Currently, this pass only focuses on *trivial* loop unswitching. At that
reduced problem it remains significantly better than the current loop
unswitch:
- Old pass is worse than cubic complexity. New pass is (I think) linear.
- New pass is much simpler in its design by focusing on full unswitching. (See
  below for details on this).
- New pass doesn't carry state for thresholds between pass iterations.
- New pass doesn't carry state for correctness (both miscompile and
  infloop) between pass iterations.
- New pass produces substantially better code after unswitching.
- New pass can handle more trivial unswitch cases.
- New pass doesn't recompute the dominator tree for the entire function
  and instead incrementally updates it.

I've ported all of the trivial unswitching test cases from the old pass
to the new one to make sure that major functionality isn't lost in the
process. For several of the test cases I've worked to improve the
precision and rigor of the CHECKs, but for many I've just updated them
to handle the new IR produced.

My initial motivation was the fact that the old pass carried state in
very unreliable ways between pass iterations, and these mechansims were
incompatible with the new pass manager. However, I discovered many more
improvements to make along the way.

This pass makes two very significant assumptions that enable most of these
improvements:

1) Focus on *full* unswitching -- that is, completely removing whatever
   control flow construct is being unswitched from the loop. In the case
   of trivial unswitching, this means removing the trivial (exiting)
   edge. In non-trivial unswitching, this means removing the branch or
   switch itself. This is in opposition to *partial* unswitching where
   some part of the unswitched control flow remains in the loop. Partial
   unswitching only really applies to switches and to folded branches.
   These are very similar to full unrolling and partial unrolling. The
   full form is an effective canonicalization, the partial form needs
   a complex cost model, cannot be iterated, isn't canonicalizing, and
   should be a separate pass that runs very late (much like unrolling).

2) Leverage LLVM's Loop machinery to the fullest. The original unswitch
   dates from a time when a great deal of LLVM's loop infrastructure was
   missing, ineffective, and/or unreliable. As a consequence, a lot of
   complexity was added which we no longer need.

With these two overarching principles, I think we can build a fast and
effective unswitcher that fits in well in the new PM and in the
canonicalization pipeline. Some of the remaining functionality around
partial unswitching may not be relevant today (not many test cases or
benchmarks I can find) but if they are I'd like to add support for them
as a separate layer that runs very late in the pipeline.

Purely to make reviewing and introducing this code more manageable, I've
split this into first a trivial-unswitch-only pass and in the next patch
I'll add support for full non-trivial unswitching against a *fixed*
threshold, exactly like full unrolling. I even plan to re-use the
unrolling thresholds, as these are incredibly similar cost tradeoffs:
we're cloning a loop body in order to end up with simplified control
flow. We should only do that when the total growth is reasonably small.

One of the biggest changes with this pass compared to the previous one
is that previously, each individual trivial exiting edge from a switch
was unswitched separately as a branch. Now, we unswitch the entire
switch at once, with cases going to the various destinations. This lets
us unswitch multiple exiting edges in a single operation and also avoids
numerous extremely bad behaviors, where we would introduce 1000s of
branches to test for thousands of possible values, all of which would
take the exact same exit path bypassing the loop. Now we will use
a switch with 1000s of cases that can be efficiently lowered into
a jumptable. This avoids relying on somehow forming a switch out of the
branches or getting horrible code if that fails for any reason.

Another significant change is that this pass actively updates the CFG
based on unswitching. For trivial unswitching, this is actually very
easy because of the definition of loop simplified form. Doing this makes
the code coming out of loop unswitch dramatically more friendly. We
still should run loop-simplifycfg (at the least) after this to clean up,
but it will have to do a lot less work.

Finally, this pass makes much fewer attempts to simplify instructions
based on the unswitch. Something like loop-instsimplify, instcombine, or
GVN can be used to do increasingly powerful simplifications based on the
now dominating predicate. The old simplifications are things that
something like loop-instsimplify should get today or a very, very basic
loop-instcombine could get. Keeping that logic separate is a big
simplifying technique.

Most of the code in this pass that isn't in the old one has to do with
achieving specific goals:
- Updating the dominator tree as we go
- Unswitching all cases in a switch in a single step.

I think it is still shorter than just the trivial unswitching code in
the old pass despite having this functionality.

Differential Revision: https://reviews.llvm.org/D32409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301576 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-27 18:45:20 +00:00
Eli Friedman
9dc569fec5 [GlobalOpt] Correctly update metadata when localizing a global.
Just calling dropAllReferences leaves pointers to the ConstantExpr
behind, so we would eventually crash with a null pointer dereference.

Differential Revision: https://reviews.llvm.org/D32551



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301575 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-27 18:39:08 +00:00
Xinliang David Li
5a24953af4 [PartialInlining]: Improve partial inlining to handle complex conditions
Differential Revision: http://reviews.llvm.org/D32249


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301561 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-27 16:34:00 +00:00
Chandler Carruth
bde56a9699 Disable GVN Hoist due to still more bugs being found in it. There is
also a discussion about exactly what we should do prior to re-enabling
it.

The current bug is http://llvm.org/PR32821 and the discussion about this
is in the review thread for r300200.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301505 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-27 00:28:03 +00:00
Matthew Simpson
86197951a0 [LV] Handle external uses of floating-point induction variables
Reference: https://bugs.llvm.org/show_bug.cgi?id=32758
Differential Revision: https://reviews.llvm.org/D32445

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301428 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-26 16:23:02 +00:00
Craig Topper
4a48147c23 [InstCombine] Add test cases for opportunities to improve knownbits handling for cttz and ctlz intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301385 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-26 05:59:19 +00:00