630 Commits

Author SHA1 Message Date
Nikita Popov
e568120da3 [InstCombine] Fold uadd.sat(a, b) == 0 and usub.sat(a, b) == 0
This adds folds for comparing uadd.sat/usub.sat with zero:

 * uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0
 * usub.sat(a, b) == 0 => a <= b

And inverted forms for !=.

Differential Revision: https://reviews.llvm.org/D69224

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375374 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-20 20:19:42 +00:00
Roman Lebedev
32d24a3249 [InstCombine] Shift amount reassociation in shifty sign bit test (PR43595)
Summary:
This problem consists of several parts:
* Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`.
  This is trivial, and easy to do, we have a fold for it.
* Shift amount reassociation - if we have two identical shifts,
  and we can simplify-add their shift amounts together,
  then we likely can just perform them as a single shift.
  But this is finicky, has one-use restrictions,
  and shift opcodes must be identical.

But there is a super-pattern where both of these work together.
to produce sign bit test from two shifts + comparison.
We do indeed already handle this in most cases.
But since we get that fold transitively, it has one-use restrictions.
And what's worse, in this case the right-shifts aren't required to be
identical, and we can't handle that transitively:

If the total shift amount is bitwidth-1, only a sign bit will remain
in the output value. But if we look at this from the perspective of
two shifts, we can't fold - we can't possibly know what bit pattern
we'd produce via two shifts, it will be *some* kind of a mask
produced from original sign bit, but we just can't tell it's shape:
https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN

But it will *only* contain sign bit and zeros.
So from the perspective of sign bit test, we're good:
https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU
Superb!

So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a
sudo-analysis mode that will ignore extra-uses, and will only check
whether a) those are two right shifts and b) they end up with bitwidth(x)-1
shift amount and return either the original value that we sign-checking,
or null.

This does not have any functionality change for
the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`.

All that being said, as disscussed in the review, this yet again
increases usage of instsimplify in instcombine as utility.
Some day that may need to be reevaluated.

https://bugs.llvm.org/show_bug.cgi?id=43595

Reviewers: spatel, efriedma, vsk

Reviewed By: spatel

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68930

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375371 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-20 19:38:50 +00:00
Roman Lebedev
322375e17e [InstCombine] Move isSignBitCheck(), handle rest of the predicates
True, no test coverage is being added here. But those non-canonical
predicates that are already handled here already have no test coverage
as far as i can tell. I tried to add tests for them, but all the patterns
already get handled elsewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373962 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-07 20:53:08 +00:00
Roman Lebedev
98b5e22e78 [InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> 'icmp sge/slt %x, 0'
We do indeed already get it right in some cases, but only transitively,
with one-use restrictions. Since we only need to produce a single
comparison, it makes sense to match the pattern directly:
  https://rise4fun.com/Alive/kPg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373802 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-04 22:16:22 +00:00
Bjorn Pettersson
8f1420e04b [InstCombine] Don't assume CmpInst has been visited in getFlippedStrictnessPredicateAndConstant
Summary:
Removing an assumption (assert) that the CmpInst already has been
simplified in getFlippedStrictnessPredicateAndConstant. Solution is
to simply bail out instead of hitting the assertion. Instead we
assume that any profitable rewrite will happen in the next iteration
of InstCombine.

The reason why we can't assume that the CmpInst already has been
simplified is that the worklist does not guarantee such an ordering.

Solves https://bugs.llvm.org/show_bug.cgi?id=43376

Reviewers: spatel, lebedev.ri

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68022

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372972 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-26 12:16:01 +00:00
Roman Lebedev
864d2f9cf8 [InstCombine] Fold (A - B) u>=/u< A --> B u>/u<= A iff B != 0
https://rise4fun.com/Alive/KtL

This also shows that the fold added in D67412 / r372257
was too specific, and the new fold allows those test cases
to be handled more generically, therefore i delete now-dead code.

This is yet again motivated by
D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372912 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-25 19:06:40 +00:00
Sanjay Patel
7c6724540c [InstCombine] allow icmp+binop folds before min/max bailout (PR43310)
This has the potential to uncover missed analysis/folds as shown in the
min/max code comment/test, but fewer restrictions on icmp folds should
be better in general to solve cases like:
https://bugs.llvm.org/show_bug.cgi?id=43310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372510 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-22 14:31:53 +00:00
Sanjay Patel
49121b4df9 [InstCombine] remove unneeded one-use checks for icmp fold
Related folds were added in:
rL125734
...the code comment about register pressure is discussed in
more detail in:
https://bugs.llvm.org/show_bug.cgi?id=2698

But 10 years later, perf testing bzip2 with this change now
shows a slight (0.2% average) improvement on Haswell although
that's probably within test noise.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 and rL371981 are related patches in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372007 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 16:15:25 +00:00
Sanjay Patel
cba3b2c1c3 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734 <https://reviews.llvm.org/rL125734>
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 is a related patch in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371981 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 12:54:34 +00:00
Sanjay Patel
1a3c0ac4ea [InstCombine] fix comments to match code; NFC
This blob was written before match() existed, so it
could probably be reduced significantly.

But I suspect it isn't well tested, so tests would have
to be added to reduce risk from logic changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371978 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 12:12:05 +00:00
Sanjay Patel
3050f70ce7 [InstCombine] remove unneeded one-use checks for icmp fold
This fold and several others were added in:
rL125734
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

There are similar checks as noted with the TODO comments. I'm
hoping to remove those restrictions too, but if any of these
does cause a regression, it should be easier to correct by making
small, individual commits.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371940 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 20:56:34 +00:00
Sanjay Patel
3813551fb7 [InstCombine] fold sign-bit compares of srem
(srem X, pow2C) sgt/slt 0 can be reduced using bit hacks by masking
off the sign bit and the module (low) bits:
https://rise4fun.com/Alive/jSO
A '2' divisor allows slightly more folding:
https://rise4fun.com/Alive/tDBM

Any chance to remove an 'srem' use is probably worthwhile, but this is limited
to the one-use improvement case because doing more may expose other missing
folds. That means it does nothing for PR21929 yet:
https://bugs.llvm.org/show_bug.cgi?id=21929

Differential Revision: https://reviews.llvm.org/D67334

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371610 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-11 12:04:26 +00:00
Matt Arsenault
b3ddf6ec6b InstCombine: Fix crash on icmp of gep with addrspacecasted null
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371146 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-05 23:39:21 +00:00
Roman Lebedev
38612e9768 [InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned sub overflow' check
A follow-up for r329011.
This may be changed to produce @llvm.sub.with.overflow in a later patch,
but for now just make things more consistent overall.

A few observations stem from this:
* There does not seem to be a similar one-instruction fold for uadd-overflow
* I'm not sure we'll want to canonicalize `B u> A` as `usub.with.overflow`,
  so since the `icmp` here no longer refers to `sub`,
  reconstructing `usub.with.overflow` will be problematic,
  and will likely require standalone pass (similar to DivRemPairs).

https://rise4fun.com/Alive/Zqs

Name: (A - B) u> A --> B u> A
  %t0 = sub i8 %A, %B
  %r = icmp ugt i8 %t0, %A
=>
  %r = icmp ugt i8 %B, %A

Name: (A - B) u<= A --> B u<= A
  %t0 = sub i8 %A, %B
  %r = icmp ule i8 %t0, %A
=>
  %r = icmp ule i8 %B, %A

Name: C u< (C - D) --> C u< D
  %t0 = sub i8 %C, %D
  %r = icmp ult i8 %C, %t0
=>
  %r = icmp ult i8 %C, %D

Name: C u>= (C - D) --> C u>= D
  %t0 = sub i8 %C, %D
  %r = icmp uge i8 %C, %t0
=>
  %r = icmp uge i8 %C, %D

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371101 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-05 17:41:02 +00:00
Roman Lebedev
a6c7a7dd8d [InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned add overflow' check
A follow-up for r342004.
This will be changed to produce @llvm.add.with.overflow in a later patch,
but for now just make things more consistent overall.

https://rise4fun.com/Alive/qxE

Name: (Op1 + X) u< Op1 --> ~Op1 u< X
  %t0 = add i8 %Op1, %X
  %r = icmp ult i8 %t0, %Op1
=>
  %n = xor i8 %Op1, -1
  %r = icmp ult i8 %n, %X

Name: (Op1 + X) u>= Op1 --> ~Op1 u>= X
  %t0 = add i8 %Op1, %X
  %r = icmp uge i8 %t0, %Op1
=>
  %n = xor i8 %Op1, -1
  %r = icmp uge i8 %n, %X

;-------------------------------------------------------------------------------

Name: Op0 u> (Op0 + X) --> X u> ~Op0
  %t0 = add i8 %Op0, %X
  %r = icmp ugt i8 %Op0, %t0
=>
  %n = xor i8 %Op0, -1
  %r = icmp ugt i8 %X, %n

Name: Op0 u<= (Op0 + X) --> X u<= ~Op0
  %t0 = add i8 %Op0, %X
  %r = icmp ule i8 %Op0, %t0
=>
  %n = xor i8 %Op0, -1
  %r = icmp ule i8 %X, %n

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371100 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-05 17:40:49 +00:00
Roman Lebedev
1bb1b5ebe5 [InstCombine] Fold '((%x * %y) u/ %x) != %y' to '@llvm.umul.with.overflow' + overflow bit extraction
Summary:
`((%x * %y) u/ %x) != %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
$ /repositories/alive2/build-Clang-unknown/alive -root-only ~/llvm-patch1.ll
Processing /home/lebedevri/llvm-patch1.ll..

----------------------------------------
Name: no overflow
  %o0 = mul i4 %y, %x
  %o1 = udiv i4 %o0, %x
  %r = icmp ne i4 %o1, %y
  ret i1 %r
=>
  %n0 = umul_overflow i4 %x, %y
  %o0 = extractvalue {i4, i1} %n0, 0
  %o1 = udiv %o0, %x
  %r = extractvalue {i4, i1} %n0, 1
  ret %r

Done: 1
Optimization is correct!

----------------------------------------
Name: no overflow
  %o0 = mul i4 %y, %x
  %o1 = udiv i4 %o0, %x
  %r = icmp eq i4 %o1, %y
  ret i1 %r
=>
  %n0 = umul_overflow i4 %x, %y
  %o0 = extractvalue {i4, i1} %n0, 0
  %o1 = udiv %o0, %x
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1
  ret i1 %r

Done: 1
Optimization is correct!

```

Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65144

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370348 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-29 12:47:20 +00:00
Roman Lebedev
817cb7ab7f [InstCombine] Fold '(-1 u/ %x) u< %y' to '@llvm.umul.with.overflow' + overflow bit extraction
Summary:
`(-1 u/ %x) u< %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
----------------------------------------
Name: no overflow
  %o0 = udiv i4 -1, %x
  %r = icmp ult i4 %o0, %y
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %n0, 1

Done: 1
Optimization is correct!

----------------------------------------
Name: no overflow, swapped
  %o0 = udiv i4 -1, %x
  %r = icmp ugt i4 %y, %o0
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %n0, 1

Done: 1
Optimization is correct!

----------------------------------------
Name: overflow
  %o0 = udiv i4 -1, %x
  %r = icmp uge i4 %o0, %y
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1

Done: 1
Optimization is correct!

----------------------------------------
Name: overflow
  %o0 = udiv i4 -1, %x
  %r = icmp ule i4 %y, %o0
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1

Done: 1
Optimization is correct!
```

As it can be observed from tests, while simply forming the `@llvm.umul.with.overflow`
is easy, if we were looking for the inverted answer, then more work needs to be done
to cleanup the now-pointless control-flow that was guarding against division-by-zero.
This is being addressed in follow-up patches.

Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon

Reviewed By: nikic, xbolva00

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370347 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-29 12:47:08 +00:00
Roman Lebedev
f2cb8a934d [InstCombine] Shift amount reassociation in bittest: trunc-of-lshr (PR42399)
Summary:
Finally, the fold i was looking forward to :)

The legality check is muddy, i doubt  i've groked the full generalization,
but it handles all the cases i care about, and can come up with:
https://rise4fun.com/Alive/26j

I.e. we can perform the fold if **any** of the following is true:
* The shift amount is either zero or one less than widest bitwidth
* Either of the values being shifted has at most lowest bit set
* The value that is being shifted by `shl` (which is not truncated) should have no less leading zeros than the total shift amount;
* The value that is being shifted by `lshr` (which **is** truncated) should have no less leading zeros than the widest bit width minus total shift amount minus one

I strongly suspect there is some better generalization, but i'm not aware of it as of right now.
For now i also avoided using actual `computeKnownBits()`, but restricted it to constants.

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370324 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-29 10:26:23 +00:00
Simon Pilgrim
63cd984c94 Fix variable set but no used warning on NDEBUG builds. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370317 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-29 09:58:47 +00:00
Craig Topper
220ab2c056 [InstCombine] Disable recursion in foldGEPICmp for vector pointer GEPs
Due to missing vector support in this function, recursion can
generate worse code in some cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370221 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-28 15:40:34 +00:00
Craig Topper
63ba0bd5d7 [InstCombine] Disable some portions of foldGEPICmp for GEPs that return a vector of pointers. Fix other portions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370114 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-27 21:38:56 +00:00
Philip Reames
bcb37b91cc [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null for vectors
Extend the transform introduced in https://reviews.llvm.org/D66608 to work for vector geps as well.

Differential Revision: https://reviews.llvm.org/D66671



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369949 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-26 19:11:49 +00:00
Roman Lebedev
17bb71c533 [InstCombine] matchThreeWayIntCompare(): commutativity awareness
Summary:
`matchThreeWayIntCompare()` looks for
```
   select i1 (a == b),
          i32 Equal,
          i32 (select i1 (a < b), i32 Less, i32 Greater)
```
but both of these selects/compares can be in it's commuted form,
so out of 8 variants, only the two most basic ones is handled.
This fixes regression being introduced in D66232.

Reviewers: spatel, nikic, efriedma, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369841 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-24 06:49:36 +00:00
Roman Lebedev
2f0aeb5b36 [InstCombine] Try to reuse constant from select in leading comparison
Summary:
If we have e.g.:
```
  %t = icmp ult i32 %x, 65536
  %r = select i1 %t, i32 %y, i32 65535
```
the constants `65535` and `65536` are suspiciously close.
We could perform a transformation to deduplicate them:
```
Name: ult
%t = icmp ult i32 %x, 65536
%r = select i1 %t, i32 %y, i32 65535
  =>
%t.inv = icmp ugt i32 %x, 65535
%r = select i1 %t.inv, i32 65535, i32 %y
```
https://rise4fun.com/Alive/avb

While this may seem esoteric, this should certainly be good for vectors
(less constant pool usage) and for opt-for-size - need to have only one constant.

But the real fun part here is that it allows further transformation,
in particular it finishes cleaning up the `clamp` folding,
see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`.
We start with e.g.
```
  %dont_need_to_clamp_positive = icmp sle i32 %X, 32767
  %dont_need_to_clamp_negative = icmp sge i32 %X, -32768
  %clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767
  %dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative
  %R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit
```
without this patch we currently produce
```
  %1 = icmp slt i32 %X, 32768
  %2 = icmp sgt i32 %X, -32768
  %3 = select i1 %2, i32 %X, i32 -32768
  %R = select i1 %1, i32 %3, i32 32767
```
which isn't really a `clamp` - both comparisons are performed on the original value,
this patch changes it into
```
  %1.inv = icmp sgt i32 %X, 32767
  %2 = icmp sgt i32 %X, -32768
  %3 = select i1 %2, i32 %X, i32 -32768
  %R = select i1 %1.inv, i32 32767, i32 %3
```
and then the magic happens! Some further transform finishes polishing it and we finally get:
```
  %t1 = icmp sgt i32 %X, -32768
  %t2 = select i1 %t1, i32 %X, i32 -32768
  %t3 = icmp slt i32 %t2, 32767
  %R = select i1 %t3, i32 %t2, i32 32767
```
which is beautiful and just what we want.

Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization:
https://rise4fun.com/Alive/THl
Proofs for the fold itself: https://rise4fun.com/Alive/THl

Reviewers: spatel, dmgreen, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369840 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-24 06:49:25 +00:00
Philip Reames
dfdde6d06e Fix a bug in just submitted rL369789
Started implementing the vector case and realized the scalar case hadn't handled the GEP producing a different type than the base correctly.  It's entertaining seeing what slips through review when we're focused on the 'hard' parts.  :(

Also adding an extra vector test as it happened to be in workspace and wasn't worth separating.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369795 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-23 18:27:57 +00:00
Philip Reames
16ed7f8f59 [InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null
This generalizes the isGEPKnownNonNull rule from ValueTracking to apply when we do not know if the base is non-null, and thus need to replace one condition with another.

The core notion is that since an inbounds GEP can only form null if the base pointer is null and the offset is zero. However, if the offset is non-zero, the the "inbounds" marker makes the result poison. Thus, we're free to ignore the case where the offset is non-zero. Similarly, there's no case under which a non-null base can result in a null result without generating poison.

Differential Revision: https://reviews.llvm.org/D66608



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369789 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-23 17:58:58 +00:00
Philip Reames
59c28fddaa [instcombine] icmp eq/ne (sub C, Y), C -> icmp eq/ne Y, 0
Noticed while looking at pr43028.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369541 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-21 15:51:57 +00:00
Sanjay Patel
fdfa6e2423 [InstCombine] narrow icmp with extended operands of different widths
An intermediate extend is used to widen the narrow operand to the width of
the other (wider) operand. At that point, we have the same logic as the
existing transform that was restricted to folds of equal width zext/sext.

This mostly solves PR42700:
https://bugs.llvm.org/show_bug.cgi?id=42700

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369519 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-21 11:56:08 +00:00
Sanjay Patel
e687343b4c [InstCombine] add helper function for icmp+zext/sext; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369421 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-20 18:15:17 +00:00
Sanjay Patel
f066c9f731 [InstCombine] make fold for icmp with sext more efficient; NFC
We were creating 2 instructions and relying on a subsequent fold
to invert a not(icmp). Create the final icmp directly instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369411 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-20 17:03:22 +00:00
Sanjay Patel
afb27048eb [InstCombine] improve readability for icmp with cast folds; NFC
1. Update function name and stale code comments.
2. Use variable names that are less ambiguous.
3. Move operand checks into the function as early exits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369390 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-20 14:56:44 +00:00
Roman Lebedev
34e84970cc [InstCombine] Cherry-pick NFC cleanups of foldShiftIntoShiftInAnotherHandOfAndInICmp() from D66383
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369207 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-18 12:26:33 +00:00
Roman Lebedev
2f2b56c4d7 [InstCombine] Shift amount reassociation in bittest: trunc-of-shl (PR42399)
Summary:
This is continuation of D63829 / https://bugs.llvm.org/show_bug.cgi?id=42399

I thought naive pattern would solve my issue, but nope, it involved truncation,
thus more folds needed.. This isn't really the fold i'm interested in,
i need trunc-of-lshr, but i'we decided to start with `shl` because it's simpler.

In this case, no extra legality checks are needed:
https://rise4fun.com/Alive/CAb

We should be careful about not increasing instruction count,
since we need to produce `zext` because `and` is done in wider type.

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369117 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-16 15:10:41 +00:00
Roman Lebedev
d1ae485811 [InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of canonicalizeCmpWithConstant(), NFCI
I'd like to use it elsewhere, hopefully without reinventing the wheel.
No functional change intended so far.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368820 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-14 09:57:20 +00:00
Roman Lebedev
fcd00601e2 [InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid constantexpr pitfail (PR42962)
Instead of matching value and then blindly casting to BinaryOperator
just to get the opcode, just match instruction and do no cast.

Fixes https://bugs.llvm.org/show_bug.cgi?id=42962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368554 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-12 11:28:02 +00:00
Roman Lebedev
8115d87edc [InstCombine][NFC] Use SimplifyAddInst() instead of SimplifyBinOp(Instruction::BinaryOps::Add, )
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368521 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-10 19:29:10 +00:00
Roman Lebedev
80815a1a90 [InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant
If one of the values being shifted is a constant, since the new shift
amount is known-constant, the new shift will end up being constant-folded
so, we don't need that one-use restriction then.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368519 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-10 19:28:54 +00:00
Roman Lebedev
2a686e81e5 [InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction
That one-use restriction is not needed for correctness - we have already
ensured that one of the shifts will go away, so we know we won't increase
the instruction count. So there is no need for that restriction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368518 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-10 19:28:44 +00:00
Roman Lebedev
64018f7f99 [InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is power-of-two
Summary:
I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling.

While we did happen to handle this fold in most of the cases,
the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two),
or first turn `x s% -y` to `x u% y`; that does handle most of the cases.
But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`,
and thus we end up being stuck with `(x s% INT_MIN) == 0`.

There is no such restriction for the more general fold:
https://rise4fun.com/Alive/IIeS

To be noted, the fold does not enforce that `y` is a constant,
so it may indeed increase instruction count.
This is consistent with what `x u% y`->`x & (y-1)` already does.
I think it makes sense, it's at most one (simple) extra instruction,
while `rem`ainder is really much more un-simple (and likely **very** costly).

Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367322 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-30 15:28:22 +00:00
Roman Lebedev
e90a51722f [PatternMatch] Generalize m_SpecificInt_ULT() to take ICmpInst::Predicate
As discussed in the original review, this may be useful,
so let's just do it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365652 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-10 16:07:35 +00:00
Sanjay Patel
3bfa6ac3a8 [InstCombine] reduce more checks for power-of-2-or-zero using ctpop
Extends the transform from:
rL364341
...to include another (more common?) pattern that tests whether a
value is a power-of-2 (including or excluding zero).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364856 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 22:00:00 +00:00
Roman Lebedev
458204d465 [InstCombine] Shift amount reassociation in bittest (PR42399)
Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0`  iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364791 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 15:55:15 +00:00
Roman Lebedev
1c1bc8778e [InstCombine] Omit 'urem' where possible
This was added in D63390 / rL364286 to backend,
but it makes sense to also handle it in middle-end.
https://rise4fun.com/Alive/Zsln

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364738 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-01 09:41:43 +00:00
Huihui Zhang
ab07556fab [InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0.
Simplify 'shl' inequality test into 'and' equality test.

This pattern happens in the middle-end while simplifying bitfield access,
Exposed in https://reviews.llvm.org/D63505

https://rise4fun.com/Alive/6uz

Reviewers: lebedev.ri, efriedma

Reviewed By: lebedev.ri

Subscribers: spatel, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364348 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 20:44:52 +00:00
Sanjay Patel
08e75cad94 [InstCombine] reduce checks for power-of-2-or-zero using ctpop
This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero.

This is matching the isPowerOf2() patterns used in PR42314:
https://bugs.llvm.org/show_bug.cgi?id=42314

But there's at least 1 instcombine follow-up needed to match the alternate form:

(v & (v - 1)) == 0;

We should have all of the backend expansions handled with:
rL364319
(x86-specific changes still needed for optimal code based on subtarget)

And the larger patterns to exclude zero as a power-of-2 are joining with this change after:
rL364153 ( D63660 )
rL364246

Differential Revision: https://reviews.llvm.org/D63777

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364341 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 18:51:44 +00:00
Huihui Zhang
e4223f994b [InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier.
Summary:
To generate simplified IR, make sure fold
  (X & ~C) ==/!= 0 --> X u</u>= C+1

is scheduled before fold
  ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.

https://rise4fun.com/Alive/7ZN

Reviewers: lebedev.ri, efriedma, spatel, craig.topper

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63505

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364255 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 00:09:10 +00:00
Sanjay Patel
3575a5c66f [InstCombine] fix typo in comment; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363974 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-20 20:23:32 +00:00
Sanjay Patel
0e1f2dd708 [InstCombine] canonicalize check for power-of-2
The form that compares against 0 is better because:
1. It removes a use of the input value.
2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2
3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS).

This is a root cause for PR42314, but probably doesn't completely answer the codegen request:
https://bugs.llvm.org/show_bug.cgi?id=42314

Alive proof:
https://rise4fun.com/Alive/9kG

  Name: is power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp eq i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp eq i32 %a2, 0

  Name: is not power-of-2
  %neg = sub i32 0, %x
  %a = and i32 %neg, %x
  %r = icmp ne i32 %a, %x
  =>
  %dec = add i32 %x, -1
  %a2 = and i32 %dec, %x
  %r = icmp ne i32 %a2, 0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363956 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-20 17:41:15 +00:00
Huihui Zhang
3104139a5b [InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier
Summary:
To generate simplified IR, make sure fold
```
  (X & signbit) ==/!= 0) -> X s>=/s< 0;
```
is scheduled before fold
```
  ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0.
```

https://rise4fun.com/Alive/fbdh

Reviewers: lebedev.ri, efriedma, spatel, craig.topper

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63026

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363845 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 17:31:39 +00:00
Roman Lebedev
1376d1f533 [InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles
A precondition 'x != 0' was forgotten by me:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL

These 4 folds with non-constants could be re-enabled,
but for now let's go for the simplest solution.

https://bugs.llvm.org/show_bug.cgi?id=42198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362911 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-09 16:30:42 +00:00