archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Nikita Popov	e568120da3	[InstCombine] Fold uadd.sat(a, b) == 0 and usub.sat(a, b) == 0 This adds folds for comparing uadd.sat/usub.sat with zero: * uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a \| b) == 0 * usub.sat(a, b) == 0 => a <= b And inverted forms for !=. Differential Revision: https://reviews.llvm.org/D69224 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375374 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 20:19:42 +00:00
Roman Lebedev	32d24a3249	[InstCombine] Shift amount reassociation in shifty sign bit test (PR43595) Summary: This problem consists of several parts: * Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`. This is trivial, and easy to do, we have a fold for it. * Shift amount reassociation - if we have two identical shifts, and we can simplify-add their shift amounts together, then we likely can just perform them as a single shift. But this is finicky, has one-use restrictions, and shift opcodes must be identical. But there is a super-pattern where both of these work together. to produce sign bit test from two shifts + comparison. We do indeed already handle this in most cases. But since we get that fold transitively, it has one-use restrictions. And what's worse, in this case the right-shifts aren't required to be identical, and we can't handle that transitively: If the total shift amount is bitwidth-1, only a sign bit will remain in the output value. But if we look at this from the perspective of two shifts, we can't fold - we can't possibly know what bit pattern we'd produce via two shifts, it will be some kind of a mask produced from original sign bit, but we just can't tell it's shape: https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN But it will only contain sign bit and zeros. So from the perspective of sign bit test, we're good: https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU Superb! So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a sudo-analysis mode that will ignore extra-uses, and will only check whether a) those are two right shifts and b) they end up with bitwidth(x)-1 shift amount and return either the original value that we sign-checking, or null. This does not have any functionality change for the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`. All that being said, as disscussed in the review, this yet again increases usage of instsimplify in instcombine as utility. Some day that may need to be reevaluated. https://bugs.llvm.org/show_bug.cgi?id=43595 Reviewers: spatel, efriedma, vsk Reviewed By: spatel Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68930 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375371 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-20 19:38:50 +00:00
Roman Lebedev	322375e17e	[InstCombine] Move isSignBitCheck(), handle rest of the predicates True, no test coverage is being added here. But those non-canonical predicates that are already handled here already have no test coverage as far as i can tell. I tried to add tests for them, but all the patterns already get handled elsewhere. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373962 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-07 20:53:08 +00:00
Roman Lebedev	98b5e22e78	[InstCombine] Fold 'icmp eq/ne (?trunc (lshr/ashr %x, bitwidth(x)-1)), 0' -> 'icmp sge/slt %x, 0' We do indeed already get it right in some cases, but only transitively, with one-use restrictions. Since we only need to produce a single comparison, it makes sense to match the pattern directly: https://rise4fun.com/Alive/kPg git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373802 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-04 22:16:22 +00:00
Bjorn Pettersson	8f1420e04b	[InstCombine] Don't assume CmpInst has been visited in getFlippedStrictnessPredicateAndConstant Summary: Removing an assumption (assert) that the CmpInst already has been simplified in getFlippedStrictnessPredicateAndConstant. Solution is to simply bail out instead of hitting the assertion. Instead we assume that any profitable rewrite will happen in the next iteration of InstCombine. The reason why we can't assume that the CmpInst already has been simplified is that the worklist does not guarantee such an ordering. Solves https://bugs.llvm.org/show_bug.cgi?id=43376 Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68022 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372972 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-26 12:16:01 +00:00
Roman Lebedev	864d2f9cf8	[InstCombine] Fold (A - B) u>=/u< A --> B u>/u<= A iff B != 0 https://rise4fun.com/Alive/KtL This also shows that the fold added in D67412 / r372257 was too specific, and the new fold allows those test cases to be handled more generically, therefore i delete now-dead code. This is yet again motivated by D67122 "[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372912 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-25 19:06:40 +00:00
Sanjay Patel	7c6724540c	[InstCombine] allow icmp+binop folds before min/max bailout (PR43310) This has the potential to uncover missed analysis/folds as shown in the min/max code comment/test, but fewer restrictions on icmp folds should be better in general to solve cases like: https://bugs.llvm.org/show_bug.cgi?id=43310 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372510 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-22 14:31:53 +00:00
Sanjay Patel	49121b4df9	[InstCombine] remove unneeded one-use checks for icmp fold Related folds were added in: rL125734 ...the code comment about register pressure is discussed in more detail in: https://bugs.llvm.org/show_bug.cgi?id=2698 But 10 years later, perf testing bzip2 with this change now shows a slight (0.2% average) improvement on Haswell although that's probably within test noise. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 and rL371981 are related patches in this series. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372007 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 16:15:25 +00:00
Sanjay Patel	cba3b2c1c3	[InstCombine] remove unneeded one-use checks for icmp fold This fold and several others were added in: rL125734 <https://reviews.llvm.org/rL125734> ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. rL371940 is a related patch in this series. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371981 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 12:54:34 +00:00
Sanjay Patel	1a3c0ac4ea	[InstCombine] fix comments to match code; NFC This blob was written before match() existed, so it could probably be reduced significantly. But I suspect it isn't well tested, so tests would have to be added to reduce risk from logic changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371978 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-16 12:12:05 +00:00
Sanjay Patel	3050f70ce7	[InstCombine] remove unneeded one-use checks for icmp fold This fold and several others were added in: rL125734 ...with no explanation for the one-use checks other than the code comments about register pressure. Given that this is IR canonicalization, we shouldn't be worried about register pressure though; the backend should be able to adjust for that as needed. There are similar checks as noted with the TODO comments. I'm hoping to remove those restrictions too, but if any of these does cause a regression, it should be easier to correct by making small, individual commits. This is part of solving PR43310 the theoretically right way: https://bugs.llvm.org/show_bug.cgi?id=43310 ...ie, if we don't cripple basic transforms, then we won't need to add special-case code to detect larger patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371940 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-15 20:56:34 +00:00
Sanjay Patel	3813551fb7	[InstCombine] fold sign-bit compares of srem (srem X, pow2C) sgt/slt 0 can be reduced using bit hacks by masking off the sign bit and the module (low) bits: https://rise4fun.com/Alive/jSO A '2' divisor allows slightly more folding: https://rise4fun.com/Alive/tDBM Any chance to remove an 'srem' use is probably worthwhile, but this is limited to the one-use improvement case because doing more may expose other missing folds. That means it does nothing for PR21929 yet: https://bugs.llvm.org/show_bug.cgi?id=21929 Differential Revision: https://reviews.llvm.org/D67334 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371610 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-11 12:04:26 +00:00
Matt Arsenault	b3ddf6ec6b	InstCombine: Fix crash on icmp of gep with addrspacecasted null git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371146 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-05 23:39:21 +00:00
Roman Lebedev	38612e9768	[InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned sub overflow' check A follow-up for r329011. This may be changed to produce @llvm.sub.with.overflow in a later patch, but for now just make things more consistent overall. A few observations stem from this: * There does not seem to be a similar one-instruction fold for uadd-overflow * I'm not sure we'll want to canonicalize `B u> A` as `usub.with.overflow`, so since the `icmp` here no longer refers to `sub`, reconstructing `usub.with.overflow` will be problematic, and will likely require standalone pass (similar to DivRemPairs). https://rise4fun.com/Alive/Zqs Name: (A - B) u> A --> B u> A %t0 = sub i8 %A, %B %r = icmp ugt i8 %t0, %A => %r = icmp ugt i8 %B, %A Name: (A - B) u<= A --> B u<= A %t0 = sub i8 %A, %B %r = icmp ule i8 %t0, %A => %r = icmp ule i8 %B, %A Name: C u< (C - D) --> C u< D %t0 = sub i8 %C, %D %r = icmp ult i8 %C, %t0 => %r = icmp ult i8 %C, %D Name: C u>= (C - D) --> C u>= D %t0 = sub i8 %C, %D %r = icmp uge i8 %C, %t0 => %r = icmp uge i8 %C, %D git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371101 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-05 17:41:02 +00:00
Roman Lebedev	a6c7a7dd8d	[InstCombine] foldICmpBinOp(): consider inverted check in 'unsigned add overflow' check A follow-up for r342004. This will be changed to produce @llvm.add.with.overflow in a later patch, but for now just make things more consistent overall. https://rise4fun.com/Alive/qxE Name: (Op1 + X) u< Op1 --> ~Op1 u< X %t0 = add i8 %Op1, %X %r = icmp ult i8 %t0, %Op1 => %n = xor i8 %Op1, -1 %r = icmp ult i8 %n, %X Name: (Op1 + X) u>= Op1 --> ~Op1 u>= X %t0 = add i8 %Op1, %X %r = icmp uge i8 %t0, %Op1 => %n = xor i8 %Op1, -1 %r = icmp uge i8 %n, %X ;------------------------------------------------------------------------------- Name: Op0 u> (Op0 + X) --> X u> ~Op0 %t0 = add i8 %Op0, %X %r = icmp ugt i8 %Op0, %t0 => %n = xor i8 %Op0, -1 %r = icmp ugt i8 %X, %n Name: Op0 u<= (Op0 + X) --> X u<= ~Op0 %t0 = add i8 %Op0, %X %r = icmp ule i8 %Op0, %t0 => %n = xor i8 %Op0, -1 %r = icmp ule i8 %X, %n git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371100 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-05 17:40:49 +00:00
Roman Lebedev	1bb1b5ebe5	[InstCombine] Fold '((%x * %y) u/ %x) != %y' to '@llvm.umul.with.overflow' + overflow bit extraction Summary: `((%x * %y) u/ %x) != %y` is one of (3?) common ways to check that some unsigned multiplication (will not) overflow. Currently, we don't catch it. We could: ``` $ /repositories/alive2/build-Clang-unknown/alive -root-only ~/llvm-patch1.ll Processing /home/lebedevri/llvm-patch1.ll.. ---------------------------------------- Name: no overflow %o0 = mul i4 %y, %x %o1 = udiv i4 %o0, %x %r = icmp ne i4 %o1, %y ret i1 %r => %n0 = umul_overflow i4 %x, %y %o0 = extractvalue {i4, i1} %n0, 0 %o1 = udiv %o0, %x %r = extractvalue {i4, i1} %n0, 1 ret %r Done: 1 Optimization is correct! ---------------------------------------- Name: no overflow %o0 = mul i4 %y, %x %o1 = udiv i4 %o0, %x %r = icmp eq i4 %o1, %y ret i1 %r => %n0 = umul_overflow i4 %x, %y %o0 = extractvalue {i4, i1} %n0, 0 %o1 = udiv %o0, %x %n1 = extractvalue {i4, i1} %n0, 1 %r = xor %n1, -1 ret i1 %r Done: 1 Optimization is correct! ``` Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65144 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370348 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-29 12:47:20 +00:00
Roman Lebedev	817cb7ab7f	[InstCombine] Fold '(-1 u/ %x) u< %y' to '@llvm.umul.with.overflow' + overflow bit extraction Summary: `(-1 u/ %x) u< %y` is one of (3?) common ways to check that some unsigned multiplication (will not) overflow. Currently, we don't catch it. We could: ``` ---------------------------------------- Name: no overflow %o0 = udiv i4 -1, %x %r = icmp ult i4 %o0, %y => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %r = extractvalue {i4, i1} %n0, 1 Done: 1 Optimization is correct! ---------------------------------------- Name: no overflow, swapped %o0 = udiv i4 -1, %x %r = icmp ugt i4 %y, %o0 => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %r = extractvalue {i4, i1} %n0, 1 Done: 1 Optimization is correct! ---------------------------------------- Name: overflow %o0 = udiv i4 -1, %x %r = icmp uge i4 %o0, %y => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %n1 = extractvalue {i4, i1} %n0, 1 %r = xor %n1, -1 Done: 1 Optimization is correct! ---------------------------------------- Name: overflow %o0 = udiv i4 -1, %x %r = icmp ule i4 %y, %o0 => %o0 = udiv i4 -1, %x %n0 = umul_overflow i4 %x, %y %n1 = extractvalue {i4, i1} %n0, 1 %r = xor %n1, -1 Done: 1 Optimization is correct! ``` As it can be observed from tests, while simply forming the `@llvm.umul.with.overflow` is easy, if we were looking for the inverted answer, then more work needs to be done to cleanup the now-pointless control-flow that was guarding against division-by-zero. This is being addressed in follow-up patches. Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon Reviewed By: nikic, xbolva00 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65143 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370347 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-29 12:47:08 +00:00
Roman Lebedev	f2cb8a934d	[InstCombine] Shift amount reassociation in bittest: trunc-of-lshr (PR42399) Summary: Finally, the fold i was looking forward to :) The legality check is muddy, i doubt i've groked the full generalization, but it handles all the cases i care about, and can come up with: https://rise4fun.com/Alive/26j I.e. we can perform the fold if any of the following is true: * The shift amount is either zero or one less than widest bitwidth * Either of the values being shifted has at most lowest bit set * The value that is being shifted by `shl` (which is not truncated) should have no less leading zeros than the total shift amount; * The value that is being shifted by `lshr` (which is truncated) should have no less leading zeros than the widest bit width minus total shift amount minus one I strongly suspect there is some better generalization, but i'm not aware of it as of right now. For now i also avoided using actual `computeKnownBits()`, but restricted it to constants. Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370324 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-29 10:26:23 +00:00
Simon Pilgrim	63cd984c94	Fix variable set but no used warning on NDEBUG builds. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370317 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-29 09:58:47 +00:00
Craig Topper	220ab2c056	[InstCombine] Disable recursion in foldGEPICmp for vector pointer GEPs Due to missing vector support in this function, recursion can generate worse code in some cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370221 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-28 15:40:34 +00:00
Craig Topper	63ba0bd5d7	[InstCombine] Disable some portions of foldGEPICmp for GEPs that return a vector of pointers. Fix other portions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370114 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-27 21:38:56 +00:00
Philip Reames	bcb37b91cc	[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null for vectors Extend the transform introduced in https://reviews.llvm.org/D66608 to work for vector geps as well. Differential Revision: https://reviews.llvm.org/D66671 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369949 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-26 19:11:49 +00:00
Roman Lebedev	17bb71c533	[InstCombine] matchThreeWayIntCompare(): commutativity awareness Summary: `matchThreeWayIntCompare()` looks for ``` select i1 (a == b), i32 Equal, i32 (select i1 (a < b), i32 Less, i32 Greater) ``` but both of these selects/compares can be in it's commuted form, so out of 8 variants, only the two most basic ones is handled. This fixes regression being introduced in D66232. Reviewers: spatel, nikic, efriedma, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66607 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369841 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-24 06:49:36 +00:00
Roman Lebedev	2f0aeb5b36	[InstCombine] Try to reuse constant from select in leading comparison Summary: If we have e.g.: ``` %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 ``` the constants `65535` and `65536` are suspiciously close. We could perform a transformation to deduplicate them: ``` Name: ult %t = icmp ult i32 %x, 65536 %r = select i1 %t, i32 %y, i32 65535 => %t.inv = icmp ugt i32 %x, 65535 %r = select i1 %t.inv, i32 65535, i32 %y ``` https://rise4fun.com/Alive/avb While this may seem esoteric, this should certainly be good for vectors (less constant pool usage) and for opt-for-size - need to have only one constant. But the real fun part here is that it allows further transformation, in particular it finishes cleaning up the `clamp` folding, see e.g. `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`. We start with e.g. ``` %dont_need_to_clamp_positive = icmp sle i32 %X, 32767 %dont_need_to_clamp_negative = icmp sge i32 %X, -32768 %clamp_limit = select i1 %dont_need_to_clamp_positive, i32 -32768, i32 32767 %dont_need_to_clamp = and i1 %dont_need_to_clamp_positive, %dont_need_to_clamp_negative %R = select i1 %dont_need_to_clamp, i32 %X, i32 %clamp_limit ``` without this patch we currently produce ``` %1 = icmp slt i32 %X, 32768 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1, i32 %3, i32 32767 ``` which isn't really a `clamp` - both comparisons are performed on the original value, this patch changes it into ``` %1.inv = icmp sgt i32 %X, 32767 %2 = icmp sgt i32 %X, -32768 %3 = select i1 %2, i32 %X, i32 -32768 %R = select i1 %1.inv, i32 32767, i32 %3 ``` and then the magic happens! Some further transform finishes polishing it and we finally get: ``` %t1 = icmp sgt i32 %X, -32768 %t2 = select i1 %t1, i32 %X, i32 -32768 %t3 = icmp slt i32 %t2, 32767 %R = select i1 %t3, i32 %t2, i32 32767 ``` which is beautiful and just what we want. Proofs for `getFlippedStrictnessPredicateAndConstant()` for de-canonicalization: https://rise4fun.com/Alive/THl Proofs for the fold itself: https://rise4fun.com/Alive/THl Reviewers: spatel, dmgreen, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66232 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369840 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-24 06:49:25 +00:00
Philip Reames	dfdde6d06e	Fix a bug in just submitted rL369789 Started implementing the vector case and realized the scalar case hadn't handled the GEP producing a different type than the base correctly. It's entertaining seeing what slips through review when we're focused on the 'hard' parts. :( Also adding an extra vector test as it happened to be in workspace and wasn't worth separating. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369795 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-23 18:27:57 +00:00
Philip Reames	16ed7f8f59	[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null This generalizes the isGEPKnownNonNull rule from ValueTracking to apply when we do not know if the base is non-null, and thus need to replace one condition with another. The core notion is that since an inbounds GEP can only form null if the base pointer is null and the offset is zero. However, if the offset is non-zero, the the "inbounds" marker makes the result poison. Thus, we're free to ignore the case where the offset is non-zero. Similarly, there's no case under which a non-null base can result in a null result without generating poison. Differential Revision: https://reviews.llvm.org/D66608 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369789 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-23 17:58:58 +00:00
Philip Reames	59c28fddaa	[instcombine] icmp eq/ne (sub C, Y), C -> icmp eq/ne Y, 0 Noticed while looking at pr43028. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369541 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-21 15:51:57 +00:00
Sanjay Patel	fdfa6e2423	[InstCombine] narrow icmp with extended operands of different widths An intermediate extend is used to widen the narrow operand to the width of the other (wider) operand. At that point, we have the same logic as the existing transform that was restricted to folds of equal width zext/sext. This mostly solves PR42700: https://bugs.llvm.org/show_bug.cgi?id=42700 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369519 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-21 11:56:08 +00:00
Sanjay Patel	e687343b4c	[InstCombine] add helper function for icmp+zext/sext; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369421 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-20 18:15:17 +00:00
Sanjay Patel	f066c9f731	[InstCombine] make fold for icmp with sext more efficient; NFC We were creating 2 instructions and relying on a subsequent fold to invert a not(icmp). Create the final icmp directly instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369411 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-20 17:03:22 +00:00
Sanjay Patel	afb27048eb	[InstCombine] improve readability for icmp with cast folds; NFC 1. Update function name and stale code comments. 2. Use variable names that are less ambiguous. 3. Move operand checks into the function as early exits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369390 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-20 14:56:44 +00:00
Roman Lebedev	34e84970cc	[InstCombine] Cherry-pick NFC cleanups of foldShiftIntoShiftInAnotherHandOfAndInICmp() from D66383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369207 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-18 12:26:33 +00:00
Roman Lebedev	2f2b56c4d7	[InstCombine] Shift amount reassociation in bittest: trunc-of-shl (PR42399) Summary: This is continuation of D63829 / https://bugs.llvm.org/show_bug.cgi?id=42399 I thought naive pattern would solve my issue, but nope, it involved truncation, thus more folds needed.. This isn't really the fold i'm interested in, i need trunc-of-lshr, but i'we decided to start with `shl` because it's simpler. In this case, no extra legality checks are needed: https://rise4fun.com/Alive/CAb We should be careful about not increasing instruction count, since we need to produce `zext` because `and` is done in wider type. Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66057 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369117 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-16 15:10:41 +00:00
Roman Lebedev	d1ae485811	[InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of canonicalizeCmpWithConstant(), NFCI I'd like to use it elsewhere, hopefully without reinventing the wheel. No functional change intended so far. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368820 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-14 09:57:20 +00:00
Roman Lebedev	fcd00601e2	[InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid constantexpr pitfail (PR42962) Instead of matching value and then blindly casting to BinaryOperator just to get the opcode, just match instruction and do no cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42962 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368554 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-12 11:28:02 +00:00
Roman Lebedev	8115d87edc	[InstCombine][NFC] Use SimplifyAddInst() instead of SimplifyBinOp(Instruction::BinaryOps::Add, ) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368521 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-10 19:29:10 +00:00
Roman Lebedev	80815a1a90	[InstCombine] Shift amount reassociation in bittest: relax one-use check when shifting constant If one of the values being shifted is a constant, since the new shift amount is known-constant, the new shift will end up being constant-folded so, we don't need that one-use restriction then. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368519 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-10 19:28:54 +00:00
Roman Lebedev	2a686e81e5	[InstCombine] Shift amount reassociation in bittest: drop pointless one-use restriction That one-use restriction is not needed for correctness - we have already ensured that one of the shifts will go away, so we know we won't increase the instruction count. So there is no need for that restriction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368518 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-10 19:28:44 +00:00
Roman Lebedev	64018f7f99	[InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is power-of-two Summary: I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling. While we did happen to handle this fold in most of the cases, the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two), or first turn `x s% -y` to `x u% y`; that does handle most of the cases. But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`, and thus we end up being stuck with `(x s% INT_MIN) == 0`. There is no such restriction for the more general fold: https://rise4fun.com/Alive/IIeS To be noted, the fold does not enforce that `y` is a constant, so it may indeed increase instruction count. This is consistent with what `x u% y`->`x & (y-1)` already does. I think it makes sense, it's at most one (simple) extra instruction, while `rem`ainder is really much more un-simple (and likely very costly). Reviewers: spatel, RKSimon, nikic, xbolva00, craig.topper Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367322 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-30 15:28:22 +00:00
Roman Lebedev	e90a51722f	[PatternMatch] Generalize m_SpecificInt_ULT() to take ICmpInst::Predicate As discussed in the original review, this may be useful, so let's just do it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365652 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-10 16:07:35 +00:00
Sanjay Patel	3bfa6ac3a8	[InstCombine] reduce more checks for power-of-2-or-zero using ctpop Extends the transform from: rL364341 ...to include another (more common?) pattern that tests whether a value is a power-of-2 (including or excluding zero). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364856 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-01 22:00:00 +00:00
Roman Lebedev	458204d465	[InstCombine] Shift amount reassociation in bittest (PR42399) Summary: Given pattern: `icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0` we should move shifts to the same hand of 'and', i.e. rewrite as `icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)` It might be tempting to not restrict this to situations where we know we'd fold two shifts together, but i'm not sure what rules should there be to avoid endless combine loops. We pick the same shift that was originally used to shift the variable we picked to shift: https://rise4fun.com/Alive/6x1v Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 \| PR42399]]. Reviewers: spatel, nikic, RKSimon Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63829 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364791 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-01 15:55:15 +00:00
Roman Lebedev	1c1bc8778e	[InstCombine] Omit 'urem' where possible This was added in D63390 / rL364286 to backend, but it makes sense to also handle it in middle-end. https://rise4fun.com/Alive/Zsln git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364738 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-01 09:41:43 +00:00
Huihui Zhang	ab07556fab	[InstCombine] Simplify icmp ult/uge (shl %x, C2), C1 iff C1 is power of two -> icmp eq/ne (and %x, (lshr -C1, C2)), 0. Simplify 'shl' inequality test into 'and' equality test. This pattern happens in the middle-end while simplifying bitfield access, Exposed in https://reviews.llvm.org/D63505 https://rise4fun.com/Alive/6uz Reviewers: lebedev.ri, efriedma Reviewed By: lebedev.ri Subscribers: spatel, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63675 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364348 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 20:44:52 +00:00
Sanjay Patel	08e75cad94	[InstCombine] reduce checks for power-of-2-or-zero using ctpop This follows up the transform from rL363956 to use the ctpop intrinsic when checking for power-of-2-or-zero. This is matching the isPowerOf2() patterns used in PR42314: https://bugs.llvm.org/show_bug.cgi?id=42314 But there's at least 1 instcombine follow-up needed to match the alternate form: (v & (v - 1)) == 0; We should have all of the backend expansions handled with: rL364319 (x86-specific changes still needed for optimal code based on subtarget) And the larger patterns to exclude zero as a power-of-2 are joining with this change after: rL364153 ( D63660 ) rL364246 Differential Revision: https://reviews.llvm.org/D63777 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364341 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 18:51:44 +00:00
Huihui Zhang	e4223f994b	[InstCombine] Fold icmp eq/ne (and %x, C), 0 iff (-C) is power of two -> %x u</u>= (-C) earlier. Summary: To generate simplified IR, make sure fold (X & ~C) ==/!= 0 --> X u</u>= C+1 is scheduled before fold ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. https://rise4fun.com/Alive/7ZN Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63505 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364255 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 00:09:10 +00:00
Sanjay Patel	3575a5c66f	[InstCombine] fix typo in comment; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363974 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 20:23:32 +00:00
Sanjay Patel	0e1f2dd708	[InstCombine] canonicalize check for power-of-2 The form that compares against 0 is better because: 1. It removes a use of the input value. 2. It's the more standard form for this pattern: https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2 3. It results in equal or better codegen (tested with x86, AArch64, ARM, PowerPC, MIPS). This is a root cause for PR42314, but probably doesn't completely answer the codegen request: https://bugs.llvm.org/show_bug.cgi?id=42314 Alive proof: https://rise4fun.com/Alive/9kG Name: is power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp eq i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp eq i32 %a2, 0 Name: is not power-of-2 %neg = sub i32 0, %x %a = and i32 %neg, %x %r = icmp ne i32 %a, %x => %dec = add i32 %x, -1 %a2 = and i32 %dec, %x %r = icmp ne i32 %a2, 0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363956 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 17:41:15 +00:00
Huihui Zhang	3104139a5b	[InstCombine] Fold icmp eq/ne (and %x, signbit), 0 -> %x s>=/s< 0 earlier Summary: To generate simplified IR, make sure fold ``` (X & signbit) ==/!= 0) -> X s>=/s< 0; ``` is scheduled before fold ``` ((X << Y) & C) == 0 -> (X & (C >> Y)) == 0. ``` https://rise4fun.com/Alive/fbdh Reviewers: lebedev.ri, efriedma, spatel, craig.topper Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63026 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363845 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-19 17:31:39 +00:00
Roman Lebedev	1376d1f533	[InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles A precondition 'x != 0' was forgotten by me: https://rise4fun.com/Alive/JFNP https://rise4fun.com/Alive/jHvL These 4 folds with non-constants could be re-enabled, but for now let's go for the simplest solution. https://bugs.llvm.org/show_bug.cgi?id=42198 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362911 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-09 16:30:42 +00:00

1 2 3 4 5 ...

630 Commits