Commit Graph

183 Commits

Author SHA1 Message Date
Sanjay Patel
9968c706bd [InstCombine] (~X) - (~Y) --> Y - X
llvm-svn: 326660
2018-03-03 17:53:25 +00:00
Sanjay Patel
74dc5c305c [InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCI
Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly 
vague. The constant check is redundant in some cases, but it allows 
removing duplication for most of the calls.

llvm-svn: 326329
2018-02-28 16:36:24 +00:00
Sanjay Patel
50e2d76225 [InstCombine] use FMF-copying functions to reduce code; NFCI
llvm-svn: 325923
2018-02-23 17:07:29 +00:00
Sanjay Patel
f0724c4012 [InstCombine] canonicalize constant-minus-boolean to select-of-constants
This restores the half of:
https://reviews.llvm.org/rL75531
that was reverted at:
https://reviews.llvm.org/rL159230

For the x86 case mentioned there, we now produce:
leal 1(%rdi), %eax
subl %esi, %eax

We have target hooks to invert this in DAGCombiner (and x86 is enabled) with:
https://reviews.llvm.org/rL296977
https://reviews.llvm.org/rL311731

AArch64 and possibly other targets would probably benefit from enabling those hooks too. 
See PR30327:
https://bugs.llvm.org/show_bug.cgi?id=30327#c2

Differential Revision: https://reviews.llvm.org/D40612

llvm-svn: 319964
2017-12-06 21:22:57 +00:00
Sanjay Patel
fd69991264 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00
Eugene Zelenko
154475b343 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 316503
2017-10-24 21:24:53 +00:00
Sanjay Patel
44f47bb494 [InstCombine] use m_Neg() to reduce code; NFCI
llvm-svn: 315762
2017-10-13 21:28:50 +00:00
Sanjay Patel
a6b70ea7c0 [InstCombine] move code to remove repeated constant check; NFCI
Also, consolidate tests for this fold in one place.

llvm-svn: 315745
2017-10-13 20:29:11 +00:00
Sanjay Patel
f951aa8225 [InstCombine] recycle adds for better efficiency
Also, clean up unnecessary matcher capture variable initializations.

llvm-svn: 315743
2017-10-13 20:12:21 +00:00
Sanjay Patel
79e4224a23 [InstCombine] use local var to reduce code duplication; NFCI
llvm-svn: 315728
2017-10-13 18:32:53 +00:00
Sanjay Patel
d9c40769a5 [InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing instructions
llvm-svn: 315718
2017-10-13 17:47:25 +00:00
Sanjay Patel
104c37991f [InstCombine] use AddOne helper to reduce code; NFC
llvm-svn: 315709
2017-10-13 17:00:47 +00:00
Sanjay Patel
d69636831d [InstCombine] rearrange code to remove repeated constant check; NFCI
llvm-svn: 315703
2017-10-13 16:43:58 +00:00
Sanjay Patel
dfbf86bd23 [InstCombine] allow zext(bool) + C --> select bool, C+1, C for vector types
The backend should be prepared for this transform after:
https://reviews.llvm.org/rL311731

llvm-svn: 315701
2017-10-13 16:29:38 +00:00
Quentin Colombet
fe9e785376 [InstCombine] Add select simplifications
In these cases, two selects have constant selectable operands for
both the true and false components and have the same conditional
expression.
We then create two arithmetic operations of the same type and feed a
final select operation using the result of the true arithmetic for the true
operand and the result of the false arithmetic for the false operand and reuse
the original conditionl expression.
The arithmetic operations are naturally folded as a consequence, leaving
only the newly formed select to replace the old arithmetic operation.

Patch by: Michael Berg <michael_c_berg@apple.com>
Differential Revision: https://reviews.llvm.org/D37019

llvm-svn: 313774
2017-09-20 17:32:16 +00:00
Hiroshi Yamauchi
a7d6028861 [InstCombine] Simplify pointer difference subtractions (GEP-GEP) where GEPs have other uses and one non-constant index
Summary:
Pointer difference simplifications currently happen only if input GEPs don't have other uses or their indexes are all constants, to avoid duplicating indexing arithmetic.

This patch enables cases with exactly one non-constant index among input GEPs to happen where there is no duplicated arithmetic or code size increase even if input GEPs have other uses.

For example, this patch allows "(&A[42][i]-&A[42][0])" --> "i", which didn't happen previously, if the input GEP(s) have other uses.

Reviewers: sanjoy, bkramer

Reviewed By: sanjoy

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D35499

llvm-svn: 309304
2017-07-27 18:27:11 +00:00
Hiroshi Yamauchi
4147ce4079 Fix a comment (test commit).
llvm-svn: 309192
2017-07-26 21:54:43 +00:00
Craig Topper
8652178bc5 [IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing isIntegerTy(unsigned), but also works for vectors.
llvm-svn: 307492
2017-07-09 07:04:03 +00:00
Craig Topper
ce5664ff66 [InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere
Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference.

This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do.

llvm-svn: 307451
2017-07-07 23:16:26 +00:00
Sanjay Patel
654de9bf13 [InstCombine] add (sext i1 X), 1 --> zext (not X)
http://rise4fun.com/Alive/i8Q

A narrow bitwise logic op is obviously better than math for value tracking, 
and zext is better than sext. Typically, the 'not' will be folded into an 
icmp predicate.

The IR difference would even survive through codegen for x86, so we would see 
worse code:

https://godbolt.org/g/C14HMF

one_or_zero(int, int):                      # @one_or_zero(int, int)
        xorl    %eax, %eax
        cmpl    %esi, %edi
        setle   %al
        retq

one_or_zero_alt(int, int):                  # @one_or_zero_alt(int, int)
        xorl    %ecx, %ecx
        cmpl    %esi, %edi
        setg    %cl
        movl    $1, %eax
        subl    %ecx, %eax
        retq

llvm-svn: 306243
2017-06-25 14:15:28 +00:00
Craig Topper
bb4fd5bd20 [InstCombine] Pass a proper context instruction to all of the calls into InstSimplify
Summary: This matches the behavior we already had for compares and makes us consistent everywhere.

Reviewers: dberlin, hfinkel, spatel

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33604

llvm-svn: 305049
2017-06-09 03:21:29 +00:00
Craig Topper
bc295eadb7 [InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce compiled code for comparing APInts with 0 and 1. NFC
These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare.

llvm-svn: 304876
2017-06-07 07:40:37 +00:00
Craig Topper
d36a5dface [ValueTracking] Convert most of the calls to computeKnownBits to use the version that returns the KnownBits object.
This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits.

Differential Revision: https://reviews.llvm.org/D33431

llvm-svn: 303773
2017-05-24 16:53:07 +00:00
Craig Topper
8956ad4fc3 [InstCombine] Cleanup the interface for overflow checks
Summary:
Fix naming conventions and const correctness.
This completes the changes made in rL303029.

Patch by Yoav Ben-Shalom.

Reviewers: craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33377

llvm-svn: 303529
2017-05-22 06:25:31 +00:00
Craig Topper
6865450af4 [KnownBits] Use isNegative/isNonNegative to shorten some code. NFC
llvm-svn: 303522
2017-05-22 00:49:33 +00:00
Craig Topper
be2ad5e5e7 [ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits.
This patch finishes off the conversion of ComputeSignBit to computeKnownBits.

Differential Revision: https://reviews.llvm.org/D33166

llvm-svn: 303035
2017-05-15 06:39:41 +00:00
Craig Topper
6fa9bfae0a [InstCombine] Merge duplicate functionality between InstCombine and ValueTracking
Summary:
Merge overflow computation for signed add,
appearing both in InstCombine and ValueTracking.

As part of the merge,
cleanup the interface for overflow checks in InstCombine.

Patch by Yoav Ben-Shalom.

Reviewers: craig.topper, majnemer

Reviewed By: craig.topper

Subscribers: takuto.ikuta, llvm-commits

Differential Revision: https://reviews.llvm.org/D32946

llvm-svn: 303029
2017-05-15 02:44:08 +00:00
Sanjay Patel
28b1842d00 [InstCombine] add (ashr (shl i32 X, 31), 31), 1 --> and (not X), 1
This is another step towards favoring 'not' ops over random 'xor' in IR:
https://bugs.llvm.org/show_bug.cgi?id=32706

This transformation may have occurred in longer IR sequences using computeKnownBits,
but that could be much more expensive to calculate.

As the scalar result shows, we do not currently favor 'not' in all cases. The 'not'
created by the transform is transformed again (unnecessarily). Vectors don't have
this problem because vectors are (wrongly) excluded from several other combines.

llvm-svn: 302659
2017-05-10 13:56:52 +00:00
Sanjay Patel
05c41aae87 [InstCombine] add helper function for add X, C folds; NFCI
llvm-svn: 302605
2017-05-10 00:07:16 +00:00
Craig Topper
f3dc55d476 [InstCombine][KnownBits] Use KnownBits better to detect nsw adds
Change checkRippleForAdd from a heuristic to a full check -
if it is provable that the add does not overflow return true, otherwise false.

Patch by Yoav Ben-Shalom

Differential Revision: https://reviews.llvm.org/D32686

llvm-svn: 302093
2017-05-03 23:22:46 +00:00
Craig Topper
614fcfd8d9 [APInt] Add clearSignBit method. Use it and setSignBit in a few places. NFCI
llvm-svn: 301656
2017-04-28 16:58:05 +00:00
Daniel Berlin
1979d95e05 InstCombine: Use the new SimplifyQuery versions of Simplify*. Use AssumptionCache, DominatorTree, TargetLibraryInfo everywhere.
llvm-svn: 301464
2017-04-26 20:56:07 +00:00
Craig Topper
c5d014c133 [ValueTracking] Introduce a KnownBits struct to wrap the two APInts for computeKnownBits
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.

Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.

I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.

Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.

Differential Revision: https://reviews.llvm.org/D32376

llvm-svn: 301432
2017-04-26 16:39:58 +00:00
Matt Arsenault
c500f69400 InstCombine: Fix assert when reassociating fsub with undef
There is logic to track the expected number of instructions
produced. It thought in this case an instruction would
be necessary to negate the result, but here it folded
into a ConstantExpr fneg when the non-undef value operand
was cancelled out by the second fsub.

I'm not sure why we don't fold constant FP ops with undef currently,
but I think that would also avoid this problem.

llvm-svn: 301199
2017-04-24 17:24:37 +00:00
Artur Pilipenko
2944f44653 Fix for PR32740 - Invalid floating type, unreachable between r300969 and r301029
The bug was introduced by r301018 "[InstCombine] fadd double (sitofp x), y check that the promotion is valid". The patch didn't expect that fadd can be on vectors not necessarily scalars. Add vector support along with the test.

llvm-svn: 301070
2017-04-22 07:24:52 +00:00
Artur Pilipenko
e473bed70c [InstCombine] fadd double (sitofp x), y check that the promotion is valid
Doing these transformations check that the result of integer addition is representable in the FP type.

(fadd double (sitofp x), fpcst) --> (sitofp (add int x, intcst))
(fadd double (sitofp x), (sitofp y)) --> (sitofp (add int x, y))

This is a fix for https://bugs.llvm.org//show_bug.cgi?id=27036

Reviewed By: andrew.w.kaylor, scanon, spatel

Differential Revision: https://reviews.llvm.org/D31182

llvm-svn: 301018
2017-04-21 18:45:25 +00:00
Craig Topper
d8d347a0e2 [APInt] Rename getSignBit to getSignMask
getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask.

Differential Revision: https://reviews.llvm.org/D32108

llvm-svn: 300856
2017-04-20 16:56:25 +00:00
Craig Topper
c7fb97cbfb [InstCombine] Support folding a subtract with a constant LHS into a phi node
We currently only support folding a subtract into a select but not a PHI. This fixes that.

I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects.

Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards.

Differential Revision: https://reviews.llvm.org/D31686

llvm-svn: 300363
2017-04-14 19:20:12 +00:00
Craig Topper
bc67402813 Fix spelling compliment->complement. Mostly refering to 2s complement. NFC
llvm-svn: 299970
2017-04-11 18:47:58 +00:00
Craig Topper
36e0adc9fa [InstCombine] Use commutable matchers and m_OneUse in visitSub to shorten code. Add missing test cases.
In one case I removed commute handling for a multiply with a constant since we'll eventually get the constant on the right hand side.

llvm-svn: 299863
2017-04-10 18:09:25 +00:00
Craig Topper
bd8800adbc [InstCombine] Use m_c_Add to shorten some code. Add testcases for this fold since they were missing. NFC
llvm-svn: 299853
2017-04-10 16:59:40 +00:00
Craig Topper
a156f75fb3 [InstCombine] Support folding of add instructions with vector constants into select operations
We currently only fold scalar add of constants into selects. This improves this to support vectors too.

Differential Revision: https://reviews.llvm.org/D31683

llvm-svn: 299847
2017-04-10 16:40:00 +00:00
Craig Topper
8a5c6aa386 [InstCombine] Use commutable and/or/xor matchers to simplify some code
Summary:
This is my first time using the commutable matchers so wanted to make sure I was doing it right.

Are there any other matcher tricks to further shrink this? Can we commute the whole match so we don't have to LHS and RHS separately?

Reviewers: davide, spatel

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31680

llvm-svn: 299840
2017-04-10 07:13:40 +00:00
Craig Topper
61817a804f [InstCombine] Remove testing assert I accidentally left in r299710.
llvm-svn: 299715
2017-04-06 21:29:43 +00:00
Craig Topper
bb70356d71 [InstCombine] When checking to see if we can turn subtracts of 2^n - 1 into xor, we only need to call computeKnownBits on the RHS not the whole subtract. While there use isMask instead of isPowerOf2(C+1)
Calling computeKnownBits on the RHS should allows us to recurse one step further. isMask is equivalent to the isPowerOf2(C+1) except in the case where C is all ones. But that was already handled earlier by creating a not which is an Xor with all ones. So this should be fine.

llvm-svn: 299710
2017-04-06 21:06:03 +00:00
Sanjay Patel
cdd1a7d5f1 [InstCombine] rename variable for easier reading; NFC
We usually give constants a 'C' somewhere in the name...

llvm-svn: 299474
2017-04-04 22:06:03 +00:00
Craig Topper
fdbd9d31f7 [InstCombine] Turn subtract of vectors of i1 into xor like we do for scalar i1. Matches what we already do for add.
llvm-svn: 299472
2017-04-04 21:44:56 +00:00
Craig Topper
c348b0516a [InstCombine] Fix typo last->least. NFC
llvm-svn: 299123
2017-03-30 22:28:55 +00:00
Artur Pilipenko
74c7718b57 NFC. InstCombiner::visitFAdd extract LHSIntVal/RHSIntVal local variables
llvm-svn: 298359
2017-03-21 11:32:15 +00:00
Sanjay Patel
6883df3bf9 [InstCombine] don't try SimplifyDemandedInstructionBits from add/sub because it's slow and unlikely to succeed
Notably, no regression tests change when we remove these calls, and these are expensive calls.

The motivation comes from the general acknowledgement that the compiler is getting slower:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/109188.html
http://lists.llvm.org/pipermail/llvm-dev/2016-December/108279.html

And specifically the test case attached to PR32037:
https://bugs.llvm.org//show_bug.cgi?id=32037

Profiling the middle-end (opt) part of the compile:
$ ./opt -O2 row_common.bc -o /dev/null

...visitAdd and visitSub are near the top of the instcombine list, and the calls to SimplifyDemandedInstructionBits()
are high within each of those. Those calls account for 1%+ of the opt time in either debug or release profiles. And 
that's the rough win I see from this patch when testing opt built release from r295864 on an iMac with Haswell 4GHz
(model 4790K).

It seems unlikely that we'd be able to eliminate add/sub or change their operands given that add/sub normally affect
all bits, and the PR32037 example shows no IR difference after this change using -O2.

Also worth noting - the code comment in visitAdd:
// This handles stuff like (X & 254)+1 -> (X&254)|1
...isn't true. That transform is handled later with a call to haveNoCommonBitsSet().

Differential Revision: https://reviews.llvm.org/D30270

llvm-svn: 295898
2017-02-22 23:01:12 +00:00