122 Commits

Author SHA1 Message Date
Craig Topper
c7bad98e0e [InstCombine] Move portion of SimplifyDemandedUseBits that deals with instructions with multiple uses out to a separate method. NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300082 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 18:05:21 +00:00
Craig Topper
3461e9c2e4 Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit below the highest demanded bit can be simplified
If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation.

My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran.

With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything.

Differential Revision: https://reviews.llvm.org/D31120



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300075 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 16:49:59 +00:00
Craig Topper
099a6fd775 [InstCombine] Use setAllBits in place of getAllOnesValue since we know the bitwidths are the same. NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299413 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-04 05:03:02 +00:00
Craig Topper
68149f546e [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword
This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation.

Differential Revision: https://reviews.llvm.org/D31565




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299362 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-03 16:34:59 +00:00
Craig Topper
38017a1fee [APInt] Remove shift functions from APIntOps namespace. Replace the few users with the APInt class methods. NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299248 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-31 20:01:16 +00:00
Craig Topper
da174a7f04 [InstCombine] Change the interface of SimplifyDemandedBits so that it takes the instruction and operand instead of the Use.
The first thing it did was get the User for the Use to get the instruction back. This requires looking through the Uses for the User using the waymarking walk. That's pretty fast, but its probably still better to just pass the Instruction we already had.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298772 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-25 06:52:52 +00:00
Craig Topper
7ebd1797a3 Revert r298711 "[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits"
Tsan bot is failing.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298745 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-24 22:12:10 +00:00
Craig Topper
bab6d5ee26 [InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits
SimplifyDemandedUseBits for Add/Sub already recursed down LHS and RHS for simplifying bits. If that didn't provide any simplifications we fall back to calling computeKnownBits which will recurse again. Instead just take the known bits for LHS and RHS we already have and call into a new function in ValueTracking that can calculate the known bits given the LHS/RHS bits.





git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298711 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-24 16:56:51 +00:00
Craig Topper
2612791ba8 [InstCombine] Teach SimplifyDemandedUseBits to shrink Constants on the left side of subtracts
Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side.

Reviewers: davide, majnemer, spatel, sanjoy, hfinkel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298478 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-22 04:03:53 +00:00
Craig Topper
9534108197 [InstCombine] Remove duplicate code in SimplifyDemandedUseBits for URem. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298231 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-19 21:45:57 +00:00
Craig Topper
0a70890b84 [InstCombine] Use setHighBits/setLowBits/setBitsFrom in place of getLowBitsSet/getHighBitsSet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298204 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-19 05:49:16 +00:00
Matt Arsenault
931794f288 AMDGPU: Fix insertion point when reducing load intrinsics
The insertion point may be later than the next instruction,
so it is necessary to set it when replacing the call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297439 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 05:25:49 +00:00
Matt Arsenault
f90265a1b0 AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297408 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 20:34:27 +00:00
Simon Pilgrim
85969be932 Use APInt::getLowBitsSet instead of APInt::getBitsSet for lower bit mask creation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296882 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-03 16:56:33 +00:00
Craig Topper
4c2f2e48dc [AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295294 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-16 07:35:23 +00:00
Sanjay Patel
5a764b03c6 [InstCombine] use m_APInt to allow demanded bits analysis on splat constants
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294628 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 21:43:06 +00:00
Simon Pilgrim
52cd722eec [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now

Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292913 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-24 11:07:41 +00:00
Simon Pilgrim
116ba1a31a [InstCombine][X86] Add MULDQ/MULUDQ undef handling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292627 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 18:20:30 +00:00
Simon Pilgrim
87735961e4 [InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions
Simplify a packss/packus truncation based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28777

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292591 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 09:28:21 +00:00
Simon Pilgrim
c2e261218f [InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles
Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292371 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-18 14:47:49 +00:00
Simon Pilgrim
04f56107c5 [InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions
Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292209 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-17 11:35:03 +00:00
Simon Pilgrim
07d3c0f01c [InstCombine][SSE] Add DemandedElts support for PSHUFB instructions
Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded.

Differential Revision: https://reviews.llvm.org/D28745

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292101 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-16 11:30:41 +00:00
Craig Topper
143fa1a52e [InstCombine] Fix typo in comment. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290706 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-29 05:38:31 +00:00
Craig Topper
084109508e [InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290705 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-29 04:24:32 +00:00
Craig Topper
6a65066873 [InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner.
We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290704 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-29 03:30:17 +00:00
Craig Topper
b7ae55eb2c [InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ instructions
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

This builds on r290554 which added supported for 128 and 256-bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290582 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-27 05:30:09 +00:00
Simon Pilgrim
915b45f09f [InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

Differential Revision: https://reviews.llvm.org/D28119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290554 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-26 23:28:17 +00:00
Craig Topper
0f5f69acca [InstCombine] Simplify code slightly. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290046 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-17 18:10:04 +00:00
Craig Topper
d941d3f22f [AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289759 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 03:49:45 +00:00
Craig Topper
2eef9bcab6 [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better.
Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289636 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 06:06:58 +00:00
Craig Topper
98bf16eccb [X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts.
Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289632 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 05:43:05 +00:00
Craig Topper
23156f1924 [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289629 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 03:17:30 +00:00
Craig Topper
52ed6069ee [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289628 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 03:17:27 +00:00
Craig Topper
f19ce9bb49 [X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly.
Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed.

Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289523 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-13 07:45:45 +00:00
Craig Topper
e25a2790d2 [InstCombine][XOP] The instructions for the scalar frcz intrinsics are defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289411 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 22:32:38 +00:00
Craig Topper
98435b8bdf [X86][InstCombine] Add support for scalar FMA intrinsics to SimplifyDemandedVectorElts.
This teaches SimplifyDemandedElts that the FMA can be removed if the lower element isn't used. It also teaches it that if upper elements of the first operand aren't used then we can simplify them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289377 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 08:54:52 +00:00
Craig Topper
614df99de4 [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.

Reviewers: zvi, delena, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26660

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 05:24:10 +00:00
Alexey Bataev
34552649e8 [InstCombine] Fixed bug introduced in r282237
The index of the new insertelement instruction was evaluated in the
wrong way, it was considered as the index of the inserted value instead
of index of the position, where the value should be inserted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282401 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 13:18:59 +00:00
Alexey Bataev
b9dfeea817 [InstCombine] Fix for PR29124: reduce insertelements to shufflevector
If inserting more than one constant into a vector:

define <4 x float> @foo(<4 x float> %x) {
  %ins1 = insertelement <4 x float> %x, float 1.0, i32 1
  %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2
  ret <4 x float> %ins2
}

InstCombine could reduce that to a shufflevector:

define <4 x float> @goo(<4 x float> %x) {
 %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3>
 ret <4 x float> %shuf
}
Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e.
shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float
undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> ->
insertelement <4 x float> %v, float 1.0, 1

Differential Revision: https://reviews.llvm.org/D24182

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282237 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-23 09:14:08 +00:00
Sanjay Patel
2fa0c5869c don't repeat function names in comments; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275470 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-14 20:54:43 +00:00
Benjamin Kramer
36538ffe93 Apply most suggestions of clang-tidy's performance-unnecessary-value-param
Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272190 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-08 19:09:22 +00:00
Simon Pilgrim
30f995aa46 [InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX
Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614

Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271789 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-04 13:42:46 +00:00
Sanjay Patel
a5edb789b1 [InstCombine] clean up; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268099 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 20:54:56 +00:00
Ahmed Bougacha
c775e31867 [InstCombine] Remove trailing whitespace. NFC.
r267873.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267887 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-28 14:36:07 +00:00
Simon Pilgrim
fa0eab1450 [InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits
The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits.

This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero.

Differential Revision: http://reviews.llvm.org/D19614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267873 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-28 12:22:53 +00:00
Simon Pilgrim
fe702865fb Tweak comments to make it clear that these combines are for SSE scalar instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267360 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-24 19:31:56 +00:00
Simon Pilgrim
a07a9dbeff [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required
As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267359 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-24 18:35:59 +00:00
Simon Pilgrim
6efee72867 [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)
Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND).

2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements

3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly

We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns).

Differential Revision: http://reviews.llvm.org/D19318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267357 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-24 18:23:14 +00:00
Sanjay Patel
7d0cdb4a10 function names start with a lowercase letter; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259425 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-01 22:23:39 +00:00
Andrea Di Biagio
b0c38394ff [InstCombine] Teach SimplifyDemandedVectorElts how to handle ConstantVector select masks with ConstantExpr elements (PR24922)
If the mask of a select instruction is a ConstantVector, method
SimplifyDemandedVectorElts iterates over the mask elements to identify which
values are selected from the select inputs.

Before this patch, method SimplifyDemandedVectorElts always used method
Constant::isNullValue() to check if a value in the mask was zero. Unfortunately
that method always returns false when called on a ConstantExpr.

This patch fixes the problem in SimplifyDemandedVectorElts by adding an explicit
check for ConstantExpr values. Now, if a value in the mask is a ConstantExpr, we
avoid calling isNullValue() on it.

Fixes PR24922.

Differential Revision: http://reviews.llvm.org/D13219


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249390 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 10:34:53 +00:00