146 Commits

Author SHA1 Message Date
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Sanjay Patel
a04db6078a [InstCombine] refactor isCheapToScalarize(); NFC
As the FIXME indicates, this has the potential to go
overboard. So I'm not sure if it's even worth keeping 
this vs. iteratively doing simple matches, but we might 
as well clean it up.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349523 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-18 19:07:38 +00:00
Matt Arsenault
a71d0d5fdc InstCombine: Scalarize single use icmp/fcmp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348801 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-10 21:50:54 +00:00
Sanjay Patel
7c4834215a [InstCombine] remove dead code from visitExtractElement
Extracting from a splat constant is always handled by InstSimplify.
Move the test for this from InstCombine to InstSimplify to make
sure that stays true.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348423 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-05 23:09:33 +00:00
Sanjay Patel
810c63b2be [InstCombine] reduce duplication in visitExtractElementInst; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348418 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-05 21:57:51 +00:00
Sanjay Patel
36d678978c [InstCombine] try to turn shuffle into insertelement
shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC'

The motivating case is at least a couple of steps away: I noticed that
SLPVectorizer does not analyze shuffles as well as sequences of 
insert/extract in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
...so SLP may fail to vectorize when source code has shuffles to start 
with or instcombine has converted insert/extract to shuffles.

Independent of that, an insertelement is always a simpler op for IR 
analysis vs. a shuffle, so we should transform to insert when possible.

I don't think there's any codegen concern here - if a target can't insert 
a scalar directly to some fixed element in a vector (x86?), then this 
should get expanded to the insert+shuffle that we started with.

Differential Revision: https://reviews.llvm.org/D53507


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345607 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 15:26:39 +00:00
Sanjay Patel
7f64213612 [InstCombine] use 'match' to simplify code; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344855 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-20 17:15:57 +00:00
Sanjay Patel
f521828b26 [InstCombine] make code more flexible with lambda; NFC
I couldn't tell from svn history when these checks were added,
but it pre-dates the split of instcombine into its own directory
at rL92459.

The motivation for changing the check is partly shown by the
code in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724

There are also existing regression tests for SLPVectorizer with
sequences of extract+insert that are likely assumed to become
shuffles by the vectorizer cost models.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344854 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-20 16:58:27 +00:00
Sanjay Patel
04e0dbca23 [InstCombine] add explanatory comment for strange vector logic; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344852 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-20 16:25:55 +00:00
Sanjay Patel
3394148166 [InstCombine] combine a shuffle and an extract subvector shuffle
This is part of the missing IR-level folding noted in D52912.
This should be ok as a canonicalization because the new shuffle mask can't
be any more complicated than the existing shuffle mask. If there's some 
target where the shorter vector shuffle is not legal, it should just end up 
expanding to something like the pair of shuffles that we're starting with here.

Differential Revision: https://reviews.llvm.org/D53037


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344476 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-14 15:25:06 +00:00
Sanjay Patel
9f5daa2df0 revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization
This commit accidentally included the diffs from D53057.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344178 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-10 20:39:39 +00:00
Sanjay Patel
1dd3c06445 [InstCombine] reverse 'trunc X to <N x i1>' canonicalization
icmp ne (and X, 1), 0 --> trunc X to N x i1

Ideally, we'd do the same for scalars, but there will likely be 
regressions unless we add more trunc folds as we're doing here 
for vectors.

The motivating vector case is from PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549

define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) {
  %c = fcmp ole <4 x float> %x, %y
  %s = sext <4 x i1> %c to <4 x i32>
  %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1>
  %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3>
  %cond = or <4 x i32> %s1, %s2
  %condtr = trunc <4 x i32> %cond to <4 x i1>
  %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w
  ret <4 x float> %r
}

Here's a sampling of the vector codegen for that case using 
mask+icmp (current behavior) vs. trunc (with this patch):

AVX before:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vandps	LCPI0_0(%rip), %xmm0, %xmm0
vxorps	%xmm1, %xmm1, %xmm1
vpcmpeqd	%xmm1, %xmm0, %xmm0
vblendvps	%xmm0, %xmm3, %xmm2, %xmm0

AVX after:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vblendvps	%xmm0, %xmm2, %xmm3, %xmm0

AVX512f before:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vpbroadcastd	LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1]
vptestnmd	%zmm1, %zmm0, %k1
vblendmps	%zmm3, %zmm2, %zmm0 {%k1}

AVX512f after:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vpslld	$31, %xmm0, %xmm0
vptestmd	%zmm0, %zmm0, %k1
vblendmps	%zmm2, %zmm3, %zmm0 {%k1}

AArch64 before:

fcmge	v0.4s, v1.4s, v0.4s
zip1	v1.4s, v0.4s, v0.4s
zip2	v0.4s, v0.4s, v0.4s
orr	v0.16b, v1.16b, v0.16b
movi	v1.4s, #1
and	v0.16b, v0.16b, v1.16b
cmeq	v0.4s, v0.4s, #0
bsl	v0.16b, v3.16b, v2.16b

AArch64 after:

fcmge	v0.4s, v1.4s, v0.4s
zip1	v1.4s, v0.4s, v0.4s
zip2	v0.4s, v0.4s, v0.4s
orr	v0.16b, v1.16b, v0.16b
bsl	v0.16b, v2.16b, v3.16b

PowerPC-le before:

xvcmpgesp 34, 35, 34
vspltisw 0, 1
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxlxor 35, 35, 35
xxland 34, 0, 32
vcmpequw 2, 2, 3
xxsel 34, 36, 37, 34

PowerPC-le after:

xvcmpgesp 34, 35, 34
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxsel 34, 37, 36, 0

Differential Revision: https://reviews.llvm.org/D52747



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344082 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-09 21:26:01 +00:00
Sanjay Patel
dd6a1fbcd5 [InstCombine] make helper function 'static'; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344056 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-09 15:29:26 +00:00
Sanjay Patel
62aab8af14 [InstCombine] allow bitcast to/from FP for vector insert/extract transform
This is a follow-up to rL343482 / D52439.
This was a pattern that initially caused the commit to be reverted because
the transform requires a bitcast as shown here.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343794 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-04 16:25:05 +00:00
Sanjay Patel
f7cc35b1ce [InstCombine] try to convert vector insert+extract to trunc; 2nd try
This was originally committed at rL343407, but reverted at 
rL343458 because it crashed trying to handle a case where
the destination type is FP. This version of the patch adds
a check for that possibility. Tests added at rL343480.

Original commit message:

This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the
extract index doesn't correspond to the LSBs of the scalar, then we
have to shift-right before the truncate. Endian-ness makes this tricky,
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343482 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 14:40:00 +00:00
Hans Wennborg
f7f9563cf4 Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"
This caused Chromium builds to fail with "Illegal Trunc" assertion.
See https://crbug.com/890723 for repro.

> This transform is requested for the backend in:
> https://bugs.llvm.org/show_bug.cgi?id=39016
> ...but I figured it was worth doing in IR too, and it's probably
> easier to implement here, so that's this patch.
>
> In the simplest case, we are just truncating a scalar value. If the
> extract index doesn't correspond to the LSBs of the scalar, then we
> have to shift-right before the truncate. Endian-ness makes this tricky,
> but hopefully the ASCII-art helps visualize the transform.
>
> Differential Revision: https://reviews.llvm.org/D52439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343458 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 12:07:45 +00:00
Sanjay Patel
8594ae0860 [InstCombine] try to convert vector insert+extract to trunc
This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably 
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the 
extract index doesn't correspond to the LSBs of the scalar, then we 
have to shift-right before the truncate. Endian-ness makes this tricky, 
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343407 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 14:34:01 +00:00
Sanjay Patel
22030c13ce [InstCombine] allow lengthening of insertelement to eliminate shuffles
As noted in post-commit comments for D52548, the limitation on 
increasing vector length can be applied by opcode.
As a first step, this patch only allows insertelement to be
widened because that has no logical downsides for IR and has 
little risk of pessimizing codegen.

This may cause PR39132 to go into hiding during a full compile,
but that bug is not fixed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343406 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 13:50:42 +00:00
Sanjay Patel
bdd68c5514 [InstCombine] fix formatting in vector evaluators; NFC
We need to alter the functionality as shown in D52548.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343379 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 15:05:24 +00:00
Sanjay Patel
331a5ec713 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343329 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 15:24:41 +00:00
Sanjay Patel
5171686681 [InstCombine] add bitcast+extelt helper function; NFC
We can handle patterns where the elements have different 
sizes, so refactoring ahead of trying to add another blob
within these clauses.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342918 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-24 20:41:22 +00:00
Sanjay Patel
c155aff937 [InstCombine] improve variable name and use 'match'; NFC
'width' of a vector usually refers to the bit-width.

https://bugs.llvm.org/show_bug.cgi?id=39016
shows a case where we could extend this fold to handle
a case where the number of elements in the bitcasted
vector is not equal to the resulting value.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342902 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-24 16:39:03 +00:00
Sanjay Patel
a5f1bd0217 [InstCombine] narrow vector select with padded condition and extracted result (PR38691)
shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341708 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-07 21:03:34 +00:00
Sanjay Patel
a1a7c48dc0 [InstCombine] move declarations closer to uses; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340930 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-29 14:42:12 +00:00
Sanjay Patel
e817129d2b [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340926 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-29 13:24:34 +00:00
Fangrui Song
af7b1832a0 Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338293 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-30 19:41:25 +00:00
Sanjay Patel
128882e5b6 [InstCombine] allow flag propagation when using safe constant
This corresponds with the code for the single binop pattern
added in rL336684.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336696 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 16:09:49 +00:00
Sanjay Patel
57f9ec6739 [InstCombine] safely allow non-commutative binop identity constant folds
This was originally intended with D48893, but as discussed there, we
have to make the folds safe from producing extra poison. This should
give the single binop folds the same capabilities as the existing
folds for 2-binops+shuffle.

LLVM binary opcode review: there are a total of 18 binops. There are 7 
commutative binops (add, mul, and, or, xor, fadd, fmul) which we already 
fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr,
fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't 
bother with sub/fsub with constant operand 1 because those are 
canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336684 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 15:12:31 +00:00
Sanjay Patel
4b88500342 [InstCombine] drop poison flags when shuffle mask undef propagates to constant
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336679 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 14:27:55 +00:00
Sanjay Patel
db58626da5 [InstCombine] allow more shuffle-binop folds with safe constants
The case with 2 variables is more complicated than the case where
we eliminate the shuffle entirely because a shuffle with an undef 
mask element creates an undef result. 

I'm not aware of any current analysis/transform that recognizes that 
undef propagating to a div/rem/shift, but we have to guard against 
the possibility.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336668 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 13:33:26 +00:00
Sanjay Patel
1cc2d65953 [InstCombine] allow more shuffle folds using safe constants
getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming
that the constant we wanted was operand 1 (RHS). That's wrong, but I
don't think we could expose a bug or even a suboptimal fold from that
because the callers have other guards for any binop that would have
been affected.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336617 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 23:22:47 +00:00
Sanjay Patel
fa29fd677f [InstCombine] generalize safe vector constant utility
This is almost NFC, but there could be some case where the original
code had undefs in the constants (rather than just the shuffle mask),
and we'll use safe constants rather than undefs now.

The FIXME noted in foldShuffledBinop() is already visible in existing
tests, so correcting that is the next step.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336558 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 16:16:51 +00:00
Sanjay Patel
f03a571cbf [InstCombine] fix shuffle-of-binops transform to avoid poison/undef
As noted in D48987, there are many different ways for this transform to go wrong. 
In particular, the poison potential for shifts means we have to more careful with those ops. 
I added tests to make that behavior visible for all of the different cases that I could find.

This is a partial fix. To make this review easier, I did not make changes for the single binop 
pattern (handled in foldSelectShuffleWith1Binop()). I also left out some potential optimizations 
noted with TODO comments. I'll follow-up once we're confident that things are correct here.

The goal is to correct all marked FIXME tests to either avoid the shuffle transform or do it safely.

Note that distinguishing when the shuffle mask contains undefs and using getBinOpIdentity() allows 
for some improvements to div/rem patterns, so there are wins along with the missed opportunities 
and fixes.

Differential Revision: https://reviews.llvm.org/D49047


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336546 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 13:21:46 +00:00
Sanjay Patel
c810b4e17d [InstCombine] fold shuffle-with-binop and common value
This is the last significant change suggested in PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806#c5
...though there are several follow-ups noted in the code comments 
in this patch to complete this transform.

It's possible that a binop feeding a select-shuffle has been eliminated 
by earlier transforms (or the code was just written like this in the 1st 
place), so we'll fail to match the patterns that have 2 binops from: 
D48401, 
D48678, 
D48662, 
D48485.

In that case, we can try to materialize identity constants for the remaining
binop to fill in the "ghost" lanes of the vector (where we just want to pass 
through the original values of the source operand).

I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. 
For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor).

Differential Revision: https://reviews.llvm.org/D48830


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336196 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 13:44:22 +00:00
Sanjay Patel
c9a157f7fd [InstCombine] reverse canonicalization of add --> or to allow more shuffle folding
This extends D48485 to allow another pair of binops (add/or) to be combined either
with or without a leading shuffle:
or X, C --> add X, C (when X and C have no common bits set)

Here, we need value tracking to determine that the 'or' can be reversed into an 'add',
and we've added general infrastructure to allow extending to other opcodes or moving 
to where other passes could use that functionality.

Differential Revision: https://reviews.llvm.org/D48662


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336128 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:42:29 +00:00
Sanjay Patel
ce6e09c740 [InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
This was discussed in D48401 as another improvement for:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have 2 different variable values, then we shuffle (select) those lanes, 
shuffle (select) the constants, and then perform the binop. This eliminates a binop.

The new shuffle uses the same shuffle mask as the existing shuffle, so there's no 
danger of creating a difficult shuffle.

All of the earlier constraints still apply, but we also check for extra uses to 
avoid creating more instructions than we'll remove.

Additionally, we're disallowing the fold for div/rem because that could expose a
UB hole.

Differential Revision: https://reviews.llvm.org/D48678


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335974 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 13:44:06 +00:00
Sanjay Patel
83bc27f0aa [InstCombine] fix opcode check in shuffle fold
There's no way to expose this difference currently, 
but we should use the updated variable because the
original opcodes can go stale if we transform into
something new.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335920 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 20:52:43 +00:00
Sanjay Patel
25be6bdd6f [InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)
This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806

We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other 
direction because that's generally better of course). This allows us to remove the shuffle 
as we do in the regular opcodes-are-the-same cases.

This requires a small hack to make sure we don't introduce any extra poison:
https://rise4fun.com/Alive/ZGv

Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already 
canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There 
are planned enhancements for opcode transforms such or -> add.

Note that there's a different fold needed if we've already managed to simplify away a binop 
as seen in the test based on PR37806, but we manage to get that one case here because this 
fold is positioned above the demanded elements fold currently.

Differential Revision: https://reviews.llvm.org/D48485


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335888 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 17:48:04 +00:00
Sanjay Patel
bbfb91da0d [InstCombine] rearrange shuffle-of-binops logic; NFC
The commutative matcher makes things more complicated
here, and I'm planning an enhancement where this 
form is more readable.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335343 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-22 12:46:16 +00:00
Sanjay Patel
7fe9364c8b [InstCombine] fix shuffle-of-binops bug
With non-commutative binops, we could be using the same
variable value as operand 0 in 1 binop and operand 1 in 
the other, so we have to check for that possibility and
bail out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335312 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 23:56:59 +00:00
Sanjay Patel
d7f1ecfded [InstCombine] fold vector select of binops with constant ops to 1 binop (PR37806)
This is the simplest case from PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have a common variable operand used in a pair of binops with vector constants 
that are vector selected together, then we can constant shuffle the constant vectors 
to eliminate the shuffle instruction.

This has some tricky parts that are hopefully addressed in the tests and their 
respective comments:

  1. If the shuffle mask contains an undef element, then that lane of the result is 
     undef:
     http://llvm.org/docs/LangRef.html#shufflevector-instruction

     Therefore, we can replace the constant in that lane with an undef value except 
     for div/rem. With div/rem, an undef in the divisor would cause the whole op to 
     be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'.

  2. Intersect the wrapping and FMF of the original binops for the new binop. There 
     should be no extra poison or fast-math potential in the new binop that wasn't 
     possible in the original code.

  3. Disregard other uses. Given that we're eliminating uses (shortening the 
     dependency chain), I think that's always the right IR canonicalization. But 
     I purposely chose the udiv test to demonstrate the scenario where both 
     intermediate values have other uses because that seems likely worse for 
     codegen with an expensive math op. This seems like a very rare possibility to 
     me, so I don't think it requires a backend patch first.

Differential Revision: https://reviews.llvm.org/D48401



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335283 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 20:15:09 +00:00
Simon Pilgrim
415d43cb70 [InstCombine] Gracefully handle out of range extractelement indices
InstSimplify is responsible for handling these, but we shouldn't just assert here.

Reduced from oss-fuzz #4808 test case

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321489 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-27 12:00:18 +00:00
Igor Laevsky
fcf12e077b Reintroduce r320049, r320014 and r319894.
OpenGL issues should be fixed by now.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320568 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 11:21:18 +00:00
Igor Laevsky
535b72d219 Revert r320049, r320014 and r319894
They were causing failures of the piglit OpenGL tests with AMD GPUs using the
Mesa radeonsi driver.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320466 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-12 10:03:39 +00:00
Igor Laevsky
e18338969d [InstCombine] Don't crash on out of bounds index in the insertelement
Differential Revision: https://reviews.llvm.org/D40390



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320049 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 15:00:52 +00:00
Sanjay Patel
c7c532a9fd [InstCombine] use 'auto' with 'dyn_cast'; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319067 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-27 18:19:32 +00:00
Eugene Zelenko
26ee77f253 [Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316503 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 21:24:53 +00:00
Sanjay Patel
963e18e511 [InstCombine] fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315223 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-09 17:54:46 +00:00
Sanjay Patel
00b5bff2c2 [InstCombine] remove extract-of-select vector transform (2nd try)
The 1st attempt at this:
https://reviews.llvm.org/rL314117
was reverted at:
https://reviews.llvm.org/rL314118

because of bot fails for clang tests that were checking optimized IR. That should be fixed with:
https://reviews.llvm.org/rL314144
...so try again. 

Original commit message:

The transform to convert an extract-of-a-select-of-vectors was added at:
https://reviews.llvm.org/rL194013

And a question about the validity of this transform was raised in the review:
https://reviews.llvm.org/D1539:
...but not answered AFAICT>

Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with
the original commit, but they are not regressing even after we remove the transform in this patch.

The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do
those transforms as canonicalizations.

The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301:
https://bugs.llvm.org/show_bug.cgi?id=33301
...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see.

Differential Revision: https://reviews.llvm.org/D38006


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314147 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-25 20:30:53 +00:00
Sanjay Patel
26f60ee8f1 revert r314117 because there are bogus clang tests that depend on the optimizer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314118 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-25 17:00:04 +00:00