Commit Graph

165 Commits

Author SHA1 Message Date
Sanjay Patel
a2a6bc5fb1 [InstCombine] fold insertelement into splat of same scalar
Forming the canonical splat shuffle improves analysis and
may allow follow-on transforms (although some possibilities
are missing as shown in the test diffs).

The backend generically turns these patterns into build_vector,
so there should be no codegen regressions. All targets are
expected to be able to lower splats efficiently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365379 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-08 19:48:52 +00:00
Sanjay Patel
ea3ab8c4b7 [InstCombine] canonicalize insert+splat to/from element 0 of vector
We recognize a splat from element 0 in (VectorUtils) llvm::getSplatValue()
and also in ShuffleVectorInst::isZeroEltSplatMask(), so this converts
to that form for better matching.

The backend generically turns these patterns into build_vector,
so there should be no codegen difference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365342 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-08 16:26:48 +00:00
Sanjay Patel
ad6324c46b [InstCombine] allow undef elements when forming splat from chain of insertelements
We allow forming a splat (broadcast) shuffle, but we were conservatively limiting
that to cases where all elements of the vector are specified. It should be safe
from a codegen perspective to allow undefined lanes of the vector because the
expansion of a splat shuffle would become the chain of inserts again.

Forming splat shuffles can reduce IR and help enable further IR transforms.
Motivating bugs:
https://bugs.llvm.org/show_bug.cgi?id=42174
https://bugs.llvm.org/show_bug.cgi?id=16739

Differential Revision: https://reviews.llvm.org/D63848

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365147 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-04 16:45:34 +00:00
Sanjay Patel
ec733559f5 [InstCombine] simplify code for inserts -> splat; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364441 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 15:52:59 +00:00
Sanjay Patel
898c7f2bca [InstCombine] prevent crashing with invalid extractelement index
This was found/reduced from a fuzzer report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361729 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-26 14:03:50 +00:00
Sanjay Patel
4888407568 [InstSimplify] fold insertelement-of-extractelement
This was partly handled in InstCombine (only the constant
index case), so delete that and zap it more generally in
InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361576 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-24 00:13:58 +00:00
Sanjay Patel
3cce33d6aa [InstCombine] remove redundant fold for extractelement; NFC
The out-of-bounds index pattern is handled by InstSimplify,
so the extractelement should be eliminated next time it is
visited.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361570 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-23 23:33:38 +00:00
Sanjay Patel
71b5f9d74a [InstCombine] remove redundant fold for insertelement; NFC
The out-of-bounds index pattern is handled by InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361569 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-23 23:33:34 +00:00
Sanjay Patel
8dc3a075f3 [InstSimplify] insertelement V, undef, ? --> V
This was part of InstCombine, but it's better placed in
InstSimplify. InstCombine also had an unreachable but weaker
fold for insertelement with undef index, so that is deleted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361559 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-23 21:49:47 +00:00
Sanjay Patel
a7df0fa47f [InstCombine] be more careful when transforming a shuffle mask
This is reduced from a fuzzer test:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890

Usually, demanded elements should be able to simplify shuffle
mask elements that are pointing to undef elements of its source
operands, but that doesn't happen in the test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361533 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-23 18:46:03 +00:00
Sanjay Patel
e607aa81a5 [InstCombine] fold shuffles of insert_subvectors
This should be a valid exception to the general rule of not creating new shuffle masks in IR...
because we already do it. :)
Also, DAG combining/legalization will undo this by widening the shuffle back out if needed.

Explanation for how we already do this: SLP or vector source can create chains of insert/extract
as shown in 1 of the examples from PR16739:
https://godbolt.org/z/NlK7rA
https://bugs.llvm.org/show_bug.cgi?id=16739

And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles.

Differential Revision: https://reviews.llvm.org/D62024

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361338 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-22 00:32:25 +00:00
Sanjay Patel
81037cf3be [InstCombine] move bitcast after insertelement-with-bitcasted-operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361058 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 18:06:12 +00:00
Sanjay Patel
f2bfd1e2c6 [InstCombine] remove overzealous assert for shuffles (PR41419)
As the TODO indicates, instsimplify could be improved.

Should fix:
https://bugs.llvm.org/show_bug.cgi?id=41419

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357910 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-08 13:28:29 +00:00
Mikael Holmen
74746501d3 [InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder
Summary:
This fixes PR41270.

The recursive function evaluateInDifferentElementOrder expects to be called
on a vector Value, so when we call it on a vector GEP's arguments, we must
first check that the argument is indeed a vector.

Reviewers: reames, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60058

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357389 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-01 14:10:10 +00:00
Mikael Holmen
c45a79becf Revert "[InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder"
This reverts commit 75216a6dbc.

I'll recommit with a better commit message with reference to the
phabricator review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357387 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-01 14:06:45 +00:00
Mikael Holmen
75216a6dbc [InstCombine] Handle vector gep with scalar argument in evaluateInDifferentElementOrder
This fixes PR41270.

The recursive function evaluateInDifferentElementOrder expects to be called
on a vector Value, so when we call it on a vector GEP's arguments, we must
first check that the argument is indeed a vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357385 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-01 13:48:56 +00:00
Sanjay Patel
3a12f8f3d2 [InstCombine] canonicalize select shuffles by commuting
In PR41304:
https://bugs.llvm.org/show_bug.cgi?id=41304
...we have a case where we want to fold a binop of select-shuffle (blended) values.

Rather than try to match commuted variants of the pattern, we can canonicalize the
shuffles and check for mask equality with commuted operands.

We don't produce arbitrary shuffle masks in instcombine, but select-shuffles are a
special case that the backend is required to handle because we already canonicalize
vector select to this shuffle form.

So there should be no codegen difference from this change. It's possible that this
improves CSE in IR though.

Differential Revision: https://reviews.llvm.org/D60016

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357366 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-31 15:01:30 +00:00
Sanjay Patel
0509fb6a5d [InstCombine] move shuffle canonicalizations before other transforms
This may not be NFC, but I'm not sure how to expose any diffs in
tests. In theory, it should be slightly more efficient and possibly
more profitable to do the canonicalizations (which can increase the
undef elements in the mask) ahead of SimplifyDemandedVectorElts().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357272 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 16:49:38 +00:00
Sanjay Patel
0811b200e6 [InstCombine] limit extracting shuffle transform based on uses
As discussed in D53037, this can lead to worse codegen, and we
don't generally expect the backend to be able to optimize
arbitrary shuffles. If there's only one use of the 1st shuffle,
that means it's getting removed, so that should always be
safe.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353235 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-05 22:58:45 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Sanjay Patel
a04db6078a [InstCombine] refactor isCheapToScalarize(); NFC
As the FIXME indicates, this has the potential to go
overboard. So I'm not sure if it's even worth keeping 
this vs. iteratively doing simple matches, but we might 
as well clean it up.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@349523 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-18 19:07:38 +00:00
Matt Arsenault
a71d0d5fdc InstCombine: Scalarize single use icmp/fcmp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348801 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-10 21:50:54 +00:00
Sanjay Patel
7c4834215a [InstCombine] remove dead code from visitExtractElement
Extracting from a splat constant is always handled by InstSimplify.
Move the test for this from InstCombine to InstSimplify to make
sure that stays true.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348423 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-05 23:09:33 +00:00
Sanjay Patel
810c63b2be [InstCombine] reduce duplication in visitExtractElementInst; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348418 91177308-0d34-0410-b5e6-96231b3b80d8
2018-12-05 21:57:51 +00:00
Sanjay Patel
36d678978c [InstCombine] try to turn shuffle into insertelement
shuffle (insert ?, Scalar, IndexC), V1, Mask --> insert V1, Scalar, IndexC'

The motivating case is at least a couple of steps away: I noticed that
SLPVectorizer does not analyze shuffles as well as sequences of 
insert/extract in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724
...so SLP may fail to vectorize when source code has shuffles to start 
with or instcombine has converted insert/extract to shuffles.

Independent of that, an insertelement is always a simpler op for IR 
analysis vs. a shuffle, so we should transform to insert when possible.

I don't think there's any codegen concern here - if a target can't insert 
a scalar directly to some fixed element in a vector (x86?), then this 
should get expanded to the insert+shuffle that we started with.

Differential Revision: https://reviews.llvm.org/D53507


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345607 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-30 15:26:39 +00:00
Sanjay Patel
7f64213612 [InstCombine] use 'match' to simplify code; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344855 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-20 17:15:57 +00:00
Sanjay Patel
f521828b26 [InstCombine] make code more flexible with lambda; NFC
I couldn't tell from svn history when these checks were added,
but it pre-dates the split of instcombine into its own directory
at rL92459.

The motivation for changing the check is partly shown by the
code in PR34724:
https://bugs.llvm.org/show_bug.cgi?id=34724

There are also existing regression tests for SLPVectorizer with
sequences of extract+insert that are likely assumed to become
shuffles by the vectorizer cost models.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344854 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-20 16:58:27 +00:00
Sanjay Patel
04e0dbca23 [InstCombine] add explanatory comment for strange vector logic; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344852 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-20 16:25:55 +00:00
Sanjay Patel
3394148166 [InstCombine] combine a shuffle and an extract subvector shuffle
This is part of the missing IR-level folding noted in D52912.
This should be ok as a canonicalization because the new shuffle mask can't
be any more complicated than the existing shuffle mask. If there's some 
target where the shorter vector shuffle is not legal, it should just end up 
expanding to something like the pair of shuffles that we're starting with here.

Differential Revision: https://reviews.llvm.org/D53037


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344476 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-14 15:25:06 +00:00
Sanjay Patel
9f5daa2df0 revert r344082: [InstCombine] reverse 'trunc X to <N x i1>' canonicalization
This commit accidentally included the diffs from D53057.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344178 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-10 20:39:39 +00:00
Sanjay Patel
1dd3c06445 [InstCombine] reverse 'trunc X to <N x i1>' canonicalization
icmp ne (and X, 1), 0 --> trunc X to N x i1

Ideally, we'd do the same for scalars, but there will likely be 
regressions unless we add more trunc folds as we're doing here 
for vectors.

The motivating vector case is from PR37549:
https://bugs.llvm.org/show_bug.cgi?id=37549

define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) {
  %c = fcmp ole <4 x float> %x, %y
  %s = sext <4 x i1> %c to <4 x i32>
  %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1>
  %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3>
  %cond = or <4 x i32> %s1, %s2
  %condtr = trunc <4 x i32> %cond to <4 x i1>
  %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w
  ret <4 x float> %r
}

Here's a sampling of the vector codegen for that case using 
mask+icmp (current behavior) vs. trunc (with this patch):

AVX before:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vandps	LCPI0_0(%rip), %xmm0, %xmm0
vxorps	%xmm1, %xmm1, %xmm1
vpcmpeqd	%xmm1, %xmm0, %xmm0
vblendvps	%xmm0, %xmm3, %xmm2, %xmm0

AVX after:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vblendvps	%xmm0, %xmm2, %xmm3, %xmm0

AVX512f before:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vpbroadcastd	LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1]
vptestnmd	%zmm1, %zmm0, %k1
vblendmps	%zmm3, %zmm2, %zmm0 {%k1}

AVX512f after:

vcmpleps	%xmm1, %xmm0, %xmm0
vpermilps	$80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1]
vpermilps	$250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3]
vorps	%xmm0, %xmm1, %xmm0
vpslld	$31, %xmm0, %xmm0
vptestmd	%zmm0, %zmm0, %k1
vblendmps	%zmm2, %zmm3, %zmm0 {%k1}

AArch64 before:

fcmge	v0.4s, v1.4s, v0.4s
zip1	v1.4s, v0.4s, v0.4s
zip2	v0.4s, v0.4s, v0.4s
orr	v0.16b, v1.16b, v0.16b
movi	v1.4s, #1
and	v0.16b, v0.16b, v1.16b
cmeq	v0.4s, v0.4s, #0
bsl	v0.16b, v3.16b, v2.16b

AArch64 after:

fcmge	v0.4s, v1.4s, v0.4s
zip1	v1.4s, v0.4s, v0.4s
zip2	v0.4s, v0.4s, v0.4s
orr	v0.16b, v1.16b, v0.16b
bsl	v0.16b, v2.16b, v3.16b

PowerPC-le before:

xvcmpgesp 34, 35, 34
vspltisw 0, 1
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxlxor 35, 35, 35
xxland 34, 0, 32
vcmpequw 2, 2, 3
xxsel 34, 36, 37, 34

PowerPC-le after:

xvcmpgesp 34, 35, 34
vmrglw 3, 2, 2
vmrghw 2, 2, 2
xxlor 0, 35, 34
xxsel 34, 37, 36, 0

Differential Revision: https://reviews.llvm.org/D52747



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344082 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-09 21:26:01 +00:00
Sanjay Patel
dd6a1fbcd5 [InstCombine] make helper function 'static'; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344056 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-09 15:29:26 +00:00
Sanjay Patel
62aab8af14 [InstCombine] allow bitcast to/from FP for vector insert/extract transform
This is a follow-up to rL343482 / D52439.
This was a pattern that initially caused the commit to be reverted because
the transform requires a bitcast as shown here.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343794 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-04 16:25:05 +00:00
Sanjay Patel
f7cc35b1ce [InstCombine] try to convert vector insert+extract to trunc; 2nd try
This was originally committed at rL343407, but reverted at 
rL343458 because it crashed trying to handle a case where
the destination type is FP. This version of the patch adds
a check for that possibility. Tests added at rL343480.

Original commit message:

This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the
extract index doesn't correspond to the LSBs of the scalar, then we
have to shift-right before the truncate. Endian-ness makes this tricky,
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343482 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 14:40:00 +00:00
Hans Wennborg
f7f9563cf4 Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"
This caused Chromium builds to fail with "Illegal Trunc" assertion.
See https://crbug.com/890723 for repro.

> This transform is requested for the backend in:
> https://bugs.llvm.org/show_bug.cgi?id=39016
> ...but I figured it was worth doing in IR too, and it's probably
> easier to implement here, so that's this patch.
>
> In the simplest case, we are just truncating a scalar value. If the
> extract index doesn't correspond to the LSBs of the scalar, then we
> have to shift-right before the truncate. Endian-ness makes this tricky,
> but hopefully the ASCII-art helps visualize the transform.
>
> Differential Revision: https://reviews.llvm.org/D52439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343458 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 12:07:45 +00:00
Sanjay Patel
8594ae0860 [InstCombine] try to convert vector insert+extract to trunc
This transform is requested for the backend in:
https://bugs.llvm.org/show_bug.cgi?id=39016
...but I figured it was worth doing in IR too, and it's probably 
easier to implement here, so that's this patch.

In the simplest case, we are just truncating a scalar value. If the 
extract index doesn't correspond to the LSBs of the scalar, then we 
have to shift-right before the truncate. Endian-ness makes this tricky, 
but hopefully the ASCII-art helps visualize the transform.

Differential Revision: https://reviews.llvm.org/D52439


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343407 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 14:34:01 +00:00
Sanjay Patel
22030c13ce [InstCombine] allow lengthening of insertelement to eliminate shuffles
As noted in post-commit comments for D52548, the limitation on 
increasing vector length can be applied by opcode.
As a first step, this patch only allows insertelement to be
widened because that has no logical downsides for IR and has 
little risk of pessimizing codegen.

This may cause PR39132 to go into hiding during a full compile,
but that bug is not fixed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343406 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 13:50:42 +00:00
Sanjay Patel
bdd68c5514 [InstCombine] fix formatting in vector evaluators; NFC
We need to alter the functionality as shown in D52548.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343379 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 15:05:24 +00:00
Sanjay Patel
331a5ec713 [InstCombine] don't propagate wider shufflevector arguments to predecessors
InstCombine would propagate shufflevector insts that had wider output vectors onto 
predecessors, which would sometimes push undef's onto the divisor of a div/rem and 
result in bad codegen.

I've fixed this by just banning propagating shufflevector back if the result of 
the shufflevector is wider than the input vectors.

Patch by: @sheredom (Neil Henning)

Differential Revision: https://reviews.llvm.org/D52548



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343329 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 15:24:41 +00:00
Sanjay Patel
5171686681 [InstCombine] add bitcast+extelt helper function; NFC
We can handle patterns where the elements have different 
sizes, so refactoring ahead of trying to add another blob
within these clauses.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342918 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-24 20:41:22 +00:00
Sanjay Patel
c155aff937 [InstCombine] improve variable name and use 'match'; NFC
'width' of a vector usually refers to the bit-width.

https://bugs.llvm.org/show_bug.cgi?id=39016
shows a case where we could extend this fold to handle
a case where the number of elements in the bitcasted
vector is not equal to the resulting value.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342902 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-24 16:39:03 +00:00
Sanjay Patel
a5f1bd0217 [InstCombine] narrow vector select with padded condition and extracted result (PR38691)
shuf (sel (shuf NarrowCond, undef, WideMask), X, Y), undef, NarrowMask) -->
sel NarrowCond, (shuf X, undef, NarrowMask), (shuf Y, undef, NarrowMask)

The motivating case from:
https://bugs.llvm.org/show_bug.cgi?id=38691
...is the last regression test. In that case, we're just left with the narrow select.

Note that if we do create new shuffles, they use the existing extraction identity mask, 
so there's no danger that this transform creates arbitrary shuffles.

Differential Revision: https://reviews.llvm.org/D51496


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341708 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-07 21:03:34 +00:00
Sanjay Patel
a1a7c48dc0 [InstCombine] move declarations closer to uses; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340930 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-29 14:42:12 +00:00
Sanjay Patel
e817129d2b [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340926 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-29 13:24:34 +00:00
Fangrui Song
af7b1832a0 Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338293 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-30 19:41:25 +00:00
Sanjay Patel
128882e5b6 [InstCombine] allow flag propagation when using safe constant
This corresponds with the code for the single binop pattern
added in rL336684.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336696 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 16:09:49 +00:00
Sanjay Patel
57f9ec6739 [InstCombine] safely allow non-commutative binop identity constant folds
This was originally intended with D48893, but as discussed there, we
have to make the folds safe from producing extra poison. This should
give the single binop folds the same capabilities as the existing
folds for 2-binops+shuffle.

LLVM binary opcode review: there are a total of 18 binops. There are 7 
commutative binops (add, mul, and, or, xor, fadd, fmul) which we already 
fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr,
fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't 
bother with sub/fsub with constant operand 1 because those are 
canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336684 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 15:12:31 +00:00
Sanjay Patel
4b88500342 [InstCombine] drop poison flags when shuffle mask undef propagates to constant
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336679 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 14:27:55 +00:00
Sanjay Patel
db58626da5 [InstCombine] allow more shuffle-binop folds with safe constants
The case with 2 variables is more complicated than the case where
we eliminate the shuffle entirely because a shuffle with an undef 
mask element creates an undef result. 

I'm not aware of any current analysis/transform that recognizes that 
undef propagating to a div/rem/shift, but we have to guard against 
the possibility.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336668 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 13:33:26 +00:00
Sanjay Patel
1cc2d65953 [InstCombine] allow more shuffle folds using safe constants
getSafeVectorConstantForBinop() was calling getBinOpIdentity() assuming
that the constant we wanted was operand 1 (RHS). That's wrong, but I
don't think we could expose a bug or even a suboptimal fold from that
because the callers have other guards for any binop that would have
been affected.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336617 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 23:22:47 +00:00