317 Commits

Author SHA1 Message Date
Michael Kuperstein
70d8785561 [SLP] Revert everything that has to do with memory access sorting.
This reverts r293386, r294027, r294029 and r296411.

Turns out the SLP tree isn't actually a "tree" and we don't handle
accessing the same packet of loads in several different orders well,
causing miscompiles.

Revert until we can fix this properly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297493 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 18:59:07 +00:00
Michael Kuperstein
48a77b7523 [SLP] Revert r296863 due to miscompiles.
Details and reproducer are on the email thread for r296863.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297103 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-06 23:54:51 +00:00
Alexey Bataev
5645eb34af [SLP] A test for vectorization of users of extractelement instructions,
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297024 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-06 16:26:00 +00:00
Mohammad Shahid
48b84df15d [SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available
for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR.
The fix is to compute the mask for out of order memory accesses while building the vectorizable tree
instead of actual vectorization of vectorizable tree.It also needs to recompute the proper Lane for
external use of vectorizable scalars based on shuffle mask.

Reviewers: mkuper

Differential Revision: https://reviews.llvm.org/D30159

Change-Id: Ide8773ce0ad3562f3cf4d1a0ad0f487e2f60ce5d

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296863 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-03 10:02:47 +00:00
Hans Wennborg
4024478081 Revert r296575 "[SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available"
It caused miscompiles, e.g. in Chromium (PR32109).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296654 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-01 18:57:16 +00:00
Alexey Bataev
8d23745f59 [SLP] Preserve IR flags when vectorizing horizontal reductions.
Summary:
The SLP vectorizer should propagate IR-level optimization hints/flags
(nsw, nuw, exact, fast-math) when converting scalar horizontal
reductions instructions into vectors, just like for other vectorized
instructions.
It doe not include IR propagation for extra arguments, we need to handle
original scalar operations for extra args to propagate correct flags.

Reviewers: mkuper, mzolotukhin, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30418

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296614 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-01 12:43:39 +00:00
Alexey Bataev
b25d8fc2fa [SLP] Preserve IR flags for extra args.
Summary:
We should preserve IR flags for extra args. These IR flags should be
taken from original scalar operations, not from the reduction
operations.

Reviewers: mkuper, mzolotukhin, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296613 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-01 12:22:33 +00:00
Alexey Bataev
145ace26f5 [SLP] Fix for PR32038: extra add of PHI node when it is not required.
Summary:
If horizontal reduction tree starts from the binary operation that is
used in PHI node, but this PHI is not used in horizontal reduction, we
may end up with extra addition of this PHI node after vectorization.
Here is an example:
```
%phi = phi i32 [ %tmp, %end], ...
...
%tmp = add i32 %tmp1, %tmp2
end:
```
after vectorization we always have something like:

```
%phi = phi i32 [ %tmp, %end], ...
...
%red = extractelement <8 x 32> %vec.red, 0
%tmp = add i32 %red, %phi
end:
```
even if `%phi` is not used in reduction tree. Patch considers these PHI
nodes as extra arguments and considers them in the final result iff they
really used in reduction.

Reviewers: mkuper, hfinkel, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296606 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-01 10:50:44 +00:00
Mohammad Shahid
b2ec2bd1f6 [SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available
for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR.
The fix is to compute the mask for out of order memory accesses while building the vectorizable tree
instead of actual vectorization of vectorizable tree.

Reviewers: mkuper

Differential Revision: https://reviews.llvm.org/D30159

Change-Id: Id1e287f073fa4959713ba545fa4254db5da8b40d

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296575 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-01 03:51:54 +00:00
Michael Kuperstein
3ee59b3779 [SLP] Load sorting should not try to sort things that aren't loads.
We may get a VL where the first element is a load, but the others
aren't. Trying to sort such VLs can only lead to sorrow.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296411 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 23:18:11 +00:00
Alexey Bataev
5c641cd1c6 [SLP] Use different flags in tests for reduction ops and extra args.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296376 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 20:22:44 +00:00
Alexey Bataev
724703a79a [SLP] Modify test to check IR flags propagation for extra args.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296369 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 19:16:09 +00:00
Alexey Bataev
d222965b6b [SLP] A test for a fix of PR32038.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296349 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 16:07:10 +00:00
Alexey Bataev
bfa45208ee [SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong
result

Summary:
If the same value is used several times as an extra value, SLP
vectorizer takes it into account only once instead of actual number of
using.
For example:
```
int val = 1;
for (int y = 0; y < 8; y++) {
  for (int x = 0; x < 8; x++) {
    val = val + input[y * 8 + x] + 3;
  }
}
```
We have 2 extra rguments: `1` - initial value of horizontal reduction
and `3`, which is added 8*8 times to the reduction. Before the patch we
added `1` to the reduction value and added once `3`, though it must be
added 64 times.

Reviewers: mkuper, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295972 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 13:37:09 +00:00
Alexey Bataev
8188e22176 Revert "[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong"
This reverts commit 7c5141e577d9efd1c8e3087566a38ce6b3a41a84.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295957 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 11:09:35 +00:00
Alexey Bataev
4ef753a118 [SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong
result

Summary:
If the same value is used several times as an extra value, SLP
vectorizer takes it into account only once instead of actual number of
using.
For example:
```
int val = 1;
for (int y = 0; y < 8; y++) {
  for (int x = 0; x < 8; x++) {
    val = val + input[y * 8 + x] + 3;
  }
}
```
We have 2 extra rguments: `1` - initial value of horizontal reduction
and `3`, which is added 8*8 times to the reduction. Before the patch we
added `1` to the reduction value and added once `3`, though it must be
added 64 times.

Reviewers: mkuper, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295956 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 10:57:15 +00:00
Alexey Bataev
8d04a8701d Revert "[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong"
This reverts commit d83c81ee6a8dea662808ac22b396d1bb0595c89d.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295951 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 09:59:29 +00:00
Alexey Bataev
29965753c8 [SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong
result

Summary:
If the same value is used several times as an extra value, SLP
vectorizer takes it into account only once instead of actual number of
using.
For example:
```
int val = 1;
for (int y = 0; y < 8; y++) {
  for (int x = 0; x < 8; x++) {
    val = val + input[y * 8 + x] + 3;
  }
}
```
We have 2 extra rguments: `1` - initial value of horizontal reduction
and `3`, which is added 8*8 times to the reduction. Before the patch we
added `1` to the reduction value and added once `3`, though it must be
added 64 times.

Reviewers: mkuper, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295949 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 09:40:38 +00:00
Michael Kuperstein
c6527c8786 Revert r295868 because it breaks a different SLP lit test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295906 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:35:13 +00:00
Alexey Bataev
d6db829b03 [SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong result
Summary:
If the same value is used several times as an extra value, SLP
vectorizer takes it into account only once instead of actual number of
using.
For example:
```
int val = 1;
for (int y = 0; y < 8; y++) {
  for (int x = 0; x < 8; x++) {
    val = val + input[y * 8 + x] + 3;
  }
}
```
We have 2 extra rguments: `1` - initial value of horizontal reduction
and `3`, which is added 8*8 times to the reduction. Before the patch we
added `1` to the reduction value and added once `3`, though it must be
added 64 times.

Reviewers: mkuper, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295868 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 20:06:40 +00:00
Alexey Bataev
437bff4b03 [SLP] Test with extra argument used several times.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295853 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 17:47:28 +00:00
Alexey Bataev
9e606c16f0 [SLP] Tests for shuffle/blending operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295717 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 13:40:55 +00:00
Alexey Bataev
b0f1c39d24 [SLP] Additional test for vectorization of cal/invoke args vectorization
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295657 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-20 12:41:16 +00:00
Alexey Bataev
2cc74041bd [SLP] Fix for PR31879: vectorize repeated scalar ops that don't get put
back into a vector

Previously the cost of the existing ExtractElement/ExtractValue
instructions was considered as a dead cost only if it was detected that
they have only one use. But these instructions may be considered
dead also if users of the instructions are also going to be vectorized,
like:
```
%x0 = extractelement <2 x float> %x, i32 0
%x1 = extractelement <2 x float> %x, i32 1
%x0x0 = fmul float %x0, %x0
%x1x1 = fmul float %x1, %x1
%add = fadd float %x0x0, %x1x1
```
This can be transformed to
```
%1 = fmul <2 x float> %x, %x
%2 = extractelement <2 x float> %1, i32 0
%3 = extractelement <2 x float> %1, i32 1
%add = fadd float %2, %3
```
because though `%x0` and `%x1` have 2 users each other, these users are
part of the vectorized tree and we can consider these `extractelement`
instructions as dead.

Differential Revision: https://reviews.llvm.org/D29900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295056 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 15:20:48 +00:00
Alexey Bataev
a4e9174165 [SLP] Additional tests for extractelement cost fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295050 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 12:52:05 +00:00
Alexey Bataev
aa5c0a0385 [SLP] Test for extractelement cost fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294980 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-13 19:08:19 +00:00
Alexey Bataev
939e3e27c1 [SLP] Fix for PR31690: Allow using of extra values in horizontal
reductions.

Currently, LLVM supports vectorization of horizontal reduction
instructions with initial value set to 0. Patch supports vectorization
of reduction with non-zero initial values. Also, it supports a
vectorization of instructions with some extra arguments, like:
```
float f(float x[], int a, int b) {
  float p = a % b;
  p += x[0] + 3;
  for (int i = 1; i < 32; i++)
    p += x[i];
  return p;
}
```
Patch allows vectorization of this kind of horizontal reductions.

Differential Revision: https://reviews.llvm.org/D29727

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294934 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-13 08:01:26 +00:00
Alexey Bataev
9ef4102334 [SLP] Additional test to check correct work of horizontal reductions,
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294505 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 19:52:46 +00:00
Michael Kuperstein
1ba9ee9b50 [SLP] Revert "Allow using of extra values in horizontal reductions."
This breaks when one of the extra values is also a scalar that
participates in the same vectorization tree which we'll end up
reducing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294245 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-06 21:50:59 +00:00
Michael Kuperstein
dce4987e9b [SLP] Use SCEV to sort memory accesses.
This generalizes memory access sorting to use differences between SCEVs,
instead of relying on constant offsets. That allows us to properly do
SLP vectorization of non-sequentially ordered loads within loops bodies.

Differential Revision: https://reviews.llvm.org/D29425


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294027 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-03 19:09:45 +00:00
Alexey Bataev
dfb9a4d840 [SLP] Fix for PR31690: Allow using of extra values in horizontal reductions.
Currently LLVM supports vectorization of horizontal reduction
instructions with initial value set to 0. Patch supports vectorization
of reduction with non-zero initial values. Also it supports a
vectorization of instructions with some extra arguments, like:

float f(float x[], int a, int b) {
  float p = a % b;
  p += x[0] + 3;
  for (int i = 1; i < 32; i++)
    p += x[i];
  return p;
}

Patch allows vectorization of this kind of horizontal reductions.

Differential Revision: https://reviews.llvm.org/D28961

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293994 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-03 08:08:50 +00:00
Mohammad Shahid
13e3a7f7a3 [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.
The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask.

Reviewers: hfinkel, mssimpso, mkuper

Differential Revision: https://reviews.llvm.org/D26905

Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293386 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-28 17:59:44 +00:00
Alexey Bataev
69c8d681d3 [SLP] Add one more reduction operation for extra argument test to make
it vectorizable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293162 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-26 09:18:41 +00:00
Alexey Bataev
add597c306 [SLP] Fixed test for extra arguments in horizontal reductions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293153 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-26 06:19:52 +00:00
Alexey Bataev
1a40f3b238 [SLP] Extra test for functionality with extra args.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293076 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-25 17:24:31 +00:00
Alexey Bataev
b6c9077a19 [SLP] Improve horizontal vectorization for non-power-of-2 number of
instructions.

If number of instructions in horizontal reduction list is not power of 2
then only PowerOf2Floor(NumberOfInstructions) last elements are actually
vectorized, other instructions remain scalar. Patch tries to vectorize
the remaining elements either.

Differential Revision: https://reviews.llvm.org/D28959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293042 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-25 09:54:38 +00:00
Alexey Bataev
e414b99879 [SLP] Additional test for checking that instruction with extra args is
not reconstructed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292911 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-24 10:44:00 +00:00
Alexey Bataev
2394101f1d [SLP] Additional test with extra args in horizontal reductions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292821 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-23 19:28:23 +00:00
Alexey Bataev
a1e9c67a8d [SLP] Additional test for SLP vectorizer with 31 reduction elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-23 11:53:16 +00:00
Alexey Bataev
27f0bb5af4 [SLP] Initial test for fix of PR31690.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292631 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 18:40:21 +00:00
Alexey Bataev
4a8de03c17 [SLP] A new test for horizontal vectorization for non-power-of-2
instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292626 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 18:04:29 +00:00
Mohammad Shahid
b60e3766a8 [SLP] Add a base test for jumbled store
Change-Id: I905ce08a02c76a6896dcfd9629547417c99adc4a

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292581 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 06:05:33 +00:00
Alexey Bataev
05153bfa17 [SLP] Add a tests for a fix for PR30787.
Add a test for PR30787: Failure to beneficially vectorize 'copyable'
elements in integer binary ops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292416 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-18 18:07:46 +00:00
Michael Kuperstein
8672808b6b [SLP] Remove bogus assert.
The removed assert seems bogus - it's perfectly legal for the roots of the
vectorized subtrees to be equal even if the original scalar values aren't,
if the original scalars happen to be equivalent.

This fixes PR31599.

Differential Revision: https://reviews.llvm.org/D28539


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291692 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 19:23:57 +00:00
Simon Pilgrim
0ce9581271 Revert r290970 [SLPVectorizer] Regenerate test.
The check script will use var names before they are declared, which filecheck doesn't like.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290971 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-04 16:12:07 +00:00
Simon Pilgrim
31b667eb67 [SLPVectorizer] Regenerate test.
Missed var name

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290970 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-04 16:01:55 +00:00
Simon Pilgrim
f9014fd84b Regenerate test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290969 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-04 15:52:41 +00:00
Michael Kuperstein
04912c8225 [InstCombine] Canonicalize insert splat sequences into an insert + shuffle
This adds a combine that canonicalizes a chain of inserts which broadcasts
a value into a single insert + a splat shufflevector.

This fixes PR31286.

Differential Revision: https://reviews.llvm.org/D27992


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290641 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-28 00:18:08 +00:00
Alexey Bataev
152f85e176 [TEST] Initial commit of tests for minmax horizontal reductions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289817 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 13:21:29 +00:00
Matthew Simpson
953d7312ee [SLP] Fix sign-extends for type-shrinking
This patch ensures the correct minimum bit width during type-shrinking.
Previously when type-shrinking, we always sign-extended values back to their
original width. However, if we are going to sign-extend, and the sign bit is
unknown, we have to increase the minimum bit width by one bit so the
sign-extend will fill the upper bits correctly. If the sign bit is known to be
zero, we can perform a zero-extend instead. This should fix PR31243.

Reference: https://llvm.org/bugs/show_bug.cgi?id=31243
Differential Revision: https://reviews.llvm.org/D27466

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289470 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-12 21:11:04 +00:00