Commit Graph

13 Commits

Author SHA1 Message Date
Michael Zuckerman
563f2fdd92 [X86][LLVM]Expanding Supports lowerInterleavedLoad() in X86InterleavedAccess (VF{8|16|32} stride 3).
This patch expands the support of lowerInterleavedload to {8|16|32}x8i stride 3.

LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) and we plan to include the store (deinterleved side).

The patch goal is to optimize the following sequence:
a0 b0 c0 a1 b1 c1 a2 b2
c2 a3 b3 c3 a4 b4 c4 a5
b5 c5 a6 b6 c6 a7 b7 c7

into

a0 a1 a2 a3 a4 a5 a6 a7
b0 b1 b2 b3 b4 b5 b6 b7
c0 c1 c2 c3 c4 c5 c6 c7

Reviewers
1. zvi
2. igor
3. guyblank
4. dorit
5. Ayal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312722 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 14:02:13 +00:00
Michael Zuckerman
e11eab53ee Update test for testing avx512
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312487 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 14:15:34 +00:00
Michael Zuckerman
9db416111e Adding base lit test for x86interleaved
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311658 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 14:11:28 +00:00
Michael Zuckerman
076fb389d7 [InterLeaved] Adding lit test for future work interleaved load strid 3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311320 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 08:56:39 +00:00
Michael Zuckerman
f87ba7b701 [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess (VF16 stride 4).
This patch expands the support of lowerInterleavedStore to 16x8i stride 4.

LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=16) and we plan to include more patterns in the future.

The patch goal is to optimize the following sequence:
At the end of the computation, we have ymm2, ymm0, ymm12 and ymm3 holding
each 16 chars:

c0, c1, , c16
m0, m1, , m16
y0, y1, , y16
k0, k1, ., k16

And these need to be transposed/interleaved and stored like so:

c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 ....

Differential Revision: https://reviews.llvm.org/D35829


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310252 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-07 13:22:39 +00:00
Michael Zuckerman
61a909a34f Expanding the test case for vf8 for stride 4 interleaved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309511 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-30 11:54:57 +00:00
Michael Zuckerman
9d7507a837 [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess.
This patch expands the support of lowerInterleavedStore to 32x8i stride 4.

LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=4 VF=32) and we plan to include more patterns in the future. To reach our goal of "more patterns". We include two mask creators. The first function creates shuffle's mask equivalent to unpacklo/unpackhi instructions. The other creator creates mask equivalent to a concat of two half vectors(high/low).

The patch goal is to optimize the following sequence:
At the end of the computation, we have ymm2, ymm0, ymm12 and ymm3 holding
each 32 chars:

c0, c1, , c31
m0, m1, , m31
y0, y1, , y31
k0, k1, ., k31

And these need to be transposed/interleaved and stored like so:

c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 ....

Reviewers:
dorit
Farhana
RKSimon
guyblank
DavidKreitzer

Differential Revision: https://reviews.llvm.org/D34601



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309086 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-26 08:10:14 +00:00
Michael Zuckerman
52f43a94dd Adding base test for interleave store VF16 and expand the test for AVX512
This patch doesn't modifay any non test file.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308909 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-24 18:29:56 +00:00
Farhana Aleen
7a6e8a3058 X86InterleaveAccess: A fix for bug33826
Reviewers: DavidKreitzer

Differential Revision: https://reviews.llvm.org/D35638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308784 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-21 21:35:00 +00:00
Michael Zuckerman
b76f903b2e [X86][LLVM][test]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess test.
Adding base tast (to trunk) for Store strid=4 vf=32. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306286 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 13:27:32 +00:00
Farhana Aleen
e83d2eccef Supported lowerInterleavedStore() in X86InterleavedAccess.
Reviewers: RKSimon, DavidKreitzer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32658

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306068 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-22 22:59:04 +00:00
Evgeny Stupachenko
fdb83c86c8 Added tests for X86InterleavedStore.
Reviewers: RKSimon, DavidKreitzer

Differential Revision: https://reviews.llvm.org/D33684

Patch by: Aleen Farhana <Farhana.aleen@gmail.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304834 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-06 21:08:00 +00:00
David L Kreitzer
4475acba12 Add a pass to optimize patterns of vectorized interleaved memory accesses for
X86. The pass optimizes as a unit the entire wide load + shuffles pattern
produced by interleaved vectorization. This initial patch optimizes one pattern
(64-bit elements interleaved by a factor of 4). Future patches will generalize
to additional patterns.

Patch by Farhana Aleen

Differential revision: http://reviews.llvm.org/D24681


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284260 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 18:20:41 +00:00