[x86] Fix another miscompile in the new vector shuffle lowering found

via the fuzz tester.

Here I missed an offset when round-tripping a value through a shuffle
mask. I got it right 2 lines below. See a problem? I do. ;] I'll
probably be adding a little "swap" algorithm which accepts a range and
two values and swaps those values where they occur in the range. Don't
really have a name for it, let me know if you do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215094 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Chandler Carruth 2014-08-07 10:14:27 +00:00
parent b3364512fc
commit 0651861b7b
2 changed files with 19 additions and 1 deletions

View File

@ -7449,7 +7449,7 @@ static SDValue lowerV8I16SingleInputVectorShuffle(
Input - SourceOffset;
// We have to swap the uses in our half mask in one sweep.
for (int &M : HalfMask)
if (M == SourceHalfMask[Input - SourceOffset])
if (M == SourceHalfMask[Input - SourceOffset] + SourceOffset)
M = Input;
else if (M == Input)
M = SourceHalfMask[Input - SourceOffset] + SourceOffset;

View File

@ -220,6 +220,24 @@ define <8 x i16> @shuffle_v8i16_26401375(<8 x i16> %a, <8 x i16> %b) {
ret <8 x i16> %shuffle
}
define <8 x i16> @shuffle_v8i16_66751643(<8 x i16> %a, <8 x i16> %b) {
; SSE2-LABEL: @shuffle_v8i16_66751643
; SSE2: # BB#0:
; SSE2-NEXT: pshuflw {{.*}} # xmm0 = xmm0[3,1,2,3,4,5,6,7]
; SSE2-NEXT: pshufhw {{.*}} # xmm0 = xmm0[0,1,2,3,4,6,5,7]
; SSE2-NEXT: pshufd {{.*}} # xmm0 = xmm0[2,3,2,0]
; SSE2-NEXT: pshuflw {{.*}} # xmm0 = xmm0[1,1,3,2,4,5,6,7]
; SSE2-NEXT: pshufhw {{.*}} # xmm0 = xmm0[0,1,2,3,7,5,4,6]
; SSE2-NEXT: retq
;
; SSSE3-LABEL: @shuffle_v8i16_66751643
; SSSE3: # BB#0:
; SSSE3-NEXT: pshufb {{.*}} # xmm0 = xmm0[12,13,12,13,14,15,10,11,2,3,12,13,8,9,6,7]
; SSSE3-NEXT: retq
%shuffle = shufflevector <8 x i16> %a, <8 x i16> %b, <8 x i32> <i32 6, i32 6, i32 7, i32 5, i32 1, i32 6, i32 4, i32 3>
ret <8 x i16> %shuffle
}
define <8 x i16> @shuffle_v8i16_00444444(<8 x i16> %a, <8 x i16> %b) {
; SSE2-LABEL: @shuffle_v8i16_00444444
; SSE2: # BB#0: