[X86] When legalizing (v64i1 select i8, v64i1, v64i1) make sure not to introduce bitcasts to i64 in 32-bit mode

We legalize selects of masks with scalar conditions using a bitcast to an integer type. But if we are in 32-bit mode we can't convert v64i1 to i64. So instead split the v64i1 to v32i1 and concat it back together. Each half will then be legalized by bitcasting to i32 which is fine.

The test case is a little indirect. If we have the v64i1 select in IR it will get legalized by legalize vector ops which has a run of type legalization after it. That type legalization run is able to fix this i64 bitcast. So in order to avoid that we need a build_vector of a splat which legalize vector ops will ignore. Legalize DAG will then turn that into a select via LowerBUILD_VECTORvXi1. And the select will get legalized. In this case there is no type legalizer run to cleanup the bitcast.

This fixes pr35972.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322724 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Craig Topper 2018-01-17 18:46:01 +00:00
parent e14d811480
commit 42c5a95af4
2 changed files with 32 additions and 0 deletions

View File

@ -18256,6 +18256,18 @@ SDValue X86TargetLowering::LowerSELECT(SDValue Op, SelectionDAG &DAG) const {
return DAG.getNode(X86ISD::SELECTS, DL, VT, Cmp, Op1, Op2);
}
// For v64i1 without 64-bit support we need to split and rejoin.
if (VT == MVT::v64i1 && !Subtarget.is64Bit()) {
assert(Subtarget.hasBWI() && "Expected BWI to be legal");
SDValue Op1Lo = extractSubVector(Op1, 0, DAG, DL, 32);
SDValue Op2Lo = extractSubVector(Op2, 0, DAG, DL, 32);
SDValue Op1Hi = extractSubVector(Op1, 32, DAG, DL, 32);
SDValue Op2Hi = extractSubVector(Op2, 32, DAG, DL, 32);
SDValue Lo = DAG.getSelect(DL, MVT::v32i1, Cond, Op1Lo, Op2Lo);
SDValue Hi = DAG.getSelect(DL, MVT::v32i1, Cond, Op1Hi, Op2Hi);
return DAG.getNode(ISD::CONCAT_VECTORS, DL, VT, Lo, Hi);
}
if (VT.isVector() && VT.getVectorElementType() == MVT::i1) {
SDValue Op1Scalar;
if (ISD::isBuildVectorOfConstantSDNodes(Op1.getNode()))

View File

@ -0,0 +1,20 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc -mtriple=i686-unknown-linux-gnu %s -o - -mattr=avx512bw | FileCheck %s
define void @test3(i32 %c, <64 x i1>* %ptr) {
; CHECK-LABEL: test3:
; CHECK: # %bb.0:
; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax
; CHECK-NEXT: cmpl $1, {{[0-9]+}}(%esp)
; CHECK-NEXT: sbbl %ecx, %ecx
; CHECK-NEXT: kmovd %ecx, %k0
; CHECK-NEXT: kunpckdq %k0, %k0, %k0
; CHECK-NEXT: kmovq %k0, (%eax)
; CHECK-NEXT: retl
%cmp = icmp eq i32 %c, 0
%insert = insertelement <64 x i1> undef, i1 %cmp, i32 0
%shuf = shufflevector <64 x i1> %insert, <64 x i1> undef, <64 x i32> zeroinitializer
store <64 x i1> %shuf, <64 x i1>* %ptr
ret void
}