4562 Commits

Author SHA1 Message Date
Igor Breger
253f60a6d8 [AVX512] Fix EXTRACT_VECTOR_ELT for v2i1/v4i1/v32i1/v64i1 with variable index.
Differential Revision: https://reviews.llvm.org/D30189



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295718 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 14:01:25 +00:00
Craig Topper
97823bb7e5 [X86] Fix formatting. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295695 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 06:27:13 +00:00
Sanjoy Das
81f0f4690c Add a wrapper around copy_if in STLExtras; NFC
I will add one more use for this in a later change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295685 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 00:38:44 +00:00
Simon Pilgrim
c8319a4345 [X86] Tidyup combineExtractVectorElt. NFCI.
Pull out repeated code for extraction index operand and source vector value type.

Use isNullConstant helper to check for zero extraction index.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295670 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-20 16:09:45 +00:00
Igor Breger
05a06cba9e [X86] Fix EXTRACT_VECTOR_ELT with variable index from v32i16 and v64i8 vector.
Its more profitable to go through memory (1 cycles throughput)
than using VMOVD + VPERMV/PSHUFB sequence ( 2/3 cycles throughput) to implement EXTRACT_VECTOR_ELT with variable index.
IACA tool was used to get performace estimation (https://software.intel.com/en-us/articles/intel-architecture-code-analyzer)
For example for var_shuffle_v16i8_v16i8_xxxxxxxxxxxxxxxx_i8 test from vector-shuffle-variable-128.ll I get 26 cycles vs 79 cycles. 
Removing the VINSERT node, we don't need it any more.

Differential Revision: https://reviews.llvm.org/D29690



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295660 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-20 14:16:29 +00:00
Simon Pilgrim
03eb1209fc [X86][AVX512] Add support for ASHR v2i64/v4i64 support without VLX
Use v8i64 ASHR instructions if we don't have VLX.

Differential Revision: https://reviews.llvm.org/D28537

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295656 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-20 12:16:38 +00:00
Simon Pilgrim
f042c820ef [X86] Use peekThroughOneUseBitcasts helper. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295618 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-19 21:40:51 +00:00
Simon Pilgrim
f8d4b524dd [X86][SSE] Use getTargetConstantBitsFromNode to find zeroable shuffle elements.
Replaces existing approach that could only search BUILD_VECTOR nodes.

Requires getTargetConstantBitsFromNode to discriminate cases with all/partial UNDEF bits in each element - this should also be useful when we get around to supporting getTargetShuffleMaskIndices with UNDEF elements. 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295613 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-19 19:40:31 +00:00
Simon Pilgrim
0c9c0d47aa [X86][SSE] Enable initial support for domain crossing at high shuffle combine depths.
As discussed on D27692, this permits another domain to be used to combine a shuffle at high depths.

We currently set the required depth at 4 or more combined shuffles, this is probably too high for most targets but is a good starting point and already helps avoid a number of costly variable shuffles.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295608 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-19 17:19:38 +00:00
Simon Pilgrim
7754c0ada2 [X86][SSE] Generalize INSERTPS/SHUFPS/SHUFPD combines across domains.
Relax the INSERTPS/SHUFPS/SHUFPD combines to support integer inputs if permitted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295606 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-19 15:15:40 +00:00
Simon Pilgrim
395e4206ab [X86][SSE] Add domain crossing support for target shuffle combines.
Add the infrastructure to flag whether float and/or int domains are permitable.

A future patch will enable domain crossing based off shuffle depth and the value types of the source vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295604 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-19 14:12:25 +00:00
Simon Pilgrim
a21a4863ea Fix signed/unsigned comparison warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295580 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 22:56:17 +00:00
Simon Pilgrim
5ff0a24f6e [X86] Fix enumeral/non-enumeral comparison warning.
gcc only allows you to mix enums / ints if they have the same signedness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295576 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 22:40:58 +00:00
Simon Pilgrim
f9e2c1f957 [X86][SSE] Avoid repeated calls to SDValue::getValueType.
Added assertion to check input type of X86ISD::VZEXT during target known bits calculation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295575 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 22:25:27 +00:00
Sanjay Patel
4c507d5052 [x86] fold sext (xor Bool, -1) --> sub (zext Bool), 1
This is the same transform that is current used for:
select Bool, 0, -1



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295568 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 21:03:28 +00:00
Simon Pilgrim
4231bd0a9d [X86] Simplify by pulling out valuetype. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295502 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-17 22:10:10 +00:00
Simon Pilgrim
d69a69b212 [X86] Remove local areOnlyUsersOf helper and use SDNode::areOnlyUsersOf instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295326 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-16 15:11:49 +00:00
Simon Pilgrim
49096c11cb [X86][SSE] Don't call EltsFromConsecutiveLoads if any element is missing.
Minor performance speedup - if any call to getShuffleScalarElt fails to get a result, don't both calling for the remaining elements as EltsFromConsecutiveLoads will fail anyhow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295235 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 21:09:00 +00:00
Simon Pilgrim
be2cd40ad4 [X86][SSE] Propagate undef upper elements from scalar_to_vector during shuffle combining
Only do this for integer types currently - floats types (in particular insertps) load folding often fails with this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295208 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 17:41:33 +00:00
Simon Pilgrim
a0d03b22c4 [X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs
Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295169 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 11:46:15 +00:00
Craig Topper
53bbf700f8 [X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types
Summary:
We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs.

As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast.

I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable.

This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused.

Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0.

Reviewers: delena, RKSimon, zvi

Reviewed By: zvi

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D28747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295155 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 06:58:47 +00:00
Diego Novillo
270ca404ab Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295065 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 16:39:54 +00:00
Simon Pilgrim
2fce16a04e [X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputs
Add support for specifying an UNPCK input as UNDEF


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295061 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 16:22:04 +00:00
Simon Pilgrim
ae8ad841a5 [X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295053 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 13:47:17 +00:00
Simon Pilgrim
c2389bbf5b [X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call.
Don't bother setting the V1/V2 operands again for unary shuffles.

Don't bother legalizing the value type unless the match succeeds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295051 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 12:54:39 +00:00
Simon Pilgrim
5e5855bdab Fix indentation. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294959 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-13 15:31:08 +00:00
Simon Pilgrim
d9480271fe [X86][SSE] Create matchVectorShuffleWithUNPCK helper function.
Currently only used by target shuffle combining - will use it for lowering as well in a future patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294943 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-13 11:52:58 +00:00
Craig Topper
d46db47633 [X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR to support 512-bit vectors with 128-bit or 256-bit subvectors.
We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294931 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-13 04:53:29 +00:00
Craig Topper
1e59ad7d60 [X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR.
This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294929 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-12 23:49:46 +00:00
Simon Pilgrim
7096a2fa24 [X86] Fix typo in function name. NFCI.
convertBitVectorToUnsiged - convertBitVectorToUnsigned

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294914 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-12 20:53:44 +00:00
Simon Pilgrim
a654726bd9 [X86][SSE] Update argument names to match function name. NFCI.
The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294900 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-12 16:46:41 +00:00
Simon Pilgrim
211b30744a [X86][AVX2] Add support for combining target shuffles to VPMOVZX
Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294896 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-12 14:31:23 +00:00
Craig Topper
fe892ae261 [X86] Move code for using blendi for insert_subvector out to an isel pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294876 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 22:57:12 +00:00
Simon Pilgrim
04a335bf92 [X86][SSE] Use VSEXT/VZEXT constant folding for SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG
Preparatory step for PR31712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294874 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 22:47:06 +00:00
Simon Pilgrim
796f9e5e57 [X86][SSE] Improve VSEXT/VZEXT constant folding.
Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR .

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294873 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 21:55:24 +00:00
Simon Pilgrim
276a1497f1 [X86][SSE] Add early-out when trying to match blend shuffle. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294864 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 18:06:24 +00:00
Amaury Sechet
3f9b6c1139 Fix indentation in X86ISelLowering. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294859 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 17:48:48 +00:00
Simon Pilgrim
dcd6d02210 [X86][SSE] Convert getTargetShuffleMaskIndices to use getTargetConstantBitsFromNode.
Removes duplicate constant extraction code in getTargetShuffleMaskIndices.

getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294856 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 17:27:21 +00:00
Simon Pilgrim
68155852cf [X86] Merge repeated getScalarValueSizeInBits calls. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 16:42:07 +00:00
Ahmed Bougacha
d0491a6b56 [X86] Bitcast subvector before broadcasting it.
Since r274013, we've been looking through bitcasts on broadcast inputs.
In the scalar-folding case (from a load, build_vector, or sc2vec),
the input type didn't matter, as we'd simply bitcast the resulting
scalar back.

However, when broadcasting a 128-bit-lane-aligned element, we create an
EXTRACT_SUBVECTOR.  Use proper types, by creating an extract_subvector
of the original input type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294774 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-10 19:51:47 +00:00
Simon Pilgrim
3545163cb2 [X86][SSE] Use SDValue::getConstantOperandVal helper. NFCI.
Also reordered an if statement to test low cost comparisons first

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294748 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-10 14:27:59 +00:00
Simon Pilgrim
c86c991488 [X86][SSE] Add support for extracting target constants from BUILD_VECTOR
In some cases we call getTargetConstantBitsFromNode for nodes that haven't been lowered from BUILD_VECTOR yet

Note: We're getting very close to being able to move most of the constant extraction code from getTargetShuffleMaskIndices into getTargetConstantBitsFromNode

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294746 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-10 14:04:11 +00:00
Simon Pilgrim
8336fc6508 [X86][SSE] Add missing comment describing combing to SHUFPS. NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294745 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-10 13:16:01 +00:00
Simon Pilgrim
161421e702 [X86] Remove duplicate call to getValueType. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294640 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 22:35:59 +00:00
Simon Pilgrim
06f2b29f82 Convert to for-range loop. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294610 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 18:52:24 +00:00
Simon Pilgrim
0a41afd896 [X86][MMX] Remove the (long time) unused MMX_PINSRW ISD opcode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294596 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 17:08:47 +00:00
Pierre Gousseau
e6fa7df85c [X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast math.
In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants.

Differential Revision: https://reviews.llvm.org/D29756


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294588 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 14:43:58 +00:00
Simon Pilgrim
85a8f6df5a [X86][SSE] Attempt to break register dependencies during lowerBuildVector
LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register.

This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD.

On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.)

Differential Revision: https://reviews.llvm.org/D29720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294581 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 11:50:19 +00:00
Craig Topper
4b9bffa31e [X86] Clzero intrinsic and its addition under znver1
This patch does the following.

1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero
2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1)
3. Adds the clzero feature under znver1 architecture.
4. The custom inserter is added in Lowering.
5. A testcase is added to check the intrinsic.
6. The clzero instruction is added to assembler test.

Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me.

Differential revision: https://reviews.llvm.org/D29385

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294558 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 04:27:34 +00:00
Simon Pilgrim
965fad0725 [X86][SSE] Tidyup LowerBuildVectorv16i8 and LowerBuildVectorv8i16. NFCI.
Run clang-format and standardized variable names between functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294456 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 14:44:45 +00:00