RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-28 08:16:14 +00:00

Author	SHA1	Message	Date
Igor Breger	253f60a6d8	[AVX512] Fix EXTRACT_VECTOR_ELT for v2i1/v4i1/v32i1/v64i1 with variable index. Differential Revision: https://reviews.llvm.org/D30189 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295718 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-21 14:01:25 +00:00
Craig Topper	97823bb7e5	[X86] Fix formatting. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295695 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-21 06:27:13 +00:00
Sanjoy Das	81f0f4690c	Add a wrapper around copy_if in STLExtras; NFC I will add one more use for this in a later change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295685 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-21 00:38:44 +00:00
Simon Pilgrim	c8319a4345	[X86] Tidyup combineExtractVectorElt. NFCI. Pull out repeated code for extraction index operand and source vector value type. Use isNullConstant helper to check for zero extraction index. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295670 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-20 16:09:45 +00:00
Igor Breger	05a06cba9e	[X86] Fix EXTRACT_VECTOR_ELT with variable index from v32i16 and v64i8 vector. Its more profitable to go through memory (1 cycles throughput) than using VMOVD + VPERMV/PSHUFB sequence ( 2/3 cycles throughput) to implement EXTRACT_VECTOR_ELT with variable index. IACA tool was used to get performace estimation (https://software.intel.com/en-us/articles/intel-architecture-code-analyzer) For example for var_shuffle_v16i8_v16i8_xxxxxxxxxxxxxxxx_i8 test from vector-shuffle-variable-128.ll I get 26 cycles vs 79 cycles. Removing the VINSERT node, we don't need it any more. Differential Revision: https://reviews.llvm.org/D29690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295660 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-20 14:16:29 +00:00
Simon Pilgrim	03eb1209fc	[X86][AVX512] Add support for ASHR v2i64/v4i64 support without VLX Use v8i64 ASHR instructions if we don't have VLX. Differential Revision: https://reviews.llvm.org/D28537 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295656 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-20 12:16:38 +00:00
Simon Pilgrim	f042c820ef	[X86] Use peekThroughOneUseBitcasts helper. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295618 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-19 21:40:51 +00:00
Simon Pilgrim	f8d4b524dd	[X86][SSE] Use getTargetConstantBitsFromNode to find zeroable shuffle elements. Replaces existing approach that could only search BUILD_VECTOR nodes. Requires getTargetConstantBitsFromNode to discriminate cases with all/partial UNDEF bits in each element - this should also be useful when we get around to supporting getTargetShuffleMaskIndices with UNDEF elements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295613 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-19 19:40:31 +00:00
Simon Pilgrim	0c9c0d47aa	[X86][SSE] Enable initial support for domain crossing at high shuffle combine depths. As discussed on D27692, this permits another domain to be used to combine a shuffle at high depths. We currently set the required depth at 4 or more combined shuffles, this is probably too high for most targets but is a good starting point and already helps avoid a number of costly variable shuffles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295608 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-19 17:19:38 +00:00
Simon Pilgrim	7754c0ada2	[X86][SSE] Generalize INSERTPS/SHUFPS/SHUFPD combines across domains. Relax the INSERTPS/SHUFPS/SHUFPD combines to support integer inputs if permitted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295606 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-19 15:15:40 +00:00
Simon Pilgrim	395e4206ab	[X86][SSE] Add domain crossing support for target shuffle combines. Add the infrastructure to flag whether float and/or int domains are permitable. A future patch will enable domain crossing based off shuffle depth and the value types of the source vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295604 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-19 14:12:25 +00:00
Simon Pilgrim	a21a4863ea	Fix signed/unsigned comparison warning. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295580 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-18 22:56:17 +00:00
Simon Pilgrim	5ff0a24f6e	[X86] Fix enumeral/non-enumeral comparison warning. gcc only allows you to mix enums / ints if they have the same signedness. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295576 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-18 22:40:58 +00:00
Simon Pilgrim	f9e2c1f957	[X86][SSE] Avoid repeated calls to SDValue::getValueType. Added assertion to check input type of X86ISD::VZEXT during target known bits calculation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295575 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-18 22:25:27 +00:00
Sanjay Patel	4c507d5052	[x86] fold sext (xor Bool, -1) --> sub (zext Bool), 1 This is the same transform that is current used for: select Bool, 0, -1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295568 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-18 21:03:28 +00:00
Simon Pilgrim	4231bd0a9d	[X86] Simplify by pulling out valuetype. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295502 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-17 22:10:10 +00:00
Simon Pilgrim	d69a69b212	[X86] Remove local areOnlyUsersOf helper and use SDNode::areOnlyUsersOf instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295326 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-16 15:11:49 +00:00
Simon Pilgrim	49096c11cb	[X86][SSE] Don't call EltsFromConsecutiveLoads if any element is missing. Minor performance speedup - if any call to getShuffleScalarElt fails to get a result, don't both calling for the remaining elements as EltsFromConsecutiveLoads will fail anyhow. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295235 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-15 21:09:00 +00:00
Simon Pilgrim	be2cd40ad4	[X86][SSE] Propagate undef upper elements from scalar_to_vector during shuffle combining Only do this for integer types currently - floats types (in particular insertps) load folding often fails with this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295208 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-15 17:41:33 +00:00
Simon Pilgrim	a0d03b22c4	[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise ZERO inputs Add support for specifying an UNPCK input as ZERO, particularly improves ZEXT cases with non-zero offsets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295169 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-15 11:46:15 +00:00
Craig Topper	53bbf700f8	[X86] Don't create VBROADCAST nodes with 256-bit or 512-bit input types Summary: We don't seem to have great rules on what a valid VBROADCAST node looks like. And as a consequence we end up with a lot of patterns to try to catch everything. We have patterns with scalar inputs, 128-bit vector inputs, 256-bit vector inputs, and 512-bit vector inputs. As you can see from the things improved here we are currently missing patterns for 128-bit loads being extended to 256-bit before the vbroadcast. I'd like to propose that VBROADCAST should always take a 128-bit vector type as input. As a first step towards that this patch adds an EXTRACT_SUBVECTOR in front of VBROADCAST when the input is 256 or 512-bits. In the future I would like to add scalar_to_vector around all the scalar operations. And maybe we should consider adding a VBROADCAST+load node to avoid separating loads from the broadcasting operation when the load itself isn't foldable. This requires an additional change in target shuffle combining to look for the extract subvector and look through it to find the original operand. I'm sure this change isn't perfect but was enough to fix a few test failures that were being caused. Another interesting thing I noticed is that the changes in masked_gather_scatter.ll show cases were we don't remove a useless insert into element 1 before broadcasting element 0. Reviewers: delena, RKSimon, zvi Reviewed By: zvi Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D28747 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295155 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-15 06:58:47 +00:00
Diego Novillo	270ca404ab	Remove unused variable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295065 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-14 16:39:54 +00:00
Simon Pilgrim	2fce16a04e	[X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputs Add support for specifying an UNPCK input as UNDEF git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295061 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-14 16:22:04 +00:00
Simon Pilgrim	ae8ad841a5	[X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295053 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-14 13:47:17 +00:00
Simon Pilgrim	c2389bbf5b	[X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call. Don't bother setting the V1/V2 operands again for unary shuffles. Don't bother legalizing the value type unless the match succeeds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295051 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-14 12:54:39 +00:00
Simon Pilgrim	5e5855bdab	Fix indentation. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294959 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-13 15:31:08 +00:00
Simon Pilgrim	d9480271fe	[X86][SSE] Create matchVectorShuffleWithUNPCK helper function. Currently only used by target shuffle combining - will use it for lowering as well in a future patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294943 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-13 11:52:58 +00:00
Craig Topper	d46db47633	[X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294931 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-13 04:53:29 +00:00
Craig Topper	1e59ad7d60	[X86] Don't let LowerEXTRACT_SUBVECTOR call getNode for EXTRACT_SUBVECTOR. This results in the simplifications inside of getNode running while we're legalizing nodes popped off the worklist during the final DAG combine. This basically makes a DAG combine like operation occur during this legalize step, but we don't handle something quite the same way. I think we don't recursively added the removed nodes to the DAG combiner worklist. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294929 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-12 23:49:46 +00:00
Simon Pilgrim	7096a2fa24	[X86] Fix typo in function name. NFCI. convertBitVectorToUnsiged - convertBitVectorToUnsigned git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294914 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-12 20:53:44 +00:00
Simon Pilgrim	a654726bd9	[X86][SSE] Update argument names to match function name. NFCI. The target shuffle match function arguments were using the term 'Ops' but the function names referred to them as 'Inputs' - use 'Inputs' consistently. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294900 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-12 16:46:41 +00:00
Simon Pilgrim	211b30744a	[X86][AVX2] Add support for combining target shuffles to VPMOVZX Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294896 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-12 14:31:23 +00:00
Craig Topper	fe892ae261	[X86] Move code for using blendi for insert_subvector out to an isel pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294876 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-11 22:57:12 +00:00
Simon Pilgrim	04a335bf92	[X86][SSE] Use VSEXT/VZEXT constant folding for SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG Preparatory step for PR31712 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294874 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-11 22:47:06 +00:00
Simon Pilgrim	796f9e5e57	[X86][SSE] Improve VSEXT/VZEXT constant folding. Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR . git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294873 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-11 21:55:24 +00:00
Simon Pilgrim	276a1497f1	[X86][SSE] Add early-out when trying to match blend shuffle. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294864 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-11 18:06:24 +00:00
Amaury Sechet	3f9b6c1139	Fix indentation in X86ISelLowering. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294859 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-11 17:48:48 +00:00
Simon Pilgrim	dcd6d02210	[X86][SSE] Convert getTargetShuffleMaskIndices to use getTargetConstantBitsFromNode. Removes duplicate constant extraction code in getTargetShuffleMaskIndices. getTargetConstantBitsFromNode - adds support for VZEXT_MOVL(SCALAR_TO_VECTOR) and fail if the caller doesn't support undef bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294856 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-11 17:27:21 +00:00
Simon Pilgrim	68155852cf	[X86] Merge repeated getScalarValueSizeInBits calls. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294852 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-11 16:42:07 +00:00
Ahmed Bougacha	d0491a6b56	[X86] Bitcast subvector before broadcasting it. Since r274013, we've been looking through bitcasts on broadcast inputs. In the scalar-folding case (from a load, build_vector, or sc2vec), the input type didn't matter, as we'd simply bitcast the resulting scalar back. However, when broadcasting a 128-bit-lane-aligned element, we create an EXTRACT_SUBVECTOR. Use proper types, by creating an extract_subvector of the original input type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294774 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-10 19:51:47 +00:00
Simon Pilgrim	3545163cb2	[X86][SSE] Use SDValue::getConstantOperandVal helper. NFCI. Also reordered an if statement to test low cost comparisons first git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294748 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-10 14:27:59 +00:00
Simon Pilgrim	c86c991488	[X86][SSE] Add support for extracting target constants from BUILD_VECTOR In some cases we call getTargetConstantBitsFromNode for nodes that haven't been lowered from BUILD_VECTOR yet Note: We're getting very close to being able to move most of the constant extraction code from getTargetShuffleMaskIndices into getTargetConstantBitsFromNode git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294746 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-10 14:04:11 +00:00
Simon Pilgrim	8336fc6508	[X86][SSE] Add missing comment describing combing to SHUFPS. NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294745 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-10 13:16:01 +00:00
Simon Pilgrim	161421e702	[X86] Remove duplicate call to getValueType. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294640 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 22:35:59 +00:00
Simon Pilgrim	06f2b29f82	Convert to for-range loop. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294610 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 18:52:24 +00:00
Simon Pilgrim	0a41afd896	[X86][MMX] Remove the (long time) unused MMX_PINSRW ISD opcode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294596 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 17:08:47 +00:00
Pierre Gousseau	e6fa7df85c	[X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast math. In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants. Differential Revision: https://reviews.llvm.org/D29756 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294588 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 14:43:58 +00:00
Simon Pilgrim	85a8f6df5a	[X86][SSE] Attempt to break register dependencies during lowerBuildVector LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register. This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD. On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.) Differential Revision: https://reviews.llvm.org/D29720 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294581 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 11:50:19 +00:00
Craig Topper	4b9bffa31e	[X86] Clzero intrinsic and its addition under znver1 This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294558 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 04:27:34 +00:00
Simon Pilgrim	965fad0725	[X86][SSE] Tidyup LowerBuildVectorv16i8 and LowerBuildVectorv8i16. NFCI. Run clang-format and standardized variable names between functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294456 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-08 14:44:45 +00:00

1 2 3 4 5 ...

4562 Commits