archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Farhana Aleen	13f7859c20	[SLP] Recognize min/max pattern using instructions producing same values. Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization. %1 = extractelement <2 x i32> %a, i32 0 %2 = extractelement <2 x i32> %a, i32 1 %cond = icmp sgt i32 %1, %2 %3 = extractelement <2 x i32> %a, i32 0 %4 = extractelement <2 x i32> %a, i32 1 %select = select i1 %cond, i32 %3, i32 %4 Author: FarhanaAleen Reviewed By: ABataev, RKSimon, spatel Differential Revision: https://reviews.llvm.org/D47608 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 17:55:31 +00:00
Simon Pilgrim	198bcb65d9	[SLPVectorizer][X86] Begin adding alternate tests for call operators Alternate opcode handling only supports binary operators, these tests demonstrate a missed opportunity to vectorize ceil/floor calls git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336125 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 17:23:45 +00:00
Simon Pilgrim	386f15c93a	[SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns. We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case. This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336095 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 11:28:01 +00:00
Simon Pilgrim	243c2fa15c	[SLPVectorizer][X86] Add some alternate tests for cast operators Alternate opcode handling only supports binary operators, these tests demonstrate missed opportunities to vectorize some sitofp/uitofp and fptosi/fptoui style casts as well as some (successful) float bits manipulations git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336060 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 11:29:46 +00:00
Simon Pilgrim	40be0055ae	[SLPVectorizer] Recognise non uniform power of 2 constants Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them. As SLP works with arrays of values I don't think we can easily use the pattern match helpers here. Differential Revision: https://reviews.llvm.org/D48214 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335621 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 16:20:16 +00:00
Simon Pilgrim	5a40cf8639	[SLPVectorizer] Support alternate opcodes in tryToVectorizeList Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree. NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc. Differential Revision: https://reviews.llvm.org/D48488 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335364 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-22 16:37:34 +00:00
Simon Pilgrim	ae9a1a8ee7	[SLPVectorizer] Relax alternate opcodes to accept any BinaryOperator pair SLP currently only accepts (F)Add/(F)Sub alternate counterpart ops to be merged into an alternate shuffle. This patch relaxes this to accept any pair of BinaryOperator opcodes instead, assuming the target's cost model accepts the vectorization+shuffle. Differential Revision: https://reviews.llvm.org/D48477 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335349 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-22 14:04:06 +00:00
Simon Pilgrim	1c5cdb19a6	[SLPVectorizer][X86] Add alternate opcode tests for simple build vector cases git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335348 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-22 13:53:58 +00:00
Simon Pilgrim	444b60212b	[CostModel][AArch64] Add some initial costs for SK_Select and SK_PermuteSingleSrc AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion. This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174. I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more. Differential Revision: https://reviews.llvm.org/D48172 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335329 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-22 09:45:31 +00:00
Simon Pilgrim	400b266d8f	[X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882) These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335216 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 11:37:13 +00:00
Simon Pilgrim	c660be2252	[SLPVectorizer][X86] Add horizontal add/sub tests Shows PR37882 perf regression git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335215 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 11:16:10 +00:00
Simon Pilgrim	2970a29566	[SLPVectorizer] Relax "alternate" opcode vectorisation to work with any SK_Select shuffle pattern D47985 saw the old SK_Alternate 'alternating' shuffle mask replaced with the SK_Select mask which accepts either input operand for each lane, equivalent to a vector select with a constant condition operand. This patch updates SLPVectorizer to make full use of this SK_Select shuffle pattern by removing the 'isOdd()' limitation. The AArch64 regression will be fixed by D48172. Differential Revision: https://reviews.llvm.org/D48174 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335130 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-20 14:26:28 +00:00
Simon Pilgrim	afe3129d8f	[SLP][X86] Add AVX2 run to POW2 SDIV Tests Non-uniform pow2 tests are only make sense on targets with fast (low cost) non-uniform shifts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334821 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-15 10:29:37 +00:00
Simon Pilgrim	b753b18785	[SLP][X86] Regenerate POW2 SDIV Tests Added non-uniform pow2 test as well git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334819 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-15 10:07:03 +00:00
Farhana Aleen	4128fd181f	[SLP] Add testcases of min/max reduction pattern for AMDGPU. Author: FarhanaAleen git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334435 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-11 20:29:31 +00:00
Matt Arsenault	4525054673	AMDGPU: Make v2i16/v2f16 legal on VI This usually results in better code. Fixes using inline asm with short2, and also fixes having a different ABI for function parameters between VI and gfx9. Partially cleans up the mess used for lowering of the d16 operations. Making v4f16 legal will help clean this up more, but this requires additional work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332953 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-22 06:32:10 +00:00
Farhana Aleen	030b9437a7	[AMDGPU] Support horizontal vectorization of min/max. Author: FarhanaAleen Reviewed By: rampitec Subscribers: AMDGPU Differential Revision: https://reviews.llvm.org/D46604 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331920 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-09 21:18:34 +00:00
Shiva Chen	a8a13bc662	[DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label. In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331841 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-09 02:40:45 +00:00
Farhana Aleen	20a92cda49	[AMDGPU] Support horizontal vectorization. Author: FarhanaAleen Reviewed By: rampitec, arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D46213 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331313 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-01 21:41:12 +00:00
Matthew Simpson	9acd5ab38b	[SLP] Add additional test for transposable binary operations with reuse git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331274 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-01 15:59:26 +00:00
Davide Italiano	44735eb19d	[SLPVectorizer] Debug info shouldn't impact spill cost computation. <rdar://problem/39794738> (Also, PR32761). Differential Revision: https://reviews.llvm.org/D46199 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331199 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-30 16:57:33 +00:00
Benjamin Kramer	937d9c9219	[NVPTX] Turn on Loop/SLP vectorization Since PTX has grown a <2 x half> datatype vectorization has become more important. The late LoadStoreVectorizer intentionally only does loads and stores, but now arithmetic has to be vectorized for optimal throughput too. This is still very limited, SLP vectorization happily creates <2 x half> if it's a legal type but there's still a lot of register moving happening to get that fed into a vectorized store. Overall it's a small performance win by reducing the amount of arithmetic instructions. I haven't really checked what the loop vectorizer does to PTX code, the cost model there might need some more tweaks. I didn't see it causing harm though. Differential Revision: https://reviews.llvm.org/D46130 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331035 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-27 13:36:05 +00:00
Matthew Simpson	4965d63ae5	[SLP] Add tests for transposable binary operations These test cases are vectorizable, but we are currently unable to vectorize them effectively. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330945 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-26 14:50:04 +00:00
Craig Topper	891d17ec5e	[X86] Remove unnecessary -mattr to enable avx512bw when the -mcpu already enabled it. NFC This makes the test similar to the arith-sub.ll and arith-mul.ll tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330144 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-16 18:14:19 +00:00
Haicheng Wu	39a435f255	[SLP] Use getExtractWithExtendCost() to compute the scalar cost of extractelement/ext pair We use getExtractWithExtendCost to calculate the cost of extractelement and s\|zext together when computing the extract cost after vectorization, but we calculate the cost of extractelement and s\|zext separately when computing the scalar cost which is larger than it should be. Differential Revision: https://reviews.llvm.org/D45469 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330143 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-16 18:09:49 +00:00
Haicheng Wu	05d18d68a3	[SLP] update a test case. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329818 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-11 15:09:49 +00:00
Alexey Bataev	e6a456223b	[SLP] Additional tests for reorder reuse vectorization, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329603 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-09 19:02:34 +00:00
Simon Pilgrim	54d7a0223b	[SLPVectorizer][X86] Regenerate some tests. NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329196 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 13:53:51 +00:00
Alexey Bataev	91811bc488	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329085 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 17:14:47 +00:00
Alexey Bataev	0b1a72a7a6	[SLP] Added tests for checks of reordering of the repeated instructions, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329080 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 16:31:26 +00:00
Benjamin Kramer	4832f865cf	Revert "[SLP] Fix PR36481: vectorize reassociated instructions." This reverts commit r328980 and r329046. Makes the vectorizer crash. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329071 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 14:40:33 +00:00
Haicheng Wu	2784c35c0f	[SLP] Distinguish "demanded and shrinkable" from "demanded and not shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329035 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 00:05:10 +00:00
Alexey Bataev	6616787959	[SLP] Fix PR36481: vectorize reassociated instructions. Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328980 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-02 14:51:37 +00:00
Dinar Temirbulatov	09493fff69	[SLPVectorizer] Add tests related to PR30787, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328813 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-29 18:57:03 +00:00
Haicheng Wu	648a6091ec	[SLP] Add more checks to a test case. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328572 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-26 18:59:28 +00:00
Haicheng Wu	b9e7253e39	[SLP] Add a test case. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328546 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-26 16:47:37 +00:00
Matthew Simpson	27f212d583	[SLP] Stop counting cost of gather sequences with multiple uses When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328316 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-23 14:18:27 +00:00
Matthew Simpson	63f2cc2aa9	[SLP] Add test case for a gather sequence with multiple uses git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328133 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-21 19:13:14 +00:00
Matthew Simpson	a7fd2c3c2a	[AArch64] Implement getArithmeticReductionCost This patch provides an implementation of getArithmeticReductionCost for AArch64. We can specialize the cost of add reductions since they are computed using the 'addv' instruction. Differential Revision: https://reviews.llvm.org/D44490 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327702 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-16 11:34:15 +00:00
Alexey Bataev	7c6f31a848	[SLP] Additional tests for stores vectorization, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326740 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-05 20:20:12 +00:00
Mohammad Shahid	820fd02e9a	[SLP] Added new tests and updated existing for jumbled load, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326303 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-28 04:19:34 +00:00
Sanjay Patel	dcf9b1dd5e	[AArch64] add SLP test based on TSVC; NFC This is a slight reduction of one of the benchmarks that suffered with D43079. Cost model changes should not cause this test to remain scalarized. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326217 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-27 18:06:15 +00:00
Simon Pilgrim	0eea35a6ef	[X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280) Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark. Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch. Differential Revision: https://reviews.llvm.org/D43733 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326133 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-26 22:10:17 +00:00
Alexey Bataev	b4efe59b69	[SLP] Added new test + fixed some checks, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326117 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-26 20:01:24 +00:00
Simon Pilgrim	b891e74e20	[SLPVectorizer][X86] Add load extend tests (PR36091) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325772 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-22 12:19:34 +00:00
Sanjay Patel	5c377f610d	[AArch64] fix IR names to not be 'tmp' because that gives the CHECK script problems git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325718 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-21 20:48:14 +00:00
Sanjay Patel	b0c13268b8	[AArch64] add SLP test for matmul (PR36280); NFC This is a slight reduction of one of the benchmarks that suffered with D43079. Cost model changes should not cause this test to remain scalarized. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325717 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-21 20:34:16 +00:00
Alexey Bataev	27e6b3dc3f	[SLP] Fix test checks, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325689 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-21 15:32:58 +00:00
Sanjay Patel	1c629279f1	revert r325515: [TTI CostModel] change default cost of FP ops to 1 (PR36280) There are too many perf regressions resulting from this, so we need to investigate (and add tests for) targets like ARM and AArch64 before trying to reinstate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325658 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-21 01:42:52 +00:00
Alexey Bataev	771994be2d	[SLP] Fix tests checks, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325605 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-20 18:11:50 +00:00

1 2 3 4 5 ...

475 Commits