llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-04-14 13:40:10 +00:00

Author	SHA1	Message	Date
Sanjay Patel	f47575c561	[SLP] adjust code comment; NFC (check commit access)	2019-10-25 11:39:43 -04:00
Guillaume Chatelet	e1380964d3	[Alignment][NFC] Instructions::getLoadStoreAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69256 llvm-svn: 375416	2019-10-21 14:49:28 +00:00
Sanjay Patel	01de66076c	[SLP] avoid reduction transform on patterns that the backend can load-combine (2nd try) The 1st attempt at this modified the cost model in a bad way to avoid the vectorization, but that caused problems for other users (the loop vectorizer) of the cost model. I don't see an ideal solution to these 2 related, potentially large, perf regressions: https://bugs.llvm.org/show_bug.cgi?id=42708 https://bugs.llvm.org/show_bug.cgi?id=43146 We decided that load combining was unsuitable for IR because it could obscure other optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend. Therefore, preventing SLP from destroying load combine opportunities requires that it recognizes patterns that could be combined later, but not do the optimization itself ( it's not a vector combine anyway, so it's probably out-of-scope for SLP). Here, we add a cost-independent bailout with a conservative pattern match for a multi-instruction sequence that can probably be reduced later. In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining will produce a single instruction on these tests like: movbe rax, qword ptr [rdi] or: mov rax, qword ptr [rdi] Not some (half) vector monstrosity as we currently do using SLP: vpmovzxbq ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,.. vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0] movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rdi + 5] shl rcx, 40 movzx edx, byte ptr [rdi + 6] shl rdx, 48 or rdx, rcx movzx ecx, byte ptr [rdi + 7] shl rcx, 56 or rcx, rdx or rcx, rax vextracti128 xmm1, ymm0, 1 vpor xmm0, xmm0, xmm1 vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1] vpor xmm0, xmm0, xmm1 vmovq rax, xmm0 or rax, rcx vzeroupper ret Differential Revision: https://reviews.llvm.org/D67841 llvm-svn: 375025	2019-10-16 18:06:24 +00:00
Sam Parker	d863761b7a	[NFC][TTI] Add Alignment for isLegalMasked[Load/Store] Add an extra parameter so the backend can take the alignment into consideration. Differential Revision: https://reviews.llvm.org/D68400 llvm-svn: 374763	2019-10-14 10:00:21 +00:00
Benjamin Kramer	2d3816bb5d	[LV] Merge LLVM_DEBUG blocks. Avoids unused variable warnings about the range-based for loops in there. NFCI. llvm-svn: 374646	2019-10-12 10:57:22 +00:00
Zi Xuan Wu	5d5b98eb29	recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374634	2019-10-12 02:53:04 +00:00
Florian Hahn	124b23d314	[VPlan] Add moveAfter to VPRecipeBase. This patch adds a moveAfter method to VPRecipeBase, which can be used to move elements after other elements, across VPBasicBlocks, if necessary. Reviewers: dcaballe, hsaito, rengolin, hfinkel Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D46825 llvm-svn: 374565	2019-10-11 15:36:55 +00:00
Florian Hahn	3d17a2eeec	[LV][NFC] Factor out calculation of "best" estimated trip count. This is just small refactoring to minimize changes in upcoming patch. In the next path I'm going to introduce changes into heuristic for vectorization of "tiny trip count" loops. Patch by Evgeniy Brevnov <evgueni.brevnov@gmail.com> Reviewers: hsaito, Ayal, fhahn, reames Reviewed By: hsaito Differential Revision: https://reviews.llvm.org/D67690 llvm-svn: 374338	2019-10-10 13:07:01 +00:00
Guillaume Chatelet	08dec91263	[Alignment][NFC] Make VectorUtils uas llvm::Align Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, rogfer01, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68784 llvm-svn: 374330	2019-10-10 12:35:04 +00:00
Sanjay Patel	63fe889fb1	[SLP] respect target register width for GEP vectorization (PR43578) We failed to account for the target register width (max vector factor) when vectorizing starting from GEPs. This causes vectorization to proceed to obviously illegal widths as in: https://bugs.llvm.org/show_bug.cgi?id=43578 For x86, this also means that SLP can produce rogue AVX or AVX512 code even when the user specifies a narrower vector width. The AArch64 test in ext-trunc.ll appears to be better using the narrower width. I'm not exactly sure what getelementptr.ll is trying to do, but it's testing with "-slp-threshold=-18", so I'm not worried about those diffs. The x86 test is an over-reduction from SPEC h264; this patch appears to restore the perf loss caused by SLP when using -march=haswell. Differential Revision: https://reviews.llvm.org/D68667 llvm-svn: 374183	2019-10-09 16:32:49 +00:00
Sjoerd Meijer	7a73112681	[LV] Emitting SCEV checks with OptForSize When optimising for size and SCEV runtime checks need to be emitted to check overflow behaviour, the loop vectorizer can run in this assert: LoopVectorize.cpp:2699: void llvm::InnerLoopVectorizer::emitSCEVChecks( llvm::Loop , llvm::BasicBlock ): Assertion `!BB->getParent()->hasOptSize() && "Cannot SCEV check stride or overflow when opt We should not generate predicates while optimising for size because code will be generated for predicates such as these SCEV overflow runtime checks. This should fix PR43371. Differential Revision: https://reviews.llvm.org/D68082 llvm-svn: 374166	2019-10-09 13:19:41 +00:00
Jinsong Ji	6616c06663	Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize" Also Revert "[LoopVectorize] Fix non-debug builds after rL374017" This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a. This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04. The patch is breaking PowerPC internal build, checked with author, reverting on behalf of him for now due to timezone. llvm-svn: 374091	2019-10-08 17:32:56 +00:00
Kadir Cetinkaya	40c14830e6	[LoopVectorize] Fix non-debug builds after rL374017 llvm-svn: 374021	2019-10-08 07:39:50 +00:00
Zi Xuan Wu	d4140de63c	[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not estimate different register pressure for different register class separately(especially for scalar type, float type should not be on the same position with int type), so it's not accurate. Specifically, it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance. So we need classify the register classes in IR level, and importantly these are abstract register classes, and are not the target register class of backend provided in td file. It's used to establish the mapping between the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types. For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR), float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled, and 3 kinds of register class when VSX is NOT enabled. It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions. Differential revision: https://reviews.llvm.org/D67148 llvm-svn: 374017	2019-10-08 03:28:33 +00:00
Martin Storsjo	6530ff4317	Revert "[SLP] avoid reduction transform on patterns that the backend can load-combine" This reverts SVN r373833, as it caused a failed assert "Non-zero loop cost expected" on building numerous projects, see PR43582 for details and reproduction samples. llvm-svn: 373882	2019-10-07 08:21:37 +00:00
Sanjay Patel	801065a9df	[SLP] avoid reduction transform on patterns that the backend can load-combine I don't see an ideal solution to these 2 related, potentially large, perf regressions: https://bugs.llvm.org/show_bug.cgi?id=42708 https://bugs.llvm.org/show_bug.cgi?id=43146 We decided that load combining was unsuitable for IR because it could obscure other optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend. Therefore, preventing SLP from destroying load combine opportunities requires that it recognizes patterns that could be combined later, but not do the optimization itself ( it's not a vector combine anyway, so it's probably out-of-scope for SLP). Here, we add a scalar cost model adjustment with a conservative pattern match and cost summation for a multi-instruction sequence that can probably be reduced later. This should prevent SLP from creating a vector reduction unless that sequence is extremely cheap. In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining will produce a single instruction on these tests like: movbe rax, qword ptr [rdi] or: mov rax, qword ptr [rdi] Not some (half) vector monstrosity as we currently do using SLP: vpmovzxbq ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,.. vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0] movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rdi + 5] shl rcx, 40 movzx edx, byte ptr [rdi + 6] shl rdx, 48 or rdx, rcx movzx ecx, byte ptr [rdi + 7] shl rcx, 56 or rcx, rdx or rcx, rax vextracti128 xmm1, ymm0, 1 vpor xmm0, xmm0, xmm1 vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1] vpor xmm0, xmm0, xmm1 vmovq rax, xmm0 or rax, rcx vzeroupper ret Differential Revision: https://reviews.llvm.org/D67841 llvm-svn: 373833	2019-10-05 18:03:58 +00:00
Guillaume Chatelet	53ff87f900	[Alignment][NFC] Remove StoreInst::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68268 llvm-svn: 373595	2019-10-03 13:17:21 +00:00
Guillaume Chatelet	119416c564	[Alignment][NFC] Remove LoadInst::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68142 llvm-svn: 373195	2019-09-30 09:37:05 +00:00
Alexey Bataev	4cfaf409b9	[SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!") Initially SLP vectorizer replaced all going-to-be-vectorized instructions with Undef values. It may break ScalarEvaluation and may cause a crash. Reworked SLP vectorizer so that it does not replace vectorized instructions by UndefValue anymore. Instead vectorized instructions are marked for deletion inside if BoUpSLP class and deleted upon class destruction. Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29641 llvm-svn: 373166	2019-09-29 14:18:06 +00:00
Guillaume Chatelet	114e854bc6	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081	2019-09-27 12:54:21 +00:00
Jordan Rupprecht	6f4086ac07	Revert [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!") This reverts r372626 (git commit 6a278d9073bdc158d31d4f4b15bbe34238f22c18) llvm-svn: 373019	2019-09-26 22:09:17 +00:00
Simon Pilgrim	d33d67fb1c	LoopVectorize - silence static analyzer dyn_cast<CmpInst> null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<CmpInst> directly and if not assert will fire for us. llvm-svn: 372732	2019-09-24 11:27:38 +00:00
Sjoerd Meijer	b80c4594bb	[LV] Forced vectorization with runtime checks and OptForSize When vectorisation is forced with a pragma, we optimise for min size, and we need to emit runtime memory checks, then allow this code growth and don't run in an assert like we currently do. This is the result of D65197 and D66803, and was a use-case not really considered before. If this now happens, we emit an optimisation remark warning about the code-size expansion, which can be avoided by not forcing vectorisation or possibly source-code modifications. Differential Revision: https://reviews.llvm.org/D67764 llvm-svn: 372694	2019-09-24 08:03:34 +00:00
Alexey Bataev	ef99efec22	[SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!") Summary: Initially SLP vectorizer replaced all going-to-be-vectorized instructions with Undef values. It may break ScalarEvaluation and may cause a crash. Reworked SLP vectorizer so that it does not replace vectorized instructions by UndefValue anymore. Instead vectorized instructions are marked for deletion inside if BoUpSLP class and deleted upon class destruction. Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29641 llvm-svn: 372626	2019-09-23 16:25:03 +00:00
Simon Pilgrim	623f27d1b0	[VPlan] Silence static analyzer dyn_cast null dereference warning. NFCI. llvm-svn: 372502	2019-09-22 13:02:00 +00:00
Simon Pilgrim	fa3fab8dc6	[LoopVectorize] Don't dereference a dyn_cast result. NFCI. The static analyzer is warning about potential null dereferences of dyn_cast<> results, we can use cast<> directly as we know that these cases should all be CastInst, which is why its working atm and anyway cast<> will assert if they aren't. llvm-svn: 372116	2019-09-17 13:24:54 +00:00
Simon Pilgrim	6ddc38e0ff	[VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI. The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place. llvm-svn: 371975	2019-09-16 11:22:44 +00:00
Simon Pilgrim	e13e496c29	[SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference warning. NFCI. llvm-svn: 371974	2019-09-16 10:48:16 +00:00
Simon Pilgrim	0df1509a7d	[SLPVectorizer] Don't dereference a dyn_cast result. NFCI. The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't. llvm-svn: 371973	2019-09-16 10:35:09 +00:00
Simon Pilgrim	1c7c7fdaaf	[LoadStoreVectorizer] vectorizeLoadChain - ensure we find a valid Type down the load chain. NFCI. Silence static analyzer uninitialized variable warning by setting the LoadTy to null and then asserting we find a real value. llvm-svn: 371936	2019-09-15 16:44:35 +00:00
Sanjay Patel	b49e547030	[SLP] limit vectorization of Constant subclasses (PR33958) This is a fix for: https://bugs.llvm.org/show_bug.cgi?id=33958 It seems universally true that we would not want to transform this kind of sequence on any target, but if that's not correct, then we could view this as a target-specific cost model problem. We could also white-list ConstantInt, ConstantFP, etc. rather than blacklist Global and ConstantExpr. Differential Revision: https://reviews.llvm.org/D67362 llvm-svn: 371931	2019-09-15 13:03:24 +00:00
Philip Reames	9c14f5f07c	[Loads] Move generic code out of vectorizer into a location it might be reused [NFC] llvm-svn: 371558	2019-09-10 21:33:53 +00:00
Philip Reames	5e61658cdf	[ValueTracking] Factor our common speculation suppression logic [NFC] Expose a utility function so that all places which want to suppress speculation (when otherwise legal) due to ordering and/or sanitizer interaction can do so. llvm-svn: 371556	2019-09-10 21:12:29 +00:00
Philip Reames	63d4c607da	[LoopVectorize] Leverage speculation safety to avoid masked.loads If we're vectorizing a load in a predicated block, check to see if the load can be speculated rather than predicated. This allows us to generate a normal vector load instead of a masked.load. To do so, we must prove that all bytes accessed on any iteration of the original loop are dereferenceable, and that all loads (across all iterations) are properly aligned. This is equivelent to proving that hoisting the load into the loop header in the original scalar loop is safe. Note: There are a couple of code motion todos in the code. My intention is to wait about a day - to be sure this sticks - and then perform the NFC motion without furthe review. Differential Revision: https://reviews.llvm.org/D66688 llvm-svn: 371452	2019-09-09 20:54:13 +00:00
Simon Pilgrim	7b8ba30bb1	Fix typo. NFCI llvm-svn: 371317	2019-09-07 18:09:09 +00:00
Teresa Johnson	0062c013da	Change TargetLibraryInfo analysis passes to always require Function Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284	2019-09-07 03:09:36 +00:00
Guillaume Chatelet	022b2711d1	[LLVM][Alignment] Convert isLegalNTStore/isLegalNTLoad to llvm::Align Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67223 llvm-svn: 371063	2019-09-05 13:09:42 +00:00
Philip Reames	79a6d69a13	[NFC] Switch last couple of invariant_load checks to use hasMetadata llvm-svn: 370948	2019-09-04 18:27:31 +00:00
Bjorn Pettersson	e93d54f960	[LV] Fix miscompiles by adding non-header PHI nodes to AllowedExit Summary: Fold-tail currently supports reduction last-vector-value live-out's, but has yet to support last-scalar-value live-outs, including non-header phi's. As it relies on AllowedExit in order to detect them and bail out we need to add the non-header PHI nodes to AllowedExit, otherwise we end up with miscompiles. Solves https://bugs.llvm.org/show_bug.cgi?id=43166 Reviewers: fhahn, Ayal Reviewed By: fhahn, Ayal Subscribers: anna, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67074 llvm-svn: 370721	2019-09-03 09:33:55 +00:00
Sjoerd Meijer	654ab28c6e	[LV] Tail-folding, runtime scev checks Now that we allow tail-folding, not only when we optimise for size, make sure we do not run in this assert. Differential revision: https://reviews.llvm.org/D66932 llvm-svn: 370711	2019-09-03 08:53:02 +00:00
Sjoerd Meijer	7b5470772a	[LV] Tail-folding with runtime memory checks The loop vectorizer was running in an assert when it tried to fold the tail and had to emit runtime memory disambiguation checks. Differential revision: https://reviews.llvm.org/D66803 llvm-svn: 370707	2019-09-03 08:38:24 +00:00
Simon Pilgrim	7282dc840a	Fix cppcheck shadow variable and variable scope warnings. NFCI. llvm-svn: 370580	2019-08-31 12:30:19 +00:00
Ayal Zaks	26b6527bad	[LV] Fold tail by masking - handle reductions Allow vectorizing loops that have reductions when tail is folded by masking. A select is introduced in VPlan, choosing between the last value carried by the loop-exit/live-out instruction of the reduction, and the penultimate value carried by the reduction phi, according to the "i < n" mask of fold-tail. This select replaces the last value as the live-out value of the loop. Differential Revision: https://reviews.llvm.org/D66720 llvm-svn: 370173	2019-08-28 09:02:23 +00:00
Philip Reames	90256467f0	Add a clarify comment for meaning of SafePointes [NFC] Extracted from D66688 as requested. llvm-svn: 369962	2019-08-26 20:48:35 +00:00
Sanjay Patel	aab0f19307	[SLP] use range-for loops, fix formatting; NFC These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369776	2019-08-23 16:22:32 +00:00
Sanjay Patel	cd04ae0632	[SLP] fix formatting; NFC These are part of D57059, but that patch doesn't apply cleanly to trunk at this point, so we might as well remove some of the noise. llvm-svn: 369769	2019-08-23 15:26:12 +00:00
Dinar Temirbulatov	3a0d7be9f3	[SLP][NFC] Avoid repetitive calls to getSameOpcode() We can avoid repetitive calls getSameOpcode() for already known tree elements by keeping MainOp and AltOp in TreeEntry. Differential Revision: https://reviews.llvm.org/D64700 llvm-svn: 369315	2019-08-20 00:22:04 +00:00
Sanjay Patel	dfce324623	[SLP] reduce duplicated code; NFC llvm-svn: 369250	2019-08-19 11:39:56 +00:00
Vasileios Porpodas	454d55518f	[SLPVectorizer] Make the scheduler aware of the TreeEntry operands. Summary: The scheduler's dependence graph gets the use-def dependencies by accessing the operands of the instructions in a bundle. However, buildTree_rec() may change the order of the operands in TreeEntry, and the scheduler is currently not aware of this. This is not causing any functional issues currently, because reordering is restricted to the operands of a single instruction. Once we support operand reordering across multiple TreeEntries, as shown here: http://www.llvm.org/devmtg/2019-04/slides/Poster-Porpodas-Supernode_SLP.pdf , the scheduler will need to get the correct operands from TreeEntry and not from the individual instructions. In short, this patch: - Connects the scheduler's bundle with the corresponding TreeEntry. It introduces new TE and Lane fields in ScheduleData. - Moves the location where the operands of the TreeEntry are initialized. This used to take place in newTreeEntry() setting one operand at a time, but is now moved pre-order just before the recursion of buildTree_rec(). This is required because the scheduler needs to access both operands of the TreeEntry in tryScheduleBundle(). - Updates the scheduler to access the instruction operands through the TreeEntry operands instead of accessing the instruction operands directly. Reviewers: ABataev, RKSimon, dtemirbulatov, Ayal, dorit, hfinkel Reviewed By: ABataev Subscribers: hiraditya, llvm-commits, lebedev.ri, rcorcs Tags: #llvm Differential Revision: https://reviews.llvm.org/D62432 llvm-svn: 369131	2019-08-16 17:21:18 +00:00
Simon Pilgrim	8062e3ba80	[SLPVectorizer] Silence null dereference warning. NFCI. cppcheck + MSVC analyzer both over zealously warn that we might dereference a null Bundle pointer - add an assertion to check for null to silence the warning, plus its a good idea to check that we succeeded in finding a schedule bundle anyway.... llvm-svn: 369094	2019-08-16 10:28:23 +00:00

1 2 3 4 5 ...

1774 Commits