archived-llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2026-01-31 01:35:20 +01:00

Author	SHA1	Message	Date
Sanjay Patel	35167a94b8	[VectorCombine] new IR transform pass for partial vector ops We have several bug reports that could be characterized as "reducing scalarization", and this topic was also raised on llvm-dev recently: http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html ...so I'm proposing that we deal with these patterns in a new, lightweight IR vector pass that runs before/after other vectorization passes. There are 4 alternate options that I can think of to deal with this kind of problem (and we've seen various attempts at all of these), but they all have flaws: InstCombine - can't happen without TTI, but we don't want target-specific folds there. SDAG - too late to assist other vectorization passes; TLI is not equipped for these kind of cost queries; limited to a single basic block. CGP - too late to assist other vectorization passes; would need to re-implement basic cleanups like CSE/instcombine. SLP - doesn't fit with existing transforms; limited to a single basic block. This initial patch/transform is based on existing code in AggressiveInstCombine: we walk backwards through the function looking for a pattern match. But we diverge from that cost-independent IR canonicalization pass by using TTI to decide if the vector alternative is profitable. We probably have at least 10 similar bug reports/patterns (binops, constants, inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements. It's possible that we could iterate on a worklist to fix-point like InstCombine does, but it's safer to start with a most basic case and evolve from there, so I didn't try to do anything fancy with this initial implementation. Differential Revision: https://reviews.llvm.org/D73480	2020-02-09 10:04:41 -05:00
Wei Mi	b78044c97c	[LV] Remove nondeterminacy by changing LoopVectorizationLegality::Reductions from DenseMap to MapVector The iteration order of LoopVectorizationLegality::Reductions matters for the final code generation, so we better use MapVector instead of DenseMap for it to remove the nondeterminacy. reduction-order.ll in the patch is an example reduced from the case we saw. In the output of opt command, the order of the select instructions in the vector.body block keeps changing from run to run currently. Differential Revision: https://reviews.llvm.org/D73490	2020-01-27 16:53:20 -08:00
Florian Hahn	1eeb9c02e5	[LV] Allow assume calls in predicated blocks. The assume intrinsic is intentionally marked as may reading/writing memory, to avoid passes moving them around. When flattening the CFG for predicated blocks, we have to drop the assume calls, as they are control-flow dependent. There are some cases where we can do better (when control flow is preserved), but that is follow-up work. Fixes PR43620. Reviewers: hsaito, rengolin, dcaballe, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D68814	2020-01-16 10:11:35 +00:00
Eric Christopher	7b534b3c55	Temporarily Revert "[SLP] allow forming 2-way reduction patterns" and update testcases. After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit 8a0aa5310bccbb42d16d11db090419fcefdd1376.	2019-11-20 16:00:53 -08:00
Eric Christopher	6d982d0bf7	Temporarily Revert "Temporarily Revert "[SLP] allow forming 2-way reduction patterns"" as there were testcase changes after that need to also be reverted. This reverts commit cd8748a15f2d18861b3548eb26ed2b52e5ee50b4.	2019-11-20 15:39:47 -08:00
Eric Christopher	fd20022682	Temporarily Revert "[SLP] allow forming 2-way reduction patterns" After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit 7ff57705ba196ce649d6034614b3b9df57e1f84f.	2019-11-20 15:19:31 -08:00
Reid Kleckner	b3a7316049	Add missing includes needed to prune LLVMContext.h include, NFC These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280	2019-11-14 15:23:15 -08:00
Alexey Bataev	2158954f3c	Revert "Temporarily Revert:" This reverts commit e511c4b0dff1692c267addf17dce3cebe8f97faa: Temporarily Revert: "[SLP] Generalization of stores vectorization." "[SLP] Fix -Wunused-variable. NFC" "[SLP] Vectorize jumbled stores." after fixing the problem with compile time.	2019-11-14 16:38:20 -05:00
Sanjay Patel	fd4fb656a1	[SLP] allow forming 2-way reduction patterns We have a vector compare reduction problem seen in PR39665 comment 2: https://bugs.llvm.org/show_bug.cgi?id=39665#c2 Or slightly reduced here: define i1 @cmp2(<2 x double> %a0) { %a = fcmp ogt <2 x double> %a0, <double 1.0, double 1.0> %b = extractelement <2 x i1> %a, i32 0 %c = extractelement <2 x i1> %a, i32 1 %d = and i1 %b, %c ret i1 %d } SLP would not attempt to turn this into a vector reduction because there is an artificial lower limit on that transform. We can not completely remove that limit without inducing regressions though, so this patch just hacks an extra attempt at creating a 2-way reduction to the end of the analysis. As shown in the test file, we are still not getting some of the motivating cases, so follow-on patches will be needed to solve those cases. Differential Revision: https://reviews.llvm.org/D59710	2019-11-07 06:08:42 -05:00
Eric Christopher	c052667ccc	Temporarily Revert: "[SLP] Generalization of stores vectorization." "[SLP] Fix -Wunused-variable. NFC" "[SLP] Vectorize jumbled stores." As they're causing significant (10-30x) compile time regressions on vectorizable code. The primary cause of the compile-time regression is f228b5371647f471853c5fb3e6719823a42fe451. This reverts commits: f228b5371647f471853c5fb3e6719823a42fe451 5503455ccb3f5fcedced158332c016c8d3a7fa81 21d498c9c0f32dcab5bc89ac593aa813b533b43a	2019-11-06 16:06:15 -08:00
Alexey Bataev	0b89a0ccd6	[SLP] Generalization of stores vectorization. Stores are vectorized with maximum vectorization factor of 16. Patch tries to improve the situation and use maximal vectorization factor. Reviewers: spatel, RKSimon, mkuper, hfinkel Differential Revision: https://reviews.llvm.org/D43582	2019-10-29 11:46:36 -04:00
Alexey Bataev	4cfaf409b9	[SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!") Initially SLP vectorizer replaced all going-to-be-vectorized instructions with Undef values. It may break ScalarEvaluation and may cause a crash. Reworked SLP vectorizer so that it does not replace vectorized instructions by UndefValue anymore. Instead vectorized instructions are marked for deletion inside if BoUpSLP class and deleted upon class destruction. Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29641 llvm-svn: 373166	2019-09-29 14:18:06 +00:00
Jordan Rupprecht	6f4086ac07	Revert [SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!") This reverts r372626 (git commit 6a278d9073bdc158d31d4f4b15bbe34238f22c18) llvm-svn: 373019	2019-09-26 22:09:17 +00:00
Alexey Bataev	ef99efec22	[SLP] Fix for PR31847: Assertion failed: (isLoopInvariant(Operands[i], L) && "SCEVAddRecExpr operand is not loop-invariant!") Summary: Initially SLP vectorizer replaced all going-to-be-vectorized instructions with Undef values. It may break ScalarEvaluation and may cause a crash. Reworked SLP vectorizer so that it does not replace vectorized instructions by UndefValue anymore. Instead vectorized instructions are marked for deletion inside if BoUpSLP class and deleted upon class destruction. Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D29641 llvm-svn: 372626	2019-09-23 16:25:03 +00:00
Bjorn Pettersson	e93d54f960	[LV] Fix miscompiles by adding non-header PHI nodes to AllowedExit Summary: Fold-tail currently supports reduction last-vector-value live-out's, but has yet to support last-scalar-value live-outs, including non-header phi's. As it relies on AllowedExit in order to detect them and bail out we need to add the non-header PHI nodes to AllowedExit, otherwise we end up with miscompiles. Solves https://bugs.llvm.org/show_bug.cgi?id=43166 Reviewers: fhahn, Ayal Reviewed By: fhahn, Ayal Subscribers: anna, hiraditya, rkruppe, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67074 llvm-svn: 370721	2019-09-03 09:33:55 +00:00
Dorit Nuzman	04a25a6a3d	[LV] fold-tail predication should be respected even with assume_safety assume_safety implies that loads under "if's" can be safely executed speculatively (unguarded, unmasked). However this assumption holds only for the original user "if's", not those introduced by the compiler, such as the fold-tail "if" that guards us from loading beyond the original loop trip-count. Currently the combination of fold-tail and assume-safety pragmas results in ignoring the fold-tail predicate that guards the loads, generating unmasked loads. This patch fixes this behavior. Differential Revision: https://reviews.llvm.org/D66106 Reviewers: Ayal, hsaito, fhahn llvm-svn: 368973	2019-08-15 07:12:14 +00:00
Hideki Saito	d420e88ebd	[LV][NFC] Share the LV illegality reporting with LoopVectorize. Reviewers: hsaito, fhahn, rengolin Reviewed By: rengolin Patch by psamolysov, thanks! Differential Revision: https://reviews.llvm.org/D62997 llvm-svn: 367980	2019-08-06 06:08:48 +00:00
Sjoerd Meijer	a83e238fd9	[LV] Tail-Loop Folding This allows folding of the scalar epilogue loop (the tail) into the main vectorised loop body when the loop is annotated with a "vector predicate" metadata hint. To fold the tail, instructions need to be predicated (masked), enabling/disabling lanes for the remainder iterations. Differential Revision: https://reviews.llvm.org/D65197 llvm-svn: 367592	2019-08-01 18:21:44 +00:00
Warren Ristow	8897816bcd	[LV] Suppress vectorization in some nontemporal cases When considering a loop containing nontemporal stores or loads for vectorization, suppress the vectorization if the corresponding vectorized store or load with the aligment of the original scaler memory op is not supported with the nontemporal hint on the target. This adds two new functions: bool isLegalNTStore(Type DataType, unsigned Alignment) const; bool isLegalNTLoad(Type DataType, unsigned Alignment) const; to TTI, leaving the target independent default implementation as returning true, but with overriding implementations for X86 that check the legality based on available Subtarget features. This fixes https://llvm.org/PR40759 Differential Revision: https://reviews.llvm.org/D61764 llvm-svn: 363581	2019-06-17 17:20:08 +00:00
Renato Golin	10fb6600cd	[LV] Wrap LV illegality reporting in a function. NFC. A function for loop vectorization illegality reporting has been introduced: void LoopVectorizationLegality::reportVectorizationFailure( const StringRef DebugMsg, const StringRef OREMsg, const StringRef ORETag, Instruction * const I) const; The function prints a debug message when the debug for the compilation unit is enabled as well as invokes the optimization report emitter to generate a message with a specified tag. The function doesn't cover any complicated logic when a custom lambda should be passed to the emitter, only generating a message with a tag is supported. The function always prints the instruction `I` after the debug message whenever the instruction is specified, otherwise the debug message ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.' Patch by Pavel Samolysov <samolisov@gmail.com> llvm-svn: 362736	2019-06-06 19:15:52 +00:00
Alina Sbirlea	6b0ca9752b	[NewPassManager] Add tuning option: SLPVectorization [NFC]. Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61616 llvm-svn: 360276	2019-05-08 17:58:35 +00:00
Alina Sbirlea	2330568f0e	[NewPassManager] Adding pass tuning options: loop vectorize. Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763	2019-04-19 16:11:59 +00:00
Serguei Katkov	e0dd059b59	[NewPM] Add Option handling for LoopVectorize This patch enables passing options to LoopVectorizePass via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60681 llvm-svn: 358647	2019-04-18 08:46:11 +00:00
Hiroshi Yamauchi	a42343417c	[PGO] Profile guided code size optimization. Summary: Enable some of the existing size optimizations for cold code under PGO. A ~5% code size saving in big internal app under PGO. The way it gets BFI/PSI is discussed in the RFC thread http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html Note it doesn't currently touch loop passes. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59514 llvm-svn: 358422	2019-04-15 16:49:00 +00:00
Mandeep Singh Grang	ce846242c8	[llvm] Fix typo: 's/analsyis/analysis/' [NFC] llvm-svn: 355246	2019-03-02 00:14:10 +00:00
Michael Kruse	54b7706ba0	Refactor setAlreadyUnrolled() and setAlreadyVectorized(). Loop::setAlreadyUnrolled() and LoopVectorizeHints::setLoopAlreadyUnrolled() both add loop metadata that stops the same loop from being transformed multiple times. This patch merges both implementations. In doing so we fix 3 potential issues: * setLoopAlreadyUnrolled() kept the llvm.loop.vectorize/interleave.* metadata even though it will not be used anymore. This already caused problems such as http://llvm.org/PR40546. Change the behavior to the one of setAlreadyUnrolled which deletes this loop metadata. * setAlreadyUnrolled() used to create a new LoopID by calling MDNode::get with nullptr as the first operand, then replacing it by the returned references using replaceOperandWith. It is possible that MDNode::get would instead return an existing node (due to de-duplication) that then gets modified. To avoid, use a fresh TempMDNode that does not get uniqued with anything else before replacing it with replaceOperandWith. * LoopVectorizeHints::matchesHintMetadataName() only compares the suffix of the attribute to set the new value for. That is, when called with "enable", would erase attributes such as "llvm.loop.unroll.enable", "llvm.loop.vectorize.enable" and "llvm.loop.distribute.enable" instead of the one to replace. Fortunately, function was only called with "isvectorized". Differential Revision: https://reviews.llvm.org/D57566 llvm-svn: 353738	2019-02-11 19:45:44 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Michael Kruse	776f1841f2	[LoopVectorize] Rename pass options. NFC. Rename: NoUnrolling to InterleaveOnlyWhenForced and AlwaysVectorize to !VectorizeOnlyWhenForced Contrary to what the name 'AlwaysVectorize' suggests, it does not unconditionally vectorize all loops, but applies a cost model to determine whether vectorization is profitable to all loops. Hence, passing false will disable the cost model, except when a loop is marked with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested by @hfinkel in D55716) better matches this behavior. Similarly, 'NoUnrolling' disables the profitability cost model for interleaving (a term to distinguish it from unrolling by the LoopUnrollPass); rename it for consistency. Differential Revision: https://reviews.llvm.org/D55785 llvm-svn: 349513	2018-12-18 17:46:09 +00:00
Michael Kruse	b72e26000d	[LV] Fix signed/unsigned comparison warning. llvm-svn: 348949	2018-12-12 18:07:19 +00:00
Michael Kruse	f48207fec4	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944	2018-12-12 17:32:52 +00:00
Markus Lavin	6da1480917	[PM] Port LoadStoreVectorizer to the new pass manager. Differential Revision: https://reviews.llvm.org/D54848 llvm-svn: 348570	2018-12-07 08:23:37 +00:00
Ayal Zaks	5fb73fd778	[LV] Fold tail by masking to vectorize loops of arbitrary trip count under opt for size When optimizing for size, a loop is vectorized only if the resulting vector loop completely replaces the original scalar loop. This holds if no runtime guards are needed, if the original trip-count TC does not overflow, and if TC is a known constant that is a multiple of the VF. The last two TC-related conditions can be overcome by 1. rounding the trip-count of the vector loop up from TC to a multiple of VF; 2. masking the vector body under a newly introduced "if (i <= TC-1)" condition. The patch allows loops with arbitrary trip counts to be vectorized under -Os, subject to the existing cost model considerations. It also applies to loops with small trip counts (under -O2) which are currently handled as if under -Os. The patch does not handle loops with reductions, live-outs, or w/o a primary induction variable, and disallows interleave groups. (Third, final and main part of -) Differential Revision: https://reviews.llvm.org/D50480 llvm-svn: 344743	2018-10-18 15:03:15 +00:00
Hideki Saito	0ecce5d4da	[VPlan] Implement initial vector code generation support for simple outer loops. Summary: [VPlan] Implement vector code generation support for simple outer loops. Context: Patch Series #1 for outer loop vectorization support in LV using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following: - force vector code generation using explicit vectorize_width - add conservative early returns in cost model and other places for VPlanNativePath - add code for setting up outer loop inductions - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches - support for generating uniform inner branches We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal Reviewed By: fhahn, hsaito Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D50820 llvm-svn: 342197	2018-09-14 00:36:00 +00:00
Adrian Prantl	076a6683eb	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Hideki Saito	6adf799873	[NFC][LV][LoopUtil] Move LoopVectorizationLegality to its own file Summary: This is a follow up to D45420 (included here since it is still under review and this change is dependent on that) and D45072 (committed). Actual change for this patch is LoopVectorize* and cmakefile. All others are all from D45420. LoopVectorizationLegality is an analysis and thus really belongs to Analysis tree. It is modular enough and it is reusable enough ---- we can further improve those aspects once uses outside of LV picks up. Hopefully, this will make it easier for people familiar with vectorization theory, but not necessarily LV itself to contribute, by lowering the volume of code they should deal with. We probably should start adding some code in LV to check its own capability (i.e., vectorization is legal but LV is not ready to handle it) and then bail out. Reviewers: rengolin, fhahn, hfinkel, mkuper, aemerson, mssimpso, dcaballe, sguggill Reviewed By: rengolin, dcaballe Subscribers: egarcia, rogfer01, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D45552 llvm-svn: 331139	2018-04-29 07:26:18 +00:00
Diego Caballero	6f85fab6b9	[LV][VPlan] Detect outer loops for explicit vectorization. Patch #2 from VPlan Outer Loop Vectorization Patch Series #1 (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces the basic infrastructure to detect, legality check and process outer loops annotated with hints for explicit vectorization. All these changes are protected under the feature flag -enable-vplan-native-path. This should make this patch NFC for the existing inner loop vectorizer. Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso. Differential Revision: https://reviews.llvm.org/D42447 llvm-svn: 330739	2018-04-24 17:04:17 +00:00
Alexey Bataev	0610dd4c30	[SLP] Take user instructions cost into consideration in insertelement vectorization. Summary: For better vectorization result we should take into consideration the cost of the user insertelement instructions when we try to vectorize sequences that build the whole vector. I.e. if we have the following scalar code: ``` <Scalar code> insertelement <ScalarCode>, ... ``` we should consider the cost of the last `insertelement ` instructions as the cost of the scalar code. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42657 llvm-svn: 324893	2018-02-12 14:54:48 +00:00
Alexey Bataev	d73719cbbe	[SLP] Fix PR35777: Incorrect handling of aggregate values. Summary: Fixes the bug with incorrect handling of InsertValue\|InsertElement instrucions in SLP vectorizer. Currently, we may use incorrect ExtractElement instructions as the operands of the original InsertValue\|InsertElement instructions. Reviewers: mkuper, hfinkel, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41767 llvm-svn: 321994	2018-01-08 14:43:06 +00:00
Guozhi Wei	8c380373f3	[SLPVectorizer] Don't ignore scalar extraction instructions of aggregate value In SLPVectorizer, the vector build instructions (insertvalue for aggregate type) is passed to BoUpSLP.buildTree, it is treated as UserIgnoreList, so later in cost estimation, the cost of these instructions are not counted. For aggregate value, later usage are more likely to be done in scalar registers, either used as individual scalars or used as a whole for function call or return value. Ignore scalar extraction instructions may cause too aggressive vectorization for aggregate values, and slow down performance. So for vectorization of aggregate value, the scalar extraction instructions are required in cost estimation. Differential Revision: https://reviews.llvm.org/D41139 llvm-svn: 320736	2017-12-14 19:35:43 +00:00
Eugene Zelenko	16fc947602	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 315640	2017-10-12 23:30:03 +00:00
Adam Nemet	ec7409d86a	Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.* Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249	2017-10-09 23:19:02 +00:00
Eugene Zelenko	cbd8f32d28	[Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 312383	2017-09-01 21:37:29 +00:00
Alexey Bataev	9f19665894	[SLP] General improvements of SLP vectorization process. Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310260	2017-08-07 15:25:49 +00:00
Alexey Bataev	92afaf2479	Revert "[SLP] General improvements of SLP vectorization process." This reverts commit r310255. llvm-svn: 310257	2017-08-07 14:51:52 +00:00
Alexey Bataev	ec62fc0fc9	[SLP] General improvements of SLP vectorization process. Summary: Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does: 1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts. 2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible. Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits Differential Revision: https://reviews.llvm.org/D29826 llvm-svn: 310255	2017-08-07 14:03:17 +00:00
Alexey Bataev	6c386e33ac	[SLPVectorizer] Generalize interface of functions, NFC. llvm-svn: 309816	2017-08-02 14:38:07 +00:00
Taewook Oh	4440ca5fab	Improve profile-guided heuristics to use estimated trip count. Summary: Existing heuristic uses the ratio between the function entry frequency and the loop invocation frequency to find cold loops. However, even if the loop executes frequently, if it has a small trip count per each invocation, vectorization is not beneficial. On the other hand, even if the loop invocation frequency is much smaller than the function invocation frequency, if the trip count is high it is still beneficial to vectorize the loop. This patch uses estimated trip count computed from the profile metadata as a primary metric to determine coldness of the loop. If the estimated trip count cannot be computed, it falls back to the original heuristics. Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32451 llvm-svn: 305729	2017-06-19 18:48:58 +00:00
Alexey Bataev	fec36b55c8	[SLP] Comment fix, NFC. Fixed comment in function description. llvm-svn: 304940	2017-06-07 20:37:24 +00:00
Adam Nemet	73703d12a4	[SLP] Emit optimization remarks The approach I followed was to emit the remark after getTreeCost concludes that SLP is profitable. I initially tried emitting them after the vectorizeRootInstruction calls in vectorizeChainsInBlock but I vaguely remember missing a few cases for example in HorizontalReduction::tryToReduce. ORE is placed in BoUpSLP so that it's available from everywhere (notably HorizontalReduction::tryToReduce). We use the first instruction in the root bundle as the locator for the remark. In order to get a sense how far the tree is spanning I've include the size of the tree in the remark. This is not perfect of course but it gives you at least a rough idea about the tree. Then you can follow up with -view-slp-tree to really see the actual tree. llvm-svn: 302811	2017-05-11 17:06:17 +00:00
Sanjoy Das	19757d9ec3	Rename WeakVH to WeakTrackingVH; NFC This relands r301424. llvm-svn: 301812	2017-05-01 17:07:49 +00:00

1 2

62 Commits