llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-04-12 20:48:17 +00:00

Author	SHA1	Message	Date
Anna Thomas	92cd76805b	NFC: [LoopUnroll] More meaningful message in tracing llvm-svn: 294017	2017-02-03 17:12:43 +00:00
Michael Kuperstein	9c21240154	Shut up another GCC warning about operator precedence. NFC. llvm-svn: 293812	2017-02-01 21:06:33 +00:00
Florian Hahn	dac390d445	[LoopUnroll] Use addClonedBlockToLoopInfo to add loop header to LI (NFC). Summary: I have a similar patch up for review already (D29173). If you prefer I can squash them both together. Also I think there more potential for code sharing between LoopUnroll.cpp and LoopUnrollRuntime.cpp. Do you think patches for that would be worthwhile? Reviewers: mkuper, mzolotukhin Reviewed By: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29311 llvm-svn: 293758	2017-02-01 10:39:35 +00:00
Anna Thomas	ee73d81422	NFC: Add debug tracing for more cases where loop unrolling fails. llvm-svn: 293313	2017-01-27 17:57:05 +00:00
Michael Kuperstein	ce7b578d43	[LoopUnroll] Properly update loopinfo for runtime unrolling by 2 Even when we don't create a remainder loop (that is, when we unroll by 2), we may duplicate nested loops into the remainder. This is complicated by the fact the remainder may itself be either inserted into an outer loop, or at the top level. In the latter case, we may need to create new top-level loops. Differential Revision: https://reviews.llvm.org/D29156 llvm-svn: 293124	2017-01-26 01:04:11 +00:00
Michael Kuperstein	147f6c96a5	[LoopUnroll] First form LCSSA, then loop-simplify Running non-LCSSA-preserving LoopSimplify followed by LCSSA on (roughly) the same loop is incorrect, since LoopSimplify may break LCSSA arbitrarily higher in the loop nest. Instead, run LCSSA first, and then run LCSSA-preserving LoopSimplify on the result. This fixes PR31718. Differential Revision: https://reviews.llvm.org/D29055 llvm-svn: 292854	2017-01-23 23:45:42 +00:00
Chandler Carruth	b1b61e803a	[PM] Sink an LCSSA preservation assert from the LoopSimplify pass into the library routine shared with the new PM and other code. This assert checks that when LCSSA preservation is requested we start in LCSSA form. Without this early assert, given very complex test cases we can hit an assert or crash much later on when trying to preserve LCSSA. The new PM's loop simplify doesn't need to (and indeed can't) preserve LCSSA as the new PM doesn't deal in transforms in the dependency graph. But we asked the library to and shockingly, this didn't work very well! Stop doing that. Now the assert will tell us immediately with existing test cases. Before this, it took a pretty convoluted input to trigger this. However, sinking the assert also found a bug in LoopUnroll where we asked simplifyLoop to preserve LCSSA right before we reform it. That's kinda silly and unsurprising that it wasn't available. =D Stop doing that too. We also would assert that the unrolled loop was in LCSSA even if preserving LCSSA was never requested! I don't have a test case or anything here. I spotted it by inspection and it seems quite obvious. No logic change anyways, that's just avoiding a spurrious assert. llvm-svn: 292710	2017-01-21 04:16:53 +00:00
Eli Friedman	21d28d5c67	Preserve domtree and loop-simplify for runtime unrolling. Mostly straightforward changes; we just didn't do the computation before. One sort of interesting change in LoopUnroll.cpp: we weren't handling dominance for children of the loop latch correctly, but foldBlockIntoPredecessor hid the problem for complete unrolling. Currently punting on loop peeling; made some minor changes to isolate that problem to LoopUnrollPeel.cpp. Adds a flag -unroll-verify-domtree; it verifies the domtree immediately after we finish updating it. This is on by default for +Asserts builds. Differential Revision: https://reviews.llvm.org/D28073 llvm-svn: 292447	2017-01-18 23:26:37 +00:00
Florian Hahn	740f03ad29	[loop-unroll] Factor out code to update LoopInfo (NFC). Move the code to update LoopInfo for cloned basic blocks to addClonedBlockToLoopInfo, as suggested in https://reviews.llvm.org/D28482. llvm-svn: 291614	2017-01-10 23:24:54 +00:00
Philip Reames	65068167f2	Add a comment for a todo in LoopUnroll post cleanup llvm-svn: 290769	2016-12-30 22:10:19 +00:00
Haicheng Wu	57b0c16d3c	[LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC. Make it clear that TripCount is the upper bound of the iteration on which control exits LatchBlock. Differential Revision: https://reviews.llvm.org/D26675 llvm-svn: 290199	2016-12-20 20:23:48 +00:00
Daniel Jasper	162ffcacd6	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086	2016-12-19 08:22:17 +00:00
Hal Finkel	f224db75d2	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756	2016-12-15 03:02:15 +00:00
Michael Kuperstein	c222d94c24	[LoopUnroll] Implement profile-based loop peeling This implements PGO-driven loop peeling. The basic idea is that when the average dynamic trip-count of a loop is known, based on PGO, to be low, we can expect a performance win by peeling off the first several iterations of that loop. Unlike unrolling based on a known trip count, or a trip count multiple, this doesn't save us the conditional check and branch on each iteration. However, it does allow us to simplify the straight-line code we get (constant-folding, etc.). This is important given that we know that we will usually only hit this code, and not the actual loop. This is currently disabled by default. Differential Revision: https://reviews.llvm.org/D25963 llvm-svn: 288274	2016-11-30 21:13:57 +00:00
John Brawn	c944a4af03	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818	2016-10-21 11:08:48 +00:00
Haicheng Wu	5b13afc1d2	Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049. The original summary: This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. llvm-svn: 284053	2016-10-12 21:29:38 +00:00
Haicheng Wu	9079316128	Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop" This reverts commit r284044. llvm-svn: 284051	2016-10-12 21:02:22 +00:00
Haicheng Wu	3e43a84017	[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop This patch tries to fully unroll loops having break statement like this for (int i = 0; i < 8; i++) { if (a[i] == value) { found = true; break; } } GCC can fully unroll such loops, but currently LLVM cannot because LLVM only supports loops having exact constant trip counts. The upper bound of the trip count can be obtained from calling ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the refactoring work in SCEV to prevent duplicating code. The feature of using the upper bound is enabled under the same circumstance when runtime unrolling is enabled since both are used to unroll loops without knowing the exact constant trip count. Differential Revision: https://reviews.llvm.org/D24790 llvm-svn: 284044	2016-10-12 20:24:32 +00:00
Adam Nemet	718a6b9aef	[LoopUnroll] Port to the new streaming interface for opt remarks. llvm-svn: 282834	2016-09-30 03:44:16 +00:00
David Majnemer	fe4709f006	[LoopUnroll] Don't clear out the AssumptionCache on each loop Clearing out the AssumptionCache can cause us to rescan the entire function for assumes. If there are many loops, then we are scanning over the entire function many times. Instead of clearing out the AssumptionCache, register all cloned assumes. llvm-svn: 278854	2016-08-16 21:09:46 +00:00
David Majnemer	5423e4bff5	Use range algorithms instead of unpacking begin/end No functionality change is intended. llvm-svn: 278417	2016-08-11 21:15:00 +00:00
Michael Zolotukhin	a6c1800e82	[LoopUnroll] Simplify loops created by unrolling. Summary: Currently loop-unrolling doesn't preserve loop-simplified form. This patch fixes it by resimplifying affected loops. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23148 llvm-svn: 278038	2016-08-08 19:02:15 +00:00
Michael Zolotukhin	878379ae72	[LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its original value. As agreed in post-commit review of r265388, I'm switching the flag to its original value until the 90% runtime performance regression on SingleSource/Benchmarks/Stanford/Bubblesort is addressed. llvm-svn: 277524	2016-08-02 21:24:14 +00:00
Adam Nemet	3b9497477f	[LoopUnroll] Include hotness of region in opt remark LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter is added to the common function analysis passes that loop passes depend on. The BFI and indirectly BPI used in this pass is computed lazily so no overhead should be observed unless -pass-remarks-with-hotness is used. This is how the patch affects the O3 pipeline: Dominator Tree Construction Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Rotate Loops Loop Invariant Code Motion Unswitch loops Simplify the CFG Dominator Tree Construction Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Combine redundant instructions Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Induction Variable Simplification Recognize loop idioms Delete dead loops Unroll loops ... llvm-svn: 277203	2016-07-29 19:29:47 +00:00
Davide Italiano	c77e3fdff4	[PM] Port LoopSimplify to the new pass manager. While here move simplifyLoop() function to the new header, as suggested by Chandler in the review. Differential Revision: http://reviews.llvm.org/D21404 llvm-svn: 274959	2016-07-09 03:03:01 +00:00
David Majnemer	de242726d7	Reinstate r273711 r273711 was reverted by r273743. The inliner needs to know about any call sites in the inlined function. These were obscured if we replaced a call to undef with an undef but kept the call around. This fixes PR28298. llvm-svn: 273753	2016-06-25 00:04:10 +00:00
Nico Weber	237b6da09c	Revert r273711, it caused PR28298. llvm-svn: 273743	2016-06-24 22:52:39 +00:00
David Majnemer	bd6be5c3a7	SimplifyInstruction does not imply DCE We cannot remove an instruction with no uses just because SimplifyInstruction succeeds. It may have side effects. llvm-svn: 273711	2016-06-24 19:34:46 +00:00
Michael Zolotukhin	50e6f3827e	[LoopUnroll] Check that DT is available before trying to verify it. llvm-svn: 272221	2016-06-08 22:49:59 +00:00
Evgeny Stupachenko	8323ef30a7	The patch refactors unroll pass. Summary: Unroll factor (Count) calculations moved to a new function. Early exits on pragma and "-unroll-count" defined factor added. New type of unrolling "Force" introduced (previously used implicitly). New unroll preference "AllowRemainder" introduced and set "true" by default. (should be set to false for architectures that suffers from it). Reviewers: hfinkel, mzolotukhin, zzheng Differential Revision: http://reviews.llvm.org/D19553 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 271071	2016-05-27 23:15:06 +00:00
Justin Lebar	e84867be95	Minor formatting fixes in LoopUnroll.cpp. llvm-svn: 268995	2016-05-10 00:31:23 +00:00
Michael Zolotukhin	7b4123e034	Follow-up for r265605: don't mutate vector we're iterating. llvm-svn: 265625	2016-04-07 00:09:42 +00:00
Michael Zolotukhin	fa8d1d0bc1	[LoopUnroll] Fix the way we update DT after complete unrolling. Updating dominators for exit-blocks of the unrolled loops is not enough, as shown in PR27157. The proper way is to update dominators for all dominance-children of original loop blocks. llvm-svn: 265605	2016-04-06 21:47:12 +00:00
David L Kreitzer	01b0be98a9	Adds the ability to use an epilog remainder loop during loop unrolling and makes this the default behavior. Patch by Evgeny Stupachenko (evstupac@gmail.com). Differential Revision: http://reviews.llvm.org/D18158 llvm-svn: 265388	2016-04-05 12:19:35 +00:00
Eric Christopher	032166f634	Use some braces to format this a little better. llvm-svn: 263527	2016-03-15 03:01:31 +00:00
Eric Christopher	773d4a559f	Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest parentheses around '&&' within '\|\|' [-Werror=parentheses]. llvm-svn: 263525	2016-03-15 02:19:06 +00:00
Justin Lebar	19453c8511	[LoopUnroll] Respect the convergent attribute. Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509	2016-03-14 23:15:34 +00:00
Sanjay Patel	0ad59f4b4e	rangify, fix function names; NFCI llvm-svn: 262940	2016-03-08 17:12:32 +00:00
Sanjay Patel	0df69edd43	don't repeat function names in documentation comments; NFC llvm-svn: 262937	2016-03-08 16:26:39 +00:00
Michael Zolotukhin	7219052084	Follow up for r261597: Add the * to the auto. llvm-svn: 261600	2016-02-23 00:57:48 +00:00
Michael Zolotukhin	3da31c17bb	Follow-up for r261595: use range loop. llvm-svn: 261597	2016-02-23 00:48:44 +00:00
Michael Zolotukhin	cb26e1de36	[LoopUnroll] Avoid unnecessary DT recomputation. Summary: When we completely unroll a loop, it's pretty easy to update DT in-place and thus avoid rebuilding it. DT recalculation is one of the most time-consuming tasks in loop-unroll, so avoiding it at least in case of full unroll should be beneficial. On some extreme (but still real-world) tests this patch improves compile time by ~2x. Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc Subscribers: joker.eph, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D17473 llvm-svn: 261595	2016-02-23 00:30:50 +00:00
Michael Zolotukhin	369872c96c	[LoopUnrolling] Fix a bug introduced in r259869 (PR26688). The issue was that we only required LCSSA rebuilding if the immediate parent-loop had values used outside of it. The fix is to enaable the same logic for all outer loops, not only immediate parent. llvm-svn: 261575	2016-02-22 21:21:45 +00:00
Michael Zolotukhin	151d484d3e	[LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible. In r255133 (reapplied r253126) we started to avoid redundant recomputation of LCSSA after loop-unrolling. This patch moves one step further in this direction - now we can avoid it for much wider range of loops, as we start to look at IR and try to figure out if the transformation actually breaks LCSSA phis or makes it necessary to insert new ones. Differential Revision: http://reviews.llvm.org/D16838 llvm-svn: 259869	2016-02-05 02:17:36 +00:00
Justin Bogner	879f86bb78	LoopInfo: Simplify ownership of Loop objects It's strange that LoopInfo mostly owns the Loop objects, but that it defers deleting them to the loop pass manager. Instead, change the oddly named "updateUnloop" to "markAsRemoved" and have it queue the Loop object for deletion. We can't delete the Loop immediately when we remove it, since we need its pointer identity still, so we'll mark the object as "invalid" so that clients can see what's going on. llvm-svn: 257191	2016-01-08 19:08:53 +00:00
Justin Bogner	58647df890	LPM: Make callers of LPM.deleteLoopFromQueue update LoopInfo directly. NFC As of r255720, the loop pass manager will DTRT when passes update the loop info for removed loops, so they no longer need to reach into LPPassManager APIs to do this kind of transformation. This change very nearly removes the need for the LPPassManager to even be passed into loop passes - the only remaining pass that uses the LPM argument is LoopUnswitch. llvm-svn: 255797	2015-12-16 18:40:20 +00:00
Justin Bogner	621a2ef540	LPM: Stop threading `Pass ` through all of the loop utility APIs. NFC A large number of loop utility functions take a `Pass ` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669	2015-12-15 19:40:57 +00:00
Michael Zolotukhin	b39f3c2210	Revert "Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible."" The bug in IndVarSimplify was fixed in r254976, r254977, so I'm reapplying the original patch for avoiding redundant LCSSA recomputation. This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466. llvm-svn: 255133	2015-12-09 18:20:28 +00:00
Michael Zolotukhin	3b65beab15	Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible." The change exposed a bug in IndVarSimplify (PR25578), which led to a failure (PR25538). When the bug is fixed, this patch can be reapplied. The tests are kept in tree, as they're useful anyway, and will not break with this revert. llvm-svn: 253596	2015-11-19 20:28:32 +00:00
Michael Zolotukhin	1bc5fae202	[PR25538]: Fix a failure caused by r253126. In r253126 we stopped to recompute LCSSA after loop unrolling in all cases, except the unrolling is full and at least one of the loop exits is outside the parent loop. In other cases the transformation should not break LCSSA, but it turned out, that we also call SimplifyLoop on the parent loop, which might break LCSSA by itself. This fix just triggers LCSSA recomputation in this case as well. I'm committing it without a test case for now, but I'll try to invent one. It's a bit tricky because in an isolated test LoopSimplify would be scheduled before LoopUnroll, and thus will change the test and hide the problem. llvm-svn: 253253	2015-11-16 21:17:26 +00:00

1 2 3

132 Commits