llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-05-15 11:36:18 +00:00

Author	SHA1	Message	Date
Sam Parker	471134db57	[LoopUnroll] Enable option to peel remainder loop On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 llvm-svn: 310824	2017-08-14 09:25:26 +00:00
Anna Thomas	d9589bfdfd	[RuntimeUnroll] NFC: Add a profitability function for mutliexit loop Separated out the profitability from the safety analysis for multiexit loop unrolling. Currently, this is an NFC because profitability is true only if the unroll-runtime-multi-exit is set to true (off-by-default). This is to ease adding the profitability heuristic up for review at D35380. llvm-svn: 308753	2017-07-21 16:30:38 +00:00
Simon Pilgrim	a1ca6c4d11	Fix unused variable warning on EXPENSIVE_CHECKS release builds. NFCI. llvm-svn: 307929	2017-07-13 17:10:12 +00:00
Anna Thomas	330876c9e4	[RuntimeUnrolling] Update DomTree correctly when exit blocks have successors Summary: When we runtime unroll with multiple exit blocks, we also need to update the immediate dominators of the immediate successors of the exit blocks. Reviewers: reames, mkuper, mzolotukhin, apilipenko Reviewed by: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35304 llvm-svn: 307909	2017-07-13 13:21:23 +00:00
Anna Thomas	6f931c9a15	[LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loop Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true. llvm-svn: 307843	2017-07-12 20:55:43 +00:00
Anna Thomas	fdf710e920	[LoopUnrollRuntime] NFC: Add some debugging trace messages for why loop wasn't unrolled. llvm-svn: 307705	2017-07-11 20:44:37 +00:00
Anna Thomas	dd85871d59	[LoopUnrollRuntime] Avoid multi-exit nested loop with epilog generation The loop structure for the outer loop does not contain the epilog preheader when we try to unroll inner loop with multiple exits and epilog code is generated. For now, we just bail out in such cases. Added a test case that shows the problem. Without this bailout, we would trip on assert saying LCSSA form is incorrect for outer loop. llvm-svn: 307676	2017-07-11 17:16:33 +00:00
Anna Thomas	5b32b3eca8	[LoopUnrollRuntime] Remove strict assert about VMap requirement When unrolling under multiple exits which is under off-by-default option, the assert that checks for VMap entry in loop exit values is too strong. (assert if VMap entry did not exist, the value should be a constant). However, values derived from constants or from values outside loop, does not have a VMap entry too. Removed the assert and added a testcase showcasing the property for non-constant values. llvm-svn: 307542	2017-07-10 15:29:38 +00:00
Anna Thomas	8705a82685	[LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog remainder generated With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic is in place to support multiple exit/exiting blocks when prolog remainder is generated. This patch removed the assert that multiple exit blocks unrolling is only supported when epilog remainder is generated. Also, added test runs and checks with PROLOG prefix in runtime-loop-multiple-exits.ll test cases. llvm-svn: 307435	2017-07-07 20:12:32 +00:00
Anna Thomas	3bab1b4bf9	[LoopUnrollRuntime] NFC: use the precomputed loop exit in ConnectProlog Minor refactoring to use the preexisting loop exit that's already calculated. We do not need to recompute the loop exit in ConnectProlog. Apart from avoiding redundant computation, this is required for supporting multiple loop exits when Prolog remainder loops are generated. llvm-svn: 307417	2017-07-07 18:05:28 +00:00
Anna Thomas	7fa3eab82b	[LoopUnrollRuntime] Bailout when multiple exiting blocks to the unique latch exit block Currently, we do not support multiple exiting blocks to the latch exit block. However, this bailout wasn't triggered when we had a unique exit block (which is the latch exit), with multiple exiting blocks to that unique exit. Moved the bailout so that it's triggered in both cases and added testcase. llvm-svn: 307291	2017-07-06 18:39:26 +00:00
Anna Thomas	77fdc59c1c	[RuntimeUnrolling] Add logic for loops with multiple exit blocks Summary: Runtime unrolling is done for loops with a single exit block and a single exiting block (and this exiting block should be the latch block). This patch adds logic to support unrolling in the presence of multiple exit blocks (which also means multiple exiting blocks). Currently this is under an off-by-default option and is supported when epilog code is generated. Support in presence of prolog code will be in a future patch (we just need to add more tests, and update comments). This patch is essentially an implementation patch. I have not added any heuristic (in terms of branches added or code size) to decide when this should be enabled. Reviewers: mkuper, sanjoy, reames, evstupac Reviewed by: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33001 llvm-svn: 306846	2017-06-30 17:57:07 +00:00
Anna Thomas	1b7caf4638	[LoopUnrollRuntime] Use SCEV exit count for calculating trip count. NFCI Instead of getBackEdgeTakenCount, use getExitCount on the latch exiting block (which is proven to be the only exiting block in the loop to be unrolled). llvm-svn: 306410	2017-06-27 14:14:35 +00:00
Anna Thomas	59e7ef5085	[RuntimeLoopUnrolling] Rename exit block and move assert earlier. NFC The single exit block allowed in runtime unrolling is guaranteed to be the Latch's successor, so rename it as LatchExitBlock. llvm-svn: 306105	2017-06-23 14:28:01 +00:00
Chandler Carruth	eb66b33867	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Anna Thomas	3c2a815f18	Avoid warning of unused variable in release builds. NFC llvm-svn: 302068	2017-05-03 19:25:04 +00:00
Anna Thomas	33afefac44	Fix PPC64 warning for missing parantheses. NFC. llvm-svn: 302061	2017-05-03 18:25:43 +00:00
Anna Thomas	892ad9babd	[RuntimeLoopUnroller] Add assert that we dont unroll non-rotated loops Summary: Cloning basic blocks in the loop for runtime loop unroller depends on loop being in rotated form (i.e. loop latch target is the exit block). Assert that this is true, so that callers of runtime loop unroller pass in canonical loops. The single caller of this function has that check recently added: https://reviews.llvm.org/rL301239 Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32801 llvm-svn: 302058	2017-05-03 17:43:59 +00:00
Florian Hahn	48d1ea22f0	[LoopUnroll] Use addClonedBlockToLoopInfo to clone the top level loop (NFC) Summary: rL293124 added the necessary infrastructure to properly add the cloned top level loop to LoopInfo, which means we do not have to do it manually in CloneLoopBlocks. @mkuper sorry for not pointing this out during my review of D29156, I just realized that today. Reviewers: mzolotukhin, chandlerc, mkuper Reviewed By: mkuper Subscribers: llvm-commits, mkuper Differential Revision: https://reviews.llvm.org/D29173 llvm-svn: 293615	2017-01-31 11:13:44 +00:00
Michael Kuperstein	ce7b578d43	[LoopUnroll] Properly update loopinfo for runtime unrolling by 2 Even when we don't create a remainder loop (that is, when we unroll by 2), we may duplicate nested loops into the remainder. This is complicated by the fact the remainder may itself be either inserted into an outer loop, or at the top level. In the latter case, we may need to create new top-level loops. Differential Revision: https://reviews.llvm.org/D29156 llvm-svn: 293124	2017-01-26 01:04:11 +00:00
Eli Friedman	21d28d5c67	Preserve domtree and loop-simplify for runtime unrolling. Mostly straightforward changes; we just didn't do the computation before. One sort of interesting change in LoopUnroll.cpp: we weren't handling dominance for children of the loop latch correctly, but foldBlockIntoPredecessor hid the problem for complete unrolling. Currently punting on loop peeling; made some minor changes to isolate that problem to LoopUnrollPeel.cpp. Adds a flag -unroll-verify-domtree; it verifies the domtree immediately after we finish updating it. This is on by default for +Asserts builds. Differential Revision: https://reviews.llvm.org/D28073 llvm-svn: 292447	2017-01-18 23:26:37 +00:00
Florian Hahn	e43c2b1c03	[loop-unroll] Properly populate LoopInfo for loops cloned in LoopUnrollRuntime. Summary: This fixes Transforms/LoopUnroll/runtime-loop3.ll which failed with EXTENSIVE_DEBUG, because the cloned basic blocks were not added to the correct sub-loops in LoopUnrollRuntime.cpp. Reviewers: dexonsmith, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28482 llvm-svn: 291619	2017-01-10 23:43:35 +00:00
Michael Zolotukhin	00ccad14c6	Revert "[LoopUnroll] Properly update loop-info when cloning prologues and epilogues." This reverts commit r280901. This caused a bunch of failures, reverting it until I investigate them. llvm-svn: 280905	2016-09-08 03:51:30 +00:00
Michael Zolotukhin	109c858375	[LoopUnroll] Properly update loop-info when cloning prologues and epilogues. Summary: When cloning blocks for prologue/epilogue we need to replicate the loop structure from the original loop. It wasn't a problem for the innermost loops, but it led to an incorrect loop info when we unrolled a loop with a child loop - in this case created prologue-loop had a child loop, but loop info didn't reflect that. This fixes PR28888. Reviewers: chandlerc, sanjoy, hfinkel Subscribers: llvm-commits, silvas Differential Revision: https://reviews.llvm.org/D24203 llvm-svn: 280901	2016-09-08 01:52:26 +00:00
Wei Mi	d25dea67a3	[UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 llvm-svn: 279748	2016-08-25 16:17:18 +00:00
Michael Zolotukhin	7bf05c9357	[LoopUnroll] Ensure we create prolog loops in simplified form. llvm-svn: 277502	2016-08-02 19:19:31 +00:00
Evgeny Stupachenko	9d35c64dc2	The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662	2016-04-27 03:04:54 +00:00
Duncan P. N. Exon Smith	e34a7ba201	Transforms: Fix bootstrap after r266565 Apparently there isn't test coverage for all of these. I'd appreciate if someone with could reproduce and send me something to reduce, but for now I've just looked for users of RemapInstruction and MapValue and ensured they don't accidentally insert nullptr. Here is one of the bootstraps that caught: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/11494 llvm-svn: 266567	2016-04-17 19:26:49 +00:00
Evgeny Stupachenko	eb50bea9cf	test commit llvm-svn: 265840	2016-04-08 20:20:38 +00:00
Duncan P. N. Exon Smith	5f260975e3	IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC Clarify what this RemapFlag actually means. - Change the flag name to match its intended behaviour. - Clearly document that it's not supposed to affect globals. - Add a host of FIXMEs to indicate how to fix the behaviour to match the intent of the flag. RF_IgnoreMissingLocals should only affect the behaviour of RemapInstruction for function-local operands; namely, for operands of type Argument, Instruction, and BasicBlock. Currently, it is only passed into RemapInstruction calls (and the transitive MapValue calls that it makes). When I split Metadata from Value I didn't understand the flag, and I used it in a bunch of places for "global" metadata. This commit doesn't have any functionality change, but prepares to cleanup MapMetadata and MapValue. llvm-svn: 265628	2016-04-07 00:26:43 +00:00
David L Kreitzer	01b0be98a9	Adds the ability to use an epilog remainder loop during loop unrolling and makes this the default behavior. Patch by Evgeny Stupachenko (evstupac@gmail.com). Differential Revision: http://reviews.llvm.org/D18158 llvm-svn: 265388	2016-04-05 12:19:35 +00:00
David L Kreitzer	f43e26ebd7	Enable non-power-of-2 #pragma unroll counts. Patch by Evgeny Stupachenko. Differential Revision: http://reviews.llvm.org/D18202 llvm-svn: 264407	2016-03-25 14:24:52 +00:00
Junmo Park	6fcf7ae0a7	[SCEVExpander] Make findExistingExpansion smarter Summary: Extending findExistingExpansion can use existing value in ExprValueMap. This patch gives 0.3~0.5% performance improvements on benchmarks(test-suite, spec2000, spec2006, commercial benchmark) Reviewers: mzolotukhin, sanjoy, zzheng Differential Revision: http://reviews.llvm.org/D15559 llvm-svn: 260938	2016-02-16 06:46:58 +00:00
Justin Lebar	c576113aad	Fix typo in comment. llvm-svn: 260731	2016-02-12 21:01:37 +00:00
Sanjay Patel	704c6546a9	rangify; NFC llvm-svn: 260151	2016-02-08 21:32:43 +00:00
Sanjay Patel	98666d19ec	fix typos; NFC llvm-svn: 260130	2016-02-08 19:27:33 +00:00
Junmo Park	4f5a66835c	Minor code formatting cleanup. NFC. llvm-svn: 259010	2016-01-28 01:23:18 +00:00
Justin Bogner	621a2ef540	LPM: Stop threading `Pass ` through all of the loop utility APIs. NFC A large number of loop utility functions take a `Pass ` and reach into it to find out which analyses to preserve. There are a number of problems with this: - The APIs have access to pretty well any Pass state they want, so it's hard to tell what they may or may not do. - Other APIs have copied these and pass around a `Pass *` even though they don't even use it. Some of these just hand a nullptr to the API since the callers don't even have a pass available. - Passes in the new pass manager don't work like the current ones, so the APIs can't be used as is there. Instead, we should explicitly thread the analysis results that we actually care about through these APIs. This is both simpler and more reusable. llvm-svn: 255669	2015-12-15 19:40:57 +00:00
Duncan P. N. Exon Smith	c29917fae7	TransformUtils: Remove implicit ilist iterator conversions, NFC Continuing the work from last week to remove implicit ilist iterator conversions. First related commit was probably r249767, with some more motivation in r249925. This edition gets LLVMTransformUtils compiling without the implicit conversions. No functional change intended. llvm-svn: 250142	2015-10-13 02:39:05 +00:00
Hans Wennborg	7d1f4ff326	Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups. Patch by Eugene Zelenko! Differential Revision: http://reviews.llvm.org/D13321 llvm-svn: 249482	2015-10-06 23:24:35 +00:00
Chandler Carruth	4d1e1851a4	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Chandler Carruth	ebae815d81	[PM/AA] Remove all of the dead AliasAnalysis pointers being threaded through APIs that are no longer necessary now that the update API has been removed. This will make changes to the AA interfaces significantly less disruptive (I hope). Either way, it seems like a really nice cleanup. llvm-svn: 242882	2015-07-22 09:52:54 +00:00
David Majnemer	194197c127	[LoopUnroll] Use undef for phis with no value live We would create a phi node with a zero initialized operand instead of undef in the case where no value was originally available. This was problematic for x86_mmx which has no null value. llvm-svn: 241143	2015-07-01 05:38:07 +00:00
Alexey Samsonov	0f73d0bb04	[LoopUnroll] Use IRBuilder to create branch instructions. Use IRBuilder::Create(Cond)?Br instead of constructing instructions manually with BranchInst::Create(). It's consistent with other uses of IRBuilder in this pass, and has an additional important benefit: Using IRBuilder will ensure that new branch instruction will get the same debug location as original terminator instruction it will eventually replace. For now I'm not adding a testcase, as currently original terminator instruction also lack debug location due to missing debug location propagation in BasicBlock::splitBasicBlock. That is, the testcase will accompany the fix for the latter I'm going to mail soon. llvm-svn: 239550	2015-06-11 18:25:44 +00:00
Sanjoy Das	b9907c45f6	[LoopUnrollRuntime] Avoid high-cost trip count computation. Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive. Depends on D8993. Reviewers: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8994 llvm-svn: 234846	2015-04-14 03:20:38 +00:00
Sanjoy Das	880c33e480	[LoopUnrollRuntime] Clean up a predicate. Clean up a predicate I added in r229731, fix the relevant comment and add a test case. The earlier version is confusing to read and was also buggy (probably not a coincidence) till Alexey fixed it in r233881. llvm-svn: 234701	2015-04-12 01:24:01 +00:00
Alexey Samsonov	03b2851bcd	Fix a bug indicated by -fsanitize=shift-exponent. llvm-svn: 233881	2015-04-02 01:30:10 +00:00
Mehdi Amini	f88efe5f8a	DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740	2015-03-10 02:37:25 +00:00
Kevin Qin	4ba876c24d	Revert r231630 - Run LICM pass after loop unrolling pass. As it broke llvm bootstrap. llvm-svn: 231635	2015-03-09 07:26:37 +00:00
Kevin Qin	92a0be0434	Run LICM pass after loop unrolling pass. Runtime unrollng will introduce a runtime check in loop prologue. If the unrolled loop is a inner loop, then the proglogue will be inside the outer loop. LICM pass can help to promote the runtime check out if the checked value is loop invariant. llvm-svn: 231630	2015-03-09 06:14:07 +00:00

1 2

73 Commits