Commit Graph

202 Commits

Author SHA1 Message Date
Haicheng Wu
86f0394337 [LoopUnroll] Check partial unrolling is enabled before initialization. NFC.
Differential Revision: https://reviews.llvm.org/D23891

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285330 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 18:40:02 +00:00
Michael Kuperstein
0e4bd938f4 Fix 80-char violations. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285092 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 18:31:23 +00:00
John Brawn
9e0c61cbeb [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 11:08:48 +00:00
Haicheng Wu
b893afb0a5 Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284053 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 21:29:38 +00:00
Haicheng Wu
3d05abfa85 Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
This reverts commit r284044.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284051 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 21:02:22 +00:00
Haicheng Wu
6ceda533e5 [LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop
This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

Differential Revision: https://reviews.llvm.org/D24790

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 20:24:32 +00:00
Dehao Chen
9faad5871f Update loop unroller cost model to make sure debug info does not affect optimization decisions.
Summary: Debug info should *not* affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info.

Reviewers: davidxl, mzolotukhin

Subscribers: haicheng, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D25098

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282894 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-30 18:30:04 +00:00
Adam Nemet
4fa3e14023 [LoopUnroll] Port to the new streaming interface for opt remarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282834 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-30 03:44:16 +00:00
Jonas Paulsson
2db5109945 [SystemZ] Implementation of getUnrollingPreferences().
This commit enables more unrolling for SystemZ by implementing the
SystemZTargetTransformInfo::getUnrollingPreferences() method.

It has been found that it is better to only unroll moderately, so the
DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order
to set this to a lower value for SystemZ (4).

Reviewers: Evgeny Stupachenko, Ulrich Weigand.
https://reviews.llvm.org/D24451

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282570 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 09:41:38 +00:00
Haicheng Wu
16d3f8855e [LoopUnroll] Correct a debug message. NFC.
Differential Revision: https://reviews.llvm.org/D24299

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280865 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 21:30:16 +00:00
Adam Nemet
d138c45c40 [LoopUnroll] Use OptimizationRemarkEmitter directly not via the analysis pass
We can't mark ORE (a function pass) preserved as required by the loop
passes because that is how we ensure that the required passes like
LazyBFI are all available any time ORE is used.  See the new comments in
the patch.

Instead we use it directly just like the inliner does in D22694.

As expected there is some additional overhead after removing the caching
provided by analysis passes.  The worst case, I measured was
LNT/CINT2006_ref/401.bzip2 which regresses by 12%.  As before, this only
affects -Rpass-with-hotness and not default compilation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279829 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-26 15:58:34 +00:00
Michael Zolotukhin
7433358528 [LoopUnroll] By default disable unrolling when optimizing for size.
Summary:
In clang commit r268509 we started to invoke loop-unroll pass from the
driver even under -Os. However, we happen to not initialize optsize
thresholds properly, which si fixed with this change.

r268509 led to some big compile time regressions, because we started to
unroll some loops that we didn't unroll before. With this change I hope
to recover most of the regressions. We still are slightly slower than
before, because we do some checks here and there in loop-unrolling
before we bail out, but at least the slowdown is not that huge now.

Reviewers: hfinkel, chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23388

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279585 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-23 23:13:15 +00:00
Haicheng Wu
50d8d1ea85 [LoopUnroll] Move a simple check earlier. NFC.
Move the check of CallInst earlier to skip expensive recursive operations.

Differential Revision: https://reviews.llvm.org/D23611

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278998 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 22:42:58 +00:00
Sean Silva
a4f9d70f9b Consistently use LoopAnalysisManager
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).

Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.

Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278079 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-09 00:28:52 +00:00
Adam Nemet
c4f6d8cd25 [LoopUnroll] Include hotness of region in opt remark
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.

The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.

This is how the patch affects the O3 pipeline:

         Dominator Tree Construction
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Rotate Loops
           Loop Invariant Code Motion
           Unswitch loops
         Simplify the CFG
         Dominator Tree Construction
         Basic Alias Analysis (stateless AA impl)
         Function Alias Analysis Results
         Combine redundant instructions
         Natural Loop Information
         Canonicalize natural loops
         Loop-Closed SSA Form Pass
         Scalar Evolution Analysis
+        Lazy Branch Probability Analysis
+        Lazy Block Frequency Analysis
+        Optimization Remark Emitter
         Loop Pass Manager
           Induction Variable Simplification
           Recognize loop idioms
           Delete dead loops
           Unroll loops
...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277203 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-29 19:29:47 +00:00
Sean Silva
d8c90ea6b8 [PM] Port LoopUnroll.
We just set PreserveLCSSA to always true since we don't have an
analogous method `mustPreserveAnalysisID(LCSSA)`.

Also port LoopInfo verifier pass to test LoopUnrollPass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276063 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 23:54:23 +00:00
David Majnemer
3df2af21e9 [LoopUnroll] Don't crash trying to unroll loop with EH pad exit
We do not support splitting cleanuppad or catchswitches.  This is
problematic for passes which assume that a loop is in loop simplify
form (the loop would have a dedicated exit block instead of sharing it).

While it isn't great that we don't support this for cleanups, we still
cannot make loop-simplify form an assertable precondition because
indirectbr will also disable these sorts of CFG cleanups.

This fixes PR28132.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272739 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 00:19:56 +00:00
Evgeny Stupachenko
3cb6afa22a The patch set unroll disable pragma when unroll
with user specified count has been applied.

Summary:
Previously SetLoopAlreadyUnrolled() set the disable pragma only if
there was some loop metadata.
Now it set the pragma in all cases. This helps to prevent multiple
unroll when -unroll-count=N is given.

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D20765

From: Evgeny Stupachenko <evstupac@gmail.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272195 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-08 20:21:24 +00:00
Michael Zolotukhin
3c11b7d257 [LoopUnroll] Set correct thresholds for new recently enabled unrolling heuristic.
In r270478, where I enabled the new heuristic I posted testing results,
which I got when explicitly passed the thresholds values via CL options.
However, setting the CL options init-values is not enough to change the
default values of thresholds, so I'm changing them in another place now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271615 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-03 00:16:46 +00:00
Evgeny Stupachenko
6be2be5352 The patch fixes r271071
Summary:
unused variables in Release mode:
  BasicBlock *Header
  unsigned OrigCount
put under DEBUG

From: Evgeny Stupachenko <evstupac@gmail.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271076 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-28 00:14:58 +00:00
Evgeny Stupachenko
1369b53da1 The patch refactors unroll pass.
Summary:
Unroll factor (Count) calculations moved to a new function.
Early exits on pragma and "-unroll-count" defined factor added.
New type of unrolling "Force" introduced (previously used implicitly).
New unroll preference "AllowRemainder" introduced and set "true" by default.
(should be set to false for architectures that suffers from it).

Reviewers: hfinkel, mzolotukhin, zzheng

Differential Revision: http://reviews.llvm.org/D19553

From: Evgeny Stupachenko <evstupac@gmail.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271071 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 23:15:06 +00:00
Benjamin Kramer
14aae01bc3 Apply clang-tidy's misc-move-constructor-init throughout LLVM.
No functionality change intended, maybe a tiny performance improvement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270997 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 14:27:24 +00:00
Michael Zolotukhin
642359b127 [LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.
Condition might be simplified to a Constant, but it doesn't have to be
ConstantInt, so we should dyn_cast, instead of cast.

This fixes PR27886.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270924 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-26 21:42:51 +00:00
Michael Zolotukhin
4b1e009d8e Re-enable "[LoopUnroll] Enable advanced unrolling analysis by default" one more time.
This reverts commit r270577.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270630 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-24 23:00:05 +00:00
Hans Wennborg
79d5d94f04 Revert r270518, which re-enabled "[LoopUnroll] Enable advanced unrolling analysis by default.
Chromium builds are still hitting the assert in PR27874.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270577 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-24 16:10:12 +00:00
Michael Zolotukhin
01a4a3e2cc Revert "Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default.""
This reverts commit r270512 and reapplies r270478. Originally it caused
PR27847, but it was fixed in r270517.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270518 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-24 01:22:20 +00:00
Hans Wennborg
f82506b889 Revert r270478 "[LoopUnroll] Enable advanced unrolling analysis by default."
This caused PR27847.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270512 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-23 23:42:35 +00:00
Michael Zolotukhin
4b3e216784 [LoopUnroll] Enable advanced unrolling analysis by default.
Summary:
This patch turns on LoopUnrollAnalyzer by default. To mitigate compile
time regressions, I chose very conservative thresholds for now. Later we
can make them more aggressive, but it might require being smarter in
which loops we're optimizing. E.g. currently the biggest issue is that
with more agressive thresholds we unroll many cold loops, which
increases compile time for no performance benefit (performance of those
loops is improved, but it doesn't matter since they are cold).

Test results for compile time(using 4 samples to reduce noise):
```
MultiSource/Benchmarks/VersaBench/ecbdes/ecbdes 5.19%
SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect  4.19%
MultiSource/Benchmarks/FreeBench/fourinarow/fourinarow  3.39%
MultiSource/Applications/JM/lencod/lencod 1.47%
MultiSource/Benchmarks/Fhourstones-3_1/fhourstones3_1 -6.06%
```

I didn't see any performance changes in the testsuite, but it improves
some internal tests.

Reviewers: hfinkel, chandlerc

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D20482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270478 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-23 19:10:19 +00:00
Michael Zolotukhin
6e6d60d000 [LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands.
Previously, we didn't add their and their operands cost, which could've
resulted in unrolling loops for no actual benefit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269985 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-18 21:20:12 +00:00
Michael Zolotukhin
2463a66c88 Revert "Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...""
This reverts commit r269395.

Try to reapply with a fix from chapuni.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269486 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 21:23:25 +00:00
Michael Zolotukhin
a934d5cb93 Revert "[Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the..."
This reverts commit r269388.

It caused some bots to fail, I'm reverting it until I investigate the
issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269395 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 06:32:25 +00:00
Michael Zolotukhin
a538be3ab1 [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...
Summary:
...loop after the last iteration.

This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.

This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.

The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.

Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.

Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.

We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.

Reviewers: chandlerc

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269388 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 01:42:39 +00:00
Hans Wennborg
9ee5a28c8c Loop unroller: set thresholds for optsize and minsize functions to zero
Before r268509, Clang would disable the loop unroll pass when optimizing
for size. That commit enabled it to be able to support unroll pragmas
in -Os builds. However, this regressed binary size in one of Chromium's
DLLs with ~100 KB.

This restores the original behaviour of no unrolling at -Os, but doing it
in LLVM instead of Clang makes more sense, and also allows the pragmas to
keep working.

Differential revision: http://reviews.llvm.org/D20115

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269124 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 21:45:55 +00:00
Dehao Chen
8b3e014d45 clang-format some files in preparation of coming patch reviews.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268583 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-05 00:54:54 +00:00
Andrew Kaylor
1e455c5cfb Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267231 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 22:06:11 +00:00
Vedant Kumar
8866d94a61 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 06:51:37 +00:00
Andrew Kaylor
c852398cbc Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-21 17:58:54 +00:00
Fiona Glaser
a4b0d0e4db Loop Unroll: add options and tweak to make Partial unrolling more useful
1. Add FullUnrollMaxCount option that works like MaxCount, but also limits
   the unroll count for fully unrolled loops. So if a loop has an iteration
   count over this, it won't fully unroll.
2. Add CLI options for MaxCount and the new option, so they can be tested
   (plus a test).
3. Make partial unrolling obey MaxCount.

An example use-case (the out of tree one this is originally designed for) is
a target’s TTI can analyze a loop and decide on a max unroll count separate
from the size threshold, e.g. based on register pressure, then constrain
LoopUnroll to not exceed that, regardless of the size of the unrolled loop.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265562 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 16:57:25 +00:00
Fiona Glaser
93b72547c0 LoopUnroll: only allow non-modulo Partial unrolling when Runtime=true
Patch by Evgeny Stupachenko <evstupac@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265558 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 16:43:45 +00:00
Zia Ansari
06022d8db1 Enable unroll for constant bound loops when TripCount is not modulo of unroll factor, reducing it to maximum power-of-2 that satisfies threshold limit.
Commit for Evgeny Stupachenko (evstupac@gmail.com)

Differential Revision: http://reviews.llvm.org/D18290



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265337 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-04 19:24:46 +00:00
David L Kreitzer
88ef819968 Enable non-power-of-2 #pragma unroll counts.
Patch by Evgeny Stupachenko.

Differential Revision: http://reviews.llvm.org/D18202


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264407 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-25 14:24:52 +00:00
Justin Lebar
64d996c3f3 [LoopUnroll] Respect the convergent attribute.
Summary:
Specifically, when we perform runtime loop unrolling of a loop that
contains a convergent op, we can only unroll k times, where k divides
the loop trip multiple.

Without this change, we'll happily unroll e.g. the following loop

  for (int i = 0; i < N; ++i) {
    if (i == 0) convergent_op();
    foo();
  }

into

  int i = 0;
  if (N % 2 == 1) {
    convergent_op();
    foo();
    ++i;
  }
  for (; i < N - 1; i += 2) {
    if (i == 0) convergent_op();
    foo();
    foo();
  }.

This is unsafe, because we've just added a control-flow dependency to
the convergent op in the prelude.

In general, runtime unrolling loops that contain convergent ops is safe
only if we don't have emit a prelude, which occurs when the unroll count
divides the trip multiple.

Reviewers: resistor

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17526

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263509 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-14 23:15:34 +00:00
Sanjay Patel
be9115f49d fix variable name; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262953 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-08 19:07:42 +00:00
Sanjay Patel
57d9dbefb3 use range-based loop; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262952 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-08 19:06:12 +00:00
Michael Zolotukhin
79c196414f [LoopUnrollAnalyzer] Check that we're using SCEV for the same loop we're simulating.
Summary: Check that we're using SCEV for the same loop we're simulating. Otherwise, we might try to use the iteration number of the current loop in SCEV expressions for inner/outer loops IVs, which is clearly incorrect.

Reviewers: chandlerc, hfinkel

Subscribers: sanjoy, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D17632

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261958 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-26 02:57:05 +00:00
Chandler Carruth
eca46e623a [LPM] Factor all of the loop analysis usage updates into a common helper
routine.

We were getting this wrong in small ways and generally being very
inconsistent about it across loop passes. Instead, let's have a common
place where we do this. One minor downside is that this will require
some analyses like SCEV in more places than they are strictly needed.
However, this seems benign as these analyses are complete no-ops, and
without this consistency we can in many cases end up with the legacy
pass manager scheduling deciding to split up a loop pass pipeline in
order to run the function analysis half-way through. It is very, very
annoying to fix these without just being very pedantic across the board.

The only loop passes I've not updated here are ones that use
AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer.
They seemed less relevant.

With this patch, almost all of the problems in PR24804 around loop pass
pipelines are fixed. The one remaining issue is that we run simplify-cfg
and instcombine in the middle of the loop pass pipeline. We've recently
added some loop variants of these passes that would seem substantially
cleaner to use, but this at least gets us much closer to the previous
state. Notably, the seven loop pass managers is down to three.

I've not updated the loop passes using LoopAccessAnalysis because that
analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't
clear that those transforms want to support those forms anyways. They
all run late anyways, so this is harmless. Similarly, LSR is left alone
because it already carefully manages its forms and doesn't need to get
fused into a single loop pass manager with a bunch of other loop passes.

LoopReroll didn't use loop simplified form previously, and I've updated
the test case to match the trivially different output.

Finally, I've also factored all the pass initialization for the passes
that use this technique as well, so that should be done regularly and
reliably.

Thanks to James for the help reviewing and thinking about this stuff,
and Ben for help thinking about it as well!

Differential Revision: http://reviews.llvm.org/D17435

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261316 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-19 10:45:18 +00:00
Michael Zolotukhin
08d1cff7c6 Factor out UnrollAnalyzer to Analysis, and add unit tests for it.
Summary:
Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests.
The change itself is supposed to be NFC, except adding a couple of tests.

I plan to add more tests as I add new functionality and find/fix bugs.

Reviewers: chandlerc, hfinkel, sanjoy

Subscribers: zzheng, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D16623

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260169 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-08 23:03:59 +00:00
Justin Bogner
f4afe81203 LoopUnroll: Move the actual unrolling logic to a standalone function. NFC
This is pure code motion - break the actual work out of runOnLoop into
a reusable standalone function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257445 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 05:21:37 +00:00
Justin Bogner
0a729451cb LoopUnroll: Make canUnrollCompletely static - it doesn't use any state. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257427 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 01:06:32 +00:00
Justin Bogner
1ddf854804 LoopUnroll: Clean up the maze of initialization for unroll parameters. NFC
The layering of where the various loop unroll parameters are
initialized and overridden here was very confusing, making it pretty
difficult to tell just how the various sources interacted. Instead, we
put all of the initialization logic together in a single function so
that it's obvious what overrides what.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257426 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 00:55:26 +00:00