llvm/test/Transforms/LoopUnroll
Michael Zolotukhin a538be3ab1 [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...
Summary:
...loop after the last iteration.

This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.

This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.

The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.

Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.

Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.

We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.

Reviewers: chandlerc

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269388 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 01:42:39 +00:00
..
AArch64 The patch fixes PR27392. 2016-04-27 03:04:54 +00:00
AMDGPU AMDGPU: Remove some old intrinsic uses from tests 2016-02-11 06:02:01 +00:00
PowerPC Loop unroller: set thresholds for optsize and minsize functions to zero 2016-05-10 21:45:55 +00:00
X86 Adds the ability to use an epilog remainder loop during loop unrolling and makes 2016-04-05 12:19:35 +00:00
2004-05-13-DontUnrollTooMuch.ll
2005-03-06-BadLoopInfoUpdate.ll
2006-08-24-MultiBlockLoop.ll
2007-04-16-PhiUpdate.ll
2007-05-05-UnrollMiscomp.ll [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction 2015-02-27 19:29:02 +00:00
2007-05-09-UnknownTripCount.ll
2007-11-05-Crash.ll
2011-08-08-PhiUpdate.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
2011-08-09-IVSimplify.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
2011-08-09-PhiUpdate.ll
2011-10-01-NoopTrunc.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
2012-04-09-unroll-indirectbr.ll
basic.ll
convergent.ll [LoopUnroll] Respect the convergent attribute. 2016-03-14 23:15:34 +00:00
ephemeral.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
full-unroll-bad-cost.ll [LoopUnroll] Fix truncation bug in canUnrollCompletely. 2015-06-06 05:24:10 +00:00
full-unroll-crashers.ll [Unroll] Do not crash trying to propagate a value to vector load. 2015-09-22 22:27:12 +00:00
full-unroll-heuristics-2.ll [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... 2016-05-13 01:42:39 +00:00
full-unroll-heuristics-cmp.ll [LoopUnrollAnalyzer] Don't treat gep-instructions with simplified offset as simplified. 2016-05-13 01:42:34 +00:00
full-unroll-heuristics-dce.ll [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... 2016-05-13 01:42:39 +00:00
full-unroll-heuristics-geps.ll [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... 2016-05-13 01:42:39 +00:00
full-unroll-heuristics-phi-prop.ll [Unroll] Improve the brute force loop unroll estimate by propagating 2015-08-03 20:32:27 +00:00
full-unroll-heuristics.ll [Unroll] Rework the naming and structure of the new unroll heuristics. 2015-06-05 17:01:43 +00:00
high-cost-trip-count-computation.ll Adds the ability to use an epilog remainder loop during loop unrolling and makes 2016-04-05 12:19:35 +00:00
ignore-annotation-intrinsic-cost.ll [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction 2015-02-27 19:29:02 +00:00
loop-remarks.ll
nsw-tripcount.ll [SCEV] Improve Scalar Evolution's use of no {un,}signed wrap flags 2014-10-31 11:40:32 +00:00
partial-unroll-const-bounds.ll LoopUnroll: only allow non-modulo Partial unrolling when Runtime=true 2016-04-06 16:43:45 +00:00
pr10813.ll
pr11361.ll
pr14167.ll
pr18861.ll [Tests] Add one more case to LoopUnroll/pr18861.ll for better coverage. 2015-10-02 19:21:52 +00:00
pr27157.ll [LoopUnroll] Fix the way we update DT after complete unrolling. 2016-04-06 21:47:12 +00:00
rebuild_lcssa.ll [LoopUnrolling] Fix a bug introduced in r259869 (PR26688). 2016-02-22 21:21:45 +00:00
runtime-loop1.ll The patch fixes PR27392. 2016-04-27 03:04:54 +00:00
runtime-loop2.ll Adds the ability to use an epilog remainder loop during loop unrolling and makes 2016-04-05 12:19:35 +00:00
runtime-loop3.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
runtime-loop4.ll Adds the ability to use an epilog remainder loop during loop unrolling and makes 2016-04-05 12:19:35 +00:00
runtime-loop5.ll Adds the ability to use an epilog remainder loop during loop unrolling and makes 2016-04-05 12:19:35 +00:00
runtime-loop.ll The patch fixes PR27392. 2016-04-27 03:04:54 +00:00
scevunroll.ll Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader" 2015-11-03 07:14:39 +00:00
shifted-tripcount.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
tripcount-overflow.ll The patch fixes PR27392. 2016-04-27 03:04:54 +00:00
unloop.ll LoopInfo: Simplify ownership of Loop objects 2016-01-08 19:08:53 +00:00
unroll-cleanup.ll Adds the ability to use an epilog remainder loop during loop unrolling and makes 2016-04-05 12:19:35 +00:00
unroll-cleanuppad.ll [LoopUnroll] Unroll loops which have exit blocks to EH pads 2016-05-03 03:57:40 +00:00
unroll-opt-attribute.ll Loop unroller: set thresholds for optsize and minsize functions to zero 2016-05-10 21:45:55 +00:00
unroll-pragmas-disabled.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
unroll-pragmas.ll Loop unroller: set thresholds for optsize and minsize functions to zero 2016-05-10 21:45:55 +00:00
update-loop-info-in-subloops.ll LoopUnroll: Create sub-loops in LoopInfo 2014-10-07 21:19:00 +00:00