This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.
With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.
Fixes https://bugs.llvm.org/show_bug.cgi?id=30472
Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43245
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346438 91177308-0d34-0410-b5e6-96231b3b80d8
Inner-loop only reductions require additional checks to make sure they
form a load-phi-store cycle across inner and outer loop. Otherwise the
reduction value is not properly preserved. This patch disables
interchanging such loops for now, as it causes miscompiles in some
cases and it seems to apply only for a tiny amount of loops. Across the
test-suite, SPEC2000 and SPEC2006, 61 instead of 62 loops are
interchange with inner loop reduction support disabled. With
-loop-interchange-threshold=-1000, 3256 instead of 3267.
See the discussion and history of D53027 for an outline of how such legality
checks could look like.
Reviewers: efriedma, mcrosier, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D53027
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345877 91177308-0d34-0410-b5e6-96231b3b80d8
This patch turns LoopInterchange into a loop pass. It now only
considers top-level loops and tries to move the innermost loop to the
optimal position within the loop nest. By only looking at top-level
loops, we might miss a few opportunities the function pass would get
(e.g. if we have a loop nest of 3 loops, in the function pass
we might process loops at level 1 and 2 and move the inner most loop to
level 1, and then we process loops at levels 0, 1, 2 and interchange
again, because we now have a different inner loop). But I think it would
be better to handle such cases by picking the best inner loop from the
start and avoid re-visiting the same loops again.
The biggest advantage of it being a function pass is that it interacts
nicely with the other loop passes. Without this patch, there are some
performance regressions on AArch64 with loop interchanging enabled,
where no loops were interchanged, but we missed out on some other loop
optimizations.
It also removes the SimplifyCFG run. We are just changing branches, so
the CFG should not be more complicated, besides the additional 'unique'
preheaders this pass might create.
Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00
Reviewed By: xbolva00
Differential Revision: https://reviews.llvm.org/D51702
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343308 91177308-0d34-0410-b5e6-96231b3b80d8
This patch extends LoopInterchange to move LCSSA to the right place
after interchanging. This is required for LoopInterchange to become a
function pass.
An alternative to the manual moving of the PHIs, we could also re-form
the LCSSA phis for a set of interchanged loops, but that's more
expensive.
Reviewers: efriedma, mcrosier, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D52154
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343132 91177308-0d34-0410-b5e6-96231b3b80d8
As preparation for LoopInterchange becoming a loop pass, it needs to
preserve ScalarEvolution. Even though interchanging should not change
the trip count of the loop, it modifies loop entry, latch and exit
blocks.
I added -verify-scev to some loop interchange tests, but the verification does
not catch problems caused by missing invalidation of SE in loop interchange, as
the trip counts themselves do not change. So there might be potential to
make the SE verification covering more stuff in the future.
Reviewers: mkazantsev, efriedma, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D52026
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342209 91177308-0d34-0410-b5e6-96231b3b80d8
There is no need to create preheaders in the analysis stage, we only
need them when adjusting the branches. Also, the only cases we need to
create our own preheaders is when they have more than 1 predecessors or
PHI nodes (even with only 1 predecessor, we could have an LCSSA phi
node). I have simplified the conditions and added some assertions to be
sure. Because we know the inner and outer loop need to be tightly
nested, it is sufficient to check if the inner loop preheader is the
outer loop header to check if we need to create a new preheader.
Reviewers: efriedma, mcrosier, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D51703
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341533 91177308-0d34-0410-b5e6-96231b3b80d8
The core get and set routines move to the `Instruction` class. These
routines are only valid to call on instructions which are terminators.
The iterator and *generic* range based access move to `CFG.h` where all
the other generic successor and predecessor access lives. While moving
the iterator here, simplify it using the iterator utilities LLVM
provides and updates coding style as much as reasonable. The APIs remain
pointer-heavy when they could better use references, and retain the odd
behavior of `operator*` and `operator->` that is common in LLVM
iterators. Adjusting this API, if desired, should be a follow-up step.
Non-generic range iteration is added for the two instructions where
there is an especially easy mechanism and where there was code
attempting to use the range accessor from a specific subclass:
`indirectbr` and `br`. In both cases, the successors are contiguous
operands and can be easily iterated via the operand list.
This is the first major patch in removing the `TerminatorInst` type from
the IR's instruction type hierarchy. This change was discussed in an RFC
here and was pretty clearly positive:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123407.html
There will be a series of much more mechanical changes following this
one to complete this move.
Differential Revision: https://reviews.llvm.org/D47467
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340698 91177308-0d34-0410-b5e6-96231b3b80d8
This patch moves the logic to handle reduction PHI nodes to the end of
adjustLoopBranches. Reduction PHI nodes in the outer loop header can be
moved to the inner loop header and reduction PHI nodes from the inner loop
header can be moved to the outer loop header. In the latter situation,
we have to deal with 1 kind of PHI nodes:
PHI nodes that are part of inner loop-only reductions.
We can replace the PHI node with the value coming from outside
the inner loop.
Reviewers: mcrosier, efriedma, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D46198
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335027 91177308-0d34-0410-b5e6-96231b3b80d8
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332240 91177308-0d34-0410-b5e6-96231b3b80d8
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331272 91177308-0d34-0410-b5e6-96231b3b80d8
We currently support LCSSA PHI nodes in the outer loop exit, if their
incoming values do not come from the outer loop latch or if the
outer loop latch has a single predecessor. In that case, the outer loop latch
will be executed only if the inner loop gets executed. If we have multiple
predecessors for the outer loop latch, it may be executed even if the inner
loop does not get executed.
This is a first step to support the case described in
https://bugs.llvm.org/show_bug.cgi?id=30472
Reviewers: efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43237
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331037 91177308-0d34-0410-b5e6-96231b3b80d8
After D43236, we started interchanging loops with empty dependence
matrices. In isProfitableForVectorization, we try to determine if
interchanging makes the loop dependences more friendly to the
vectorizer. If there are no dependences, we should not interchange,
based on that heuristic.
Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource
Reviewed By: mcrosier
Differential Revision: https://reviews.llvm.org/D45208
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330738 91177308-0d34-0410-b5e6-96231b3b80d8
If a loop with child loops becomes our new inner loop after
interchanging, we only need to update LoopInfo for the blocks defined in
the old outer loop. BBs in child loops will stay there.
Reviewers: efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45970
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330653 91177308-0d34-0410-b5e6-96231b3b80d8
LoopInterchange relies on LoopInfo being up-to-date, so we should
preserve it after interchanging. This patch updates restructureLoops to
move the BBs of the interchanged loops to the right place.
Reviewers: davide, efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45278
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329264 91177308-0d34-0410-b5e6-96231b3b80d8
It also updates test/Transforms/LoopInterchange/call-instructions.ll
to use accesses where we can prove dependence after D35430.
Reviewers: sebpop, karthikthecool, blitz.opensource
Reviewed By: sebpop
Differential Revision: https://reviews.llvm.org/D45206
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329111 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes layering - Transforms/Utils shouldn't depend on including a Scalar
or IPO header, because Scalar and IPO depend on Utils.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328717 91177308-0d34-0410-b5e6-96231b3b80d8
It's been quite some time the Dependence Analysis (DA) is broken,
as it uses the GEP representation to "identify" multi-dimensional arrays.
It even wrongly detects multi-dimensional arrays in single nested loops:
from test/Analysis/DependenceAnalysis/Coupled.ll, example @couple6
;; for (long int i = 0; i < 50; i++) {
;; A[i][3*i - 6] = i;
;; *B++ = A[i][i];
DA used to detect two subscripts, which makes no sense in the LLVM IR
or in C/C++ semantics, as there are no guarantees as in Fortran of
subscripts not overlapping into a next array dimension:
maximum nesting levels = 1
SrcPtrSCEV = %A
DstPtrSCEV = %A
using GEPs
subscript 0
src = {0,+,1}<nuw><nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
subscript 1
src = {-6,+,3}<nsw><%for.body>
dst = {0,+,1}<nuw><nsw><%for.body>
class = 1
loops = {1}
Separable = {}
Coupled = {1}
With the current patch, DA will correctly work on only one dimension:
maximum nesting levels = 1
SrcSCEV = {(-2424 + %A)<nsw>,+,1212}<%for.body>
DstSCEV = {%A,+,404}<%for.body>
subscript 0
src = {(-2424 + %A)<nsw>,+,1212}<%for.body>
dst = {%A,+,404}<%for.body>
class = 1
loops = {1}
Separable = {0}
Coupled = {}
This change removes all uses of GEP from DA, and we now only rely
on the SCEV representation.
The patch does not turn on -da-delinearize by default, and so the DA analysis
will be more conservative in the case of multi-dimensional memory accesses in
nested loops.
I disabled some interchange tests, as the DA is not able to disambiguate
the dependence anymore. To make DA stronger, we may need to
compute a bound on the number of iterations based on the access functions
and array dimensions.
The patch cleans up all the CHECKs in test/Transforms/LoopInterchange/*.ll to
avoid checking for snippets of LLVM IR: this form of checking is very hard to
maintain. Instead, we now check for output of the pass that are more meaningful
than dozens of lines of LLVM IR. Some tests now require -debug messages and thus
only enabled with asserts.
Patch written by Sebastian Pop and Aditya Kumar.
Differential Revision: https://reviews.llvm.org/D35430
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326837 91177308-0d34-0410-b5e6-96231b3b80d8
The dependency matrix is only empty if no conflicting load/store
instructions have been found. In that case, it is safe to interchange.
For the LLVM test-suite, after this change around 1900 loops are
interchanged, whereas it is 15 before this change. On cortex-a57,
this gives an improvement of -0.57% on the geomean execution
time of SPEC2006, SPEC2000 and the test-suite. There are a
few small perf regressions, but I think we can improve on those
by making the cost model better.
Reviewers: karthikthecool, mcrosier
Reviewed by: karthikthecool
Differential Revision: https://reviews.llvm.org/D43236
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326077 91177308-0d34-0410-b5e6-96231b3b80d8
In cases where the OuterMostLoopLatchBI only has a single successor,
accessing the second successor will fail.
This fixes a failure when building the test-suite with loop-interchange
enabled.
Reviewers: mcrosier, karthikthecool, davide
Reviewed by: karthikthecool
Differential Revision: https://reviews.llvm.org/D42906
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324994 91177308-0d34-0410-b5e6-96231b3b80d8
We can use SplitBlock for both cases, which makes the code slightly
simpler and updates both LoopInfo and the dominator tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324881 91177308-0d34-0410-b5e6-96231b3b80d8
The way that splitInnerLoopHeader splits blocks requires that
the induction PHI will be the first PHI in the inner loop
header. This makes sure that is actually the case when there
are both IV and reduction phis.
Differential Revision: https://reviews.llvm.org/D38682
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316261 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
SimplifyIndVar may introduce zext instructions to widen arguments of the
loop exit check. They should not prevent us from splitting the loop at
the induction variable, but maybe the check should be more conservative,
e.g. making sure it only extends arguments used by a comparison?
Reviewers: karthikthecool, mcrosier, mzolotukhin
Reviewed By: mcrosier
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D34879
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311783 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Without any information about the called function, we cannot be sure
that it is safe to interchange loops which contain function calls. For
example there could be dependences that prevent interchanging between
accesses in the called function and the loops. Even functions without any
parameters could cause problems, as they could access memory using
global pointers.
For now, I think it is only safe to interchange loops with calls marked
as readnone.
With this patch, the LLVM test suite passes with `-O3 -mllvm
-enable-loopinterchange` and LoopInterchangeProfitability::isProfitable
returning true for all loops. check-llvm and check-clang also pass when
bootstrapped in a similar fashion, although only 3 loops got
interchanged.
Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper
Reviewed By: mcrosier
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35489
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309547 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The remaining non range-based for loops do not iterate over full ranges,
so leave them as they are.
Reviewers: karthikthecool, blitz.opensource, mcrosier, mkuper, aemerson
Reviewed By: aemerson
Subscribers: aemerson, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35777
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308872 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This makes it easier to find out which limitation prevented this pass from doing its work.
Reviewers: karthikthecool, mzolotukhin, efriedma, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D34940
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307035 91177308-0d34-0410-b5e6-96231b3b80d8
It is, in fact, unused. Found while reviewing Danny's new
SSAUpdater and porting passes to it to see how the new API
looked like.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293407 91177308-0d34-0410-b5e6-96231b3b80d8
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289756 91177308-0d34-0410-b5e6-96231b3b80d8
This condition is trivially always true prior to the change. The comment
at the call site makes it clear that we expect *all* of these to be '=',
'S', or 'I' so fix the code.
We have a bug I will update to track the fact that Clang doesn't warn on
this: http://llvm.org/PR13101
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285930 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, we give up on loop interchange if we encounter a flow dependency
anywhere in the loop list. Worse yet, we don't even track output dependencies.
This patch updates the dependency matrix computation to track flow and output
dependencies in the same way we track anti dependencies.
This improves an internal workload by 2.2x.
Note the loop interchange pass is off by default and it can be enabled with
'-mllvm -enable-loopinterchange'
Differential Revision: https://reviews.llvm.org/D24564
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282101 91177308-0d34-0410-b5e6-96231b3b80d8