Summary:
LAA uses the PredicatedScalarEvolution interface, so it can produce
forward/backward dependences having SCEVs that are AddRecExprs only after being
transformed by PredicatedScalarEvolution.
Use PredicatedScalarEvolution to get the expected expressions.
Reviewers: anemet
Subscribers: llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D15382
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255241 91177308-0d34-0410-b5e6-96231b3b80d8
ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform
and Analysis modules:
[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions
Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
P1: {a,+,b} has nsw
P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255122 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the
usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge
by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is
that both LAA and LV should use this interface everywhere.
This also solves a problem involving the result of SCEV expression rewritting when
the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates
P1: {a,+,b} has nsw
P2: b = 1.
Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us
sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies).
The SCEVPredicatedLayer maintains the order of transformations by feeding back
the results of previous transformations into new transformations, and therefore
avoiding this issue.
The SCEVPredicatedLayer maintains a cache to remember the results of previous
SCEV rewritting results. This also has the benefit of reducing the overall number
of expression rewrites.
Reviewers: mzolotukhin, anemet
Subscribers: jmolloy, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14296
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255115 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
LAA currently generates a set of SCEV predicates that must be checked by users.
In the case of Loop Distribute/Loop Load Elimination, no such predicates could have
been emitted, since we don't allow stride versioning. However, in the future there
could be SCEV predicates that will need to be checked.
This change adds support for SCEV predicate versioning in the Loop Distribute, Loop
Load Eliminate and the loop versioning infrastructure.
Reviewers: anemet
Subscribers: mssimpso, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14240
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252467 91177308-0d34-0410-b5e6-96231b3b80d8
Some implicit ilist iterator conversions have crept back into Analysis,
Transforms, Hexagon, and llvm-stress. This removes them.
I'll commit a patch immediately after this to disallow them (in a
separate patch so that it's easy to revert if necessary).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252371 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop. E.g.:
for (i)
A[i + 1] = A[i] + B[i]
=>
T = A[0]
for (i)
T = T + B[i]
A[i + 1] = T
The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load. Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.
This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning. As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here. Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.
In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads. I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.
The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop. Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%). This gain also transfers over to x86: it's
around 8-10%.
Right now the pass is off by default and can be enabled
with -enable-loop-load-elim. On the LNT testsuite, there are two
performance changes (negative number -> improvement):
1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
critical paths is reduced
2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
outside of LNT
The pass is scheduled after the loop vectorizer (which is after loop
distribution). The rational is to try to reuse LAA state, rather than
recomputing it. The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.
LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences). LAA is known to omit loop-independent
dependences in certain situations. The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.
Reviewers: dberlin, hfinkel
Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits
Differential Revision: http://reviews.llvm.org/D13259
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252017 91177308-0d34-0410-b5e6-96231b3b80d8