We were inappropriately using 32-bit types to account for quantities
that can be far larger.
Fixed in PR28443.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274737 91177308-0d34-0410-b5e6-96231b3b80d8
Note that require<domtree> and require<loops> aren't needed because they
come in implicitly via the loop pass manager.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274712 91177308-0d34-0410-b5e6-96231b3b80d8
It did not handle correctly cases without GEP.
The following loop wasn't vectorized:
for (int i=0; i<len; i++)
*to++ = *from++;
I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.
Re-commit rL273257 - revision: http://reviews.llvm.org/D20789
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273864 91177308-0d34-0410-b5e6-96231b3b80d8
It did not handle correctly cases without GEP.
The following loop wasn't vectorized:
for (int i=0; i<len; i++)
*to++ = *from++;
I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.
Differential revision: http://reviews.llvm.org/D20789
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273257 91177308-0d34-0410-b5e6-96231b3b80d8
This is a functional change for LLE and LDist. The other clients (LV,
LVerLICM) already had this explicitly enabled.
The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides. This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter. This makes the
interface more friendly to the new Pass Manager.
The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273064 91177308-0d34-0410-b5e6-96231b3b80d8
This is still NFCI, so the list of clients that allow symbolic stride
speculation does not change (yes: LV and LoopVersioningLICM, no: LLE,
LDist). However since the symbolic strides are now managed by LAA
rather than passed by client a new bool parameter is used to enable
symbolic stride speculation.
The existing test Transforms/LoopVectorize/version-mem-access.ll checks
that stride speculation is performed for LV.
The previously added test Transforms/LoopLoadElim/symbolic-stride.ll
ensures that no speculation is performed for LLE.
The next patch will change the functionality and turn on symbolic stride
speculation in all of LAA's clients and remove the bool parameter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272970 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the order in which we attempt to prove the independence of
strided accesses. We previously did this after we knew the dependence distance
was positive. With this change, we check for independence before handling the
negative distance case. The patch prevents LAA from reporting forward
dependences for independent strided accesses.
This change was requested in the review of D19984.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270072 91177308-0d34-0410-b5e6-96231b3b80d8
This patch renames the option enabling the store-to-load forwarding conflict
detection optimization. This change was requested in the review of D20241.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269668 91177308-0d34-0410-b5e6-96231b3b80d8
Also s/Cycles/Iters/ in NumCyclesForStoreLoadThroughMemory to make it
clear that this is not about clock cycles but loop cycles/iterations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269667 91177308-0d34-0410-b5e6-96231b3b80d8
This removes a redundant stride versioning step (we already
do it in getPtrStride, so it has no effect) and uses PSE to
get the SCEV expressions for the source and destination
(this might have changed when getPtrStride was called).
I discovered this through code inspection, and couldn't
produce a regression test for it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269052 91177308-0d34-0410-b5e6-96231b3b80d8
When we encounter unsafe memory dependencies, loop distribution could
help.
Even though, the diagnostics is in LAA, it's only currently emitted in
the vectorizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268987 91177308-0d34-0410-b5e6-96231b3b80d8
This message used to be correct, when all we cared about was whether the
dependence was safe (i.e. NoDep) or unsafe. With the current more
precise characterization, this is a forward dep.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268695 91177308-0d34-0410-b5e6-96231b3b80d8
The functionality contained within getIntrinsicIDForCall is two-fold: it
checks if a CallInst's callee is a vectorizable intrinsic. If it isn't
an intrinsic, it attempts to map the call's target to a suitable
intrinsic.
Move the mapping functionality into getIntrinsicForCallSite and rename
getIntrinsicIDForCall to getVectorIntrinsicIDForCall while
reimplementing it in terms of getIntrinsicForCallSite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266801 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Add a print method to Predicated Scalar Evolution which prints all interesting
transformations done by PSE.
Loop Access Analysis will now print this as part of the analysis output.
We now use this to check the exact expression transformations that were done
by PSE in LAA.
The additional checking also acts as white-box testing for the getAsAddRec method.
Reviewers: anemet, sanjoy
Subscribers: sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18792
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266334 91177308-0d34-0410-b5e6-96231b3b80d8
This re-commits r265535 which was reverted in r265541 because it
broke the windows bots. The problem was that we had a PointerIntPair
which took a pointer to a struct allocated with new. The problem
was that new doesn't provide sufficient alignment guarantees.
This pattern was already present before r265535 and it just happened
to work. To fix this, we now separate the PointerToIntPair from the
ExitNotTakenInfo struct into a pointer and a bool.
Original commit message:
Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.
However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.
In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.
We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.
Reviewers: anemet, mzolotukhin, hfinkel, sanjoy
Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17201
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265786 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.
However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.
In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.
We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.
Reviewers: anemet, mzolotukhin, hfinkel, sanjoy
Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D17201
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265535 91177308-0d34-0410-b5e6-96231b3b80d8
We used to only allow SCEVAddRecExpr for pointer expressions in order to
be able to compute the bounds. However this is also trivially possible
for loop-invariant addresses (scUnknown) since then the bounds are the
address itself.
Interestingly, we used allow this for the special case when the
loop-invariant address happens to also be an SCEVAddRecExpr (in an outer
loop).
There are a couple more loops that are vectorized in SPEC after this.
My guess is that the main reason we don't see more because for example a
loop-invariant load is vectorized into a splat vector with several
vector-inserts. This is likely to make the vectorization unprofitable.
I.e. we don't notice that a later LICM will move all of this out of the
loop so the cost estimate should really be 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264243 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This changes the conversion functions from SCEV * to SCEVAddRecExpr from
ScalarEvolution and PredicatedScalarEvolution to return a SCEVAddRecExpr*
instead of a SCEV* (which removes the need of most clients to do a
dyn_cast right after calling these functions).
We also don't add new predicates if the transformation was not successful.
This is not entirely a NFC (as it can theoretically remove some predicates
from LAA when we have an unknown dependece), but I couldn't find an obvious
regression test for it.
Reviewers: sanjoy
Subscribers: sanjoy, mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D18368
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264161 91177308-0d34-0410-b5e6-96231b3b80d8
sanitizer issue. The PredicatedScalarEvolution's copy constructor
wasn't copying the Generation value, and was leaving it un-initialized.
Original commit message:
[SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection
Summary:
This change adds no wrap SCEV predicates with:
- support for runtime checking
- support for expression rewriting:
(sext ({x,+,y}) -> {sext(x),+,sext(y)}
(zext ({x,+,y}) -> {zext(x),+,sext(y)}
Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.
We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.
Reviewers: mzolotukhin, sanjoy, anemet
Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel
Differential Revision: http://reviews.llvm.org/D15412
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260112 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change adds no wrap SCEV predicates with:
- support for runtime checking
- support for expression rewriting:
(sext ({x,+,y}) -> {sext(x),+,sext(y)}
(zext ({x,+,y}) -> {zext(x),+,sext(y)}
Note that we are sign extending the increment of the SCEV, even for
the zext case. This is needed to cover the fairly common case where y would
be a (small) negative integer. In order to do this, this change adds two new
flags: nusw and nssw that are applicable to AddRecExprs and permit the
transformations above.
We also change isStridedPtr in LAA to be able to make use of
these predicates. With this feature we should now always be able to
work around overflow issues in the dependence analysis.
Reviewers: mzolotukhin, sanjoy, anemet
Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel
Differential Revision: http://reviews.llvm.org/D15412
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260085 91177308-0d34-0410-b5e6-96231b3b80d8
This is a recommit of r258620 which causes PR26293.
The original message:
Now LIR can turn following codes into memset:
typedef struct foo {
int a;
int b;
} foo_t;
void bar(foo_t *f, unsigned n) {
for (unsigned i = 0; i < n; ++i) {
f[i].a = 0;
f[i].b = 0;
}
}
void test(foo_t *f, unsigned n) {
for (unsigned i = 0; i < n; i += 2) {
f[i] = 0;
f[i+1] = 0;
}
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258777 91177308-0d34-0410-b5e6-96231b3b80d8
The early return seems to be missed. This causes a radical and wrong loop
optimization on powerpc. It isn't reproducible on x86_64, because
"UseInterleaved" is false.
Patch by Tim Shen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257134 91177308-0d34-0410-b5e6-96231b3b80d8