Summary:
The current LICM allows sinking an instruction only when it is exposed to exit
blocks through a trivially replacable PHI of which all incoming values are the
same instruction. This change enhance LICM to sink a sinkable instruction
through non-trivially replacable PHIs by spliting predecessors of loop
exits.
Reviewers: hfinkel, majnemer, davidxl, bmakam, mcrosier, danielcdh, efriedma, jtony
Reviewed By: efriedma
Subscribers: nemanjai, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D37163
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317335 91177308-0d34-0410-b5e6-96231b3b80d8
When going to explain this to someone else, I got tripped up by the complicated meaning of IsKnownNonEscapingObject in load-store promotion. Extract a helper routine and clarify naming/scopes to make this a bit more obvious.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316699 91177308-0d34-0410-b5e6-96231b3b80d8
Sinking of unordered atomic load into loop must be disallowed because it turns
a single load into multiple loads. The relevant section of the documentation
is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for
Optimizers section. Here is the full text of this section:
> Notes for optimizers
> In terms of the optimizer, this **prohibits any transformation that
> transforms a single load into multiple loads**, transforms a store into
> multiple stores, narrows a store, or stores a value which would not be
> stored otherwise. Some examples of unsafe optimizations are narrowing
> an assignment into a bitfield, rematerializing a load, and turning loads
> and stores into a memcpy call. Reordering unordered operations is safe,
> though, and optimizers should take advantage of that because unordered
> operations are common in languages that need them.
Patch by Daniil Suchkov!
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D38392
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315438 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The current promoteLoopAccessesToScalars method receives an AliasSet, but
the information used is in fact a list of Value*, known to must alias.
Create the list ahead of time to make this method independent of the AliasSet class.
While there is no functionality change, this adds overhead for creating
a set of Value*, when promotion would normally exit earlier.
This is meant to be as a first refactoring step in order to start replacing
AliasSetTracker with MemorySSA.
And while the end goal is to redesign LICM, the first few steps will focus on
adding MemorySSA as an alternative to the AliasSetTracker using most of the
existing functionality.
Reviewers: mkuper, danielcdh, dberlin
Subscribers: sanjoy, chandlerc, gberry, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D35439
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313075 91177308-0d34-0410-b5e6-96231b3b80d8
Large CFGs can cause us to blow up the stack because we would have a
recursive step for each basic block in a region.
Instead, create a worklist and iterate it. This limits the stack usage
to something more manageable.
Differential Revision: https://reviews.llvm.org/D35609
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308582 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead of keeping a variable indicating whether there are early exits
in the loop. We keep all the early exits. This improves LICM's ability to
move instructions out of the loop based on is-guaranteed-to-execute.
I am going to update compilation time as well soon.
Reviewers: hfinkel, sanjoy, efriedma, mkuper
Reviewed By: hfinkel
Subscribers: llvm-commits, mzolotukhin
Differential Revision: https://reviews.llvm.org/D32433
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301196 91177308-0d34-0410-b5e6-96231b3b80d8
Extend our store promotion code to deal with unordered atomic accesses. Ordered atomics continue to be unhandled.
Most of the change is straight-forward, the only complicated bit is in the reasoning around mixing of atomic and non-atomic memory access. Rather than trying to reason about the complex semantics in these cases, I simply disallowed promotion when both atomic and non-atomic accesses are present. This is conservatively correct.
It seems really tempting to just promote all access to atomics, but the original accesses might have been conditional. Since we can't lower an arbitrary atomic type, it might not be safe to promote all access to atomic. Consider a loop like the following:
while(b) {
load i128 ...
if (can lower i128 atomic)
store atomic i128 ...
else
store i128
}
It could be there's no race on the location and thus the code is perfectly well defined even if we can't lower a i128 atomically.
It's not clear we need to be this conservative - arguably the program above is brocken since it can't be lowered unless the branch is folded - but I didn't want to have to fix any fallout which might result.
Differential Revision: https://reviews.llvm.org/D15592
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295015 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We can hoist out loads that are dominated by invariant.start, to the preheader.
We conservatively assume the load is variant, if we see a corresponding
use of invariant.start (it could be an invariant.end or an escaping
call).
Reviewers: mkuper, sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29331
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293887 91177308-0d34-0410-b5e6-96231b3b80d8
skip sub-subloops.
The logic to skip subloops dated from when this code was shared with the
cached case. Once it was factored out to only run in the case of
recomputed subloops it became a dangerous bug. If a subsubloop contained
an interfering instruction it would be silently skipped from the alias
sets for LICM.
With the old pass manager this was extremely hard to trigger as it would
require failing to visit these subloops with the LICM pass but then
visiting the outer loop somehow. I've not yet contrived any test case
that actually manages to trigger this.
But with the new pass manager we don't do the cross-loop caching hack
that the old PM does and so we recompute alias set information from
first principles. While this seems much cleaner and simpler it exposed
this bug and would subtly miscompile code due to failing to correctly
model the aliasing constraints of deeply nested loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293273 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In case of non-alloca pointers, we check for whether it is a pointer
from malloc-like calls and it is not captured. In such case, we can
promote the pointer, as the caller will have no way to access this pointer
even if there is unwinding in middle of the loop.
Reviewers: hfinkel, sanjoy, reames, eli.friedman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28834
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292510 91177308-0d34-0410-b5e6-96231b3b80d8
a function's CFG when that CFG is unchanged.
This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.
I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.
Differential Revision: https://reviews.llvm.org/D28627
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292054 91177308-0d34-0410-b5e6-96231b3b80d8
the latter to the Transforms library.
While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.
Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.
We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.
This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.
I haven't split the unittest though because testing one component
without the other seems nearly intractable.
Differential Revision: https://reviews.llvm.org/D28452
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291662 91177308-0d34-0410-b5e6-96231b3b80d8
arguments much like the CGSCC pass manager.
This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.
An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.
This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.
While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.
I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.
Differential Revision: https://reviews.llvm.org/D28292
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291651 91177308-0d34-0410-b5e6-96231b3b80d8
These are interesting again because the user may not be aware that this
is a common reason preventing LICM.
A const is removed from an instruction pointer declaration in order to
pass it to ORE.
Differential Revision: https://reviews.llvm.org/D27940
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291649 91177308-0d34-0410-b5e6-96231b3b80d8
Promotion is always legal when a store within the loop is guaranteed to execute.
However, this is not a necessary condition - for promotion to be memory model
semantics-preserving, it is enough to have a store that dominates every exit
block. This is because if the store dominates every exit block, the fact the
exit block was executed implies the original store was executed as well.
Differential Revision: https://reviews.llvm.org/D28147
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291171 91177308-0d34-0410-b5e6-96231b3b80d8
"Changed" doesn't actually change within the loop, so there's
no reason to keep track of it - we always return false during
analysis and true after the transformation is made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290735 91177308-0d34-0410-b5e6-96231b3b80d8
This moves the exit block and insertion point computation to be eager,
instead of after seeing the first scalar we can promote.
The cost is relatively small (the computation happens anyway, see discussion
on D28147), and the code is easier to follow, and can bail out earlier
if there's a catchswitch present.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290729 91177308-0d34-0410-b5e6-96231b3b80d8
We would check whether we have a prehader *or* dedicated exit blocks,
and go into the promotion loop. Then, for each alias set we'd check
if we have a preheader *and* dedicated exit blocks, and bail if not.
Instead, bail immediately if we don't have both.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290728 91177308-0d34-0410-b5e6-96231b3b80d8
We want to recompute LCSSA only when we actually promoted a value.
This means we only need to look at changes made by promotion when
deciding whether to recompute it or not, not at regular sinking/hoisting.
(This was what the code was documented as doing, just not what it did)
Hopefully NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290726 91177308-0d34-0410-b5e6-96231b3b80d8
The pass creates some state which expects to be cleaned up by
a later instance of the same pass. opt-bisect happens to expose
this not ideal design because calling skipLoop() will result in
this state not being cleaned up at times and an assertion firing
in `doFinalization()`. Chandler tells me the new pass manager will
give us options to avoid these design traps, but until it's not ready,
we need a workaround for the current pass infrastructure. Fix provided
by Andy Kaylor, see the review for a complete discussion.
Differential Revision: https://reviews.llvm.org/D25848
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290427 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency.
Reviewers: hfinkel, davidxl, chandlerc
Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D22778
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285308 91177308-0d34-0410-b5e6-96231b3b80d8
r280425 | dehao | 2016-09-01 16:15:50 -0700 (Thu, 01 Sep 2016) | 9 lines
Refactor LICM pass in preparation for LoopSink pass.
Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).
r280429 | dehao | 2016-09-01 16:31:25 -0700 (Thu, 01 Sep 2016) | 9 lines
Refactor LICM to expose canSinkOrHoistInst to LoopSink pass.
Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280453 91177308-0d34-0410-b5e6-96231b3b80d8
One exception here is LoopInfo which must forward-declare it (because
the typedef is in LoopPassManager.h which depends on LoopInfo).
Also, some includes for LoopPassManager.h were needed since that file
provides the typedef.
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278079 91177308-0d34-0410-b5e6-96231b3b80d8
Just because we can constant fold the result of an instruction does not
imply that we can delete the instruction. It may have side effects.
This fixes PR28655.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276389 91177308-0d34-0410-b5e6-96231b3b80d8
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.
This also removes the dependence of functionattrs on TLI altogether.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274455 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We can avoid repeating the check `isGuaranteedToExecute` when it's already called once while checking if the alignment can be widened for the load/store being hoisted.
The function is invariant for the same instruction `UI` in `isGuaranteedToExecute(*UI, DT, CurLoop, SafetyInfo);`
Reviewers: hfinkel, eli.friedman
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D21672
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273671 91177308-0d34-0410-b5e6-96231b3b80d8