Summary:
JumpThreading for guards feature has been reverted at https://reviews.llvm.org/rL295200
due to the following problem: the feature used the following algorithm for detection of
diamond patters:
1. Find a block with 2 predecessors;
2. Check that these blocks have a common single parent;
3. Check that the parent's terminator is a branch instruction.
The problem is that these checks are insufficient. They may pass for a non-diamond
construction in case if those two predecessors are actually the same block. This may
happen if parent's terminator is a br (either conditional or unconditional) to a block
that ends with "switch" instruction with exactly two branches going to one block.
This patch re-enables the JumpThreading for guards and fixes this issue by adding the
check that those found predecessors are actually different blocks. This guarantees that
parent's terminator is a conditional branch with exactly 2 different successors, which
is now ensured by assertions. It also adds two more tests for this situation (with parent's
terminator being a conditional and an unconditional branch).
Patch by Max Kazantsev!
Reviewers: anna, sanjoy, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30036
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295410 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r294617.
We fail on an assert while trying to get a condition from an
unconditional branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295200 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch allows JumpThreading also thread through guards.
Virtually, guard(cond) is equivalent to the following construction:
if (cond) { do something } else {deoptimize}
Yet it is not explicitly converted into IFs before lowering.
This patch enables early threading through guards in simple cases.
Currently it covers the following situation:
if (cond1) {
// code A
} else {
// code B
}
// code C
guard(cond2)
// code D
If there is implication cond1 => cond2 or !cond1 => cond2, we can transform
this construction into the following:
if (cond1) {
// code A
// code C
} else {
// code B
// code C
guard(cond2)
}
// code D
Thus, removing the guard from one of execution branches.
Patch by Max Kazantsev!
Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29620
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294617 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: While scanning predecessors to find an available loaded value, if the predecessor has a single predecessor, we can continue scanning through the single predecessor.
Reviewers: mcrosier, rengolin, reames, davidxl, haicheng
Reviewed By: rengolin
Subscribers: zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D29200
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293896 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: No need to try to ease BB from LoopHeaders as we already know that BB is not in LoopHeaders.
Reviewers: hsung, majnemer, mcrosier, haicheng, rengolin
Reviewed By: rengolin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29232
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293802 91177308-0d34-0410-b5e6-96231b3b80d8
invalidation of deleted functions in GlobalDCE.
This was always testing a bug really triggered in GlobalDCE. Right now
we have analyses with asserting value handles into IR. As long as those
remain, when *deleting* an IR unit, we cannot wait for the normal
invalidation scheme to kick in even though it was designed to work
correctly in the face of these kinds of deletions. Instead, the pass
needs to directly handle invalidating the analysis results pointing at
that IR unit.
I've tought the Inliner about this and this patch teaches GlobalDCE.
This will handle the asserting VH case in the existing test as well as
other issues of the same fundamental variety. I've moved the test into
the GlobalDCE directory and added a comment explaining what is going on.
Note that we cannot simply require LVI here because LVI is too lazy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292773 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Unfolding selects was previously done with the help of a vector
of pointers that was then sorted to be able to remove duplicates.
As this sorting depends on the memory addresses, it was
non-deterministic. A SetVector is used now so that duplicates are
removed without the need of sorting first.
Reviewers: mgrang, efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26450
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286807 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.
Patch by James Molloy and Pablo Barrio.
Reviewers: rengolin, haicheng, sebpop
Subscribers: jojo, jmolloy, llvm-commits
Differential Revision: https://reviews.llvm.org/D26391
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286236 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.
Patch by James Molloy and Pablo Barrio.
Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop
Subscribers: jojo, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D25477
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284971 91177308-0d34-0410-b5e6-96231b3b80d8
Currently the pass updates branch weights in the IR if the function has
any PGO info (entry frequency is set). However we could still have
regions of the CFG that does not have branch weights collected (e.g. a
cold region). In this case we'd use static estimates. Since static
estimates for branches are determined independently, they are
inconsistent. Updating them can "randomly" inflate block frequencies.
I've run into this in a completely cold loop of h264ref from
SPEC. -Rpass-with-hotness showed the loop to be completely cold during
inlining (before JT) but completely hot during vectorization (after JT).
The new testcase demonstrate the problem. We check array elements
against 1, 2 and 3 in a loop. The check against 3 is the loop-exiting
check. The block names should be self-explanatory.
In this example, jump threading incorrectly updates the weight of the
loop-exiting branch to 0, drastically inflating the frequency of the
loop (in the range of billions).
There is no run-time profile info for edges inside the loop, so branch
probabilities are estimated. These are the resulting branch and block
frequencies for the loop body:
check_1 (16)
(8) / |
eq_1 | (8)
\ |
check_2 (16)
(8) / |
eq_2 | (8)
\ |
check_3 (16)
(1) / |
(loop exit) | (15)
|
(back edge)
First we thread eq_1 -> check_2 to check_3. Frequencies are updated to
remove the frequency of eq_1 from check_2 and then from the false edge
leaving check_2. Changed frequencies are highlighted with * *:
check_1 (16)
(8) / |
eq_1~ | (8)
/ |
/ check_2 (*8*)
/ (8) / |
\ eq_2 | (*0*)
\ \ |
` --- check_3 (16)
(1) / |
(loop exit) | (15)
|
(back edge)
Next we thread eq_1 -> check_3 and eq_2 -> check_3 to check_1 as new
back edges. Frequencies are updated to remove the frequency of eq_1 and
eq_3 from check_3 and then the false edge leaving check_3 (changed
frequencies are highlighted with * *):
check_1 (16)
(8) / |
eq_1~ | (8)
/ |
/ check_2 (*8*)
/ (8) / |
/-- eq_2~ | (*0*)
(back edge) |
check_3 (*0*)
(*0*) / |
(loop exit) | (*0*)
|
(back edge)
As a result, the loop exit edge ends up with 0 frequency which in turn makes
the loop header to have maximum frequency.
There are a few potential problems here:
1. The profile data seems odd. There is a single profile sample of the
loop being entered. On the other hand, there are no weights inside the
loop.
2. Based on static estimation we shouldn't set edges to "extreme"
values, i.e. extremely likely or unlikely.
3. We shouldn't create profile metadata that is calculated from static
estimation. I am not sure what policy is but it seems to make sense to
treat profile metadata as something that is known to originate from
profiling. Estimated probabilities should only be reflected in BPI/BFI.
Any one of these would probably fix the immediate problem. I went for 3
because I think it's a good policy to have and added a FIXME about 2.
Differential Revision: https://reviews.llvm.org/D24118
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280713 91177308-0d34-0410-b5e6-96231b3b80d8
If the result of the find is only used to compare against end(), just
use is_contained instead.
No functionality change is intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278433 91177308-0d34-0410-b5e6-96231b3b80d8
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.
Thanks to David for the suggestion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278077 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The correctness fix here is that when we CSE a load with another load,
we need to combine the metadata on the two loads. This matches the
behavior of other passes, like instcombine and GVN.
There's also a minor optimization improvement here: for load PRE, the
aliasing metadata on the inserted load should be the same as the
metadata on the original load. Not sure why the old code was throwing
it away.
Issue found by inspection.
Differential Revision: http://reviews.llvm.org/D21460
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277977 91177308-0d34-0410-b5e6-96231b3b80d8
Just because we can constant fold the result of an instruction does not
imply that we can delete the instruction. It may have side effects.
This fixes PR28655.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276389 91177308-0d34-0410-b5e6-96231b3b80d8
We were still crashing in the "no change" case because LVI was not
getting invalidated.
See the thread "Should analyses be able to hold AssertingVH to IR?
(related to PR28400)" for more discussion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274656 91177308-0d34-0410-b5e6-96231b3b80d8
PR28400 seems to be not an isolated issue, but a general problem related
to caching analyses. We will need to discuss on llvm-dev.
A test case is in the PR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274457 91177308-0d34-0410-b5e6-96231b3b80d8
r273711 was reverted by r273743. The inliner needs to know about any
call sites in the inlined function. These were obscured if we replaced
a call to undef with an undef but kept the call around.
This fixes PR28298.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273753 91177308-0d34-0410-b5e6-96231b3b80d8
We cannot remove an instruction with no uses just because
SimplifyInstruction succeeds. It may have side effects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273711 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r272603 and adds a fix.
Big thanks to Davide for pointing me at r216244 which gives some insight
into how to fix this VS2013 issue. VS2013 can't synthesize a move
constructor. So the fix here is to add one explicitly to the
JumpThreadingPass class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272607 91177308-0d34-0410-b5e6-96231b3b80d8
This follows the approach in r263208 (for GVN) pretty closely:
- move the bulk of the body of the function to the new PM class.
- expose a runImpl method on the new-PM class that takes the IRUnitT and
pointers/references to any analyses and use that to implement the
old-PM class.
- use a private namespace in the header for stuff that used to be file
scope
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272597 91177308-0d34-0410-b5e6-96231b3b80d8
This is a bit gnarly since LVI is maintaining its own cache.
I think this port could be somewhat cleaner, but I'd rather not spend
too much time on it while we still have the old pass hanging around and
limiting how much we can clean things up.
Once the old pass is gone it will be easier (less time spent) to clean
it up anyway.
This is the last dependency needed for porting JumpThreading which I'll
do in a follow-up commit (there's no printer pass for LVI or anything to
test it, so porting a pass that depends on it seems best).
I've been mostly following:
r269370 / D18834 which ported Dependence Analysis
r268601 / D19839 which ported BPI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272593 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.
The bisection is enabled using a new command line option (-opt-bisect-limit). Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit. A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.
The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check. Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute. A new function call has been added for module and SCC passes that behaves in a similar way.
Differential Revision: http://reviews.llvm.org/D19172
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267022 91177308-0d34-0410-b5e6-96231b3b80d8
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264697 91177308-0d34-0410-b5e6-96231b3b80d8
When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes
is currently used to recognize potential loops of which the block is the header and keep the block.
However, the current algorithm fails if the loops' exit condition is evaluated only with volatile
values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested
loop, the loop is collapsed into a single loop which prevent later optimizations from being
applied (e.g., transforming nested loops into simplified forms and loop vectorization).
The patch augments the existing PHI node-based check by adding a pre-test if the BB actually
belongs to a set of loop headers and not eliminating it if yes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264596 91177308-0d34-0410-b5e6-96231b3b80d8