analysis of the alloca. We don't need to visit all the users twice for
this. We build up a kill list during the analysis and then just process
it afterward. This recovers the tiny bit of performance lost by moving
to the visitor based analysis system as it removes one entire use-list
walk from mem2reg. In some cases, this is now faster than mem2reg was
previously.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187296 91177308-0d34-0410-b5e6-96231b3b80d8
robust. It now uses an InstVisitor and worklist to actually walk the
uses of the Alloca transitively and detect the pattern which we can
directly promote: loads & stores of the whole alloca and instructions we
can completely ignore.
Also, with this new implementation teach both the predicate for testing
whether we can promote and the promotion engine itself to use the same
code so we no longer have strange divergence between the two code paths.
I've added some silly test cases to demonstrate that we can handle
slightly more degenerate code patterns now. See the below for why this
is even interesting.
Performance impact: roughly 1% regression in the performance of SROA or
ScalarRepl on a large C++-ish test case where most of the allocas are
basically ready for promotion. The reason is because of silly redundant
work that I've left FIXMEs for and which I'll address in the next
commit. I wanted to separate this commit as it changes the behavior.
Once the redundant work in removing the dead uses of the alloca is
fixed, this code appears to be faster than the old version. =]
So why is this useful? Because the previous requirement for promotion
required a *specific* visit pattern of the uses of the alloca to verify:
we *had* to look for no more than 1 intervening use. The end goal is to
have SROA automatically detect when an alloca is already promotable and
directly hand it to the mem2reg machinery rather than trying to
partition and rewrite it. This is a 25% or more performance improvement
for SROA, and a significant chunk of the delta between it and
ScalarRepl. To get there, we need to make mem2reg actually capable of
promoting allocas which *look* promotable to SROA without have SROA do
tons of work to massage the code into just the right form.
This is actually the tip of the iceberg. There are tremendous potential
savings we can realize here by de-duplicating work between mem2reg and
SROA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187191 91177308-0d34-0410-b5e6-96231b3b80d8
helper function. This leaves both trivial cases handled entirely in
helper functions and merely manages the list of allocas to process in
the run method.
The next step will be to handle all of the trivial promotion work prior
to even creating the core class and the subsequent simplifications that
enables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186784 91177308-0d34-0410-b5e6-96231b3b80d8
a single block into the helper routine. This takes advantage of the fact
that we can directly replace uses prior to any store with undef to
simplify matters and unconditionally promote allocas only used within
one block.
I've removed the special handling for the case of no stores existing.
This has no semantic effect but might slow things down. I'll fix that in
a later patch when I refactor this entire thing to be easier to manage
the different cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186783 91177308-0d34-0410-b5e6-96231b3b80d8
handles the general cases.
The hope is to refactor this so that we don't end up building the entire
class for the trivial cases. I also want to lift a lot of the early
pre-processing in the initial segment of run() into a separate routine,
and really none of it needs to happen inside the primary promotion
class.
These routines in particular used none of the actual state in the
promotion class, so they don't really make sense as members.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186781 91177308-0d34-0410-b5e6-96231b3b80d8
This struct is nicely independent of everything else, and we already
needed a foward declaration here. It's simpler to just define it
immediately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186780 91177308-0d34-0410-b5e6-96231b3b80d8
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.
There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.
The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.
I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).
I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171366 91177308-0d34-0410-b5e6-96231b3b80d8
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.
Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169131 91177308-0d34-0410-b5e6-96231b3b80d8
deterministic, replace it with a DenseMap<std::pair<unsigned, unsigned>,
PHINode*> (we already have a map from BasicBlock to unsigned).
<rdar://problem/12541389>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166435 91177308-0d34-0410-b5e6-96231b3b80d8
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.
The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159312 91177308-0d34-0410-b5e6-96231b3b80d8
function. This seems to be about a 1.5% speedup of -scalarrepl on test-suite
with SPEC2000 and SPEC2006.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123731 91177308-0d34-0410-b5e6-96231b3b80d8
checks enabled:
1) Use '<' to compare integers in a comparison function rather than '<='.
2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.
The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123662 91177308-0d34-0410-b5e6-96231b3b80d8
eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.
Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123609 91177308-0d34-0410-b5e6-96231b3b80d8