Commit Graph

257 Commits

Author SHA1 Message Date
David Majnemer
e64e7c4634 SROA: Don't insert instructions before a PHI
SROA may decide that it needs to insert a bitcast and would set it's
insertion point before a PHI.  This will create an invalid module
right quick.

Instead, choose the first insertion point in the basic block that holds
our PHI.

This fixes PR20822.

Differential Revision: http://reviews.llvm.org/D5141

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 21:20:14 +00:00
Jingyue Wu
8be5600f0a [SROA] Fold a PHI node if all its incoming values are the same
Summary:
Fixes PR20425.

During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.

Test Plan: Added three more tests in phi-and-select.ll

Reviewers: nlewycky, eliben, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits

Differential Revision: http://reviews.llvm.org/D4659

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216299 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 22:45:57 +00:00
Reid Kleckner
2c0e02e21b SROA: Handle a case of store size being smaller than allocation size
In this case, we are creating an x86_fp80 slice for a union from C where
the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes,
and that's just fine. We can't, however, use regular loads and stores to
access the slice, because the store size is only 10 bytes / 80 bits.
Instead, use memcpy and memset.

Fixes PR18726.

Reviewed By: chandlerc

Differential Revision: http://reviews.llvm.org/D5012

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 00:09:56 +00:00
Craig Topper
431bdfc4c1 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216158 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 05:55:13 +00:00
Craig Topper
db77b82ed5 Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."
Getting a weird buildbot failure that I need to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 00:24:38 +00:00
Craig Topper
f06c7072c2 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 23:47:00 +00:00
Owen Anderson
d4748bbd49 Fix a case in SROA where lifetime intrinsics could inhibit alloca promotion. In
this case, the code path dealing with vector promotion was missing the explicit
checks for lifetime intrinsics that were present on the corresponding integer
promotion path.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215148 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-07 21:07:35 +00:00
Hal Finkel
2c7c54c86c AA metadata refactoring (introduce AAMDNodes)
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).

This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-24 12:16:19 +00:00
Hal Finkel
a739834446 Allow isDereferenceablePointer to look through some bitcasts
isDereferenceablePointer should not give up upon encountering any bitcast. If
we're casting from a pointer to a larger type to a pointer to a small type, we
can continue by examining the bitcast's operand. This missing capability
was noted in a comment in the function.

In order for this to work, isDereferenceablePointer now takes an optional
DataLayout pointer (essentially all callers already had such a pointer
available). Most code uses isDereferenceablePointer though
isSafeToSpeculativelyExecute (which already took an optional DataLayout
pointer), and to enable the LICM test case, LICM needs to actually provide its DL
pointer to isSafeToSpeculativelyExecute (which it was not doing previously).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212686 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-10 05:27:53 +00:00
Duncan P. N. Exon Smith
0dee67560f SROA: Only split loads on byte boundaries
r199771 accidently broke the logic that makes sure that SROA only splits
load on byte boundaries.  If such a split happens, some bits get lost
when reassembling loads of wider types, causing data corruption.

Move the width check up to reject such splits early, avoiding the
corruption.  Fixes PR19250.

Patch by: Björn Steinbrink <bsteinbr@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211082 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-17 00:19:35 +00:00
Craig Topper
8d7221ccf5 [C++] Use 'nullptr'. Transforms edition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207196 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-25 05:29:35 +00:00
Chandler Carruth
7962dbdc65 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Transforms/...
edition.

This one is tricky for two reasons. We again have a couple of passes
that define something else before the includes as well. I've sunk their
name macros with the DEBUG_TYPE.

Also, InstCombine contains headers that need DEBUG_TYPE, so now those
headers #define and #undef DEBUG_TYPE around their code, leaving them
well formed modular headers. Fixing these headers was a large motivation
for all of these changes, as "leaky" macros of this form are hard on the
modules implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206844 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 02:55:47 +00:00
Chandler Carruth
36b699f2b1 [C++11] Add range based accessors for the Use-Def chain of a Value.
This requires a number of steps.
1) Move value_use_iterator into the Value class as an implementation
   detail
2) Change it to actually be a *Use* iterator rather than a *User*
   iterator.
3) Add an adaptor which is a User iterator that always looks through the
   Use to the User.
4) Wrap these in Value::use_iterator and Value::user_iterator typedefs.
5) Add the range adaptors as Value::uses() and Value::users().
6) Update *all* of the callers to correctly distinguish between whether
   they wanted a use_iterator (and to explicitly dig out the User when
   needed), or a user_iterator which makes the Use itself totally
   opaque.

Because #6 requires churning essentially everything that walked the
Use-Def chains, I went ahead and added all of the range adaptors and
switched them to range-based loops where appropriate. Also because the
renaming requires at least churning every line of code, it didn't make
any sense to split these up into multiple commits -- all of which would
touch all of the same lies of code.

The result is still not quite optimal. The Value::use_iterator is a nice
regular iterator, but Value::user_iterator is an iterator over User*s
rather than over the User objects themselves. As a consequence, it fits
a bit awkwardly into the range-based world and it has the weird
extra-dereferencing 'operator->' that so many of our iterators have.
I think this could be fixed by providing something which transforms
a range of T&s into a range of T*s, but that *can* be separated into
another patch, and it isn't yet 100% clear whether this is the right
move.

However, this change gets us most of the benefit and cleans up
a substantial amount of code around Use and User. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203364 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-09 03:16:01 +00:00
Chandler Carruth
67f6bf70d2 [Layering] Move InstVisitor.h into the IR library as it is pretty
obviously coupled to the IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203064 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-06 03:23:41 +00:00
Chandler Carruth
f4ec8bfaec [Layering] Move DebugInfo.h into the IR library where its implementation
already lives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203046 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-06 00:46:21 +00:00
Chandler Carruth
7cf9764966 [Layering] Move DIBuilder.h into the IR library where its implementation
already lives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203038 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-06 00:22:06 +00:00
Craig Topper
7b62be28cb [C++11] Add 'override' keyword to virtual methods that override their base class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202953 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-05 09:10:37 +00:00
Chandler Carruth
0550e93e89 [C++11] Remove the completely unnecessary requirement on SetVector's
remove_if that its predicate is adaptable. We don't actually need this,
we can write a generic adapter for any predicate.

This lets us remove some very wrong std::function usages. We should
never be using std::function for predicates to algorithms. This incurs
an *indirect* call overhead for every evaluation of the predicate, and
makes it very hard to inline through.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202742 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-03 19:28:52 +00:00
Chandler Carruth
3dfabcb249 [C++11] Add two range adaptor views to User: operands and
operand_values. The first provides a range view over operand Use
objects, and the second provides a range view over the Value*s being
used by those operands.

The naming is "STL-style" rather than "LLVM-style" because we have
historically named iterator methods STL-style, and range methods seem to
have far more in common with their iterator counterparts than with
"normal" APIs. Feel free to bikeshed on this one if you want, I'm happy
to change these around if people feel strongly.

I've switched code in SROA and LCG to exercise these mostly to ensure
they work correctly -- we don't really have an easy way to unittest this
and they're trivial.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202687 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-03 10:42:58 +00:00
Benjamin Kramer
a4f0aad951 [C++11] Replace llvm::tie with std::tie.
The old implementation is no longer needed in C++11.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202644 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-02 13:30:33 +00:00
Benjamin Kramer
d628f19f5d [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202636 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-02 12:27:27 +00:00
Benjamin Kramer
ee5e607355 Now that we have C++11, turn simple functors into lambdas and remove a ton of boilerplate.
No intended functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202588 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-01 11:47:00 +00:00
Chandler Carruth
2b442bffcb [SROA] Use the correct index integer size in GEPs through non-default
address spaces.

This isn't really a correctness issue (the values are truncated) but its
much cleaner.

Patch by Matt Arsenault!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202252 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 10:08:16 +00:00
Chandler Carruth
5b95cec37c [SROA] Teach SROA how to handle pointers from address spaces other than
the default.

Based on the patch by Matt Arsenault, D1764!

I switched one place to use the more direct pointer type to compute the
desired address space, and I reworked the memcpy rewriting section to
reflect significant refactorings that this patch helped inspire.

Thanks to several of the folks who helped review and improve the patch
as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202247 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 08:25:02 +00:00
Chandler Carruth
38e90e3de1 [SROA] Split the alignment computation complete for the memcpy rewriting
to work independently for the slice side and the other side.

This allows us to only compute the minimum of the two when we actually
rewrite to a memcpy that needs to take the minimum, and preserve higher
alignment for one side or the other when rewriting to loads and stores.

This fix was inspired by seeing the result of some refactoring that
makes addrspace handling better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 07:29:54 +00:00
Chandler Carruth
c1c37734ad [SROA] The original refactoring inspired by the addrspace patch in
D1764, which in turn set off the other refactorings to make
'getSliceAlign()' a sensible thing.

There are two possible inputs to the required alignment of a memory
transfer intrinsic: the alignment constraints of the source and the
destination. If we are *only* introducing a (potentially new) offset
onto one side of the transfer, we don't need to consider the alignment
constraints of the other side. Use this to simplify the logic feeding
into alignment computation for unsplit transfers.

Also, hoist the clamp of the magical zero alignment for these intrinsics
to the more customary one alignment early. This lets several other
conditions melt away.

No functionality changed. There is a further improvement this exposes
which *will* change functionality, but that's arriving in a separate
patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202232 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 05:33:36 +00:00
Chandler Carruth
dd080790eb [SROA] Yet another slight refactoring that simplifies an API in the
rewriting logic: don't pass custom offsets for the adjusted pointer to
the new alloca.

We always passed NewBeginOffset here. Sometimes we spelled it
BeginOffset, but only when they were in fact equal. Whats worse, the API
is set up so that you can't reasonably call it with anything else -- it
assumes that you're passing it an offset relative to the *original*
alloca that happens to fall within the new one. That's the whole point
of NewBeginOffset, it's the clamped beginning offset.

No functionality changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 05:12:43 +00:00
Chandler Carruth
c9166f098c [SROA] Simplify the computing of alignment: we only ever need the
alignment of the slice being rewritten, not any arbitrary offset.

Every caller is really just trying to compute the alignment for the
whole slice, never for some arbitrary alignment. They are also just
passing a type when they have one to see if we can skip an explicit
alignment in the IR by using the type's alignment. This makes for a much
simpler interface.

Another refactoring inspired by the addrspace patch for SROA, although
only loosely related.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202230 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 05:02:19 +00:00
Chandler Carruth
f28c057c42 [SROA] Use NewOffsetBegin in the unsplit case for memset merely for
consistency with memcpy rewriting, and fix a latent bug in the alignment
management for memset.

The alignment issue is that getAdjustedAllocaPtr is computing the
*relative* offset into the new alloca, but the alignment isn't being set
to the relative offset, it was using the the absolute offset which is
into the old alloca.

I don't think its possible to write a test case that actually reaches
this code where the resulting alignment would be observably different,
but the intent was clearly to use the relative offset within the new
alloca.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202229 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 04:45:24 +00:00
Chandler Carruth
f11f7a49b9 [SROA] Use the members for New{Begin,End}Offset in the rewrite helpers
rather than passing them as arguments.

While I generally prefer actual arguments, in this case the readability
loss is substantial. By using members we avoid repeatedly calculating
the offsets, and once we're using members it is useful to ensure that
those names *always* refer to the original-alloca-relative new offset
for a rewritten slice.

No functionality changed. Follow-up refactoring, all toward getting the
address space patch merged.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202228 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 04:25:04 +00:00
Chandler Carruth
abd2555e36 [SROA] Compute the New{Begin,End}Offset values once for each alloca
slice being rewritten.

We had the same code scattered across most of the visits. Instead,
compute the new offsets and the slice size once when we start to visit
a particular slice, and use the member variables from then on. This
reduces quite a bit of code duplication.

No functionality changed. Refactoring inspired to make it easier to
apply the address space patch to SROA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202227 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 04:20:00 +00:00
Chandler Carruth
50bc165c54 [SROA] Fix PR18615 with some long overdue simplifications to the bounds
checking in SROA.

The primary change is to just rely on uge for checking that the offset
is within the allocation size. This removes the explicit checks against
isNegative which were terribly error prone (including the reversed logic
that led to PR18615) and prevented us from supporting stack allocations
larger than half the address space.... Ok, so maybe the latter isn't
*common* but it's a silly restriction to have.

Also, we used to try to support a PHI node which loaded from before the
start of the allocation if any of the loaded bytes were within the
allocation. This doesn't make any sense, we have never really supported
loading or storing *before* the allocation starts. The simplified logic
just doesn't care.

We continue to allow loading past the end of the allocation in part to
support cases where there is a PHI and some loads are larger than others
and the larger ones reach past the end of the allocation. We could solve
this a different and more conservative way, but I'm still somewhat
paranoid about this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202224 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-26 03:14:14 +00:00
Chandler Carruth
495e40121b [SROA] Add an off-by-default *strict* inbounds check to SROA. I had SROA
implemented this way a long time ago and due to the overwhelming bugs
that surfaced, moved to a much more relaxed variant. Richard Smith would
like to understand the magnitude of this problem and it seems fairly
harmless to keep some flag-controlled logic to get the extremely strict
behavior here. I'll remove it if it doesn't prove useful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 21:24:45 +00:00
Rafael Espindola
57edc9d4ff Make DataLayout a plain object, not a pass.
Instead, have a DataLayoutPass that holds one. This will allow parts of LLVM
don't don't handle passes to also use DataLayout.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202168 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 17:30:31 +00:00
Chandler Carruth
a2ff58121a [SROA] Use the original load name with the SROA-prefixed IRB rather than
just "load". This helps avoid pointless de-duping with order-sensitive
numbers as we already have unique names from the original load. It also
makes the resulting IR quite a bit easier to read.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202140 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 11:21:48 +00:00
Chandler Carruth
f5701e4282 [SROA] Thread the ability to add a pointer-specific name prefix through
the pointer adjustment code. This is the primary code path that creates
totally new instructions in SROA and being able to lump them based on
the pointer value's name for which they were created causes
*significantly* fewer name collisions and general noise in the debug
output. This is particularly significant because it is making it much
harder to track down instability in the output of SROA, as name
de-duplication is a totally harmless form of instability that gets in
the way of seeing real problems.

The new fancy naming scheme tries to dig out the root "pre-SROA" name
for pointer values and associate that all the way through the pointer
formation instructions. Digging out the root is important to prevent the
multiple iterative rounds of SROA from just layering too much cruft on
top of cruft here. We already track the layers of SROAs iteration in the
alloca name prefix. We don't need to duplicate it here.

Should have no functionality change, and shouldn't have any really
measurable impact on NDEBUG builds, as most of the complex logic is
debug-only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202139 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 11:19:56 +00:00
Chandler Carruth
19899632f7 [SROA] Rather than copying the logic for building a name prefix into the
PHI-pointer builder, just copy the builder and clobber the obvious
fields.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202136 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 11:12:04 +00:00
Chandler Carruth
b59c39e520 [SROA] Simplify some of the logic to dig out the old pointer value by
using OldPtr more heavily. Lots of this code was written before the
rewriter had an OldPtr member setup ahead of time. There are already
asserts in place that should ensure this doesn't change any
functionality.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202135 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 11:08:02 +00:00
Chandler Carruth
10b2920851 [SROA] Adjust to new clang-format style.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202134 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 11:07:58 +00:00
Chandler Carruth
64d5fd932b [SROA] Fix a *glaring* bug in r202091: you have to actually *write*
the break statement, not just think it to yourself....

No idea how this worked at all, much less survived most bots, my
bootstrap, and some bot bootstraps!

The Polly one didn't survive, and this was filed as PR18959. I don't
have a reduced test case and honestly I'm not seeing the need. What we
probably need here are better asserts / debug-build behavior in
SmallPtrSet so that this madness doesn't make it so far.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202129 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 09:45:27 +00:00
Alexey Samsonov
31c756b374 Silence GCC warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202119 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 07:56:00 +00:00
Chandler Carruth
eb1b5ba880 [SROA] Add a debugging tool which shuffles the slices sequence prior to
sorting it. This helps uncover latent reliance on the original ordering
which aren't guaranteed to be preserved by std::sort (but often are),
and which are based on the use-def chain orderings which also aren't
(technically) guaranteed.

Only available in C++11 debug builds, and behind a flag to prevent noise
at the moment, but this is generally useful so figured I'd put it in the
tree rather than keeping it out-of-tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202106 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 03:59:29 +00:00
Chandler Carruth
2516e7fb22 [SROA] Use a more direct way of determining whether we are processing
the destination operand or source operand of a memmove.

It so happens that it was impossible for SROA to try to rewrite
self-memmove where the operands are *identical*, because either such
a think is volatile (and we don't rewrite) or it is non-volatile, and we
don't even register it as a use of the alloca.

However, making the 'IsDest' test *rely* on this subtle fact is... Very
confusing for the reader. We should use the direct and readily available
test of the Use* which gives us concrete information about which operand
is being rewritten.

No functionality changed, I hope! ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202103 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 03:50:14 +00:00
Chandler Carruth
c537759a7f [SROA] Fix another instability in SROA with respect to the slice
ordering.

The fundamental problem that we're hitting here is that the use-def
chain ordering is *itself* not a stable thing to be relying on in the
rewriting for SROA. Further, we use a non-stable sort over the slices to
arrange them based on the section of the alloca they're operating on.
With a debugging STL implementation (or different implementations in
stage2 and stage3) this can cause stage2 != stage3.

The specific aspect of this problem fixed in this commit deals with the
rewriting and load-speculation around PHIs and Selects. This, like many
other aspects of the use-rewriting in SROA, is really part of the
"strong SSA-formation" that is doen by SROA where it works very hard to
canonicalize loads and stores in *just* the right way to satisfy the
needs of mem2reg[1]. When we have a select (or a PHI) with 2 uses of the
same alloca, we test that loads downstream of the select are
speculatable around it twice. If only one of the operands to the select
needs to be rewritten, then if we get lucky we rewrite that one first
and the select is immediately speculatable. This can cause the order of
operand visitation, and thus the order of slices to be rewritten, to
change an alloca from promotable to non-promotable and vice versa.

The fix is to defer all of the speculation until *after* the rewrite
phase is done. Once we've rewritten everything, we can accurately test
for whether speculation will work (once, instead of twice!) and the
order ceases to matter.

This also happens to simplify the other subtlety of speculation -- we
need to *not* speculate anything unless the result of speculating will
make the alloca fully promotable by mem2reg. I had a previous attempt at
simplifying this, but it was still pretty horrible.

There is actually already a *really* nice test case for this in
basictest.ll, but on multiple STL implementations and inputs, we just
got "lucky". Fortunately, the test case is very small and we can
essentially build it in exactly the opposite way to get reasonable
coverage in both directions even from normal STL implementations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202092 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-25 00:07:09 +00:00
Rafael Espindola
034b8f9d31 Trivial cleanup: reuse existing variable.
Extracted while trying to understand http://llvm-reviews.chandlerc.com/D1764.

Patch by Matt Arsenault.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-14 19:02:01 +00:00
Paul Robinson
2684ddd72e Disable most IR-level transform passes on functions marked 'optnone'.
Ideally only those transform passes that run at -O0 remain enabled,
in reality we get as close as we reasonably can.
Passes are responsible for disabling themselves, it's not the job of
the pass manager to do it for them.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200892 91177308-0d34-0410-b5e6-96231b3b80d8
2014-02-06 00:07:05 +00:00
Chandler Carruth
c04f2c99ab [SROA] Fix a bug which could cause the common type finding to return
inconsistent results for different orderings of alloca slices. The
fundamental issue is that it is just always a mistake to return early
from this function. There is no effective early exit to leverage. This
patch stops trynig to do so and simplifies the code a bit as
a consequence.

Original diagnosis and patch by James Molloy with some name tweaks by me
in part reflecting feedback from Duncan Smith on the mailing list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199771 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-21 23:16:05 +00:00
Chandler Carruth
e1a5243053 Fix a really nasty SROA bug with how we handled out-of-bounds memcpy
intrinsics.

Reported on the list by Evan with a couple of attempts to fix, but it
took a while to dig down to the root cause. There are two overlapping
bugs here, both centering around the circumstance of discovering
a memcpy operand which is known to be completely outside the bounds of
the alloca.

First, we need to kill the *other* side of the memcpy if it was added to
this alloca. Otherwise we'll factor it into our slicing and try to
rewrite it even though we know for a fact that it is dead. This is made
more tricky because we can visit the sides in either order. So we have
to both kill the other side and skip instructions marked as dead. The
latter really should be goodness in every case, but here is a matter of
correctness.

Second, we need to actually remove the *uses* of the alloca by the
memcpy when queuing it for later deletion. Otherwise it may still be
using the alloca when we go to promote it (if the rewrite re-uses the
existing alloca instruction). Do this by factoring out the
use-clobbering used when for nixing a Phi argument and re-using it
across the operands of a to-be-deleted instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199590 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-19 12:16:54 +00:00
Chandler Carruth
7f2eff792a [PM] Split DominatorTree into a concrete analysis result object which
can be used by both the new pass manager and the old.

This removes it from any of the virtual mess of the pass interfaces and
lets it derive cleanly from the DominatorTreeBase<> template. In turn,
tons of boilerplate interface can be nuked and it turns into a very
straightforward extension of the base DominatorTree interface.

The old analysis pass is now a simple wrapper. The names and style of
this split should match the split between CallGraph and
CallGraphWrapperPass. All of the users of DominatorTree have been
updated to match using many of the same tricks as with CallGraph. The
goal is that the common type remains the resulting DominatorTree rather
than the pass. This will make subsequent work toward the new pass
manager significantly easier.

Also in numerous places things became cleaner because I switched from
re-running the pass (!!! mid way through some other passes run!!!) to
directly recomputing the domtree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-13 13:07:17 +00:00
Chandler Carruth
56e1394c88 [cleanup] Move the Dominators.h and Verifier.h headers into the IR
directory. These passes are already defined in the IR library, and it
doesn't make any sense to have the headers in Analysis.

Long term, I think there is going to be a much better way to divide
these matters. The dominators code should be fully separated into the
abstract graph algorithm and have that put in Support where it becomes
obvious that evn Clang's CFGBlock's can use it. Then the verifier can
manually construct dominance information from the Support-driven
interface while the Analysis library can provide a pass which both
caches, reconstructs, and supports a nice update API.

But those are very long term, and so I don't want to leave the really
confusing structure until that day arrives.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199082 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-13 09:26:24 +00:00
Alp Toker
395f7c2505 Add missed cleanup from r198456
All other uses of this macro in LLVM/clang have been moved to the function
definition so follow suite (and the usage advice) here too for consistency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198516 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-04 22:47:48 +00:00
Nico Weber
c3d3f0c696 Add a LLVM_DUMP_METHOD macro.
The motivation is to mark dump methods as used in debug builds so that they can
be called from lldb, but to not do so in release builds so that they can be
dead-stripped.

There's lots of potential follow-up work suggested in the thread
"Should dump methods be LLVM_ATTRIBUTE_USED only in debug builds?" on cfe-dev,
but everyone seems to agreen on this subset.

Macro name chosen by fair coin toss.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198456 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-03 22:53:37 +00:00
Chandler Carruth
ed1951e79f Fix an issue where SROA computed different results based on the relative
order of slices of the alloca which have exactly the same size and other
properties. This was found by a perniciously unstable sort
implementation used to flush out buggy uses of the algorithm.

The fundamental idea is that findCommonType should return the best
common type it can find across all of the slices in the range. There
were two bugs here previously:

1) We would accept an integer type smaller than a byte-width multiple,
   and if there were different bit-width integer types, we would accept
   the first one. This caused an actual failure in the testcase updated
   here when the sort order changed.
2) If we found a bad combination of types or a non-load, non-store use
   before an integer typed load or store we would bail, but if we found
   the integere typed load or store, we would use it. The correct
   behavior is to always use an integer typed operation which covers the
   partition if one exists.

While a clever debugging sort algorithm found problem #1 in our existing
test cases, I have no useful test case ideas for #2. I spotted in by
inspection when looking at this code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195118 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-19 09:03:18 +00:00
Benjamin Kramer
7f80b75b96 Drop spurious handle in comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191172 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-22 11:24:58 +00:00
Benjamin Kramer
1ce1525ed4 SROA: Handle casts involving vectors of pointers and integer scalars.
SROA wants to convert any types of equivalent widths but it's not possible to
convert vectors of pointers to an integer scalar with a single cast. As a
workaround we add a bitcast to the corresponding int ptr type first. This type
of cast used to be an edge case but has become common with SLP vectorization.
Fixes PR17271.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191143 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-21 20:36:04 +00:00
Nick Lewycky
6c1fa7caae Revert r187191, which broke opt -mem2reg on the testcases included in PR16867.
However, opt -O2 doesn't run mem2reg directly so nobody noticed until r188146
when SROA started sending more things directly down the PromoteMemToReg path.

In order to revert r187191, I also revert dependent revisions r187296, r187322
and r188146. Fixes PR16867. Does not add the testcases from that PR, but both
of them should get added for both mem2reg and sroa when this revert gets
unreverted.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188327 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-13 22:51:58 +00:00
Chandler Carruth
5b854f1ea5 Re-instate r187323 which fast-tracks promotable allocas as soon as the
SROA-based analysis has enough information. This should work now that
both mem2reg *and* the SSAUpdater-based AllocaPromoter have been updated
to be able to promote the types of allocas that the SROA analysis
detects.

I've included tests for the AllocaPromoter that were only possible to
write once we fast-tracked promotable allocas without rewriting them.
This includes a test both for r187347 and r188145.

Original commit log for r187323:
"""
Now that mem2reg understands how to cope with a slightly wider set of uses of
an alloca, we can pre-compute promotability while analyzing an alloca for
splitting in SROA. That lets us short-circuit the common case of a bunch of
trivially promotable allocas. This cuts 20% to 30% off the run time of SROA for
typical frontend-generated IR sequneces I'm seeing. It gets the new SROA to
within 20% of ScalarRepl for such code. My current benchmark for these numbers
is PR15412, but it fits the general pattern of IR emitted by Clang so it should
be widely applicable.
"""

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188146 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-11 02:17:11 +00:00
Chandler Carruth
37508bb842 Finish fixing the SSAUpdater-based AllocaPromoter strategy in SROA to cope with
the more general set of patterns that are now handled by mem2reg and that we
can detect quickly while doing SROA's initial analysis. Notably, this allows it
to promote through no-op bitcast and GEP sequences. A core part of the
SSAUpdater approach is the ability to test whether a particular instruction is
part of the set being promoted. Testing this becomes significantly more complex
in the world where the operand to every load and store isn't the alloca itself.
I ended up using the approach of walking up the def-chain until we find the
alloca. I benchmarked this against keeping a set of pointer operands and
keeping a set of the loads and stores we care about, and this one seemed faster
although the difference was very small.

No test case yet because currently the rewriting always "fixes" the inputs to
not require this. The next patch which re-enables early promotion of easy cases
in SROA will include a test case that specifically exercises this aspect of the
alloca promoter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188145 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-11 01:56:15 +00:00
Chandler Carruth
3c7a446059 Reformat some bits of AllocaPromoter and simplify the name and type of
our visiting datastructures in the AllocaPromoter/SSAUpdater path of
SROA. Also shift the order if clears around to be more consistent.

No functionality changed here, this is just a cleanup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188144 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-11 01:03:18 +00:00
Chandler Carruth
e1361ec325 Teach the AllocaPromoter which is wrapped around the SSAUpdater
infrastructure to do promotion without a domtree the same smarts about
looking through GEPs, bitcasts, etc., that I just taught mem2reg about.
This way, if SROA chooses to promote an alloca which still has some
noisy instructions this code can cope with them.

I've not used as principled of an approach here for two reasons:
1) This code doesn't really need it as we were already set up to zip
   through the instructions used by the alloca.
2) I view the code here as more of a hack, and hopefully a temporary one.

The SSAUpdater path in SROA is a real sore point for me. It doesn't make
a lot of architectural sense for many reasons:
- We're likely to end up needing the domtree anyways in a subsequent
  pass, so why not compute it earlier and use it.
- In the future we'll likely end up needing the domtree for parts of the
  inliner itself.
- If we need to we could teach the inliner to preserve the domtree. Part
  of the re-work of the pass manager will allow this to be very powerful
  even in large SCCs with many functions.
- Ultimately, computing a domtree has gotten significantly faster since
  the original SSAUpdater-using code went into ScalarRepl. We no longer
  use domfrontiers, and much of domtree is lazily done based on queries
  rather than eagerly.
- At this point keeping the SSAUpdater-based promotion saves a total of
  0.7% on a build of the 'opt' tool for me. That's not a lot of
  performance given the complexity!

So I'm leaving this a bit ugly in the hope that eventually we just
remove all of this nonsense.

I can't even readily test this because this code isn't reachable except
through SROA. When I re-instate the patch that fast-tracks allocas
already suitable for promotion, I'll add a testcase there that failed
before this change. Before that, SROA will fix any test case I give it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187347 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-29 09:06:53 +00:00
Chandler Carruth
65f12f1d05 Temporarily revert r187323 until I update SSAUpdater to match mem2reg.
I forgot that we had two totally independent things here. :: sigh ::

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187327 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-28 09:05:49 +00:00
Chandler Carruth
cea60aff34 Now that mem2reg understands how to cope with a slightly wider set of
uses of an alloca, we can pre-compute promotability while analyzing an
alloca for splitting in SROA. That lets us short-circuit the common case
of a bunch of trivially promotable allocas. This cuts 20% to 30% off the
run time of SROA for typical frontend-generated IR sequneces I'm seeing.
It gets the new SROA to within 20% of ScalarRepl for such code. My
current benchmark for these numbers is PR15412, but it fits the general
pattern of IR emitted by Clang so it should be widely applicable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187323 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-28 08:27:12 +00:00
Chandler Carruth
6c3a95dab5 Thread DataLayout through the callers and into mem2reg. This will be
useful in a subsequent patch, but causes an unfortunate amount of noise,
so I pulled it out into a separate patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187322 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-28 06:43:11 +00:00
Chandler Carruth
89934cbd34 Don't use all the #ifdefs to hide the stats counters and instead rely on
their being optimized out in debug mode. Realistically, this just isn't
going to be the slow part anyways. This also fixes unused variable
warnings that are breaking LLD build bots. =/ I didn't see these at
first, and kept losing track of the fact that they were broken.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187297 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-27 10:17:49 +00:00
Chandler Carruth
b7f27824fb Fix a problem I introduced in r187029 where we would over-eagerly
schedule an alloca for another iteration in SROA. This only showed up
with a mixture of promotable and unpromotable selects and phis. Added
a test case for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187031 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-24 12:12:17 +00:00
Chandler Carruth
9b3b286247 Fix PR16687 where we were incorrectly promoting an alloca that had
pending speculation for a phi node. The problem here is that we were
using growth of the specluation set as an indicator of whether
speculation would occur, and if the phi node is already in the set we
don't see it grow. This is a symptom of the fact that this signal is
a total hack.

Unfortunately, I couldn't really come up with a non-hacky way of
signaling that promotion remains valid *after* speculation occurs, such
that we only speculate when all else looks good for promotion. In the
end, I went with at least a much more explicit approach of doing the
work of queuing inside the phi and select processing and setting
a preposterously named flag to convey that we're in the special state of
requiring speculating before promotion.

Thanks to Richard Trieu and Nick Lewycky for the excellent work reducing
a testcase for this from a pretty giant, nasty assert in a big
application. =] The testcase was excellent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187029 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-24 09:47:28 +00:00
Nick Lewycky
1579a0f8a6 Remove extraneous null statement. No functionality change!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186893 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-22 23:38:27 +00:00
Jakub Staszak
85f6cbd1a5 OldPtr is llvm::Instruction. Remove unneeded cast<>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186880 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-22 22:10:43 +00:00
Benjamin Kramer
916cde6416 SROA: Microoptimization: Remove dead entries first, then sort.
While there replace an explicit struct with std::mem_fun.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186761 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-20 08:38:34 +00:00
Chandler Carruth
47042bcc26 Cleanup the stats counters for the new implementation. These actually
count the right things and have the right names.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186667 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-19 10:57:36 +00:00
Chandler Carruth
fbf2a02622 Fix another assert failure very similar to PR16651's test case. This
test case came from Benjamin and found the parallel bug in the vector
promotion code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186666 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-19 10:57:32 +00:00
Chandler Carruth
c09228dba3 Try to move to a more reasonable set of naming conventions given the new
implementation of the SROA algorithm. We were using the term 'partition'
in many places that no longer ever represented an actual partition, but
rather just an arbitrary slice of an alloca.

No functionality change intended here. Mostly just renaming of types,
functions, variables, and rewording of comments. Several comments were
rewritten to make a lot more sense in the new structure of things.

The stats are still weird and not reflective of how this really works.
I'll fix those up in a separate patch as it is a touch more semantic of
a change...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186659 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-19 09:13:58 +00:00
Chandler Carruth
df5ed3f642 A long overdue cleanup in SROA to use 'DL' instead of 'TD' for the
DataLayout variables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186656 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-19 07:21:28 +00:00
Chandler Carruth
8f0a1cecc5 Fix PR16651, an assert introduced in my recent re-work of the innards of
SROA.

The crux of the issue is that now we track uses of a partition of the
alloca in two places: the iterators over the partitioning uses and the
previously collected split uses vector. We weren't accounting for the
fact that the split uses might invalidate integer widening in ways other
than due to their width (in this case due to being volatile).

Further reduced testcase added to the tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186655 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-19 07:12:23 +00:00
Chandler Carruth
f7c45ce3f5 Reapply r186316 with a fix for one bug where the code could walk off the
end of a vector. This was found with ASan. I've had one other report of
a crasher, but thus far been unable to reproduce the crash. It may well
be fixed with this version, and if not I'd like to get more information
from the build bots about what is happening.

See r186316 for the full commit log for the new implementation of the
SROA algorithm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186565 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-18 07:15:00 +00:00
Chandler Carruth
ebf72b3301 Revert r186316 while I track down an ASan failure and an assert from
a bot.

This reverts the commit which introduced a new implementation of the
fancy SROA pass designed to reduce its overhead. I'll skip the huge
commit log here, refer to r186316 if you're looking for how this all
works and why it works that way.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186332 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-15 17:36:21 +00:00
Chandler Carruth
ea2e90df15 Reimplement SROA yet again. Same fundamental principle, but a totally
different core implementation strategy.

Previously, SROA would build a relatively elaborate partitioning of an
alloca, associate uses with each partition, and then rewrite the uses of
each partition in an attempt to break apart the alloca into chunks that
could be promoted. This was very wasteful in terms of memory and compile
time because regardless of how complex the alloca or how much we're able
to do in breaking it up, all of the datastructure work to analyze the
partitioning was done up front.

The new implementation attempts to form partitions of the alloca lazily
and on the fly, rewriting the uses that make up that partition as it
goes. This has a few significant effects:
1) Much simpler data structures are used throughout.
2) No more double walk of the recursive use graph of the alloca, only
   walk it once.
3) No more complex algorithms for associating a particular use with
   a particular partition.
4) PHI and Select speculation is simplified and happens lazily.
5) More precise information is available about a specific use of the
   alloca, removing the need for some side datastructures.

Ultimately, I think this is a much better implementation. It removes
about 300 lines of code, but arguably removes more like 500 considering
that some code grew in the process of being factored apart and cleaned
up for this all to work.

I've re-used as much of the old implementation as possible, which
includes the lion's share of code in the form of the rewriting logic.
The interesting new logic centers around how the uses of a partition are
sorted, and split into actual partitions.

Each instruction using a pointer derived from the alloca gets
a 'Partition' entry. This name is totally wrong, but I'll do a rename in
a follow-up commit as there is already enough churn here. The entry
describes the offset range accessed and the nature of the access. Once
we have all of these entries we sort them in a very specific way:
increasing order of begin offset, followed by whether they are
splittable uses (memcpy, etc), followed by the end offset or whatever.
Sorting by splittability is important as it simplifies the collection of
uses into a partition.

Once we have these uses sorted, we walk from the beginning to the end
building up a range of uses that form a partition of the alloca.
Overlapping unsplittable uses are merged into a single partition while
splittable uses are broken apart and carried from one partition to the
next. A partition is also introduced to bridge splittable uses between
the unsplittable regions when necessary.

I've looked at the performance PRs fairly closely. PR15471 no longer
will even load (the module is invalid). Not sure what is up there.
PR15412 improves by between 5% and 10%, however it is nearly impossible
to know what is holding it up as SROA (the entire pass) takes less time
than reading the IR for that test case. The analysis takes the same time
as running mem2reg on the final allocas. I suspect (without much
evidence) that the new implementation will scale much better however,
and it is just the small nature of the test cases that makes the changes
small and noisy. Either way, it is still simpler and cleaner I think.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@186316 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-15 10:30:19 +00:00
Craig Topper
365ef0b197 Use SmallVectorImpl::iterator/const_iterator instead of SmallVector to avoid specifying the vector size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@185540 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-03 15:07:05 +00:00
Bob Wilson
a1fe2948ed Fix SROA to avoid unnecessary scalar conversions for 1-element vectors.
When a 1-element vector alloca is promoted, a store instruction can often be
rewritten without converting the value to a scalar and using an insertelement
instruction to stuff it into the new alloca.  This patch just adds a check
to skip that conversion when it is unnecessary.  This turns out to be really
important for some ARM Neon operations where <1 x i64> is used to get around
the fact that i64 is not a legal type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184870 91177308-0d34-0410-b5e6-96231b3b80d8
2013-06-25 19:09:50 +00:00
Nadav Rotem
fee6969463 SROA: Generate selects instead of shuffles when blending values because this is the cannonical form.
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180875 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-01 19:53:30 +00:00
Benjamin Kramer
d81a0dee5b SROA: Don't crash on a select with two identical operands.
This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179982 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-21 17:48:39 +00:00
Chandler Carruth
05c7e7f99d Fix PR15674 (and PR15603): a SROA think-o.
The fix for PR14972 in r177055 introduced a real think-o in the *store*
side, likely because I was much more focused on the load side. While we
can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
widen a value to be stored, as that changes the width of memory access!
Lock down the code path in the store rewriting which would do this to
only handle the intended circumstance.

All of the existing tests continue to pass, and I've added a test from
the PR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178974 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 11:47:54 +00:00
Jakub Staszak
6f7becfe23 Minor cleanups. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177837 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-24 09:56:28 +00:00
Chandler Carruth
d647ec5239 [SROA] Prefix names using a custom IRBuilder inserter.
The key part of this is ensuring that name prefixes remain in a Twine
form until we get to a point where we can nuke them under NDEBUG. This
is tricky using the old APIs as they played fast and loose with Twine,
which is prone to serious error. The inserter is much cleaner as it is
actually in the call stack leading to the setName call, and so has
a good opportunity to prepend the prefix.

This matters more than you might imagine because most runs over an
alloca find a single partition, and rewrite 3 or 4 instructions
referring to it. As a consequence doing this lazily and exclusively with
Twine allows the optimizer to delete more of it and shaves another 2% to
3% off of the release build's SROA run time for PR15412. I also think
the APIs are cleaner, and the use of Twine is more reliable, so
I consider it a win-win despite the churn required to reach this state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177631 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-21 09:52:18 +00:00
Chandler Carruth
fd060a94a4 Fix a silly search-and-replace goof with r177495 that only broke
non-release builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177498 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 07:40:56 +00:00
Chandler Carruth
05c6d0b16f [SROA] Don't preserve the IR names in release builds.
This is espcially important because the new SROA pass goes to great
lengths to provide helpful names for debugging, and as a consequence
they can become very slow to render.

Good for between 5% and 15% of the SROA runtime on some slow test cases
such as the one in PR15412.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177495 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 07:30:36 +00:00
Chandler Carruth
30ee9c2093 Move the endif to the correct line so we don't have warnings about
unused statistics variables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177494 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 06:47:00 +00:00
Chandler Carruth
f2b649da0f Introduce some new statistics to help track the exact behavior of the
new SROA pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177493 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-20 06:30:46 +00:00
Chandler Carruth
5e8da1773c Mark internal classes as POD-like to get better behavior out of
SmallVector and DenseMap.

This speeds up SROA by 25% on PR15412.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177259 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-18 08:36:46 +00:00
Chandler Carruth
41b55f5556 PR14972: SROA vs. GVN exposed a really bad bug in SROA.
The fundamental problem is that SROA didn't allow for overly wide loads
where the bits past the end of the alloca were masked away and the load
was sufficiently aligned to ensure there is no risk of page fault, or
other trapping behavior. With such widened loads, SROA would delete the
load entirely rather than clamping it to the size of the alloca in order
to allow mem2reg to fire. This was exposed by a test case that neatly
arranged for GVN to run first, widening certain loads, followed by an
inline step, and then SROA which miscompiles the code. However, I see no
reason why this hasn't been plaguing us in other contexts. It seems
deeply broken.

Diagnosing all of the above took all of 10 minutes of debugging. The
really annoying aspect is that fixing this completely breaks the pass.
;] There was an implicit reliance on the fact that no loads or stores
extended past the alloca once we decided to rewrite them in the final
stage of SROA. This was used to encode information about whether the
loads and stores had been split across multiple partitions of the
original alloca. That required threading explicit tracking of whether
a *use* of a partition is split across multiple partitions.

Once that was done, another problem arose: we allowed splitting of
integer loads and stores iff they were loads and stores to the entire
alloca. This is a really arbitrary limitation, and splitting at least
some integer loads and stores is crucial to maximize promotion
opportunities. My first attempt was to start removing the restriction
entirely, but currently that does Very Bad Things by causing *many*
common alloca patterns to be fully decomposed into i8 operations and
lots of or-ing together to produce larger integers on demand. The code
bloat is terrifying. That is still the right end-goal, but substantial
work must be done to either merge partitions or ensure that small i8
values are eagerly merged in some other pass. Sadly, figuring all this
out took essentially all the time and effort here.

So the end result is that we allow splitting only when the load or store
at least covers the alloca. That ensures widened loads and stores don't
hurt SROA, and that we don't rampantly decompose operations more than we
have previously.

All of this was already fairly well tested, and so I've just updated the
tests to cover the wide load behavior. I can add a test that crafts the
pass ordering magic which caused the original PR, but that seems really
brittle and to provide little benefit. The fundamental problem is that
widened loads should Just Work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177055 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-14 11:32:24 +00:00
Jakub Staszak
11687d4982 Keep coding stanard.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176661 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-07 22:20:06 +00:00
Jakub Staszak
9497005d38 Don't create IRBuilder if we can return from the method earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176660 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-07 22:10:33 +00:00
Jakub Staszak
bcff7b7734 Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175568 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-19 22:17:58 +00:00
Jakub Staszak
50573b1c27 Minor cleanups. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175567 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-19 22:14:45 +00:00
Jakub Staszak
4263ed33a7 Remove unneeded #includes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175565 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-19 22:06:38 +00:00
Jakub Staszak
ba6f722d6a Fix typos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175562 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-19 22:02:21 +00:00
Edwin Vane
f1af1feeee Fixing warnings revealed by gcc release build
Fixed set-but-not-used warnings.

Reviewer: gribozavr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173810 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-29 17:42:24 +00:00
Chandler Carruth
0b8c9a80f2 Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171366 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02 11:36:10 +00:00
Benjamin Kramer
6c30749583 Add IRBuilder::CreateVectorSplat and use it to simplify code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171349 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-01 19:55:16 +00:00
Benjamin Kramer
0ea4c3d8c0 SROA: Clean up unused assignment warnings from clang's analyzer.
No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171348 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-01 16:13:35 +00:00