Commit Graph

1112 Commits

Author SHA1 Message Date
Matthew Simpson
0749c8e439 [LV] Scalarize instructions marked scalar after vectorization
This patch ensures that we actually scalarize instructions marked scalar after
vectorization. Previously, such instructions may have been vectorized instead.

Differential Revision: https://reviews.llvm.org/D23889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282418 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 17:08:37 +00:00
Matthew Simpson
653eafda13 [LV] Don't emit unused scalars for uniform instructions
If we identify an instruction as uniform after vectorization, we know that we
should only use the value corresponding to the first vector lane of each unroll
iteration. However, when scalarizing such instructions, we still produce values
for the other vector lanes. This patch prevents us from generating the unused
scalars.

Differential Revision: https://reviews.llvm.org/D24275

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282087 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-21 16:50:24 +00:00
Matthew Simpson
8f80e9f11f [LV] Rename "Width" to "Lane" (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-21 16:09:23 +00:00
Elena Demikhovsky
305b3f3b5a [Loop Vectorizer] Consecutive memory access - fixed and simplified
Amended consecutive memory access detection in Loop Vectorizer.
Load/Store were not handled properly without preceding GEP instruction.

Differential Revision: https://reviews.llvm.org/D20789



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281853 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-18 13:56:08 +00:00
Elena Demikhovsky
868c6d1643 [Loop vectorizer] Simplified GEP cloning. NFC.
Simplified GEP cloning in vectorizeMemoryInstruction().
Added an assertion that checks consecutive GEP, which should have only one loop-variant operand.

Differential Revision: https://reviews.llvm.org/D24557



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281851 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-18 09:22:54 +00:00
Matthew Simpson
be3fec6cf2 [LV] Process pointer IVs with PHINodes in collectLoopUniforms
This patch moves the processing of pointer induction variables in
collectLoopUniforms from the consecutive pointer phase of the analysis to the
phi node phase. Previously, if a pointer induction variable was used by both a
scalarized non-memory instruction as well as a vectorized memory instruction,
we would incorrectly identify the pointer as uniform. Pointer induction
variables should be treated the same as other phi nodes. That is, they are
uniform if all users of the induction variable and induction variable update
are uniform.

Differential Revision: https://reviews.llvm.org/D24511

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281485 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-14 14:47:40 +00:00
Matthew Simpson
470b8e4d54 [LV] Clean up uniform induction variable analysis (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281368 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-13 19:01:45 +00:00
Matt Arsenault
7474f828e2 LSV: Fix incorrectly increasing alignment
If the unaligned access has a dynamic offset, it may be odd which
would make the adjusted alignment incorrect to use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281110 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 22:20:14 +00:00
Matthew Simpson
97ea58ef28 [LV] Ensure proper handling of multi-use case when collecting uniforms
The test case included in r280979 wasn't checking what it was supposed to be
checking for the predicated store case. Fixing the test revealed that the
multi-use case (when a pointer is used by both vectorized and scalarized memory
accesses) wasn't being handled properly. We can't skip over
non-consecutive-like pointers since they may have looked consecutive-like with
a different memory access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280992 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 21:38:26 +00:00
Matthew Simpson
99a1cd6e48 [LV] Don't mark pointers used by scalarized memory accesses uniform
Previously, all consecutive pointers were marked uniform after vectorization.
However, if a consecutive pointer is used by a memory access that is eventually
scalarized, the pointer won't remain uniform after all. An example is
predicated stores. Even though a predicated store may be consecutive, it will
still be scalarized, making it's pointer operand non-uniform.

This patch updates the logic in collectLoopUniforms to consider the cases where
a memory access may be scalarized. If a memory access may be scalarized, its
pointer operand is not marked uniform. The determination of whether a given
memory instruction will be scalarized or not has been moved into a common
function that is used by the vectorizer, cost model, and legality analysis.

Differential Revision: https://reviews.llvm.org/D24271

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280979 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 19:11:07 +00:00
Peter Collingbourne
1c13d2b2d2 IR: Remove Value::intersectOptionalDataWith, replace all calls with calls to Instruction::andIRFlags.
The two functions are functionally equivalent.

Differential Revision: https://reviews.llvm.org/D22830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280884 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 23:39:04 +00:00
Justin Lebar
600740bf18 [LSV] Use the original loads' names for the extractelement instructions.
Summary:
LSV replaces multiple adjacent loads with one vectorized load and a
bunch of extractelement instructions.  This patch makes the
extractelement instructions' names match those of the original loads,
for (hopefully) improved readability.

Reviewers: asbirlea, tstellarAMD

Subscribers: arsenm, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 15:49:48 +00:00
Duncan P. N. Exon Smith
0d65f1da0e ADT: Do not inherit from std::iterator in ilist_iterator
Inheriting from std::iterator uses more boiler-plate than manual
typedefs.  Avoid that in both ilist_iterator and
MachineInstrBundleIterator.

This has the side effect of removing ilist_iterator from certain ADL
lookups in namespace std; calls to std::next need to be qualified by
"std::" that didn't have to before.  The one case of this in-tree was
operating on a temporary, so I used the more compact operator++.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280570 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-03 02:27:35 +00:00
Chad Rosier
454a60a86c [SLP] Don't pass a global CL option as an argument. NFC.
Differential Revision: https://reviews.llvm.org/D24199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280527 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 19:09:50 +00:00
Matthew Simpson
d768ea4620 [LV] Ensure reverse interleaved group GEPs remain uniform
For uniform instructions, we're only required to generate a scalar value for
the first vector lane of each unroll iteration. Thus, if we have a reverse
interleaved group, computing the member index off the scalar GEP corresponding
to the last vector lane of its pointer operand technically makes the GEP
non-uniform. We should compute the member index off the first scalar GEP
instead.

I've added the updated member index computation to the existing reverse
interleaved group test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280497 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 16:19:22 +00:00
Matthew Simpson
af1a999e07 [LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI)
We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer
have to create temporary vectors with insertelement instructions when handling
pointer induction variables. This case was mistakenly missed from r279649 when
refactoring the other scalarization code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280405 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 19:40:19 +00:00
Matthew Simpson
8dabfb7c14 [LV] Move VectorParts allocation and mapping into PHI widening (NFC)
This patch moves the allocation of VectorParts for PHI nodes into the actual
PHI widening code. Previously, we allocated these VectorParts in
vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon
returning, we would then map the VectorParts in VectorLoopValueMap. This
behavior is problematic for the cases where we only want to generate a scalar
version of a PHI node. For example, if in the future we only generate a scalar
version of an induction variable, we would end up inserting an empty vector
entry into the map once we return to vectorizeBlockInLoop. We now no longer
need to pass VectorParts to the various PHI widening functions, and we can keep
VectorParts allocation as close as possible to the point at which they are
actually mapped in VectorLoopValueMap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280390 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 18:14:27 +00:00
Chad Rosier
8f1c5752a9 [SLP] Update the debug based on Michael's suggestion.
Passing the types/opcode check still doesn't guarantee we'll actually vectorize.
Therefore, just make it clear we're attempting to vectorize.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280263 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 17:41:12 +00:00
Chad Rosier
71255dfcef [SLP] Sink debug after checking for matching types/opcode.
Differential Revision: https://reviews.llvm.org/D24090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280260 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 17:31:09 +00:00
Chad Rosier
5f960b8b75 [SLP] Arguments should be camel case, and start with an upper case letter. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280248 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 15:06:58 +00:00
Alina Sbirlea
74a597b31a [LoadStoreVectorizer] Change VectorSet to Vector to match head and tail positions. Resolves PR29148.
Summary:
LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize.
A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions
in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head.

e.g.:
i1: store a[0]
i2: store a[1]
i3: store a[1]
Leads to:
H: i1
T: i2 i3
Instead of:
H: i1 i1
T: i2 i3
So the positions for instructions that follow i3 will have different indexes in H/T.
This patch resolves PR29148.

This issue also surfaced the fact that if the chain is too long, and TLI
returns a "not-fast" answer, the whole chain will be abandoned for
vectorization, even though a smaller one would be beneficial.
Added a testcase and FIXME for this.

Reviewers: tstellarAMD, arsenm, jlebar

Subscribers: mzolotukhin, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280179 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-30 23:53:59 +00:00
Michael Kuperstein
67a6032e65 [LoopVectorizer] Predicate instructions in blocks with several incoming edges
We don't need to limit predication to blocks that have a single incoming
edge, we just need to use the right mask.
This fixes PR30172.

Differential Revision: https://reviews.llvm.org/D24009


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280148 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-30 20:22:21 +00:00
Duncan P. N. Exon Smith
1d79fff9e6 ADT: Give ilist<T>::reverse_iterator a handle to the current node
Reverse iterators to doubly-linked lists can be simpler (and cheaper)
than std::reverse_iterator.  Make it so.

In particular, change ilist<T>::reverse_iterator so that it is *never*
invalidated unless the node it references is deleted.  This matches the
guarantees of ilist<T>::iterator.

(Note: MachineBasicBlock::iterator is *not* an ilist iterator, but a
MachineInstrBundleIterator<MachineInstr>.  This commit does not change
MachineBasicBlock::reverse_iterator, but it does update
MachineBasicBlock::reverse_instr_iterator.  See note at end of commit
message for details on bundle iterators.)

Given the list (with the Sentinel showing twice for simplicity):

     [Sentinel] <-> A <-> B <-> [Sentinel]

the following is now true:
 1. begin() represents A.
 2. begin() holds the pointer for A.
 3. end() represents [Sentinel].
 4. end() holds the poitner for [Sentinel].
 5. rbegin() represents B.
 6. rbegin() holds the pointer for B.
 7. rend() represents [Sentinel].
 8. rend() holds the pointer for [Sentinel].

The changes are #6 and #8.  Here are some properties from the old
scheme (which used std::reverse_iterator):
- rbegin() held the pointer for [Sentinel] and rend() held the pointer
  for A;
- operator*() cost two dereferences instead of one;
- converting from a valid iterator to its valid reverse_iterator
  involved a confusing increment; and
- "RI++->erase()" left RI invalid.  The unintuitive replacement was
  "RI->erase(), RE = end()".

With vector-like data structures these properties are hard to avoid
(since past-the-beginning is not a valid pointer), and don't impose a
real cost (since there's still only one dereference, and all iterators
are invalidated on erase).  But with lists, this was a poor design.

Specifically, the following code (which obviously works with normal
iterators) now works with ilist::reverse_iterator as well:

    for (auto RI = L.rbegin(), RE = L.rend(); RI != RE;)
      fooThatMightRemoveArgFromList(*RI++);

Converting between iterator and reverse_iterator for the same node uses
the getReverse() function.

    reverse_iterator iterator::getReverse();
    iterator reverse_iterator::getReverse();

Why doesn't iterator <=> reverse_iterator conversion use constructors?

In order to catch and update old code, reverse_iterator does not even
have an explicit conversion from iterator.  It wouldn't be safe because
there would be no reasonable way to catch all the bugs from the changed
semantic (see the changes at call sites that are part of this patch).

Old code used this API:

    std::reverse_iterator::reverse_iterator(iterator);
    iterator std::reverse_iterator::base();

Here's how to update from old code to new (that incorporates the
semantic change), assuming I is an ilist<>::iterator and RI is an
ilist<>::reverse_iterator:

            [Old]         ==>          [New]
    reverse_iterator(I)       (--I).getReverse()
    reverse_iterator(I)         ++I.getReverse()
  --reverse_iterator(I)           I.getReverse()
    reverse_iterator(++I)         I.getReverse()
          RI.base()          (--RI).getReverse()
          RI.base()            ++RI.getReverse()
        --RI.base()              RI.getReverse()
      (++RI).base()              RI.getReverse()
  delete &*RI, RE = end()         delete &*RI++
  RI->erase(), RE = end()         RI++->erase()

=======================================
Note: bundle iterators are out of scope
=======================================

MachineBasicBlock::iterator, also known as
MachineInstrBundleIterator<MachineInstr>, is a wrapper to represent
MachineInstr bundles.  The idea is that each operator++ takes you to the
beginning of the next bundle.  Implementing a sane reverse iterator for
this is harder than ilist.  Here are the options:
- Use std::reverse_iterator<MBB::i>.  Store a handle to the beginning of
  the next bundle.  A call to operator*() runs a loop (usually
  operator--() will be called 1 time, for unbundled instructions).
  Increment/decrement just works.  This is the status quo.
- Store a handle to the final node in the bundle.  A call to operator*()
  still runs a loop, but it iterates one time fewer (usually
  operator--() will be called 0 times, for unbundled instructions).
  Increment/decrement just works.
- Make the ilist_sentinel<MachineInstr> *always* store that it's the
  sentinel (instead of just in asserts mode).  Then the bundle iterator
  can sniff the sentinel bit in operator++().

I initially tried implementing the end() option as part of this commit,
but updating iterator/reverse_iterator conversion call sites was
error-prone.  I have a WIP series of patches that implements the final
option.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280032 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-30 00:13:12 +00:00
Chad Rosier
2f811bdf6d [SLP] Return a boolean value for these static helpers. NFC.
Differential Revision: https://reviews.llvm.org/D24008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280020 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 22:09:51 +00:00
Matthew Simpson
1254de0c5c [LV] Move insertelement sequence after scalar definitions
After r279649 when getting a vector value from VectorLoopValueMap, we create an
insertelement sequence on-demand if the value has been scalarized instead of
vectorized. We previously inserted this insertelement sequence before the
value's first vector user. However, this insert location is problematic if that
user is the phi node of a first-order recurrence. With this patch, we move the
insertelement sequence after the last scalar instruction we created when
scalarizing the value. Thus, the value's vector definition in the new loop will
immediately follow its scalar definitions. This should fix PR30183.

Reference: https://llvm.org/bugs/show_bug.cgi?id=30183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280001 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 20:14:04 +00:00
Matthew Simpson
428e79c2bb [LV] Unify vector and scalar maps
This patch unifies the data structures we use for mapping instructions from the
original loop to their corresponding instructions in the new loop. Previously,
we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap.
WidenMap maintained the vector values each instruction from the old loop was
represented with, and ScalarIVMap maintained the scalar values each scalarized
induction variable was represented with. With this patch, all values created
for the new loop are maintained in VectorLoopValueMap.

The change allows for several simplifications. Previously, when an instruction
was scalarized, we had to insert the scalar values into vectors in order to
maintain the mapping in WidenMap. Then, if a user of the scalarized value was
also scalar, we had to extract the scalar values from the temporary vector we
created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions.
If a scalarized value is used by a scalar instruction, the scalar value is used
directly. However, if the scalarized value is needed by a vector instruction,
we generate the needed insertelement instructions on-demand.

A common idiom in several locations in the code (including the scalarization
code), is to first get the vector values an instruction from the original loop
maps to, and then extract a particular scalar value. This patch adds
getScalarValue for this purpose along side getVectorValue as an interface into
VectorLoopValueMap. These functions work together to return the requested
values if they're available or to produce them if they're not.

The mapping has also be made less permissive. Entries can be added to
VectorLoopValue map with the new initVector and initScalar functions.
getVectorValue has been modified to return a constant reference to the mapped
entries.

There's no real functional change with this patch; however, in some cases we
will generate slightly different code. For example, instead of an insertelement
sequence following the definition of an instruction, it will now precede the
first use of that instruction. This can be seen in the test case changes.

Differential Revision: https://reviews.llvm.org/D23169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279649 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-24 18:23:17 +00:00
Gil Rapaport
568c0d2c94 [Loop Vectorizer] Support predication of div/rem
div/rem instructions in basic blocks that require predication currently prevent
vectorization. This patch extends the existing mechanism for predicating stores
to handle other instructions and leverages it to predicate divs and rems.

Differential Revision: https://reviews.llvm.org/D22918


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279620 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-24 11:37:57 +00:00
Matthew Simpson
35b8916582 [SLP] Avoid signed integer overflow
The test case included with r279125 exposed an existing signed integer
overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together
with other costs, such as getReductionCost.

This patch removes the possibility of assigning a cost of INT_MAX. Since we
were previously using INT_MAX as an indicator for "should not vectorize", we
now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable"
before computing a cost.

This patch adds a run-line to the test case used for r279125 that ensures we
don't vectorize. Previously, this line would vectorize the test case by chance
due to undefined behavior in the cost calculation.

Differential Revision: https://reviews.llvm.org/D23723

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279562 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-23 20:48:50 +00:00
Matthew Simpson
fcc016a4ea Reapply "[SLP] Initialize VectorizedValue when gathering"
The test case included in r279125 exposed existing undefined behavior in the
SLP vectorizer that it did not introduce. This patch reapplies the original
patch, but modifies the test case to avoid hitting the undefined behavior. This
allows us to close PR28330 while keeping the UBSan bot happy. The undefined
behavior the original test uncovered will be addressed in a follow-on patch.

Reference: https://llvm.org/bugs/show_bug.cgi?id=28330

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279370 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-20 14:49:02 +00:00
Matthew Simpson
4848b39daa [SLP] Add command line option for minimum tree size (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279369 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-20 14:10:06 +00:00
Vitaly Buka
914b867b4c Revert "[SLP] Initialize VectorizedValue when gathering" to fix ubsan bot.
This reverts commit r279125.

https://reviews.llvm.org/D23410

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279363 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-20 07:09:39 +00:00
Benjamin Kramer
7393ed54be [LoopVectorize] Don't copy std::vector in for-range loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279233 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-19 12:44:24 +00:00
Matthew Simpson
3989d7e56d [SLP] Initialize VectorizedValue when gathering
We abort building vectorizable trees in some cases (e.g., if the maximum
recursion depth is reached, if the region size is too large, etc.). If this
happens for a reduction, we can be left with a root entry that needs to be
gathered. For these cases, we need make sure we actually set VectorizedValue to
the resulting vector.

This patch ensures we properly set VectorizedValue, and it also ensures the
insertelement sequence generated for the gathers is inserted at the correct
location.

Reference: https://llvm.org/bugs/show_bug.cgi?id=28330
Differential Revison: https://reviews.llvm.org/D23410

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279125 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-18 19:50:32 +00:00
Tim Shen
87032a5e0c [LV] Move LoopBodyTraits to a better place, and add comment for simplifying LoopBlocksTraversal. NFC.
Summary: I later (after r278573) found that LoopIterator.h has some overlapping with LoopBodyTraits. It's good to use LoopBodyTraits because a *Traits struct is algorithm independent.

Reviewers: anemet, nadav, mkuper

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23529

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278996 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 22:20:07 +00:00
Justin Lebar
06c11279fd [LSV] Use a set rather than an ArraySlice at the end of getVectorizablePrefix. NFC
Summary: This avoids a small O(n^2) loop.

Reviewers: asbirlea

Subscribers: mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D23473

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278581 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-13 00:04:12 +00:00
Justin Lebar
50724be200 [LSV] Use OrderedBasicBlock instead of rolling it ourselves. NFC
Summary:
In getVectorizablePrefix, this is less efficient (because we have to
iterate over the BB twice), but boy is it simpler.  Given how much
trouble we've had here, I think the simplicity gain is worthwhile.

In reorder(), this is actually more efficient, as
DominatorTree::dominates iterates over the BB from the beginning when
the two instructions are in the same BB.

Reviewers: asbirlea

Subscribers: arsenm, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278580 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-13 00:04:08 +00:00
Tim Shen
9e8ae09eb8 [LoopVectorize] Detect loops in the innermost loop before creating InnerLoopVectorizer
InnerLoopVectorizer shouldn't handle a loop with cycles inside the loop
body, even if that cycle isn't a natural loop.

Fixes PR28541.

Differential Revision: https://reviews.llvm.org/D22952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278573 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-12 22:47:13 +00:00
David Majnemer
2d62ce6ee8 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278469 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-12 03:55:06 +00:00
David Majnemer
975248e4fb Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278433 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 22:21:41 +00:00
David Majnemer
dc9c737666 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278417 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 21:15:00 +00:00
Matthew Simpson
67914cbf05 [SLP] Make RecursionMaxDepth a command line option (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278343 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 15:28:45 +00:00
Benjamin Kramer
284030ab2c Move helpers into anonymous namespaces. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277916 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-06 11:13:10 +00:00
Michael Kuperstein
e788186982 [LV, X86] Be more optimistic about vectorizing shifts.
Shifts with a uniform but non-constant count were considered very expensive to
vectorize, because the splat of the uniform count and the shift would tend to
appear in different blocks. That made the splat invisible to ISel, and we'd
scalarize the shift at codegen time.

Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
are able to select the appropriate vector shifts. This updates the cost model to
to take this into account by making shifts by a uniform cheap again.

Differential Revision: https://reviews.llvm.org/D23049


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-04 22:48:03 +00:00
Alina Sbirlea
4ebfadfe76 LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments.
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.

Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:
- x86 failing tests either did not have the right target, or the right alignment.
- NVPTX failing tests did not have the right alignment.
- AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info
  considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks
  for a maximum size allowed and relies on the next condition checking for %4 for correctness.
  This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).

Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.

Reviewers: arsenm, jlebar, tstellarAMD

Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277735 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-04 16:38:44 +00:00
Gil Rapaport
3a37cd24d7 [Loop Vectorizer] Move store-predication into its own function, remove obsolete comment (NFC)
Differential Revision: https://reviews.llvm.org/D23013


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277595 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-03 13:23:43 +00:00
Wei Mi
ba9543ccae [LoopVectorize] Change comment for isOutOfScope in collectLoopUniforms, NFC
Update comment for isOutOfScope and add a testcase for uniform value being used
out of scope.

Differential Revision: https://reviews.llvm.org/D23073


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277515 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-02 20:27:49 +00:00
Matthew Simpson
155b8551c6 [LV] Generate both scalar and vector integer induction variables
This patch enables the vectorizer to generate both scalar and vector versions
of an integer induction variable for a given loop. Previously, we only
generated a scalar induction variable if we knew all its users were going to be
scalar. Otherwise, we generated a vector induction variable. In the case of a
loop with both scalar and vector users of the induction variable, we would
generate the vector induction variable and extract scalar values from it for
the scalar users. With this patch, we now generate both versions of the
induction variable when there are both scalar and vector users and select which
version to use based on whether the user is scalar or vector.

Differential Revision: https://reviews.llvm.org/D22869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277474 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-02 15:25:16 +00:00
Matthew Simpson
90b7f569d8 [LV] Untangle the concepts of uniform and scalar
This patch refactors the logic in collectLoopUniforms and
collectValuesToIgnore, untangling the concepts of "uniform" and "scalar". It
adds isScalarAfterVectorization along side isUniformAfterVectorization to
distinguish the two. Known scalar values include those that are uniform,
getelementptr instructions that won't be vectorized, and induction variables
and induction variable update instructions whose users are all known to be
scalar.

This patch includes the following functional changes:

- In collectLoopUniforms, we mark uniform the pointer operands of interleaved
  accesses. Although non-consecutive, these pointers are treated like
  consecutive pointers during vectorization.

- In collectValuesToIgnore, we insert a value into VecValuesToIgnore if it
  isScalarAfterVectorization rather than isUniformAfterVectorization. This
  differs from the previous functionaly in that we now add getelementptr
  instructions that will not be vectorized into VecValuesToIgnore.

This patch also removes the ValuesNotWidened set used for induction variable
scalarization since, after the above changes, it is now equivalent to
isScalarAfterVectorization.

Differential Revision: https://reviews.llvm.org/D22867

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277460 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-02 14:29:41 +00:00
Benjamin Kramer
df988869b8 [LoadStoreVectorizer] Don't use a linear walk for an existence check in a SmallPtrSet
No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277436 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-02 09:35:17 +00:00
Matthew Simpson
8a44831abe [LV] Move isGatherOrScatterLegal into LoopVectorizationLegality (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277376 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-01 20:11:25 +00:00