Commit Graph

71 Commits

Author SHA1 Message Date
Amara Emerson
f69362835a Re-commit r302678, fixing PR33053.
The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions
which didn't have a lowering.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303211 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-16 21:29:22 +00:00
Matthew Simpson
61c3d36e2e Revert 303174, 303176, and 303178
These commits are breaking the bots. Reverting to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303182 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-16 15:50:30 +00:00
Matthew Simpson
cce9257402 Make test target-specific
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303178 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-16 15:33:22 +00:00
Hans Wennborg
05f671ecfa Revert r302678 "[AArch64] Enable use of reduction intrinsics."
This caused PR33053.

Original commit message:

> The new experimental reduction intrinsics can now be used, so I'm enabling this
> for AArch64. We will need this for SVE anyway, so it makes sense to do this for
> NEON reductions as well.
>
> The existing code to match shufflevector patterns are replaced with a direct
> lowering of the reductions to AArch64-specific nodes. Tests updated with the
> new, simpler, representation.
>
> Differential Revision: https://reviews.llvm.org/D32247

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303115 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-15 20:59:32 +00:00
Amara Emerson
7c631c8afc [AArch64] Enable use of reduction intrinsics.
The new experimental reduction intrinsics can now be used, so I'm enabling this
for AArch64. We will need this for SVE anyway, so it makes sense to do this for
NEON reductions as well.

The existing code to match shufflevector patterns are replaced with a direct
lowering of the reductions to AArch64-specific nodes. Tests updated with the
new, simpler, representation.

Differential Revision: https://reviews.llvm.org/D32247


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302678 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-10 15:15:38 +00:00
Anna Thomas
0bd00d7d97 [LV] Fix the vector code generation for first order recurrence
Summary:
In first order recurrences where phi's are used outside the loop,
we should generate an additional vector.extract of the second last element from
the vectorized phi update.
This is because we require the phi itself (which is the value at the second last
iteration of the vector loop) and not the phi's update within the loop.
Also fix the code gen when we just unroll, but don't vectorize.
Fixes PR32396.

Reviewers: mssimpso, mkuper, anemet

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D31979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300238 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-13 18:59:25 +00:00
Anna Thomas
3d57035bd9 [LV] Avoid vectorizing first order recurrence when phi uses are outside loop
In the vectorization of first order recurrence, we vectorize such
that the last element in the vector will be the one extracted to pass into the
scalar remainder loop. However, this is not true when there is a phi (other
than the primary induction variable) is used outside the loop.
In such a case, we need the value from the second last iteration (i.e.
the phi value), not the last iteration (which would be the phi update).
I've added a test case for this. Also see PR32396.

A follow up patch would generate the correct code gen for such cases,
and turn this vectorization on.

Differential Revision: https://reviews.llvm.org/D31910

Reviewers: mssimpso

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299985 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 21:02:00 +00:00
Anna Thomas
c3ff31ab3f [LV] Move first order recurrence test to common folder. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299969 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 18:31:42 +00:00
Matthew Simpson
a9eefb0375 [LV] Make test case more robust
This test case depends on the loop being vectorized without forcing the
vectorization factor. If the profitability ever changes in the future (due to
cost model improvements), the test may no longer work as intended. Instead of
checking the resulting IR, we should just check the instruction costs. The
costs will be computed regardless if vectorization is profitable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299545 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 14:34:13 +00:00
Jonas Paulsson
85dd82a95b [TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved
getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.

Tests updates:

Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll

The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.

Review: Hal Finkel
https://reviews.llvm.org/D29540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 06:35:36 +00:00
Matthew Simpson
8ca15c2f22 [LV] Select legal insert point when fixing first-order recurrences
Because IRBuilder performs constant-folding, it's not guaranteed that an
instruction in the original loop map to an instruction in the vector loop. It
could map to a constant vector instead. The handling of first-order recurrences
was incorrectly making this assumption when setting the IRBuilder's insert
point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 18:18:20 +00:00
Matthew Simpson
b9f3ad9d10 [LV] Make the test case for PR30183 less fragile
This patch also renames the PR number the test points to. The previous
reference was PR29559, but that bug was somehow deleted and recreated under
PR30183.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297295 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 17:03:38 +00:00
Matthew Simpson
d20a0daedd [LV] Add missing check labels to tests and reformat
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297294 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 16:55:34 +00:00
Matthew Simpson
201896c9fd [ARM/AArch64] Update costs for interleaved accesses with wide types
After r296750, we're able to match interleaved accesses having types wider than
128 bits. This patch updates the associated TTI costs.

Differential Revision: https://reviews.llvm.org/D29675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296751 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-02 15:15:35 +00:00
Matthew Simpson
465f5a16f9 [LV] Considier non-consecutive but vectorizable accesses for VF selection
When computing the smallest and largest types for selecting the maximum
vectorization factor, we currently ignore loads and stores of pointer types if
the memory access is non-consecutive. We do this because such accesses must be
scalarized regardless of vectorization factor, and thus shouldn't be considered
when determining the factor. This patch makes this check less aggressive by
also considering non-consecutive accesses that may be vectorized, such as
interleaved accesses. Because we don't know at the time of the check if an
accesses will certainly be vectorized (this is a cost model decision given a
particular VF), we consider all accesses that can potentially be vectorized.

Differential Revision: https://reviews.llvm.org/D30305

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296747 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-02 13:55:05 +00:00
Karl-Johan Karlsson
29fec22c97 [LoopVectorize] Added address space check when analysing interleaved accesses
Prevent memory objects of different address spaces to be part of
the same load/store groups when analysing interleaved accesses.

This is fixing pr31900.

Reviewers: HaoLiu, mssimpso, mkuper

Reviewed By: mssimpso, mkuper

Subscribers: llvm-commits, efriedma, mzolotukhin

Differential Revision: https://reviews.llvm.org/D29717

This reverts r295042 (re-applies r295038) with an additional fix for the
buildbot problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295858 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 18:37:36 +00:00
Matthew Simpson
06f1a4b52b Reapply "[LV] Extend trunc optimization to all IVs with constant integer steps"
This reapplies commit r294967 with a fix for the execution time regressions
caught by the clang-cmake-aarch64-quick bot. We now extend the truncate
optimization to non-primary induction variables only if the truncate isn't
already free.

Differential Revision: https://reviews.llvm.org/D29847

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295063 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 16:28:32 +00:00
Karl-Johan Karlsson
ad0908f33b Revert "[LoopVectorize] Added address space check when analysing interleaved accesses"
This reverts r295038. The buildbot clang-with-thin-lto-ubuntu failed.
I'm reverting to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295042 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 10:06:16 +00:00
Karl-Johan Karlsson
26a008fb2f [LoopVectorize] Added address space check when analysing interleaved accesses
Prevent memory objects of different address spaces to be part of
the same load/store groups when analysing interleaved accesses.

This is fixing pr31900.

Reviewers: HaoLiu, mssimpso, mkuper

Reviewed By: mssimpso, mkuper

Subscribers: llvm-commits, efriedma, mzolotukhin

Differential Revision: https://reviews.llvm.org/D29717

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295038 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 08:14:06 +00:00
Elena Demikhovsky
92cc2185b8 [Loop Vectorizer] Cost-based decision for vectorization form of memory instruction.
Making the cost model selecting between Interleave, GatherScatter or Scalar vectorization form of memory instruction.
The right decision should be done for non-consecutive memory access instrcuctions that may have more than one vectorization solution.

This patch includes the following changes:
- Cost Model calculates the cost of Load/Store vector form and choose the better option between Widening, Interleave, GatherScactter and Scalarization. Cost Model keeps the widening decision.
- Arrays of Uniform and Scalar values are moved from Legality to Cost Model.
- Cost Model collects Uniforms and Scalars per VF. The collection is based on CM decision map of Loadis/Stores vectorization form.
- Vectorization of memory instruction is performed according to the CM decision.

Differential Revision: https://reviews.llvm.org/D27919



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294503 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 19:25:23 +00:00
Matthew Simpson
9e8cc9ccc4 [LV] Add new ARM/AArch64 interleaved access cost model tests (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294342 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 19:34:24 +00:00
Matthew Simpson
85dcf5ecdf [LV] Simplify ARM/AArch64 interleaved access cost model tests (NFC)
This patch removes unneeded instructions from the existing ARM/AArch64
interleaved access cost model tests. I'll be adding a similar set of tests in a
follow-on patch to increase coverage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294336 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 19:17:44 +00:00
Matthew Simpson
3499b6e23b Reapply "[LV] Enable vectorization of loops with conditional stores by default"
This patch reapplies r289863. The original patch was reverted because it
exposed a bug causing the loop vectorizer to crash in the Python runtime on
PPC. The underlying issue was fixed with r289958.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289975 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 19:12:02 +00:00
Chandler Carruth
04bc9fbf17 Revert r289863: [LV] Enable vectorization of loops with conditional
stores by default

This uncovers a crasher in the loop vectorizer on PPC when building the
Python runtime. I'll send the testcase to the review thread for the
original commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289934 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 11:31:39 +00:00
Matthew Simpson
f977c2b26d [LV] Enable vectorization of loops with conditional stores by default
This patch sets the default value of the "-enable-cond-stores-vec" command line
option to "true".

Differential Revision: https://reviews.llvm.org/D27814

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289863 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 20:11:05 +00:00
Matthew Simpson
b6b20e1aa2 [LV] Scalarize operands of predicated instructions
This patch attempts to scalarize the operand expressions of predicated
instructions if they were conditionally executed in the original loop. After
scalarization, the expressions will be sunk inside the blocks created for the
predicated instructions. The transformation essentially performs
un-if-conversion on the operands.

The cost model has been updated to determine if scalarization is profitable. It
compares the cost of a vectorized instruction, assuming it will be
if-converted, to the cost of the scalarized instruction, assuming that the
instructions corresponding to each vector lane will be sunk inside a predicated
block, possibly avoiding execution. If it's more profitable to scalarize the
entire expression tree feeding the predicated instruction, the expression will
be scalarized; otherwise, it will be vectorized. We only consider the cost of
the entire expression to accurately estimate the cost of the required
insertelement and extractelement instructions.

Differential Revision: https://reviews.llvm.org/D26083

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288909 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 15:03:32 +00:00
Dorit Nuzman
3aa311854a Second attempt at r285517.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285568 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-31 13:17:31 +00:00
Dorit Nuzman
6d3c9bdc8f Revert r285517 due to build failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285518 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-30 14:34:57 +00:00
Dorit Nuzman
b10d927158 [LoopVectorize] Make interleaved-accesses analysis less conservative about
possible pointer-wrap-around concerns, in some cases.

Before this patch, collectConstStridedAccesses (part of interleaved-accesses
analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when
examining all candidate pointers. This is too conservative. Instead, this
patch makes collectConstStridedAccesses use an optimistic approach, calling
getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the
candidate interleave groups have been formed, revisits the pointer-wrapping
analysis but only where it matters: namely, in groups that have gaps, and where
the gaps are not at the very end of the group (in which case the loop is
peeled). This second time getPtrStride is called with [Assume=false,
ShouldCheckWrap=true], but this could further be improved to using Assume=true,
once we also add the logic to track that we are not going to meet the scev
runtime checks threshold.

Differential Revision: https://reviews.llvm.org/D25276



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285517 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-30 12:23:26 +00:00
Matthew Simpson
889ff7ba68 [LV] Correct misleading comments in test (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285402 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 14:27:45 +00:00
Matthew Simpson
a913b4aab2 [LV] Account for predicated stores in instruction costs
This patch ensures that we scale the estimated cost of predicated stores by
block probability. This is a follow-on patch for r284123.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284126 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 14:54:31 +00:00
Matthew Simpson
262bc1134d [LV] Avoid rounding errors for predicated instruction costs
This patch modifies the cost calculation of predicated instructions (div and
rem) to avoid the accumulation of rounding errors due to multiple truncating
integer divisions. The calculation for predicated stores will be addressed in a
follow-on patch since we currently don't scale the cost of predicated stores by
block probability.

Differential Revision: https://reviews.llvm.org/D25333

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284123 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 14:19:48 +00:00
Matthew Simpson
1254de0c5c [LV] Move insertelement sequence after scalar definitions
After r279649 when getting a vector value from VectorLoopValueMap, we create an
insertelement sequence on-demand if the value has been scalarized instead of
vectorized. We previously inserted this insertelement sequence before the
value's first vector user. However, this insert location is problematic if that
user is the phi node of a first-order recurrence. With this patch, we move the
insertelement sequence after the last scalar instruction we created when
scalarizing the value. Thus, the value's vector definition in the new loop will
immediately follow its scalar definitions. This should fix PR30183.

Reference: https://llvm.org/bugs/show_bug.cgi?id=30183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280001 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 20:14:04 +00:00
Matthew Simpson
428e79c2bb [LV] Unify vector and scalar maps
This patch unifies the data structures we use for mapping instructions from the
original loop to their corresponding instructions in the new loop. Previously,
we maintained two distinct maps for this purpose: WidenMap and ScalarIVMap.
WidenMap maintained the vector values each instruction from the old loop was
represented with, and ScalarIVMap maintained the scalar values each scalarized
induction variable was represented with. With this patch, all values created
for the new loop are maintained in VectorLoopValueMap.

The change allows for several simplifications. Previously, when an instruction
was scalarized, we had to insert the scalar values into vectors in order to
maintain the mapping in WidenMap. Then, if a user of the scalarized value was
also scalar, we had to extract the scalar values from the temporary vector we
created. We now aovid these unnecessary scalar-to-vector-to-scalar conversions.
If a scalarized value is used by a scalar instruction, the scalar value is used
directly. However, if the scalarized value is needed by a vector instruction,
we generate the needed insertelement instructions on-demand.

A common idiom in several locations in the code (including the scalarization
code), is to first get the vector values an instruction from the original loop
maps to, and then extract a particular scalar value. This patch adds
getScalarValue for this purpose along side getVectorValue as an interface into
VectorLoopValueMap. These functions work together to return the requested
values if they're available or to produce them if they're not.

The mapping has also be made less permissive. Entries can be added to
VectorLoopValue map with the new initVector and initScalar functions.
getVectorValue has been modified to return a constant reference to the mapped
entries.

There's no real functional change with this patch; however, in some cases we
will generate slightly different code. For example, instead of an insertelement
sequence following the definition of an instruction, it will now precede the
first use of that instruction. This can be seen in the test case changes.

Differential Revision: https://reviews.llvm.org/D23169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279649 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-24 18:23:17 +00:00
Matthew Simpson
570bf9da46 Reapply "[TTI] Refine default cost for interleaved load groups with gaps"
This reapplies commit r272385 with a fix. The build was failing when compiled
with gcc, but not with clang. With the fix, we now get the data layout from the
current TTI implementation, which will hopefully solve the issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272395 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 14:33:30 +00:00
Matthew Simpson
f8ad75dbd3 Revert "[TTI] Refine default cost for interleaved load groups with gaps"
This reverts commit r272385. This commit broke the build. I'm temporarily
reverting to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272391 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 12:41:33 +00:00
Matthew Simpson
8f626cdbc5 [TTI] Refine default cost for interleaved load groups with gaps
This patch refines the default cost for interleaved load groups having gaps. If
a load group has gaps, the legalized instructions corresponding to the unused
elements will be dead. Thus, we don't need to account for them in the cost
model. Instead, we only need to account for the fraction of legalized loads
that will actually be used.

Differential Revision: http://reviews.llvm.org/D20873

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272385 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 11:27:51 +00:00
Matthew Simpson
12427adbc9 [LAA] Rename forwarding conflict detection option (NFC)
This patch renames the option enabling the store-to-load forwarding conflict
detection optimization. This change was requested in the review of D20241.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269668 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-16 17:00:56 +00:00
Matthew Simpson
20ebcfc5d1 [LV] Ensure safe VF for loops with interleaved accesses
The selection of the vectorization factor currently doesn't consider
interleaved accesses. The vectorization factor is based on the maximum safe
dependence distance computed by LAA. However, for loops with interleaved
groups, we should instead base the vectorization factor on the maximum safe
dependence distance divided by the maximum interleave factor of all the
interleaved groups. Interleaved accesses not in a group will be scalarized.

Differential Revision: http://reviews.llvm.org/D20241

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269659 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-16 15:08:20 +00:00
James Molloy
3a22301784 Revert "[VectorUtils] Query number of sign bits to allow more truncations"
This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690.

This reverts commit r268921.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269051 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 12:27:23 +00:00
James Molloy
ec0b9c8745 [VectorUtils] Query number of sign bits to allow more truncations
When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268921 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 14:32:30 +00:00
Matthew Simpson
09e9ded8a1 [LoopUtils, LV] Fix PR27246 (first-order recurrences)
This patch ensures that when we detect first-order recurrences, we reject a phi
node if its previous value is also a phi node. During vectorization the initial
and previous values of the recurrence are shuffled together to create the value
for the current iteration. However, phi nodes are not widened like other
instructions. This fixes PR27246.

Differential Revision: http://reviews.llvm.org/D18971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265983 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-11 19:48:18 +00:00
Silviu Baranga
d8cc816f81 Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV
This re-commits r265535 which was reverted in r265541 because it
broke the windows bots. The problem was that we had a PointerIntPair
which took a pointer to a struct allocated with new. The problem
was that new doesn't provide sufficient alignment guarantees.
This pattern was already present before r265535 and it just happened
to work. To fix this, we now separate the PointerToIntPair from the
ExitNotTakenInfo struct into a pointer and a bool.

Original commit message:

Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.

However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.

In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.

We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.

Reviewers: anemet, mzolotukhin, hfinkel, sanjoy

Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17201



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265786 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-08 14:29:09 +00:00
Silviu Baranga
89e8236bfb Revert r265535 until we know how we can fix the bots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265541 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 14:06:32 +00:00
Silviu Baranga
39fbde60e1 [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV
Summary:
When the backedge taken codition is computed from an icmp, SCEV can
deduce the backedge taken count only if one of the sides of the icmp
is an AddRecExpr. However, due to sign/zero extensions, we sometimes
end up with something that is not an AddRecExpr.

However, we can use SCEV predicates to produce a 'guarded' expression.
This change adds a method to SCEV to get this expression, and the
SCEV predicate associated with it.

In HowManyGreaterThans and HowManyLessThans we will now add a SCEV
predicate associated with the guarded backedge taken count when the
analyzed SCEV expression is not an AddRecExpr. Note that we only do
this as an alternative to returning a 'CouldNotCompute'.

We use new feature in Loop Access Analysis and LoopVectorize to analyze
and transform more loops.

Reviewers: anemet, mzolotukhin, hfinkel, sanjoy

Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265535 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 13:18:26 +00:00
James Molloy
cd309008e2 [VectorUtils] Don't try and truncate PHIs to a smaller bitwidth
We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means.

However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018.

This fixes PR17018.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264852 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-30 10:11:43 +00:00
Matthew Simpson
72b5335cac [LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262624 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-03 16:12:01 +00:00
Matthew Simpson
3dd74513a8 [LV] Vectorize first-order recurrences
This patch enables the vectorization of first-order recurrences. A first-order
recurrence is a non-reduction recurrence relation in which the value of the
recurrence in the current loop iteration equals a value defined in the previous
iteration. The load PRE of the GVN pass often creates these recurrences by
hoisting loads from within loops.

In this patch, we add a new recurrence kind for first-order phi nodes and
attempt to vectorize them if possible. Vectorization is performed by shuffling
the values for the current and previous iterations. The vectorization cost
estimate is updated to account for the added shuffle instruction.

Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D16197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261346 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-19 17:56:08 +00:00
Silviu Baranga
23340531a1 [LV] Add support for insertelt/extractelt processing during type truncation
Summary:
While shrinking types according to the required bits, we can
encounter insert/extract element instructions. This will cause us to
reach an llvm_unreachable statement.

This change adds support for truncating insert/extract element
operations, and adds a regression test.

Reviewers: jmolloy

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260893 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-15 15:38:17 +00:00
James Molloy
c48890e194 [DemandedBits] Revert r249687 due to PR26071
This regresses a test in LoopVectorize, so I'll need to go away and think about how to solve this in a way that isn't broken.

From the writeup in PR26071:

What's happening is that ComputeKnownZeroes is telling us that all bits except the LSB are zero. We're then deciding that only the LSB needs to be demanded from the icmp's inputs.

This is where we're wrong - we're assuming that after simplification the bits that were known zero will continue to be known zero. But they're not - during trivialization the upper bits get changed (because an XOR isn't shrunk), so the icmp fails.

The fault is in demandedbits - its contract does clearly state that a non-demanded bit may either be zero or one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259649 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-03 15:05:06 +00:00