Commit Graph

1386 Commits

Author SHA1 Message Date
Hans Wennborg
1809c66b66 Revert r313771 "[SLP] Vectorize jumbled memory loads."
This broke the buildbots, e.g.
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask'
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Subscribers: mzolotukhin
>
> Reviewed By: ayal
>
> Differential Revision: https://reviews.llvm.org/D36130
>
> Review comments updated accordingly
>
> Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0
>
> Added a TODO for sortLoadAccesses API
>
> Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58
>
> Modified the TODO for sortLoadAccesses API
>
> Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565
>
> Review comment update for using OpdNum to insert the mask in respective location
>
> Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce
>
> Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase
>
> Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313781 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 18:00:03 +00:00
Mohammad Shahid
46e0b67b99 [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask'
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Subscribers: mzolotukhin

Reviewed By: ayal

Differential Revision: https://reviews.llvm.org/D36130

Review comments updated accordingly

Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0

Added a TODO for sortLoadAccesses API

Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58

Modified the TODO for sortLoadAccesses API

Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565

Review comment update for using OpdNum to insert the mask in respective location

Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce

Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase

Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313771 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 17:19:57 +00:00
Alexander Kornienko
e1631a5af7 Revert r313736: "[SLP] Vectorize jumbled memory loads."
The revision breaks buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313758 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 14:53:07 +00:00
Mohammad Shahid
0acc54b75c [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

Commit after rebase for patch D36130

Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313736 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 08:18:28 +00:00
Adam Nemet
093624c347 Allow ORE.emit to take a closure to delay building the remark object
In the lambda we are now returning the remark by value so we need to preserve
its type in the insertion operator.  This requires making the insertion
operator generic.

I've also converted a few cases to use the new API.  It seems to work pretty
well.  See the LoopUnroller for a slightly more interesting case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313691 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 23:00:55 +00:00
Sanjay Patel
cfcdb248ec [SLP] clean up for vector store case; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313541 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 16:20:15 +00:00
Chandler Carruth
f405097e96 [SLP] Revert r312791 and other necessary commits, except for TTI and
CostModel.

The original patch added support for horizontal min/max reductions to
the SLP vectorizer.

This patch causes LLVM to miscompile fairly simple signed min
reductions. I have attached a test progrom to http://llvm.org/PR34635
that shows the behavior change after this patch. We found this in a test
for the open source Eigen library, but also in other code.

Unfortunately, the revert is moderately challenging. It required
reverting:
r313042: [SLP] Test with multiple uses of conditional op and wrong parent.
r312853: [SLP] Fix buildbots, NFC.
r312793: [SLP] Fix the warning about paths not returning the value, NFC.
r312791: [SLP] Support for horizontal min/max reduction.

And even then, I had to completely skip reverting the changes to TTI and
CostModel because r312832 rewrote so much of this code. Plus, the cost
modeling changes aren implicated in the miscompile, so they should be
fine and will just not be used until this gets re-introduced.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313409 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 22:23:27 +00:00
Vivek Pandya
18b4c37d1e This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352
It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with
command line options.
The diagnostic handler used to be callback now this patch adds a class
DiagnosticHandler. It has virtual method to provide custom diagnostic handler
and methods to control which particular remarks are enabled. 
However LLVM-C API users can still provide callback function for diagnostic handler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313390 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 20:10:09 +00:00
Vivek Pandya
4abccff981 This reverts r313381
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313387 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 19:53:54 +00:00
Vivek Pandya
bb8204f26f This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352
It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with
command line options.
The diagnostic handler used to be callback now this patch adds a class
DiagnosticHandler. It has virtual method to provide custom diagnostic handler
and methods to control which particular remarks are enabled. 
However LLVM-C API users can still provide callback function for diagnostic handler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313382 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 19:30:59 +00:00
Ilya Biryukov
653f60b6a3 Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
This reverts commit r313348.

Reason: it caused buildbot failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313352 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 10:15:00 +00:00
Dinar Temirbulatov
1dc1d99a0e [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:

void add1(int * __restrict dst, const int * __restrict src) {
  *dst++ = *src++;
  *dst++ = *src++ + 1;
  *dst++ = *src++ + 2;
  *dst++ = *src++ + 3;
}
Allows to vectorize even if the very first operation is not a binary add, but just a load.

Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide

Subscribers: llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D28907


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313348 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 06:56:39 +00:00
Dinar Temirbulatov
16aa13f7f2 [SLPVectorizer] Remove duplicated functionality code in initScheduleData function, NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313341 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 04:31:54 +00:00
Alon Kom
dde48e1948 [LV] Fix maximum legal VF calculation
This patch fixes pr34283, which exposed that the computation of
maximum legal width for vectorization was wrong, because it relied
on MaxInterleaveFactor to obtain the maximum stride used in the loop,
however not all strided accesses in the loop have an interleave-group
associated with them.
Instead of recording the maximum stride in the loop, which can be over
conservative (e.g. if the access with the maximum stride is not involved
in the dependence limitation), this patch tracks the actual maximum legal
width imposed by accesses that are involved in dependencies.

Differential Revision: https://reviews.llvm.org/D37507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313237 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 07:40:02 +00:00
Dinar Temirbulatov
e2d706b202 [SLPVectorizer] Prefer auto over explicit type for VL0, NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313228 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 04:28:35 +00:00
Anna Thomas
fb0da33fc1 [LV] Avoid computing the register usage for default VF. NFC
These are changes to reduce redundant computations when calculating a
feasible vectorization factor:
1. early return when target has no vector registers
2. don't compute register usage for the default VF.

Suggested during review for D37702.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313176 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 19:35:45 +00:00
Ayal Zaks
3c0fd8c54b [LV] Fix PR34523 - avoid generating redundant selects
When converting a PHI into a series of 'select' instructions to combine the
incoming values together according their edge masks, initialize the first
value to the incoming value In0 of the first predecessor, instead of
generating a redundant assignment 'select(Cond[0], In0, In0)'. The latter
fails when the Cond[0] mask is null, representing a full mask, which can
happen only when there's a single incoming value.

No functional changes intended nor expected other than surviving null Cond[0]'s.

This fix follows D35725, which introduced using null to represent full masks.

Differential Revision: https://reviews.llvm.org/D37619


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313119 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 06:28:37 +00:00
Anna Thomas
2691122302 [LV] Clamp the VF to the trip count
Summary:
When the MaxVectorSize > ConstantTripCount, we should just clamp the
vectorization factor to be the ConstantTripCount.
This vectorizes loops where the TinyTripCountThreshold >= TripCount < MaxVF.

Earlier we were finding the maximum vector width, which could be greater than
the trip count itself. The Loop vectorizer does all the work for generating a
vectorizable loop, but in the end we would always choose the scalar loop (since
the VF > trip count). This allows us to choose the VF keeping in mind the trip
count if available.

This is a fix on top of rL312472.

Reviewers: Ayal, zvi, hfinkel, dneilson

Reviewed by: Ayal

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37702

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313046 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-12 16:32:45 +00:00
Alexey Bataev
33aba127b4 [SLP] Fix for PHINode during horizontal reduction scanning, NFC.
Reduces number of loops during instructions analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313035 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-12 15:13:50 +00:00
Alexey Bataev
c2b919492b [SLP] Fix buildbots, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312853 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-09 02:08:45 +00:00
Dinar Temirbulatov
996c1871f5 [SLPVectorizer] Add struct InstructionsState that holds information about analysis of vector to be vectorized.
Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D37212


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312802 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 17:08:17 +00:00
Alexey Bataev
c95cfa7758 [SLP] Fix the warning about paths not returning the value, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312793 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 14:32:20 +00:00
Alexey Bataev
4fcc7e8528 [SLP] Support for horizontal min/max reduction.
SLP vectorizer supports horizontal reductions for Add/FAdd binary
operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for
binary operation reductions and getMinMaxReductionCost() for min/max
reductions.
Patch fixes PR26956.

Differential revision: https://reviews.llvm.org/D27846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312791 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 13:49:36 +00:00
Zvi Rackover
3438d07f09 LoopVectorize: MaxVF should not be larger than the loop trip count
Summary:
Improve how MaxVF is computed while taking into account that MaxVF should not be larger than the loop's trip count.

Other than saving on compile-time by pruning the possible MaxVF candidates, this patch fixes pr34438 which exposed the following flow:
1. Short trip count identified -> Don't bail out, set OptForSize:=True to avoid tail-loop and runtime checks.
2. Compute MaxVF returned 16 on a target supporting AVX512.
3. OptForSize -> choose VF:=MaxVF.
4. Bail out because TripCount = 8, VF = 16, TripCount % VF !=0 means we need a tail loop.

With this patch step 2. will choose MaxVF=8 based on TripCount.

Reviewers: Ayal, dorit, mkuper, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D37425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312472 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 08:35:13 +00:00
Benjamin Kramer
6bf02adee0 [LoopVectorize] Turn static DenseSet into switch.
LLVM transforms this into a bit test which is a lot faster and smaller.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312417 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-02 16:41:55 +00:00
Eugene Zelenko
cecd8f18e2 [Analysis, Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312383 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 21:37:29 +00:00
Manoj Gupta
5d292d77f5 [LoopVectorizer] Use two step casting for float to pointer types.
Summary:
LoopVectorizer is creating casts between vec<ptr> and vec<float> types
on ARM when compiling OpenCV. Since, tIs is illegal to directly cast a
floating point type to a pointer type even if the types have same size
causing a crash. Fix the crash using a two-step casting by bitcasting
to integer and integer to pointer/float.
Fixes PR33804.

Reviewers: mkuper, Ayal, dlj, rengolin, srhines

Reviewed By: rengolin

Subscribers: aemerson, kristof.beyls, mkazantsev, Meinersbur, rengolin, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312331 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 15:36:00 +00:00
Dinar Temirbulatov
7e05f0efc2 [SLPVectorizer] Move out Entry->NeedToGather check and assert of inner loop as invariant, NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312242 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 14:10:07 +00:00
Sanjay Patel
1bf0915d7a [Instruction] add moveAfter() convenience function; NFCI
As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(),
but implemented using moveBefore() for symmetry/efficiency.

Differential Revision: https://reviews.llvm.org/D37239


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312001 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-29 14:07:48 +00:00
Ayal Zaks
a183d6cf2a [LV] Fix PR34248 - recommit D32871 after revert r311304
Original commit r311077 of D32871 was reverted in r311304 due to failures
reported in PR34248.

This recommit fixes PR34248 by restricting the packing of predicated scalars
into vectors only when vectorizing, avoiding doing so when unrolling w/o
vectorizing. Added a test derived from the reproducer of PR34248.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311849 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 12:55:46 +00:00
Chandler Carruth
3f7e6da696 Revert r311077: [LV] Using VPlan ...
This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311304 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 23:17:11 +00:00
Aditya Kumar
70fb4705b4 [Loop Vectorize] Added a separate metadata
Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.

Patch written by Divya Shanmughan and Aditya Kumar

Differential Revision: https://reviews.llvm.org/D36220

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311281 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 10:32:41 +00:00
Chandler Carruth
33ba3ea4de [SLP] Fix an unused variable warning in non-asserts builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311227 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 05:06:23 +00:00
Dinar Temirbulatov
484d59e444 [SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311223 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 03:15:07 +00:00
Dinar Temirbulatov
ef0eca1bd9 [SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions.
Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36766


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311221 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 02:54:20 +00:00
Ayal Zaks
cd8f8f7fd4 [LV] Using VPlan to model the vectorized code and drive its transformation
VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This
patch introduces the VPlan model into LV and uses it to represent the vectorized
code and drive the generation of vectorized IR.

In this patch VPlan models the vectorized loop body: the vectorized control-flow
is represented using VPlan's Hierarchical CFG, with predication refactored from
being a post-vectorization-step into a vectorization planning step modeling
if-then VPRegionBlocks, and generating code inline with non-predicated code. The
vectorized code within each VPBasicBlock is represented as a sequence of
Recipes, each responsible for modelling and generating a sequence of IR
instructions. To keep the size of this commit manageable the Recipes in this
patch are coarse-grained and capture large chunks of LV's code-generation logic.
The constructed VPlans are dumped in dot format under -debug.

This commit retains current vectorizer output, except for minor instruction
reorderings; see associated modifications to lit tests.

For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst
and its references.

Authors: Gil Rapaport and Ayal Zaks

Differential Revision: https://reviews.llvm.org/D32871


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311077 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 09:29:59 +00:00
Rui Ueyama
979a8cab62 Fix -Wunused-lambda-capture for Release build.
`I` and `this` are used only in assert or DEBUG, so they are unused
in Release build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310934 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-15 17:39:35 +00:00
Ayal Zaks
745921f87f [LV] Minor savings to Sink casts to unravel first order recurrence
Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it
is not needed.

Differential Revision: https://reviews.llvm.org/D36408


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310910 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-15 08:32:59 +00:00
Dinar Temirbulatov
822b8dab31 [SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0,
replace E->Scalars[0] to VL0, NFCI.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310904 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-15 00:31:49 +00:00
Dinar Temirbulatov
86316b8f46 [SLPVectorizer] Schedule bundle with different opcodes.
This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ]

Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36518


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310847 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-14 15:40:16 +00:00
Anna Thomas
0428548839 [LoopVectorize] Fix assertion failure in Fcmp vectorization
Summary:
When vectorizing fcmps we can trip on incorrect cast assertion when setting the
FastMathFlags after generating the vectorized FCmp.
This can happen if the FCmp can be folded to true or false directly. The fix
here is to set the FastMathFlag using the FastMathFlagBuilder *before* creating
the FCmp Instruction. This is what's done by other optimizations such as
InstCombine.
Added a test case which trips on cast assertion without this patch.

Reviewers: Ayal, mssimpso, mkuper, gilr

Reviewed by: Ayal, mssimpso

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D36244

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310389 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-08 18:07:44 +00:00
Alexey Bataev
a64c8c9af1 [SLP] General improvements of SLP vectorization process.
Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does:

1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts.
2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the
array. This array is processed only after the vectorization of the
first-after-these instructions key node is finished. Vectorization goes
in reverse order to try to vectorize as much code as possible.

Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon

Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits

Differential Revision: https://reviews.llvm.org/D29826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310260 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-07 15:25:49 +00:00
Alexey Bataev
20e83eb193 Revert "[SLP] General improvements of SLP vectorization process."
This reverts commit r310255.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310257 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-07 14:51:52 +00:00
Alexey Bataev
4362efa895 [SLP] General improvements of SLP vectorization process.
Summary:
Patch tries to improve two-pass vectorization analysis, existing in SLP vectorizer. What it does:
1. Defines key nodes, that are the vectorization roots. Previously vectorization started if StoreInst or ReturnInst is found. For now, the vectorization started for all Instructions with no users and void types (Terminators, StoreInst) + CallInsts.
2. CmpInsts, InsertElementInsts and InsertValueInsts are stored in the array. This array is processed only after the vectorization of the first-after-these instructions key node is finished. Vectorization goes in reverse order to try to vectorize as much code as possible.

Reviewers: mzolotukhin, Ayal, mkuper, gilr, hfinkel, RKSimon

Subscribers: ashahid, anemet, RKSimon, mssimpso, llvm-commits

Differential Revision: https://reviews.llvm.org/D29826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310255 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-07 14:03:17 +00:00
Dinar Temirbulatov
f24c4662b3 [SLPVectorizer] Add extra parameter to setInsertPointAfterBundle to handle different opcodes, NFCI.
Differential Revision: https://reviews.llvm.org/D35769


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310183 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-05 18:43:52 +00:00
Matt Arsenault
c8df92092d LV: Don't insert runtime ptr checks on divergent targets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309890 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 21:43:08 +00:00
Alexey Bataev
340067cec4 [SLPVectorizer] Generalize interface of functions, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309816 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 14:38:07 +00:00
Alexey Bataev
b51029d1f1 [SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on extracted elements
Summary:
Currently most of the time vectors of extractelement instructions are
treated as scalars that must be gathered into vectors. But in some
cases, like when we have extractelement instructions from single vector
with different constant indeces or from 2 vectors of the same size, we
can treat this operations as shuffle of a single vector or blending of 2
vectors.
```
define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) {
  %x0 = extractelement <2 x i8> %x, i32 0
  %y1 = extractelement <2 x i8> %y, i32 1
  %x0x0 = mul i8 %x0, %x0
  %y1y1 = mul i8 %y1, %y1
  %ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0
  %ins2 = insertelement <2 x i8> %ins1, i8 %y1y1, i32 1
  ret <2 x i8> %ins2
}
```
can be converted to something like
```
define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) {
  %1 = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i32> <i32 0, i32 3>
  %2 = mul <2 x i8> %1, %1
  ret <2 x i8> %2
}
```
Currently this type of conversion is considered as high cost
transformation.

Reviewers: mzolotukhin, delena, mkuper, hfinkel, RKSimon

Subscribers: ashahid, RKSimon, spatel, llvm-commits

Differential Revision: https://reviews.llvm.org/D30200

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309812 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 13:25:26 +00:00
Davide Italiano
4181790cb5 [SLPVectorizer] Unbreak the build with -Werror.
GCC was complaining about `&&` within `||` without explicit
parentheses. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309606 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 19:14:19 +00:00
Alexey Bataev
837b97fb9a [SLP] Initial rework for min/max horizontal reduction vectorization, NFC.
Summary: All getReductionCost() functions are renamed to getArithmeticReductionCost() + added basic infrastructure to handle non-binary reduction operations.

Reviewers: spatel, mzolotukhin, Ayal, mkuper, gilr, hfinkel

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D29402

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309566 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 14:36:05 +00:00