1586 Commits

Author SHA1 Message Date
Anna Thomas
edafc389f1 [LV][LAA] Vectorize loop invariant values stored into loop invariant address
Summary:
We are overly conservative in loop vectorizer with respect to stores to loop
invariant addresses.
More details in https://bugs.llvm.org/show_bug.cgi?id=38546
This is the first part of the fix where we start with vectorizing loop invariant
values to loop invariant addresses.

This also includes changes to ORE for stores to invariant address.

Reviewers: anemet, Ayal, mkuper, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50665

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343028 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-25 20:57:20 +00:00
Warren Ristow
99ea666c23 [Loop Vectorizer] Abandon vectorization when no integer IV found
Support for vectorizing loops with secondary floating-point induction
variables was added in r276554.  A primary integer IV is still required
for vectorization to be done.  If an FP IV was found, but no integer IV
was found at all (primary or secondary), the attempt to vectorize still
went forward, causing a compiler-crash.  This change abandons that
attempt when no integer IV is found.  (Vectorizing FP-only cases like
this, rather than bailing out, is discussed as possible future work
in D52327.)

See PR38800 for more information.

Differential Revision: https://reviews.llvm.org/D52327


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342786 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-21 23:03:50 +00:00
Matt Arsenault
9ae72b778a LSV: Fix adjust alloca alignment trick for AMDGPU
This was checking the hardcoded address space 0 for the stack.
Additionally, this should be checking for legality with
the adjusted alignment, so defer the alignment check.

Also try to split if the unaligned access isn't allowed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342442 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-18 02:05:44 +00:00
Hideki Saito
5d8f7dea91 Fix for the buildbot failure http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/23635
from the commit (r342197) of https://reviews.llvm.org/D50820.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342201 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-14 02:02:57 +00:00
Hideki Saito
a858f4fe0b [VPlan] Implement initial vector code generation support for simple outer loops.
Summary:
[VPlan] Implement vector code generation support for simple outer loops.

Context: Patch Series #1 for outer loop vectorization support in LV  using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).
                                                          
This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following:

  - force vector code generation using explicit vectorize_width

  - add conservative early returns in cost model and other places for VPlanNativePath

  - add code for setting up outer loop inductions 

  - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches

  - support for generating uniform inner branches

We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. 

Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal

Reviewed By: fhahn, hsaito

Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D50820

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342197 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-14 00:36:00 +00:00
Florian Hahn
cf397e4d39 [LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC)
Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use
them for VPlan outside LoopVectorize.cpp

Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00

Reviewed By: rengolin, xbolva00

Differential Revision: https://reviews.llvm.org/D49488



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342027 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-12 08:01:57 +00:00
Vikram TV
dfb80c8b79 Move a transformation routine from LoopUtils to LoopVectorize.
Summary:
Move InductionDescriptor::transform() routine from LoopUtils to its only uses in LoopVectorize.cpp.
Specifically, the function is renamed as InnerLoopVectorizer::emitTransformedIndex().

This is a child to D51153.

Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341776 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-10 06:16:44 +00:00
Vikram TV
5d1d358120 Move createMinMaxOp() out of RecurrenceDescriptor.
Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51838

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341773 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-10 05:05:08 +00:00
Anna Thomas
78b2eddfbc [LV] Fix code gen for conditionally executed loads and stores
Fix a latent bug in loop vectorizer which generates incorrect code for
memory accesses that are executed conditionally. As pointed in review,
this bug definitely affects uniform loads and may affect conditional
stores that should have turned into scatters as well).

The code gen for conditionally executed uniform loads on architectures
that support masked gather instructions is broken.

Without this patch, we were unconditionally executing the *conditional*
load in the vectorized version.

This patch does the following:
1. Uniform conditional loads on architectures with gather support will
   have correct code generated. In particular, the cost model
   (setCostBasedWideningDecision) is fixed.
2. For the recipes which are handled after the widening decision is set,
   we use the isScalarWithPredication(I, VF) form which is added in the
   patch.

3. Fix the vectorization cost model for scalarization
   (getMemInstScalarizationCost): implement and use isPredicatedInst to
   identify *all* predicated instructions, not just scalar+predicated. So,
   now the cost for scalarization will be increased for maskedloads/stores
   and gather/scatter operations. In short, we should be choosing the
   gather/scatter in place of scalarization on archs where it is
   profitable.
4. We needed to weaken the assert in useEmulatedMaskMemRefHack.

Reviewers: Ayal, hsaito, mkuper

Differential Revision: https://reviews.llvm.org/D51313

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341673 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-07 15:53:48 +00:00
Anna Thomas
a989ec75d4 [LV] First order recurrence phis should not be treated as uniform
This is fix for PR38786.
First order recurrence phis were incorrectly treated as uniform,
which caused them to be vectorized as uniform instructions.

Patch by Ayal Zaks and Orivej Desh!

Reviewed by: Anna

Differential Revision: https://reviews.llvm.org/D51639

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341416 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-04 22:12:23 +00:00
Matt Arsenault
bde9b3177b SLPVectorizer: Fix assert with different sized address spaces
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341215 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-31 14:34:53 +00:00
David Bolvansky
8bcab14697 [LoopVectorize][NFCI] Use find instead of count
Summary:
Avoid "count" if possible -> use "find" to check for the existence of keys.

Passed llvm test suite.

Reviewers: fhahn, dcaballe, mkuper, rengolin

Reviewed By: fhahn

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340563 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-23 18:34:58 +00:00
Anna Thomas
099d4ece4f [LV] Vectorize loops where non-phi instructions used outside loop
Summary:
Follow up change to rL339703, where we now vectorize loops with non-phi
instructions used outside the loop. Note that the cyclic dependency
identification occurs when identifying reduction/induction vars.

We also need to identify that we do not allow users where the PSCEV information
within and outside the loop are different. This was the fix added in rL307837
for PR33706.

Reviewers: Ayal, mkuper, fhahn

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D50778

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340278 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-21 14:40:27 +00:00
Anna Thomas
a3fdc1d293 NFC: Clarify comment in loop vectorization legality
Clarifying the comment about PSCEV and external IV users by referencing
the bug in question.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339722 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-14 20:25:13 +00:00
Anna Thomas
1464f16217 [LV] Teach about non header phis that have uses outside the loop
Summary:
This patch teaches the loop vectorizer to vectorize loops with non
header phis that have have outside uses.  This is because the iteration
dependence distance for these phis can be widened upto VF (similar to
how we do for induction/reduction) if they do not have a cyclic
dependence with header phis. When identifying reduction/induction/first
order recurrence header phis, we already identify if there are any cyclic
dependencies that prevents vectorization.

The vectorizer is taught to extract the last element from the vectorized
phi and update the scalar loop exit block phi to contain this extracted
element from the vector loop.

This patch can be extended to vectorize loops where instructions other
than phis have outside uses.

Reviewers: Ayal, mkuper, mssimpso, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339703 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-14 18:22:19 +00:00
Alexey Bataev
0b9e7ad06c [SLP] Fix insert point for reused extract instructions.
Summary:
Reworked the previously committed patch to insert shuffles for reused
extract element instructions in the correct position. Previous logic was
incorrect, and might lead to the crash with PHIs and EH instructions.

Reviewers: efriedma, javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339166 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-07 19:21:05 +00:00
Alexey Bataev
1ae9ed698a [SLP] Fix PR38339: Instruction does not dominate all uses!
Summary:
If the ExtractElement instructions can be optimized out during the
vectorization and we need to reshuffle the parent vector, this
ShuffleInstruction may be inserted in the wrong place causing compiler
to produce incorrect code.

Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49928

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338380 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-31 14:02:43 +00:00
Diego Caballero
b2970fad9b [VPlan] Introduce VPLoopInfo analysis.
The patch introduces loop analysis (VPLoopInfo/VPLoop) for VPBlockBases.
This analysis will be necessary to perform some H-CFG transformations and
detect and introduce regions representing a loop in the H-CFG.

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn 

Differential Revision: https://reviews.llvm.org/D48816



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338346 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-31 01:57:29 +00:00
Diego Caballero
00f1a051ef [VPlan] Introduce VPlan-based dominator analysis.
The patch introduces dominator analysis for VPBlockBases and extend
VPlan's GraphTraits specialization with the required interfaces. Dominator
analysis will be necessary to perform some H-CFG transformations and
to introduce VPLoopInfo (LoopInfo analysis on top of the VPlan representation).

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48815



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338310 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-30 21:33:31 +00:00
Fangrui Song
af7b1832a0 Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338293 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-30 19:41:25 +00:00
Anastasis Grammenos
e4b8312715 Revert "[LV][DebugInfo] Set DL to the middle block Icmp instruction"
This reverts commit r338106.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338109 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-27 08:22:54 +00:00
Anastasis Grammenos
951194fd1c [LV][DebugInfo] Set DL to the middle block Icmp instruction
Reviewers: hsaito

Differential Revision: https://reviews.llvm.org/D49746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338106 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-27 07:12:44 +00:00
Fangrui Song
77618f2a51 [LoadStoreVectorizer] Use const reference
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337992 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-26 01:11:36 +00:00
Roman Tereshin
406c0440c5 [LSV] Look through selects for consecutive addresses
In some cases LSV sees (load/store _ (select _ <pointer expression>
<pointer expression>)) patterns in input IR, often due to sinking and
other forms of CFG simplification, sometimes interspersed with
bitcasts and all-constant-indices GEPs. With this
patch`areConsecutivePointers` method would attempt to handle select
instructions. This leads to an increased number of successful
vectorizations.

Technically, select instructions could appear in index arithmetic as
well, however, we don't see those in our test suites / benchmarks.
Also, there is a lot more freedom in IR shapes computing integral
indices in general than in what's common in pointer computations, and
it appears that it's quite unreliable to do anything short of making
select instructions first class citizens of Scalar Evolution, which
for the purposes of this patch is most definitely an overkill.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D49428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337965 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-25 21:33:00 +00:00
Hideki Saito
222b8becdd [LV] Fix for PR38110, LV encountered llvm_unreachable()
Summary: truncateToMinimalBitWidths() doesn't handle all Instructions and the worst case is compiler crash via llvm_unreachable(). Fix is to add a case to handle PHINode and changed the worst case to NO-OP (from compiler crash).

Reviewers: sbaranga, mssimpso, hsaito

Reviewed By: hsaito

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49461

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337861 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-24 22:30:31 +00:00
Roman Tereshin
6bb56ed117 Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reapplies commit r337489 reverted by r337541
Additionally, this commit contains a speculative fix to the issue reported in r337541
(the report does not contain an actionable reproducer, just a stack trace)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337606 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 20:10:04 +00:00
Sam McCall
1642979851 Revert "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reverts commit r337489.
It causes asserts to fire in some TensorFlow tests, e.g.
tensorflow/compiler/tests/gather_test.py on GPU.

Example stack trace:
Start test case: GatherTest.testHigherRank
assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits"
    @     0x5559446ebe10  __assert_fail
    @     0x55593ef32f5e  llvm::APInt::trunc()
    @     0x55593d78f86e  (anonymous namespace)::Vectorizer::lookThroughComplexAddresses()
    @     0x55593d78f2bc  (anonymous namespace)::Vectorizer::areConsecutivePointers()
    @     0x55593d78d128  (anonymous namespace)::Vectorizer::isConsecutiveAccess()
    @     0x55593d78c926  (anonymous namespace)::Vectorizer::vectorizeInstructions()
    @     0x55593d78c221  (anonymous namespace)::Vectorizer::vectorizeChains()
    @     0x55593d78b948  (anonymous namespace)::Vectorizer::run()
    @     0x55593d78b725  (anonymous namespace)::LoadStoreVectorizer::runOnFunction()
    @     0x55593edf4b17  llvm::FPPassManager::runOnFunction()
    @     0x55593edf4e55  llvm::FPPassManager::runOnModule()
    @     0x55593edf563c  (anonymous namespace)::MPPassManager::runOnModule()
    @     0x55593edf5137  llvm::legacy::PassManagerImpl::run()
    @     0x55593edf5b71  llvm::legacy::PassManager::run()
    @     0x55593ced250d  xla::gpu::IrDumpingPassManager::run()
    @     0x55593ced5033  xla::gpu::(anonymous namespace)::EmitModuleToPTX()
    @     0x55593ced40ba  xla::gpu::(anonymous namespace)::CompileModuleToPtx()
    @     0x55593ced33d0  xla::gpu::CompileToPtx()
    @     0x55593b26b2a2  xla::gpu::NVPTXCompiler::RunBackend()
    @     0x55593b21f973  xla::Service::BuildExecutable()
    @     0x555938f44e64  xla::LocalService::CompileExecutable()
    @     0x555938f30a85  xla::LocalClient::Compile()
    @     0x555938de3c29  tensorflow::XlaCompilationCache::BuildExecutable()
    @     0x555938de4e9e  tensorflow::XlaCompilationCache::CompileImpl()
    @     0x555938de3da5  tensorflow::XlaCompilationCache::Compile()
    @     0x555938c5d962  tensorflow::XlaLocalLaunchBase::Compute()
    @     0x555938c68151  tensorflow::XlaDevice::Compute()
    @     0x55593f389e1f  tensorflow::(anonymous namespace)::ExecutorState::Process()
    @     0x55593f38a625  tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()()
*** SIGABRT received by PID 7798 (TID 7837) from PID 7798; ***

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337541 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 12:03:00 +00:00
Roman Tereshin
b2f9f92413 [LSV] Refactoring + supporting bitcasts to a type of different size
This is mostly a preparation work for adding a limited support for
select instructions. It proved to be difficult to do due to size and
irregularity of Vectorizer::isConsecutiveAccess, this is fixed here I
believe.

It also turned out that these changes make it simpler to finish one of
the TODOs and fix a number of other small issues, namely:

1. Looking through bitcasts to a type of a different size (requires
careful tracking of the original load/store size and some math
converting sizes in bytes to expected differences in indices of GEPs).

2. Reusing partial analysis of pointers done by first attempt in proving
them consecutive instead of starting from scratch. This added limited
support for nested GEPs co-existing with difficult sext/zext
instructions. This also required a careful handling of negative
differences between constant parts of offsets.

3. Handing a case where the first pointer index is not an add, but
something else (a function parameter for instance).

I observe an increased number of successful vectorizations on a large
set of shader programs. Only few shaders are affected, but those that
are affected sport >5% less loads and stores than before the patch.

Reviewed By: rampitec

Differential-Revision: https://reviews.llvm.org/D49342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337489 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 19:42:43 +00:00
Farhana Aleen
ae72a8c570 [LoadStoreVectorizer] Use getMinusScev() to compute the distance between two pointers.
Summary: Currently, isConsecutiveAccess() detects two pointers(PtrA and PtrB) as consecutive by
         comparing PtrB with BaseDelta+PtrA. This works when both pointers are factorized or
         both of them are not factorized. But isConsecutiveAccess() fails if one of the
         pointers is factorized but the other one is not.

         Here is an example:
         PtrA = 4 * (A + B)
         PtrB = 4 + 4A + 4B

         This patch uses getMinusSCEV() to compute the distance between two pointers.
         getMinusSCEV() allows combining the expressions and computing the simplified distance.

Author: FarhanaAleen

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D49516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337471 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 16:50:27 +00:00
Simon Pilgrim
1a0909c7b2 [SLPVectorizer] Avoid duplicate scalar cost calculations in BoUpSLP::getEntryCost. NFCI.
Pulled out from D49225, we have a lot of repeated scalar cost calculations, often with arguments that don't look the same but turn out to be.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337390 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 13:53:55 +00:00
Simon Pilgrim
de720479bb [SLPVectorizer] Don't attempt horizontal reduction on pointer types (PR38191)
TTI::getMinMaxReductionCost typically can't handle pointer types - until this is changed its better to limit horizontal reduction to integer/float vector types only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337280 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 13:43:33 +00:00
Simon Pilgrim
1e086c7b69 [SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED-2)
We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Reapplied with fix to only accept 2 different casts if they come from the same source type (PR38154).

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336989 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 11:09:52 +00:00
Martin Storsjo
54919303bf Revert "[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED)"
This reverts commit r336812, which broke compilation of a number
of projects, see PR38154.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336949 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 21:33:42 +00:00
Simon Pilgrim
33f4d61062 [SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED)
We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Reapplied with fix to only accept 2 different casts if they come from the same source type.

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336812 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 15:05:10 +00:00
Simon Pilgrim
191ae9ef3c Revert rL336804: [SLPVectorizer] Add initial alternate opcode support for cast instructions.
Reverting due to buildbot failures

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336806 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 14:08:16 +00:00
Simon Pilgrim
71b0da15d2 [SLPVectorizer] Add initial alternate opcode support for cast instructions.
We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336804 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 13:34:09 +00:00
Anastasis Grammenos
c1175857e2 [DebugInfo][LoopVectorize] Preserve DL in induction PHI and Add
Differential Revision: https://reviews.llvm.org/D48968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336667 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 13:29:50 +00:00
Diego Caballero
dc4af2458e [VPlan][LV] Introduce condition bit in VPBlockBase
This patch introduces a VPValue in VPBlockBase to represent the condition
bit that is used as successor selector when a block has multiple successors.
This information wasn't necessary until now, when we are about to introduce
outer loop vectorization support in VPlan code gen.

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48814



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336554 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 15:57:09 +00:00
Simon Pilgrim
7029d5a52f [SLPVectorizer] Begin abstracting InstructionsState alternate matching away from opcodes. NFCI.
This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism.

Differential Revision: https://reviews.llvm.org/D48945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336344 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 12:30:44 +00:00
Simon Pilgrim
79d7b5cd91 Fix some irregular whitespace/indentation. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336291 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 17:24:05 +00:00
Anastasis Grammenos
fb31c6481f [DebugInfo][LoopVectorize] Preserve DL in generated phi instruction
When creating `phi` instructions to resume at the scalar part of the loop,
copy the DebugLoc from the original phi over to the new one.

Differential Revision: https://reviews.llvm.org/D48769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336256 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 10:16:55 +00:00
Farhana Aleen
13f7859c20 [SLP] Recognize min/max pattern using instructions producing same values.
Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization.

         %1 = extractelement <2 x i32> %a, i32 0
         %2 = extractelement <2 x i32> %a, i32 1
         %cond = icmp sgt i32 %1, %2
         %3 = extractelement <2 x i32> %a, i32 0
         %4 = extractelement <2 x i32> %a, i32 1
         %select = select i1 %cond, i32 %3, i32 %4

Author: FarhanaAleen

Reviewed By: ABataev, RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D47608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:55:31 +00:00
Simon Pilgrim
a40b37f909 [SLPVectorizer] Remove nullptr early-outs from Instruction::ShuffleVector getEntryCost
This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336102 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 13:41:29 +00:00
Simon Pilgrim
386f15c93a [SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns.
We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case.

This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336095 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 11:28:01 +00:00
Simon Pilgrim
a43dcd4394 [SLPVectorizer] Only Alternate opcodes use ShuffleVector cases for getEntryCost/vectorizeTree. NFCI.
Add assertions - we're already assuming this in how we use the AltOpcode and treat everything as BinaryOperators.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336092 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 10:54:19 +00:00
Simon Pilgrim
a634a07771 [SLPVectorizer] Call InstructionsState.isOpcodeOrAlt with Instruction instead of an opcode. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336069 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 20:22:46 +00:00
Simon Pilgrim
b1dbeaa9ce [SLPVectorizer] Replace sameOpcodeOrAlt with InstructionsState.isOpcodeOrAlt helper. NFCI.
This is a basic step towards matching more general instructions types than just opcodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336068 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 20:07:30 +00:00
Simon Pilgrim
4c7a6ba2a4 [SLPVectorizer] Use InstructionsState Op/Alt opcodes directly. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336063 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-01 13:41:58 +00:00
Simon Pilgrim
40be0055ae [SLPVectorizer] Recognise non uniform power of 2 constants
Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them.

As SLP works with arrays of values I don't think we can easily use the pattern match helpers here.

Differential Revision: https://reviews.llvm.org/D48214

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335621 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 16:20:16 +00:00
Simon Pilgrim
5a40cf8639 [SLPVectorizer] Support alternate opcodes in tryToVectorizeList
Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree.

NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc.

Differential Revision: https://reviews.llvm.org/D48488

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335364 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-22 16:37:34 +00:00