archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Anna Thomas	edafc389f1	[LV][LAA] Vectorize loop invariant values stored into loop invariant address Summary: We are overly conservative in loop vectorizer with respect to stores to loop invariant addresses. More details in https://bugs.llvm.org/show_bug.cgi?id=38546 This is the first part of the fix where we start with vectorizing loop invariant values to loop invariant addresses. This also includes changes to ORE for stores to invariant address. Reviewers: anemet, Ayal, mkuper, mssimpso Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50665 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343028 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-25 20:57:20 +00:00
Warren Ristow	99ea666c23	[Loop Vectorizer] Abandon vectorization when no integer IV found Support for vectorizing loops with secondary floating-point induction variables was added in r276554. A primary integer IV is still required for vectorization to be done. If an FP IV was found, but no integer IV was found at all (primary or secondary), the attempt to vectorize still went forward, causing a compiler-crash. This change abandons that attempt when no integer IV is found. (Vectorizing FP-only cases like this, rather than bailing out, is discussed as possible future work in D52327.) See PR38800 for more information. Differential Revision: https://reviews.llvm.org/D52327 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342786 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-21 23:03:50 +00:00
Matt Arsenault	9ae72b778a	LSV: Fix adjust alloca alignment trick for AMDGPU This was checking the hardcoded address space 0 for the stack. Additionally, this should be checking for legality with the adjusted alignment, so defer the alignment check. Also try to split if the unaligned access isn't allowed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342442 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-18 02:05:44 +00:00
Hideki Saito	5d8f7dea91	Fix for the buildbot failure http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/23635 from the commit (r342197) of https://reviews.llvm.org/D50820. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342201 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-14 02:02:57 +00:00
Hideki Saito	a858f4fe0b	[VPlan] Implement initial vector code generation support for simple outer loops. Summary: [VPlan] Implement vector code generation support for simple outer loops. Context: Patch Series #1 for outer loop vectorization support in LV using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following: - force vector code generation using explicit vectorize_width - add conservative early returns in cost model and other places for VPlanNativePath - add code for setting up outer loop inductions - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches - support for generating uniform inner branches We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal Reviewed By: fhahn, hsaito Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D50820 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342197 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-14 00:36:00 +00:00
Florian Hahn	cf397e4d39	[LV] Move InterleaveGroup and InterleavedAccessInfo to VectorUtils.h (NFC) Move the 2 classes out of LoopVectorize.cpp to make it easier to re-use them for VPlan outside LoopVectorize.cpp Reviewers: Ayal, mssimpso, rengolin, dcaballe, mkuper, hsaito, hfinkel, xbolva00 Reviewed By: rengolin, xbolva00 Differential Revision: https://reviews.llvm.org/D49488 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342027 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-12 08:01:57 +00:00
Vikram TV	dfb80c8b79	Move a transformation routine from LoopUtils to LoopVectorize. Summary: Move InductionDescriptor::transform() routine from LoopUtils to its only uses in LoopVectorize.cpp. Specifically, the function is renamed as InnerLoopVectorizer::emitTransformedIndex(). This is a child to D51153. Reviewers: dmgreen, llvm-commits Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D51837 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341776 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-10 06:16:44 +00:00
Vikram TV	5d1d358120	Move createMinMaxOp() out of RecurrenceDescriptor. Reviewers: dmgreen, llvm-commits Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D51838 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341773 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-10 05:05:08 +00:00
Anna Thomas	78b2eddfbc	[LV] Fix code gen for conditionally executed loads and stores Fix a latent bug in loop vectorizer which generates incorrect code for memory accesses that are executed conditionally. As pointed in review, this bug definitely affects uniform loads and may affect conditional stores that should have turned into scatters as well). The code gen for conditionally executed uniform loads on architectures that support masked gather instructions is broken. Without this patch, we were unconditionally executing the conditional load in the vectorized version. This patch does the following: 1. Uniform conditional loads on architectures with gather support will have correct code generated. In particular, the cost model (setCostBasedWideningDecision) is fixed. 2. For the recipes which are handled after the widening decision is set, we use the isScalarWithPredication(I, VF) form which is added in the patch. 3. Fix the vectorization cost model for scalarization (getMemInstScalarizationCost): implement and use isPredicatedInst to identify all predicated instructions, not just scalar+predicated. So, now the cost for scalarization will be increased for maskedloads/stores and gather/scatter operations. In short, we should be choosing the gather/scatter in place of scalarization on archs where it is profitable. 4. We needed to weaken the assert in useEmulatedMaskMemRefHack. Reviewers: Ayal, hsaito, mkuper Differential Revision: https://reviews.llvm.org/D51313 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341673 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-07 15:53:48 +00:00
Anna Thomas	a989ec75d4	[LV] First order recurrence phis should not be treated as uniform This is fix for PR38786. First order recurrence phis were incorrectly treated as uniform, which caused them to be vectorized as uniform instructions. Patch by Ayal Zaks and Orivej Desh! Reviewed by: Anna Differential Revision: https://reviews.llvm.org/D51639 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341416 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-04 22:12:23 +00:00
Matt Arsenault	bde9b3177b	SLPVectorizer: Fix assert with different sized address spaces git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341215 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-31 14:34:53 +00:00
David Bolvansky	8bcab14697	[LoopVectorize][NFCI] Use find instead of count Summary: Avoid "count" if possible -> use "find" to check for the existence of keys. Passed llvm test suite. Reviewers: fhahn, dcaballe, mkuper, rengolin Reviewed By: fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51054 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340563 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-23 18:34:58 +00:00
Anna Thomas	099d4ece4f	[LV] Vectorize loops where non-phi instructions used outside loop Summary: Follow up change to rL339703, where we now vectorize loops with non-phi instructions used outside the loop. Note that the cyclic dependency identification occurs when identifying reduction/induction vars. We also need to identify that we do not allow users where the PSCEV information within and outside the loop are different. This was the fix added in rL307837 for PR33706. Reviewers: Ayal, mkuper, fhahn Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D50778 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340278 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-21 14:40:27 +00:00
Anna Thomas	a3fdc1d293	NFC: Clarify comment in loop vectorization legality Clarifying the comment about PSCEV and external IV users by referencing the bug in question. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339722 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-14 20:25:13 +00:00
Anna Thomas	1464f16217	[LV] Teach about non header phis that have uses outside the loop Summary: This patch teaches the loop vectorizer to vectorize loops with non header phis that have have outside uses. This is because the iteration dependence distance for these phis can be widened upto VF (similar to how we do for induction/reduction) if they do not have a cyclic dependence with header phis. When identifying reduction/induction/first order recurrence header phis, we already identify if there are any cyclic dependencies that prevents vectorization. The vectorizer is taught to extract the last element from the vectorized phi and update the scalar loop exit block phi to contain this extracted element from the vector loop. This patch can be extended to vectorize loops where instructions other than phis have outside uses. Reviewers: Ayal, mkuper, mssimpso, efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50579 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339703 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-14 18:22:19 +00:00
Alexey Bataev	0b9e7ad06c	[SLP] Fix insert point for reused extract instructions. Summary: Reworked the previously committed patch to insert shuffles for reused extract element instructions in the correct position. Previous logic was incorrect, and might lead to the crash with PHIs and EH instructions. Reviewers: efriedma, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50143 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339166 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-07 19:21:05 +00:00
Alexey Bataev	1ae9ed698a	[SLP] Fix PR38339: Instruction does not dominate all uses! Summary: If the ExtractElement instructions can be optimized out during the vectorization and we need to reshuffle the parent vector, this ShuffleInstruction may be inserted in the wrong place causing compiler to produce incorrect code. Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49928 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338380 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 14:02:43 +00:00
Diego Caballero	b2970fad9b	[VPlan] Introduce VPLoopInfo analysis. The patch introduces loop analysis (VPLoopInfo/VPLoop) for VPBlockBases. This analysis will be necessary to perform some H-CFG transformations and detect and introduce regions representing a loop in the H-CFG. Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48816 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338346 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-31 01:57:29 +00:00
Diego Caballero	00f1a051ef	[VPlan] Introduce VPlan-based dominator analysis. The patch introduces dominator analysis for VPBlockBases and extend VPlan's GraphTraits specialization with the required interfaces. Dominator analysis will be necessary to perform some H-CFG transformations and to introduce VPLoopInfo (LoopInfo analysis on top of the VPlan representation). Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48815 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338310 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-30 21:33:31 +00:00
Fangrui Song	af7b1832a0	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338293 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-30 19:41:25 +00:00
Anastasis Grammenos	e4b8312715	Revert "[LV][DebugInfo] Set DL to the middle block Icmp instruction" This reverts commit r338106. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338109 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-27 08:22:54 +00:00
Anastasis Grammenos	951194fd1c	[LV][DebugInfo] Set DL to the middle block Icmp instruction Reviewers: hsaito Differential Revision: https://reviews.llvm.org/D49746 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338106 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-27 07:12:44 +00:00
Fangrui Song	77618f2a51	[LoadStoreVectorizer] Use const reference git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337992 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-26 01:11:36 +00:00
Roman Tereshin	406c0440c5	[LSV] Look through selects for consecutive addresses In some cases LSV sees (load/store _ (select _ <pointer expression> <pointer expression>)) patterns in input IR, often due to sinking and other forms of CFG simplification, sometimes interspersed with bitcasts and all-constant-indices GEPs. With this patch`areConsecutivePointers` method would attempt to handle select instructions. This leads to an increased number of successful vectorizations. Technically, select instructions could appear in index arithmetic as well, however, we don't see those in our test suites / benchmarks. Also, there is a lot more freedom in IR shapes computing integral indices in general than in what's common in pointer computations, and it appears that it's quite unreliable to do anything short of making select instructions first class citizens of Scalar Evolution, which for the purposes of this patch is most definitely an overkill. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D49428 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337965 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-25 21:33:00 +00:00
Hideki Saito	222b8becdd	[LV] Fix for PR38110, LV encountered llvm_unreachable() Summary: truncateToMinimalBitWidths() doesn't handle all Instructions and the worst case is compiler crash via llvm_unreachable(). Fix is to add a case to handle PHINode and changed the worst case to NO-OP (from compiler crash). Reviewers: sbaranga, mssimpso, hsaito Reviewed By: hsaito Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49461 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337861 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-24 22:30:31 +00:00
Roman Tereshin	6bb56ed117	Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size" This reapplies commit r337489 reverted by r337541 Additionally, this commit contains a speculative fix to the issue reported in r337541 (the report does not contain an actionable reproducer, just a stack trace) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337606 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-20 20:10:04 +00:00
Sam McCall	1642979851	Revert "[LSV] Refactoring + supporting bitcasts to a type of different size" This reverts commit r337489. It causes asserts to fire in some TensorFlow tests, e.g. tensorflow/compiler/tests/gather_test.py on GPU. Example stack trace: Start test case: GatherTest.testHigherRank assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits" @ 0x5559446ebe10 __assert_fail @ 0x55593ef32f5e llvm::APInt::trunc() @ 0x55593d78f86e (anonymous namespace)::Vectorizer::lookThroughComplexAddresses() @ 0x55593d78f2bc (anonymous namespace)::Vectorizer::areConsecutivePointers() @ 0x55593d78d128 (anonymous namespace)::Vectorizer::isConsecutiveAccess() @ 0x55593d78c926 (anonymous namespace)::Vectorizer::vectorizeInstructions() @ 0x55593d78c221 (anonymous namespace)::Vectorizer::vectorizeChains() @ 0x55593d78b948 (anonymous namespace)::Vectorizer::run() @ 0x55593d78b725 (anonymous namespace)::LoadStoreVectorizer::runOnFunction() @ 0x55593edf4b17 llvm::FPPassManager::runOnFunction() @ 0x55593edf4e55 llvm::FPPassManager::runOnModule() @ 0x55593edf563c (anonymous namespace)::MPPassManager::runOnModule() @ 0x55593edf5137 llvm::legacy::PassManagerImpl::run() @ 0x55593edf5b71 llvm::legacy::PassManager::run() @ 0x55593ced250d xla::gpu::IrDumpingPassManager::run() @ 0x55593ced5033 xla::gpu::(anonymous namespace)::EmitModuleToPTX() @ 0x55593ced40ba xla::gpu::(anonymous namespace)::CompileModuleToPtx() @ 0x55593ced33d0 xla::gpu::CompileToPtx() @ 0x55593b26b2a2 xla::gpu::NVPTXCompiler::RunBackend() @ 0x55593b21f973 xla::Service::BuildExecutable() @ 0x555938f44e64 xla::LocalService::CompileExecutable() @ 0x555938f30a85 xla::LocalClient::Compile() @ 0x555938de3c29 tensorflow::XlaCompilationCache::BuildExecutable() @ 0x555938de4e9e tensorflow::XlaCompilationCache::CompileImpl() @ 0x555938de3da5 tensorflow::XlaCompilationCache::Compile() @ 0x555938c5d962 tensorflow::XlaLocalLaunchBase::Compute() @ 0x555938c68151 tensorflow::XlaDevice::Compute() @ 0x55593f389e1f tensorflow::(anonymous namespace)::ExecutorState::Process() @ 0x55593f38a625 tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()() * SIGABRT received by PID 7798 (TID 7837) from PID 7798; * git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337541 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-20 12:03:00 +00:00
Roman Tereshin	b2f9f92413	[LSV] Refactoring + supporting bitcasts to a type of different size This is mostly a preparation work for adding a limited support for select instructions. It proved to be difficult to do due to size and irregularity of Vectorizer::isConsecutiveAccess, this is fixed here I believe. It also turned out that these changes make it simpler to finish one of the TODOs and fix a number of other small issues, namely: 1. Looking through bitcasts to a type of a different size (requires careful tracking of the original load/store size and some math converting sizes in bytes to expected differences in indices of GEPs). 2. Reusing partial analysis of pointers done by first attempt in proving them consecutive instead of starting from scratch. This added limited support for nested GEPs co-existing with difficult sext/zext instructions. This also required a careful handling of negative differences between constant parts of offsets. 3. Handing a case where the first pointer index is not an add, but something else (a function parameter for instance). I observe an increased number of successful vectorizations on a large set of shader programs. Only few shaders are affected, but those that are affected sport >5% less loads and stores than before the patch. Reviewed By: rampitec Differential-Revision: https://reviews.llvm.org/D49342 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337489 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-19 19:42:43 +00:00
Farhana Aleen	ae72a8c570	[LoadStoreVectorizer] Use getMinusScev() to compute the distance between two pointers. Summary: Currently, isConsecutiveAccess() detects two pointers(PtrA and PtrB) as consecutive by comparing PtrB with BaseDelta+PtrA. This works when both pointers are factorized or both of them are not factorized. But isConsecutiveAccess() fails if one of the pointers is factorized but the other one is not. Here is an example: PtrA = 4 * (A + B) PtrB = 4 + 4A + 4B This patch uses getMinusSCEV() to compute the distance between two pointers. getMinusSCEV() allows combining the expressions and computing the simplified distance. Author: FarhanaAleen Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D49516 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337471 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-19 16:50:27 +00:00
Simon Pilgrim	1a0909c7b2	[SLPVectorizer] Avoid duplicate scalar cost calculations in BoUpSLP::getEntryCost. NFCI. Pulled out from D49225, we have a lot of repeated scalar cost calculations, often with arguments that don't look the same but turn out to be. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337390 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-18 13:53:55 +00:00
Simon Pilgrim	de720479bb	[SLPVectorizer] Don't attempt horizontal reduction on pointer types (PR38191) TTI::getMinMaxReductionCost typically can't handle pointer types - until this is changed its better to limit horizontal reduction to integer/float vector types only. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337280 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-17 13:43:33 +00:00
Simon Pilgrim	1e086c7b69	[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED-2) We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Reapplied with fix to only accept 2 different casts if they come from the same source type (PR38154). Differential Revision: https://reviews.llvm.org/D49135 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336989 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-13 11:09:52 +00:00
Martin Storsjo	54919303bf	Revert "[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED)" This reverts commit r336812, which broke compilation of a number of projects, see PR38154. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336949 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-12 21:33:42 +00:00
Simon Pilgrim	33f4d61062	[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED) We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Reapplied with fix to only accept 2 different casts if they come from the same source type. Differential Revision: https://reviews.llvm.org/D49135 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336812 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-11 15:05:10 +00:00
Simon Pilgrim	191ae9ef3c	Revert rL336804: [SLPVectorizer] Add initial alternate opcode support for cast instructions. Reverting due to buildbot failures git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336806 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-11 14:08:16 +00:00
Simon Pilgrim	71b0da15d2	[SLPVectorizer] Add initial alternate opcode support for cast instructions. We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Differential Revision: https://reviews.llvm.org/D49135 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336804 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-11 13:34:09 +00:00
Anastasis Grammenos	c1175857e2	[DebugInfo][LoopVectorize] Preserve DL in induction PHI and Add Differential Revision: https://reviews.llvm.org/D48968 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336667 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-10 13:29:50 +00:00
Diego Caballero	dc4af2458e	[VPlan][LV] Introduce condition bit in VPBlockBase This patch introduces a VPValue in VPBlockBase to represent the condition bit that is used as successor selector when a block has multiple successors. This information wasn't necessary until now, when we are about to introduce outer loop vectorization support in VPlan code gen. Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D48814 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336554 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-09 15:57:09 +00:00
Simon Pilgrim	7029d5a52f	[SLPVectorizer] Begin abstracting InstructionsState alternate matching away from opcodes. NFCI. This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism. Differential Revision: https://reviews.llvm.org/D48945 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336344 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-05 12:30:44 +00:00
Simon Pilgrim	79d7b5cd91	Fix some irregular whitespace/indentation. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336291 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-04 17:24:05 +00:00
Anastasis Grammenos	fb31c6481f	[DebugInfo][LoopVectorize] Preserve DL in generated phi instruction When creating `phi` instructions to resume at the scalar part of the loop, copy the DebugLoc from the original phi over to the new one. Differential Revision: https://reviews.llvm.org/D48769 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336256 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-04 10:16:55 +00:00
Farhana Aleen	13f7859c20	[SLP] Recognize min/max pattern using instructions producing same values. Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization. %1 = extractelement <2 x i32> %a, i32 0 %2 = extractelement <2 x i32> %a, i32 1 %cond = icmp sgt i32 %1, %2 %3 = extractelement <2 x i32> %a, i32 0 %4 = extractelement <2 x i32> %a, i32 1 %select = select i1 %cond, i32 %3, i32 %4 Author: FarhanaAleen Reviewed By: ABataev, RKSimon, spatel Differential Revision: https://reviews.llvm.org/D47608 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336130 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 17:55:31 +00:00
Simon Pilgrim	a40b37f909	[SLPVectorizer] Remove nullptr early-outs from Instruction::ShuffleVector getEntryCost This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336102 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 13:41:29 +00:00
Simon Pilgrim	386f15c93a	[SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns. We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case. This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336095 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 11:28:01 +00:00
Simon Pilgrim	a43dcd4394	[SLPVectorizer] Only Alternate opcodes use ShuffleVector cases for getEntryCost/vectorizeTree. NFCI. Add assertions - we're already assuming this in how we use the AltOpcode and treat everything as BinaryOperators. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336092 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 10:54:19 +00:00
Simon Pilgrim	a634a07771	[SLPVectorizer] Call InstructionsState.isOpcodeOrAlt with Instruction instead of an opcode. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336069 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 20:22:46 +00:00
Simon Pilgrim	b1dbeaa9ce	[SLPVectorizer] Replace sameOpcodeOrAlt with InstructionsState.isOpcodeOrAlt helper. NFCI. This is a basic step towards matching more general instructions types than just opcodes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336068 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 20:07:30 +00:00
Simon Pilgrim	4c7a6ba2a4	[SLPVectorizer] Use InstructionsState Op/Alt opcodes directly. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336063 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 13:41:58 +00:00
Simon Pilgrim	40be0055ae	[SLPVectorizer] Recognise non uniform power of 2 constants Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them. As SLP works with arrays of values I don't think we can easily use the pattern match helpers here. Differential Revision: https://reviews.llvm.org/D48214 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335621 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 16:20:16 +00:00
Simon Pilgrim	5a40cf8639	[SLPVectorizer] Support alternate opcodes in tryToVectorizeList Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree. NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc. Differential Revision: https://reviews.llvm.org/D48488 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335364 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-22 16:37:34 +00:00

1 2 3 4 5 ...

1586 Commits