archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Matt Arsenault	84b3660bac	AMDGPU: Allow vectorization of packed types git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305844 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 20:38:06 +00:00
Taewook Oh	9f93c9df69	Improve profile-guided heuristics to use estimated trip count. Summary: Existing heuristic uses the ratio between the function entry frequency and the loop invocation frequency to find cold loops. However, even if the loop executes frequently, if it has a small trip count per each invocation, vectorization is not beneficial. On the other hand, even if the loop invocation frequency is much smaller than the function invocation frequency, if the trip count is high it is still beneficial to vectorize the loop. This patch uses estimated trip count computed from the profile metadata as a primary metric to determine coldness of the loop. If the estimated trip count cannot be computed, it falls back to the original heuristics. Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32451 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305729 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-19 18:48:58 +00:00
George Burgess IV	9276050d30	[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops. If we're shrinking a binary operation, it may be the case that the new operations wraps where the old didn't. If this happens, the behavior should be well-defined. So, we can't always carry wrapping flags with us when we shrink operations. If we do, we get incorrect optimizations in cases like: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] - 128; } which gets optimized to: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] \| 128; } Because: - InstCombine turned `sub i32 %from.i, 128` into `add nuw nsw i32 %from.i, 128`. - LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a vector full of `i8 128`s - InstCombine took advantage of the fact that the newly-shrunken add "couldn't wrap", and changed the `add` to an `or`. InstCombine seems happy to figure out whether we can add nuw/nsw on its own, so I just decided to drop the flags. There are already a number of places in LoopVectorize where we rely on InstCombine to clean up. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305053 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-09 03:56:15 +00:00
Matthew Simpson	ed4243c350	[LV] Reapply r303763 with fix for PR33193 r303763 caused build failures in some out-of-tree tests due to an assertion in TTI. The original patch updated cost estimates for induction variable update instructions marked for scalarization. However, it didn't consider that the incoming value of an induction variable phi node could be a cast instruction. This caused queries for cast instruction costs with a mix of vector and scalar types. This patch includes a fix for cast instructions and the test case from PR33193. The fix was suggested by Jonas Paulsson <paulsson@linux.vnet.ibm.com>. Reference: https://bugs.llvm.org/show_bug.cgi?id=33193 Original Differential Revision: https://reviews.llvm.org/D33457 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304235 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-30 19:55:57 +00:00
Joerg Sonnenberger	ef8c4cd636	Revert r303763, results in asserts i.e. while building Ruby. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304179 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-29 22:52:17 +00:00
Matthew Simpson	9e8c6339d7	[LV] Update type in cost model for scalarization For non-uniform instructions marked for scalarization, we should update `VectorTy` when computing instruction costs to reflect the scalar type. In addition to determining instruction costs, this type is also used to signal that all instructions in the loop will be scalarized. This currently affects memory instructions and non-pointer induction variables and their updates. (We also mark GEPs scalar after vectorization, but their cost is computed together with memory instructions.) For scalarized induction updates, this patch also scales the scalar cost by the vectorization factor, corresponding to each induction step. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303763 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-24 15:26:15 +00:00
Jonas Paulsson	a551a28baa	[LoopVectorizer] Let target prefer scalar addressing computations. The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303744 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-24 13:42:56 +00:00
Ayal Zaks	f93293ef42	[LV] Report multiple reasons for not vectorizing under allowExtraAnalysis The default behavior of -Rpass-analysis=loop-vectorizer is to report only the first reason encountered for not vectorizing, if one is found, at which time the vectorizer aborts its handling of the loop. This patch allows multiple reasons for not vectorizing to be identified and reported, at the potential expense of additional compile-time, under allowExtraAnalysis which can currently be turned on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed. Removed from LoopVectorizationLegality::canVectorize() the redundant checking and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also does that. This redundancy is caught by a lit test once multiple reasons are reported. Patch initially developed by Dror Barak. Differential Revision: https://reviews.llvm.org/D33396 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303613 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-23 07:08:02 +00:00
Amara Emerson	f69362835a	Re-commit r302678, fixing PR33053. The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions which didn't have a lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303211 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 21:29:22 +00:00
Matthew Simpson	61c3d36e2e	Revert 303174, 303176, and 303178 These commits are breaking the bots. Reverting to investigate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303182 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 15:50:30 +00:00
Matthew Simpson	cce9257402	Make test target-specific git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303178 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 15:33:22 +00:00
Matthew Simpson	f62f568123	Fix test case to unbreak bots git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303176 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 15:20:27 +00:00
Matthew Simpson	d9ed77b3ad	[LV] Avoid potentential division by zero when selecting IC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303174 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 14:43:55 +00:00
Hans Wennborg	05f671ecfa	Revert r302678 "[AArch64] Enable use of reduction intrinsics." This caused PR33053. Original commit message: > The new experimental reduction intrinsics can now be used, so I'm enabling this > for AArch64. We will need this for SVE anyway, so it makes sense to do this for > NEON reductions as well. > > The existing code to match shufflevector patterns are replaced with a direct > lowering of the reductions to AArch64-specific nodes. Tests updated with the > new, simpler, representation. > > Differential Revision: https://reviews.llvm.org/D32247 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303115 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 20:59:32 +00:00
Simon Pilgrim	07ac640d6c	[LoopOptimizer][Fix]PR32859, PR24738 The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302988 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-13 13:25:57 +00:00
Andrew Kaylor	a0e10ad06a	[TLI] Add mapping for various '__<func>_finite' forms of the math routines to SVML routines Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31789 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302957 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 22:11:26 +00:00
Amara Emerson	7c631c8afc	[AArch64] Enable use of reduction intrinsics. The new experimental reduction intrinsics can now be used, so I'm enabling this for AArch64. We will need this for SVE anyway, so it makes sense to do this for NEON reductions as well. The existing code to match shufflevector patterns are replaced with a direct lowering of the reductions to AArch64-specific nodes. Tests updated with the new, simpler, representation. Differential Revision: https://reviews.llvm.org/D32247 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302678 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-10 15:15:38 +00:00
Anna Thomas	1fbed43c3e	[LV] Fix insertion point for shuffle vectors in first order recurrence Summary: In first order recurrence vectorization, when the previous value is a phi node, we need to set the insertion point to the first non-phi node. We can have the previous value being a phi node, due to the generation of new IVs as part of trunc optimization [1]. [1] https://reviews.llvm.org/rL294967 Reviewers: mssimpso, mkuper Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302532 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-09 14:29:33 +00:00
Elad Cohen	ea59a24717	Support arbitrary address space pointers in masked gather/scatter intrinsics. Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a non-default address space pointer we fail with a "Calling a function with a bad singature!" assertion. This patch solves this by adding the 'vector of pointers' argument as an overloaded type which will determine the address space. Differential revision: https://reviews.llvm.org/D31490 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302018 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-03 12:28:54 +00:00
Matthew Simpson	86197951a0	[LV] Handle external uses of floating-point induction variables Reference: https://bugs.llvm.org/show_bug.cgi?id=32758 Differential Revision: https://reviews.llvm.org/D32445 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301428 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-26 16:23:02 +00:00
Gil Rapaport	52d55b49a8	[LV] Make LIT test insensitive to basic block numbering This patch is part of D28975's breakdown. induction.ll encodes the specific (and rather arbitrary) numbers given to predicated basic blocks by the unique naming mechanism, which makes it sensitive to changes in LV's instruction generation order. This patch replaces those specific numbers with a numeric pattern. Differential Revision: https://reviews.llvm.org/D32404 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301345 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-25 18:14:24 +00:00
Matthew Simpson	915a4bc0f9	[LV] Model if-converted phi node costs Phi nodes in non-header blocks are converted to select instructions after if-conversion. This patch updates the cost model to account for the selects. Differential Revision: https://reviews.llvm.org/D31906 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300980 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-21 14:14:54 +00:00
Anna Thomas	0bd00d7d97	[LV] Fix the vector code generation for first order recurrence Summary: In first order recurrences where phi's are used outside the loop, we should generate an additional vector.extract of the second last element from the vectorized phi update. This is because we require the phi itself (which is the value at the second last iteration of the vector loop) and not the phi's update within the loop. Also fix the code gen when we just unroll, but don't vectorize. Fixes PR32396. Reviewers: mssimpso, mkuper, anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D31979 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300238 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-13 18:59:25 +00:00
Renato Golin	73ba5e9415	[SystemZ] Fix more target specific tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300081 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 18:03:09 +00:00
Sanjay Patel	e930b092c4	[InstCombine] morph an existing instruction instead of creating a new one One potential way to make InstCombine (very slightly?) faster is to recycle instructions when possible instead of creating new ones. It's not explicitly stated AFAIK, but we don't consider this an "InstSimplify". We could, however, make a new layer to house transforms like this if that makes InstCombine more manageable (just throwing out an idea; not sure how much opportunity is actually here). Differential Revision: https://reviews.llvm.org/D31863 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300067 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 15:11:33 +00:00
Jonas Paulsson	16b23e95d5	[LoopVectorizer] Improve handling of branches during cost estimation. The cost for a branch after vectorization is very different depending on if the vectorizer will if-convert the block (branch is eliminated), or if scalarized and predicated blocks will be produced (branch duplicated before each block). There is also the case of remaining scalar branches, such as the back-edge branch. This patch handles these cases differently with TTI based cost estimates. Review: Matthew Simpson https://reviews.llvm.org/D31175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300058 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 13:13:15 +00:00
Jonas Paulsson	43d439da88	[LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore() Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300056 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 12:41:37 +00:00
Jonas Paulsson	c33bdfa7b1	[SystemZ] TargetTransformInfo cost functions implemented. getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300052 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 11:49:08 +00:00
Anna Thomas	3d57035bd9	[LV] Avoid vectorizing first order recurrence when phi uses are outside loop In the vectorization of first order recurrence, we vectorize such that the last element in the vector will be the one extracted to pass into the scalar remainder loop. However, this is not true when there is a phi (other than the primary induction variable) is used outside the loop. In such a case, we need the value from the second last iteration (i.e. the phi value), not the last iteration (which would be the phi update). I've added a test case for this. Also see PR32396. A follow up patch would generate the correct code gen for such cases, and turn this vectorization on. Differential Revision: https://reviews.llvm.org/D31910 Reviewers: mssimpso git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299985 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 21:02:00 +00:00
Anna Thomas	c3ff31ab3f	[LV] Move first order recurrence test to common folder. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299969 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 18:31:42 +00:00
Matt Arsenault	bdbe8280f2	Add address space mangling to lifetime intrinsics In preparation for allowing allocas to have non-0 addrspace. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299876 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-10 20:18:21 +00:00
Matthew Simpson	ccc38cc5e7	Reapply r298620: [LV] Vectorize GEPs This patch reapplies r298620. The original patch was reverted because of two issues. First, the patch exposed a bug in InstCombine that caused the Chromium builds to fail (PR32414). This issue was fixed in r299017. Second, the patch introduced a bug in the vectorizer's scalars analysis that caused test suite builds to fail on SystemZ. The scalars analysis was too aggressive and marked a memory instruction scalar, even though it was going to be vectorized. This issue has been fixed in the current patch and several new test cases for the scalars analysis have been added. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299770 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-07 14:15:34 +00:00
Matthew Simpson	a9eefb0375	[LV] Make test case more robust This test case depends on the loop being vectorized without forcing the vectorization factor. If the profitability ever changes in the future (due to cost model improvements), the test may no longer work as intended. Instead of checking the resulting IR, we should just check the instruction costs. The costs will be computed regardless if vectorization is profitable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299545 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-05 14:34:13 +00:00
Matthew Simpson	fcdf36fabb	[LV] Transform truncations of non-primary induction variables The vectorizer tries to replace truncations of induction variables with new induction variables having the smaller type. After r295063, this optimization was applied to all integer induction variables, including non-primary ones. When optimizing the truncation of a non-primary induction variable, we still need to transform the new induction so that it has the correct start value. This should fix PR32419. Reference: https://bugs.llvm.org/show_bug.cgi?id=32419 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298882 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-27 20:07:38 +00:00
Ivan Krasin	76f79b8934	Revert r298620: [LV] Vectorize GEPs Reason: breaks linking Chromium with LLD + ThinLTO (a pass crashes) LLVM bug: https://bugs.llvm.org//show_bug.cgi?id=32413 Original change description: [LV] Vectorize GEPs This patch adds support for vectorizing GEPs. Previously, we only generated vector GEPs on-demand when creating gather or scatter operations. All GEPs from the original loop were scalarized by default, and if a pointer was to be stored to memory, we would have to build up the pointer vector with insertelement instructions. With this patch, we will vectorize all GEPs that haven't already been marked for scalarization. The patch refines collectLoopScalars to more exactly identify the scalar GEPs. The function now more closely resembles collectLoopUniforms. And the patch moves vector GEP creation out of vectorizeMemoryInstruction and into the main vectorization loop. The vector GEPs needed for gather and scatter operations will have already been generated before vectoring the memory accesses. Original Differential Revision: https://reviews.llvm.org/D30710 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298735 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 20:49:43 +00:00
Gil Rapaport	a1c1557043	[LV] Add regression test for r297610 The new test asserts that scalarized memory operations get memcheck metadata added even if the loop is only unrolled. Differential Revision: https://reviews.llvm.org/D30972 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298641 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 20:02:23 +00:00
Matthew Simpson	77a842c0a2	[LV] Vectorize GEPs This patch adds support for vectorizing GEPs. Previously, we only generated vector GEPs on-demand when creating gather or scatter operations. All GEPs from the original loop were scalarized by default, and if a pointer was to be stored to memory, we would have to build up the pointer vector with insertelement instructions. With this patch, we will vectorize all GEPs that haven't already been marked for scalarization. The patch refines collectLoopScalars to more exactly identify the scalar GEPs. The function now more closely resembles collectLoopUniforms. And the patch moves vector GEP creation out of vectorizeMemoryInstruction and into the main vectorization loop. The vector GEPs needed for gather and scatter operations will have already been generated before vectoring the memory accesses. Differential Revision: https://reviews.llvm.org/D30710 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298620 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 16:29:58 +00:00
Matthew Simpson	49901e7a48	[LV] Delete unneeded scalar GEP creation code The code for generating scalar base pointers in vectorizeMemoryInstruction is not needed. We currently scalarize all GEPs and maintain the scalarized values in VectorLoopValueMap. The GEP cloning in this unneeded code is the same as that in scalarizeInstruction. The test cases that changed as a result of this patch changed because we were able to reuse the scalarized GEP that we previously generated instead of cloning a new one. Differential Revision: https://reviews.llvm.org/D30587 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298615 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-23 16:07:21 +00:00
Matt Arsenault	d706d030af	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298444 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 21:39:51 +00:00
Jonas Paulsson	85dd82a95b	[TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved getIntrinsicInstrCost() used to only compute scalarization cost based on types. This patch improves this so that the actual arguments are checked when they are available, in order to handle only unique non-constant operands. Tests updates: Analysis/CostModel/X86/arith-fp.ll Transforms/LoopVectorize/AArch64/interleaved_cost.ll Transforms/LoopVectorize/ARM/interleaved_cost.ll The improvement in getOperandsScalarizationOverhead() to differentiate on constants made it necessary to update the interleaved_cost.ll tests even though they do not relate to intrinsics. Review: Hal Finkel https://reviews.llvm.org/D29540 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-14 06:35:36 +00:00
Changpeng Fang	1970df176f	AMDGPU/SI: Disable unrolling in the loop vectorizer if the loop is not vectorized. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D30719 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297328 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 00:07:00 +00:00
Matthew Simpson	8ca15c2f22	[LV] Select legal insert point when fixing first-order recurrences Because IRBuilder performs constant-folding, it's not guaranteed that an instruction in the original loop map to an instruction in the vector loop. It could map to a constant vector instead. The handling of first-order recurrences was incorrectly making this assumption when setting the IRBuilder's insert point. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297302 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 18:18:20 +00:00
Matthew Simpson	b9f3ad9d10	[LV] Make the test case for PR30183 less fragile This patch also renames the PR number the test points to. The previous reference was PR29559, but that bug was somehow deleted and recreated under PR30183. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297295 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 17:03:38 +00:00
Matthew Simpson	d20a0daedd	[LV] Add missing check labels to tests and reformat git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297294 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 16:55:34 +00:00
Matthew Simpson	30bc56bd3c	[LV] Consider users that are memory accesses in uniforms expansion step When expanding the set of uniform instructions beyond the seed instructions (e.g., consecutive pointers), we mark a new instruction uniform if all its loop-varying users are uniform. We should also allow users that are consecutive or interleaved memory accesses. This fixes cases where we have an instruction that is used as the pointer operand of a consecutive access but also used by a non-memory instruction that later becomes uniform as part of the expansion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297179 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 18:47:30 +00:00
Matthew Simpson	201896c9fd	[ARM/AArch64] Update costs for interleaved accesses with wide types After r296750, we're able to match interleaved accesses having types wider than 128 bits. This patch updates the associated TTI costs. Differential Revision: https://reviews.llvm.org/D29675 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296751 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-02 15:15:35 +00:00
Matthew Simpson	465f5a16f9	[LV] Considier non-consecutive but vectorizable accesses for VF selection When computing the smallest and largest types for selecting the maximum vectorization factor, we currently ignore loads and stores of pointer types if the memory access is non-consecutive. We do this because such accesses must be scalarized regardless of vectorization factor, and thus shouldn't be considered when determining the factor. This patch makes this check less aggressive by also considering non-consecutive accesses that may be vectorized, such as interleaved accesses. Because we don't know at the time of the check if an accesses will certainly be vectorized (this is a cost model decision given a particular VF), we consider all accesses that can potentially be vectorized. Differential Revision: https://reviews.llvm.org/D30305 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296747 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-02 13:55:05 +00:00
Adam Nemet	90c4f1ee16	[LV] These remark should have been missed remarks The practice in LV is that we emit analysis remarks and then finally report either a missed or applied remark on the final decision whether vectorization is taking place. On this code path, we were closing with an analysis remark. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296578 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-01 04:31:15 +00:00
Craig Topper	f46e01bb23	[AVX-512] Fix the execution domain for AVX-512 integer broadcasts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296290 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-26 06:45:51 +00:00
Matthew Simpson	aceef6fa37	[LV] Merge floating-point and integer induction widening code This patch merges the existing floating-point induction variable widening code into the integer induction variable widening code, creating a single set of functions for both kinds of inductions. The primary motivation for doing this is to enable vector phi node creation for floating-point induction variables. Differential Revision: https://reviews.llvm.org/D30211 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296145 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-24 18:20:12 +00:00

1 2 3 4 5 ...

600 Commits