archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Matt Arsenault	c8df92092d	LV: Don't insert runtime ptr checks on divergent targets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309890 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-02 21:43:08 +00:00
Alexey Bataev	340067cec4	[SLPVectorizer] Generalize interface of functions, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309816 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-02 14:38:07 +00:00
Alexey Bataev	b51029d1f1	[SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on extracted elements Summary: Currently most of the time vectors of extractelement instructions are treated as scalars that must be gathered into vectors. But in some cases, like when we have extractelement instructions from single vector with different constant indeces or from 2 vectors of the same size, we can treat this operations as shuffle of a single vector or blending of 2 vectors. ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %x0 = extractelement <2 x i8> %x, i32 0 %y1 = extractelement <2 x i8> %y, i32 1 %x0x0 = mul i8 %x0, %x0 %y1y1 = mul i8 %y1, %y1 %ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0 %ins2 = insertelement <2 x i8> %ins1, i8 %y1y1, i32 1 ret <2 x i8> %ins2 } ``` can be converted to something like ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %1 = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i32> <i32 0, i32 3> %2 = mul <2 x i8> %1, %1 ret <2 x i8> %2 } ``` Currently this type of conversion is considered as high cost transformation. Reviewers: mzolotukhin, delena, mkuper, hfinkel, RKSimon Subscribers: ashahid, RKSimon, spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D30200 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309812 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-02 13:25:26 +00:00
Davide Italiano	4181790cb5	[SLPVectorizer] Unbreak the build with -Werror. GCC was complaining about `&&` within `\|\|` without explicit parentheses. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309606 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-31 19:14:19 +00:00
Alexey Bataev	837b97fb9a	[SLP] Initial rework for min/max horizontal reduction vectorization, NFC. Summary: All getReductionCost() functions are renamed to getArithmeticReductionCost() + added basic infrastructure to handle non-binary reduction operations. Reviewers: spatel, mzolotukhin, Ayal, mkuper, gilr, hfinkel Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D29402 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309566 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-31 14:36:05 +00:00
Alexey Bataev	5a34abfe3e	[Cost] Rename getReductionCost() to getArithmeticReductionCost(), NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309563 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-31 14:19:32 +00:00
Ayal Zaks	343f60c4b2	[LV] Avoid redundant operations manipulating masks The Loop Vectorizer generates redundant operations when manipulating masks: AND with true, OR with false, compare equal to true. Instead of relying on a subsequent pass to clean them up, this patch avoids generating them. Use null (no-mask) to represent all-one full masks, instead of a constant all-one vector, following the convention of masked gathers and scatters. Preparing for a follow-up VPlan patch in which these mask manipulating operations are modeled using recipes. Differential Revision: https://reviews.llvm.org/D35725 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309558 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-31 13:21:42 +00:00
Alexey Bataev	2976ab9c15	[SLP] Allow vectorization of the instruction from the same basic blocks only, NFC. Summary: After some changes in SLP vectorizer we missed some additional checks to limit the instructions for vectorization. We should not perform analysis of the instructions if the parent of instruction is not the same as the parent of the first instruction in the tree or it was analyzed already. Subscribers: mzolotukhin Differential Revision: https://reviews.llvm.org/D34881 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309425 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-28 20:11:16 +00:00
Alexey Bataev	fb84191e18	[SLP] Outline code for the check that instruction users are part of vectorization tree, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309284 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-27 15:48:44 +00:00
Dinar Temirbulatov	cce6cac026	[SLPVectorizer] Replace E->Scalars to VL0 at vectorizeTree and move comment, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308750 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-21 16:02:56 +00:00
Dinar Temirbulatov	4a3583d89e	[SLPVectorizer] buildTree_rec replace cast<Instruction>(VL[0]) to VL0, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308745 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-21 15:31:54 +00:00
Dinar Temirbulatov	e234ef0a5f	[SLPVectorizer] Change canReuseExtract function parameter Opcode from unsigned to Value *, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308739 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-21 13:32:36 +00:00
Ayal Zaks	e1f7499ee7	[LV] Test once if vector trip count is zero, instead of twice Generate a single test to decide if there are enough iterations to jump to the vectorized loop, or else go to the scalar remainder loop. This test compares the Scalar Trip Count: if STC < VF * UF go to the scalar loop. If requiresScalarEpilogue() holds, at-least one iteration must remain scalar; the rest can be used to form vector iterations. So in this case the test checks instead if (STC - 1) < VF * UF by comparing STC <= VF * UF, and going to the scalar loop if so. Otherwise the vector loop is entered for at-least one vector iteration. This test covers the case where incrementing the backedge-taken count will overflow leading to an incorrect trip count of zero. In this (rare) case we will also avoid the vector loop and jump to the scalar loop. This patch simplifies the existing tests and effectively removes the basic-block originally named "min.iters.checked", leaving the single test in block "vector.ph". Original observation and initial patch by Evgeny Stupachenko. Differential Revision: https://reviews.llvm.org/D34150 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308421 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-19 05:16:39 +00:00
Simon Pilgrim	42916d8d85	Remove unnecessary cast. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308166 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-17 09:35:03 +00:00
Dinar Temirbulatov	31b76d9b4a	[SLPVectorizer] Add an extra parameter to tryScheduleBundle function, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308081 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-15 05:43:54 +00:00
Dinar Temirbulatov	4cbfb4282b	[SLPVectorizer] Add an extra parameter to alreadyVectorized function, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307996 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 03:48:29 +00:00
Michael Kuperstein	73d05a2a19	[LV] Don't allow outside uses of IVs if the SCEV is predicated on loop conditions. This fixes PR33706. Differential Revision: https://reviews.llvm.org/D35227 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307837 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-12 19:53:55 +00:00
Dinar Temirbulatov	fa3d66c27c	[SLPVectorizer] Revert change in cancelScheduling with referencing to FirstInBundle, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307667 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-11 15:54:50 +00:00
Dinar Temirbulatov	18b9e001a8	[SLPVectorizer] Add an extra parameter to cancelScheduling function, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307158 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-05 13:53:03 +00:00
Teresa Johnson	9d923b35aa	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306936 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-01 03:24:09 +00:00
Teresa Johnson	f7497bfb4a	re-commit r306336: Enable vectorizer-maximize-bandwidth by default. Differential Revision: https://reviews.llvm.org/D33341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306935 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-01 03:24:08 +00:00
Teresa Johnson	423d09931a	revert r306336 for breaking ppc test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306934 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-01 03:24:07 +00:00
Teresa Johnson	005cfad2e8	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306933 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-01 03:24:06 +00:00
Dinar Temirbulatov	7bf0a87e3a	[SLPVectorizer] Add isOdd() helper function, NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306887 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-30 21:16:26 +00:00
Ayal Zaks	e050d57c74	[LV] Sink casts to unravel first order recurrence Check if a single cast is preventing handling a first-order-recurrence Phi, because the scheduling constraints it imposes on the first-order-recurrence shuffle are infeasible; but they can be made feasible by moving the cast downwards. Record such casts and move them when vectorizing the loop. Differential Revision: https://reviews.llvm.org/D33058 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306884 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-30 21:05:06 +00:00
Ayal Zaks	bfae62c2cb	[LV] Optimize for size when vectorizing loops with tiny trip count It may be detrimental to vectorize loops with very small trip count, as various costs of the vectorized loop body as well as enclosing overheads including runtime tests and scalar iterations may outweigh the gains of vectorizing. The current cost model measures the cost of the vectorized loop body only, expecting it will amortize other costs, and loops with known or expected very small trip counts are not vectorized at all. This patch allows loops with very small trip counts to be vectorized, but under OptForSize constraints, which ensure the cost of the loop body is dominant, having no runtime guards nor scalar iterations. Patch inspired by D32451. Differential Revision: https://reviews.llvm.org/D34373 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306803 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-30 08:02:35 +00:00
Chandler Carruth	cebf3467bc	Remove the BBVectorize pass. It served us well, helped kick-start much of the vectorization efforts in LLVM, etc. Its time has come and past. Back in 2014: http://lists.llvm.org/pipermail/llvm-dev/2014-November/079091.html Time to actually let go and move forward. =] I've updated the release notes both about the removal and the deprecation of the corresponding C API. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306797 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-30 07:09:08 +00:00
Daniel Jasper	ed1642feee	Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default." This still breaks PPC tests we have. I'll forward reproduction instructions to dehao. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306792 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-30 06:32:21 +00:00
Dinar Temirbulatov	d5b3cba3bb	[SLPVectorizer] Moving Entry->NeedToGather check out of inner loop, since it is invariant there. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306749 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-29 21:56:33 +00:00
Dinar Temirbulatov	c0dfd2f671	[SLPVectorizer] Introducing getTreeEntry() helper function [NFC] Differential Revision: https://reviews.llvm.org/D34756 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306655 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-29 08:46:18 +00:00
Ayal Zaks	9a06b5298e	[LV] Fix PR33613 - retain order of insertelement per part r306381 caused PR33613, by reversing the order in which insertelements were generated per unroll part. This patch fixes PR33613 by retraining this order, placing each set of insertelements per part immediately after the last scalar being packed for this part. Includes a test case derived from PR33613. Reference: https://bugs.llvm.org/show_bug.cgi?id=33613 Differential Revision: https://reviews.llvm.org/D34760 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306575 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 17:59:33 +00:00
Dehao Chen	c9d2291c96	re-commit r306336: Enable vectorizer-maximize-bandwidth by default. Differential Revision: https://reviews.llvm.org/D33341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306473 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 22:05:58 +00:00
Ayal Zaks	84b5668c17	Recommitting 306331. Undoing revert 306338 after fixed bug: add metadata to the load instead of the reverse shuffle added to it, retaining the original ValueMap implementation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306381 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 08:41:19 +00:00
Dehao Chen	74c2abe3c6	revert r306336 for breaking ppc test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306344 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 23:05:35 +00:00
Ayal Zaks	bfc8711de9	reverting 306331. Causes TBAA metadata to be generates on reverse shuffles, investigating. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306338 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 22:26:54 +00:00
Dehao Chen	fd167cf907	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306336 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 21:41:09 +00:00
Ayal Zaks	faf416b5ae	[LV] Changing the interface of ValueMap, NFC. Instead of providing access to the internal MapStorage holding all Values associated with a given Key, used for setting or resetting them all together, ValueMap keeps its MapStorage internal; its new interface allows getting, setting or resetting a single Value, per part or per part-and-lane. Follows the discussion in https://reviews.llvm.org/D32871. Differential Revision: https://reviews.llvm.org/D34473 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306331 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-26 21:03:51 +00:00
Diana Picus	e36adbda88	Revert "Enable vectorizer-maximize-bandwidth by default." This reverts commit r305960 because it broke self-hosting on AArch64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305990 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 10:00:28 +00:00
Dehao Chen	998914d301	Enable vectorizer-maximize-bandwidth by default. Summary: vectorizer-maximize-bandwidth is generally useful in terms of performance. I've tested the impact of changing this to default on speccpu benchmarks on sandybridge machines. The result shows non-negative impact: spec/2006/fp/C++/444.namd 26.84 -0.31% spec/2006/fp/C++/447.dealII 46.19 +0.89% spec/2006/fp/C++/450.soplex 42.92 -0.44% spec/2006/fp/C++/453.povray 38.57 -2.25% spec/2006/fp/C/433.milc 24.54 -0.76% spec/2006/fp/C/470.lbm 41.08 +0.26% spec/2006/fp/C/482.sphinx3 47.58 -0.99% spec/2006/int/C++/471.omnetpp 22.06 +1.87% spec/2006/int/C++/473.astar 22.65 -0.12% spec/2006/int/C++/483.xalancbmk 33.69 +4.97% spec/2006/int/C/400.perlbench 33.43 +1.70% spec/2006/int/C/401.bzip2 23.02 -0.19% spec/2006/int/C/403.gcc 32.57 -0.43% spec/2006/int/C/429.mcf 40.35 +0.27% spec/2006/int/C/445.gobmk 26.96 +0.06% spec/2006/int/C/456.hmmer 24.4 +0.19% spec/2006/int/C/458.sjeng 27.91 -0.08% spec/2006/int/C/462.libquantum 57.47 -0.20% spec/2006/int/C/464.h264ref 46.52 +1.35% geometric mean +0.29% The regression on 453.povray seems real, but is due to secondary effects as all hot functions are bit-identical with and without the flag. I started this patch to consult upstream opinions on this. It will be greatly appreciated if the community can help test the performance impact of this change on other architectures so that we can decided if this should be target-dependent. Reviewers: hfinkel, mkuper, davidxl, chandlerc Reviewed By: chandlerc Subscribers: rengolin, sanjoy, javed.absar, bjope, dorit, magabari, RKSimon, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305960 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-21 22:01:32 +00:00
Taewook Oh	9f93c9df69	Improve profile-guided heuristics to use estimated trip count. Summary: Existing heuristic uses the ratio between the function entry frequency and the loop invocation frequency to find cold loops. However, even if the loop executes frequently, if it has a small trip count per each invocation, vectorization is not beneficial. On the other hand, even if the loop invocation frequency is much smaller than the function invocation frequency, if the trip count is high it is still beneficial to vectorize the loop. This patch uses estimated trip count computed from the profile metadata as a primary metric to determine coldness of the loop. If the estimated trip count cannot be computed, it falls back to the original heuristics. Reviewers: Ayal, mssimpso, mkuper, danielcdh, wmi, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D32451 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305729 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-19 18:48:58 +00:00
Dinar Temirbulatov	8bcd7ee921	Remove brackets, NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305706 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-19 16:44:07 +00:00
George Burgess IV	9276050d30	[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops. If we're shrinking a binary operation, it may be the case that the new operations wraps where the old didn't. If this happens, the behavior should be well-defined. So, we can't always carry wrapping flags with us when we shrink operations. If we do, we get incorrect optimizations in cases like: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] - 128; } which gets optimized to: void foo(const unsigned char from, unsigned char to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] \| 128; } Because: - InstCombine turned `sub i32 %from.i, 128` into `add nuw nsw i32 %from.i, 128`. - LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a vector full of `i8 128`s - InstCombine took advantage of the fact that the newly-shrunken add "couldn't wrap", and changed the `add` to an `or`. InstCombine seems happy to figure out whether we can add nuw/nsw on its own, so I just decided to drop the flags. There are already a number of places in LoopVectorize where we rely on InstCombine to clean up. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305053 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-09 03:56:15 +00:00
Chandler Carruth	e3e43d9d57	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304787 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-06 11:49:48 +00:00
Ayal Zaks	2cfe765f46	[LV] Make scalarizeInstruction() non-virtual. NFC. Following the request made in https://reviews.llvm.org/D32871, scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is hereby made non-virtual in InnerLoopVectorizer. Should have been part of r297580 originally. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304685 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-04 13:29:51 +00:00
Galina Kistanova	143302b9f0	Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304636 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-03 05:18:46 +00:00
Alexey Bataev	c20deb63f8	[SLP] Improve comments and naming of functions/variables/members, NFC. Fixed some comments, added an additional description of the algorithms, improved readability of the code. Differential revision: https://reviews.llvm.org/D33320 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304616 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-03 00:08:21 +00:00
Alexey Bataev	cb453a0d29	Revert "[SLP] Improve comments and naming of functions/variables/members, NFC." This reverts commit 6e311de8b907aa20da9a1a13ab07c3ce2ef4068a. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304609 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-02 23:09:15 +00:00
Alexey Bataev	37aaa827f4	[SLP] Improve comments and naming of functions/variables/members, NFC. Summary: Fixed some comments, added an additional description of the algorithms, improved readability of the code. Reviewers: anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33320 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304593 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-02 20:39:27 +00:00
Matthew Simpson	ed4243c350	[LV] Reapply r303763 with fix for PR33193 r303763 caused build failures in some out-of-tree tests due to an assertion in TTI. The original patch updated cost estimates for induction variable update instructions marked for scalarization. However, it didn't consider that the incoming value of an induction variable phi node could be a cast instruction. This caused queries for cast instruction costs with a mix of vector and scalar types. This patch includes a fix for cast instructions and the test case from PR33193. The fix was suggested by Jonas Paulsson <paulsson@linux.vnet.ibm.com>. Reference: https://bugs.llvm.org/show_bug.cgi?id=33193 Original Differential Revision: https://reviews.llvm.org/D33457 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304235 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-30 19:55:57 +00:00
Joerg Sonnenberger	ef8c4cd636	Revert r303763, results in asserts i.e. while building Ruby. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304179 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-29 22:52:17 +00:00

1 2 3 4 5 ...

1341 Commits