llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-04-06 17:51:44 +00:00

Author	SHA1	Message	Date
Adam Nemet	c956f416cc	[SLP] Visualize SLP trees with -view-slp-tree Analyzing larger trees is extremely difficult with the current debug output so this adds GraphTraits and DOTGraphTraits on top of the VectorizableTree data structure. We can now display the SLP trees with Graphviz as in https://reviews.llvm.org/F3132765. I decorated the graph where a value needs to be gathered for one reason or another. These are the red nodes. There are other improvement I am planning to make as I work through my case here. For example, I would also like to mark nodes that need to be extracted. Differential Revision: https://reviews.llvm.org/D30731 llvm-svn: 297303	2017-03-08 18:47:50 +00:00
Michael Kuperstein	ecb8f70721	[SLP] Revert r296863 due to miscompiles. Details and reproducer are on the email thread for r296863. llvm-svn: 297103	2017-03-06 23:54:51 +00:00
Mohammad Shahid	d8acf02cf1	[SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR. The fix is to compute the mask for out of order memory accesses while building the vectorizable tree instead of actual vectorization of vectorizable tree.It also needs to recompute the proper Lane for external use of vectorizable scalars based on shuffle mask. Reviewers: mkuper Differential Revision: https://reviews.llvm.org/D30159 Change-Id: Ide8773ce0ad3562f3cf4d1a0ad0f487e2f60ce5d llvm-svn: 296863	2017-03-03 10:02:47 +00:00
Hans Wennborg	358597d3c9	Revert r296575 "[SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available" It caused miscompiles, e.g. in Chromium (PR32109). llvm-svn: 296654	2017-03-01 18:57:16 +00:00
Alexey Bataev	d9b0bf33dd	[SLP] Preserve IR flags when vectorizing horizontal reductions. Summary: The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math) when converting scalar horizontal reductions instructions into vectors, just like for other vectorized instructions. It doe not include IR propagation for extra arguments, we need to handle original scalar operations for extra args to propagate correct flags. Reviewers: mkuper, mzolotukhin, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30418 llvm-svn: 296614	2017-03-01 12:43:39 +00:00
Alexey Bataev	28bee685c1	[SLP] Preserve IR flags for extra args. Summary: We should preserve IR flags for extra args. These IR flags should be taken from original scalar operations, not from the reduction operations. Reviewers: mkuper, mzolotukhin, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30447 llvm-svn: 296613	2017-03-01 12:22:33 +00:00
Alexey Bataev	f077ffe566	[SLP] Fix for PR32038: extra add of PHI node when it is not required. Summary: If horizontal reduction tree starts from the binary operation that is used in PHI node, but this PHI is not used in horizontal reduction, we may end up with extra addition of this PHI node after vectorization. Here is an example: ``` %phi = phi i32 [ %tmp, %end], ... ... %tmp = add i32 %tmp1, %tmp2 end: ``` after vectorization we always have something like: ``` %phi = phi i32 [ %tmp, %end], ... ... %red = extractelement <8 x 32> %vec.red, 0 %tmp = add i32 %red, %phi end: ``` even if `%phi` is not used in reduction tree. Patch considers these PHI nodes as extra arguments and considers them in the final result iff they really used in reduction. Reviewers: mkuper, hfinkel, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30409 llvm-svn: 296606	2017-03-01 10:50:44 +00:00
Mohammad Shahid	8ddc0dd2a4	[SLP] Fixes the bug due to absence of in order uses of scalars which needs to be available for VectorizeTree() API.This API uses it for proper mask computation to be used in shufflevector IR. The fix is to compute the mask for out of order memory accesses while building the vectorizable tree instead of actual vectorization of vectorizable tree. Reviewers: mkuper Differential Revision: https://reviews.llvm.org/D30159 Change-Id: Id1e287f073fa4959713ba545fa4254db5da8b40d llvm-svn: 296575	2017-03-01 03:51:54 +00:00
Alexey Bataev	437d10997b	[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong result Summary: If the same value is used several times as an extra value, SLP vectorizer takes it into account only once instead of actual number of using. For example: ``` int val = 1; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { val = val + input[y * 8 + x] + 3; } } ``` We have 2 extra rguments: `1` - initial value of horizontal reduction and `3`, which is added 8*8 times to the reduction. Before the patch we added `1` to the reduction value and added once `3`, though it must be added 64 times. Reviewers: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30262 llvm-svn: 295972	2017-02-23 13:37:09 +00:00
Alexey Bataev	04f7dbe68e	Revert "[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong" This reverts commit 7c5141e577d9efd1c8e3087566a38ce6b3a41a84. llvm-svn: 295957	2017-02-23 11:09:35 +00:00
Alexey Bataev	f86b9bd592	[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong result Summary: If the same value is used several times as an extra value, SLP vectorizer takes it into account only once instead of actual number of using. For example: ``` int val = 1; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { val = val + input[y * 8 + x] + 3; } } ``` We have 2 extra rguments: `1` - initial value of horizontal reduction and `3`, which is added 8*8 times to the reduction. Before the patch we added `1` to the reduction value and added once `3`, though it must be added 64 times. Reviewers: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30262 llvm-svn: 295956	2017-02-23 10:57:15 +00:00
Alexey Bataev	165908b766	Revert "[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong" This reverts commit d83c81ee6a8dea662808ac22b396d1bb0595c89d. llvm-svn: 295951	2017-02-23 09:59:29 +00:00
Alexey Bataev	06a8650a4b	[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong result Summary: If the same value is used several times as an extra value, SLP vectorizer takes it into account only once instead of actual number of using. For example: ``` int val = 1; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { val = val + input[y * 8 + x] + 3; } } ``` We have 2 extra rguments: `1` - initial value of horizontal reduction and `3`, which is added 8*8 times to the reduction. Before the patch we added `1` to the reduction value and added once `3`, though it must be added 64 times. Reviewers: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30262 llvm-svn: 295949	2017-02-23 09:40:38 +00:00
Michael Kuperstein	bb42bf14f7	Revert r295868 because it breaks a different SLP lit test. llvm-svn: 295906	2017-02-22 23:35:13 +00:00
Alexey Bataev	59c0b46177	[SLP] Fix for PR32036: Vectorized horizontal reduction returning wrong result Summary: If the same value is used several times as an extra value, SLP vectorizer takes it into account only once instead of actual number of using. For example: ``` int val = 1; for (int y = 0; y < 8; y++) { for (int x = 0; x < 8; x++) { val = val + input[y * 8 + x] + 3; } } ``` We have 2 extra rguments: `1` - initial value of horizontal reduction and `3`, which is added 8*8 times to the reduction. Before the patch we added `1` to the reduction value and added once `3`, though it must be added 64 times. Reviewers: mkuper, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30262 llvm-svn: 295868	2017-02-22 20:06:40 +00:00
Alexey Bataev	da1fd33d4e	[SLP] Remove unused initial value from the variable, NFC. llvm-svn: 295826	2017-02-22 12:57:58 +00:00
Alexey Bataev	195cd24304	[SLP] nullptr'ize initial value in `findBuildAggregate()`, NFC. Initial value of V is sett nullptr, as it is not used. llvm-svn: 295642	2017-02-20 08:04:11 +00:00
Alexey Bataev	97eecd7096	[SLP] Rework `findBuildAggregate()` from ercursive form to iterative, NFC. Reviewers: mkuper Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30103 llvm-svn: 295641	2017-02-20 07:49:39 +00:00
Alexey Bataev	eab1c078e4	[SLP] Fix for PR31879: vectorize repeated scalar ops that don't get put back into a vector Previously the cost of the existing ExtractElement/ExtractValue instructions was considered as a dead cost only if it was detected that they have only one use. But these instructions may be considered dead also if users of the instructions are also going to be vectorized, like: ``` %x0 = extractelement <2 x float> %x, i32 0 %x1 = extractelement <2 x float> %x, i32 1 %x0x0 = fmul float %x0, %x0 %x1x1 = fmul float %x1, %x1 %add = fadd float %x0x0, %x1x1 ``` This can be transformed to ``` %1 = fmul <2 x float> %x, %x %2 = extractelement <2 x float> %1, i32 0 %3 = extractelement <2 x float> %1, i32 1 %add = fadd float %2, %3 ``` because though `%x0` and `%x1` have 2 users each other, these users are part of the vectorized tree and we can consider these `extractelement` instructions as dead. Differential Revision: https://reviews.llvm.org/D29900 llvm-svn: 295056	2017-02-14 15:20:48 +00:00
Alexey Bataev	89acda24b8	[SLP] Fix for PR31690: Allow using of extra values in horizontal reductions. Currently, LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also, it supports a vectorization of instructions with some extra arguments, like: ``` float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } ``` Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D29727 llvm-svn: 294934	2017-02-13 08:01:26 +00:00
Michael Kuperstein	5951cf494c	[SLP] Revert "Allow using of extra values in horizontal reductions." This breaks when one of the extra values is also a scalar that participates in the same vectorization tree which we'll end up reducing. llvm-svn: 294245	2017-02-06 21:50:59 +00:00
Michael Kuperstein	f2f8127335	[SLP] Make sortMemAccesses explicitly return an error. NFC. llvm-svn: 294029	2017-02-03 19:32:50 +00:00
Alexey Bataev	dde66d4ab8	[SLP] Fix for PR31690: Allow using of extra values in horizontal reductions. Currently LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also it supports a vectorization of instructions with some extra arguments, like: float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D28961 llvm-svn: 293994	2017-02-03 08:08:50 +00:00
Mohammad Shahid	a2646e67e7	[SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way. The jumbled scalar loads will be sorted while building the tree and these accesses will be marked to generate shufflevector after the vectorized load with proper mask. Reviewers: hfinkel, mssimpso, mkuper Differential Revision: https://reviews.llvm.org/D26905 Change-Id: I9c0c8e6f91a00076a7ee1465440a3f6ae092f7ad llvm-svn: 293386	2017-01-28 17:59:44 +00:00
Alexey Bataev	76d6416ffb	[SLP] Refactoring of horizontal reduction analysis, NFC. Some checks in SLP horizontal reduction analysis function are performed several times, though it is enough to perform these checks only once during an initial attempt at adding candidate for the reduction instruction/reduced value. Differential Revision: https://reviews.llvm.org/D29175 llvm-svn: 293274	2017-01-27 10:54:04 +00:00
Alexey Bataev	cb8becd477	[SLP] Improve horizontal vectorization for non-power-of-2 number of instructions. If number of instructions in horizontal reduction list is not power of 2 then only PowerOf2Floor(NumberOfInstructions) last elements are actually vectorized, other instructions remain scalar. Patch tries to vectorize the remaining elements either. Differential Revision: https://reviews.llvm.org/D28959 llvm-svn: 293042	2017-01-25 09:54:38 +00:00
Alexey Bataev	5239450fd0	[SLP] Refactoring of HorizontalReduction class, NFC. Removed data members ReduxWidth and MinVecRegSize + some C++11 stylish improvements. Differential Revision: https://reviews.llvm.org/D29010 llvm-svn: 292899	2017-01-24 08:57:17 +00:00
Michael Kuperstein	fe5c6f1abe	[SLP] Make ReductionOpcode have the right (enum) type. NFC. llvm-svn: 292703	2017-01-21 02:03:03 +00:00
Michael Kuperstein	88e863dd60	[SLP] Delete useless helper. NFC. The helper contained a branch for a special case that is unnecessary, and a cast. llvm-svn: 292698	2017-01-21 01:33:25 +00:00
Chandler Carruth	531a8d8a72	[PM] Introduce an analysis set used to preserve all analyses over a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis passes to check for the CFG being preserved to remove the fanout of all analyses being listed in all passes. I've gone through and removed or cleaned up as many of the comments reminding us to do this as I could. Differential Revision: https://reviews.llvm.org/D28627 llvm-svn: 292054	2017-01-15 06:32:49 +00:00
Eric Fiselier	8117e6a0a9	Give comparator const call operator llvm-svn: 292043	2017-01-15 02:06:44 +00:00
Michael Kuperstein	172556c384	[SLP] Remove bogus assert. The removed assert seems bogus - it's perfectly legal for the roots of the vectorized subtrees to be equal even if the original scalar values aren't, if the original scalars happen to be equivalent. This fixes PR31599. Differential Revision: https://reviews.llvm.org/D28539 llvm-svn: 291692	2017-01-11 19:23:57 +00:00
Daniel Jasper	162ffcacd6	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086	2016-12-19 08:22:17 +00:00
Hal Finkel	f224db75d2	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756	2016-12-15 03:02:15 +00:00
Matthew Simpson	1d43cda909	[SLP] Fix sign-extends for type-shrinking This patch ensures the correct minimum bit width during type-shrinking. Previously when type-shrinking, we always sign-extended values back to their original width. However, if we are going to sign-extend, and the sign bit is unknown, we have to increase the minimum bit width by one bit so the sign-extend will fill the upper bits correctly. If the sign bit is known to be zero, we can perform a zero-extend instead. This should fix PR31243. Reference: https://llvm.org/bugs/show_bug.cgi?id=31243 Differential Revision: https://reviews.llvm.org/D27466 llvm-svn: 289470	2016-12-12 21:11:04 +00:00
Alexey Bataev	3aadace193	[SLP] Fix for PR6246: vectorization for scalar ops on vector elements. When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 289043	2016-12-08 11:57:51 +00:00
Renato Golin	78b2b93486	Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements." This reverts commit r288497, as it broke the AArch64 build of Compiler-RT's builtins (twice: once in r288412 and once in r288497). We should investigate this offline. llvm-svn: 288508	2016-12-02 16:56:26 +00:00
Alexey Bataev	b767847ade	[SLP] Fix for PR6246: vectorization for scalar ops on vector elements. When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 288497	2016-12-02 12:20:22 +00:00
Artem Belevich	18e8ce875f	Revert "[SLP] Fix for PR6246: vectorization for scalar ops on vector elements." This reverts r288412 which causes severe compile-time regression. llvm-svn: 288431	2016-12-01 22:52:15 +00:00
Alexey Bataev	ec86a0f4e9	[SLP] Fix for PR6246: vectorization for scalar ops on vector elements. When trying to vectorize trees that start at insertelement instructions function tryToVectorizeList() uses vectorization factor calculated as MinVecRegSize/ScalarTypeSize. But sometimes it does not work as tree cost for this fixed vectorization factor is too high. Patch tries to improve the situation. It tries different vectorization factors from max(PowerOf2Floor(NumberOfVectorizedValues), MinVecRegSize/ScalarTypeSize) to MinVecRegSize/ScalarTypeSize and tries to choose the best one. Differential Revision: https://reviews.llvm.org/D27215 llvm-svn: 288412	2016-12-01 20:06:53 +00:00
Alexey Bataev	7135b5ad24	[SLP] Fixed cost model for horizontal reduction. Currently when cost of scalar operations is evaluated the vector type is used for scalar operations. Patch fixes this issue and fixes evaluation of the vector operations cost. Several test showed that vector cost model is too optimistic. It allowed vectorization of 8 or less add/fadd operations, though scalar code is faster. Actually, only for 16 or more operations vector code provides better performance. Differential Revision: https://reviews.llvm.org/D26277 llvm-svn: 288398	2016-12-01 18:42:42 +00:00
Alexey Bataev	62124599cb	[SLPVectorizer] Improved support of partial tree vectorization. Currently SLP vectorizer tries to vectorize a binary operation and dies immediately after unsuccessful the first unsuccessfull attempt. Patch tries to improve the situation, trying to vectorize all binary operations of all children nodes in the binop tree. Differential Revision: https://reviews.llvm.org/D25517 llvm-svn: 288115	2016-11-29 08:21:14 +00:00
Simon Pilgrim	e116136251	Fix spelling mistakes in Transforms comments. NFC. Identified by Pedro Giffuni in PR27636. llvm-svn: 287488	2016-11-20 13:19:49 +00:00
Vyacheslav Klochkov	90d5b65bd8	Fixed the lost FastMathFlags for CALL operations in SLPVectorizer. Reviewer: Michael Zolotukhin. Differential Revision: https://reviews.llvm.org/D26575 llvm-svn: 287064	2016-11-16 00:55:50 +00:00
Vyacheslav Klochkov	6d9c315fe7	Fixed the lost FastMathFlags for FCmp operations in SLPVectorizer. Reviewer: Michael Zolotukhin. Differential Revision: https://reviews.llvm.org/D26543 llvm-svn: 286626	2016-11-11 19:55:29 +00:00
Alexey Bataev	b6c0260e86	[SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer. After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286	2016-10-27 12:02:28 +00:00
Alexey Bataev	0a23402e2a	[SLPVectorizer] Fix for PR25748: reduction vectorization after loop unrolling. The next code is not vectorized by the SLPVectorizer: ``` int test(unsigned int *p) { int sum = 0; for (int i = 0; i < 8; i++) sum += p[i]; return sum; } ``` During optimization this loop is fully unrolled and SLPVectorizer is unable to vectorize it. Patch tries to fix this problem. Differential Revision: https://reviews.llvm.org/D24796 llvm-svn: 283535	2016-10-07 09:39:22 +00:00
Duncan P. N. Exon Smith	3dccc3d478	ADT: Do not inherit from std::iterator in ilist_iterator Inheriting from std::iterator uses more boiler-plate than manual typedefs. Avoid that in both ilist_iterator and MachineInstrBundleIterator. This has the side effect of removing ilist_iterator from certain ADL lookups in namespace std; calls to std::next need to be qualified by "std::" that didn't have to before. The one case of this in-tree was operating on a temporary, so I used the more compact operator++. llvm-svn: 280570	2016-09-03 02:27:35 +00:00
Chad Rosier	ac5b082a70	[SLP] Don't pass a global CL option as an argument. NFC. Differential Revision: https://reviews.llvm.org/D24199 llvm-svn: 280527	2016-09-02 19:09:50 +00:00
Chad Rosier	ce78968521	[SLP] Update the debug based on Michael's suggestion. Passing the types/opcode check still doesn't guarantee we'll actually vectorize. Therefore, just make it clear we're attempting to vectorize. llvm-svn: 280263	2016-08-31 17:41:12 +00:00

1 2 3 4 5 ...

370 Commits