llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-04-15 06:01:20 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	2f0b52e5f6	[SLP] getVectorElementSize and isTreeTinyAndNotFullyVectorizable are const methods. NFCI. llvm-svn: 357416	2019-04-01 17:48:03 +00:00
Simon Pilgrim	ddbded2c44	[SLP] getGatherCost and isFullyVectorizableTinyTree are const methods. NFCI. llvm-svn: 357414	2019-04-01 17:32:46 +00:00
Simon Pilgrim	8a9486eaf0	[SLP] Add support for commutative icmp/fcmp predicates For the cases where the icmp/fcmp predicate is commutative, use reorderInputsAccordingToOpcode to collect and commute the operands. This requires a helper to recognise commutativity in both general Instruction and CmpInstr types - the CmpInst::isCommutative doesn't overload the Instruction::isCommutative method for reasons I'm not clear on (maybe because its based on predicate not opcode?!?). Differential Revision: https://reviews.llvm.org/D59992 llvm-svn: 357266	2019-03-29 15:28:25 +00:00
Simon Pilgrim	73b9c02f2c	[SLP] Add support for swapping icmp/fcmp predicates to permit vectorization We should be able to match elements with the swapped predicate as well - as long as we commute the source operands. Differential Revision: https://reviews.llvm.org/D59956 llvm-svn: 357243	2019-03-29 10:41:00 +00:00
Simon Pilgrim	6b2a982e08	[SLPVectorizer] Merge reorderAltShuffleOperands into reorderInputsAccordingToOpcode As discussed on D59738, this generalizes reorderInputsAccordingToOpcode to handle multiple + non-commutative instructions so we can get rid of reorderAltShuffleOperands and make use of the extra canonicalizations that reorderInputsAccordingToOpcode brings. Differential Revision: https://reviews.llvm.org/D59784 llvm-svn: 356939	2019-03-25 20:05:27 +00:00
Simon Pilgrim	b3aca7fab1	[SLPVectorizer] reorderInputsAccordingToOpcode - remove non-Instruction canonicalization Remove attempts to commute non-Instructions to the LHS - the codegen changes appear to rely on chance more than anything else and also have a tendency to fight existing instcombine canonicalization which moves constants to the RHS of commutable binary ops. This is prep work towards: (a) reusing reorderInputsAccordingToOpcode for alt-shuffles and removing the similar reorderAltShuffleOperands (b) improving reordering to optimized cases with commutable and non-commutable instructions to still find splat/consecutive ops. Differential Revision: https://reviews.llvm.org/D59738 llvm-svn: 356913	2019-03-25 15:53:55 +00:00
Simon Pilgrim	abd5dedc77	[SLPVectorizer] shouldReorderOperands - just check for reordering. NFCI. Remove the I.getOperand() calls from inside shouldReorderOperands - reorderInputsAccordingToOpcode should handle the creation of the operand lists and shouldReorderOperands should just check to see whether the i'th element should be commuted. llvm-svn: 356854	2019-03-24 13:36:32 +00:00
Simon Pilgrim	5676909b87	Fix unused variable warning on non-asserts builds. NFCI. llvm-svn: 356841	2019-03-23 16:56:23 +00:00
Simon Pilgrim	16cc343a0c	Remove unused function argument. NFCI. llvm-svn: 356840	2019-03-23 16:20:34 +00:00
Simon Pilgrim	63fb2c71e4	[SLPVectorizer] reorderInputsAccordingToOpcode - use InstructionState directly. NFCI. llvm-svn: 356832	2019-03-23 13:44:06 +00:00
Simon Pilgrim	63dafb2876	[SLPVectorizer] Don't repeat VL.size() call. NFCI. llvm-svn: 356830	2019-03-23 12:11:25 +00:00
Simon Pilgrim	fff90b95eb	[SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree(). This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree(). To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order. This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo Patch by: @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59059 llvm-svn: 356814	2019-03-22 21:27:11 +00:00
Simon Pilgrim	6746a9b58a	Revert rL355906: [SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree(). This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree(). To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order. This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo Patch by: @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59059 ........ Reverted due to buildbot failures that I don't have time to track down. llvm-svn: 355913	2019-03-12 11:51:59 +00:00
Simon Pilgrim	7351169b59	Try to fix SLPVectorizer BoUpSLP::BoEdgeInfo::dump visibility on non-debug builds llvm-svn: 355912	2019-03-12 11:31:06 +00:00
Simon Pilgrim	fcc980a913	[SLP] Remove redundancy of performing operand reordering twice: once in buildTree() and later in vectorizeTree(). This is a refactoring patch that removes the redundancy of performing operand reordering twice, once in buildTree() and later in vectorizeTree(). To achieve this we need to keep track of the operands within the TreeEntry struct while building the tree, and later in vectorizeTree() we are just accessing them from the TreeEntry in the right order. This patch is the first in a series of patches that will allow for better operand reordering across chains of instructions (e.g., a chain of ADDs), as presented here: https://www.youtube.com/watch?v=gIEn34LvyNo Patch by: @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59059 llvm-svn: 355906	2019-03-12 10:51:51 +00:00
Sanjoy Das	c6d3c3a4b5	Reland "Relax constraints for reduction vectorization" Change from original commit: move test (that uses an X86 triple) into the X86 subdirectory. Original description: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355889	2019-03-12 01:31:44 +00:00
Sanjoy Das	244bc57544	Revert "Relax constraints for reduction vectorization" This reverts commit r355868. Breaks hexagon. llvm-svn: 355873	2019-03-11 22:37:31 +00:00
Sanjoy Das	367bdd4c9b	Relax constraints for reduction vectorization Summary: Gating vectorizing reductions on all fastmath flags seems unnecessary; `reassoc` should be sufficient. Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal Reviewed By: sdesmalen Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57728 llvm-svn: 355868	2019-03-11 21:36:41 +00:00
Simon Pilgrim	bbece4ffe7	[Vectorizer] Add vectorization support for fixed smul/umul intrinsics This requires a couple of tweaks to existing vectorization functions as they were assuming that only the second call argument (ctlz/cttz/powi) could ever be the 'always scalar' argument, but for smul.fix + umul.fix its the third argument. Differential Revision: https://reviews.llvm.org/D58616 llvm-svn: 354790	2019-02-25 15:42:02 +00:00
James Y Knight	c8b30de05f	[opaque pointer types] Pass value type to LoadInst creation. This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911	2019-02-01 20:44:24 +00:00
Yevgeny Rouban	93a3601da5	[SLPVectorizer] Get rid of IndexQueue array from vectorizeStores. NFCI. Indices are checked as they are generated. No need to fill the whole array of indices. Differential Revision: https://reviews.llvm.org/D57144 llvm-svn: 352839	2019-02-01 06:44:08 +00:00
Chandler Carruth	ae65e281f3	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Alexey Bataev	4fa42a0c79	[SLP] Fix PR40310: The reduction nodes should stay scalar. Summary: Sometimes the SLP vectorizer tries to vectorize the horizontal reduction nodes during regular vectorization. This may happen inside of the loops, when there are some vectorizable PHIs. Patch fixes this by checking if the node is the reduction node and thus it must not be vectorized, it must be gathered. Reviewers: RKSimon, spatel, hfinkel, fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56783 llvm-svn: 351349	2019-01-16 15:39:52 +00:00
Anton Afanasyev	d3ca8c67ce	Test commit Fix typos. llvm-svn: 349644	2018-12-19 17:18:40 +00:00
Alexey Bataev	5e0205ab38	[SLP]PR39774: Update references of the replaced external instructions. Summary: An additional fix for PR39774. Need to update the references for the RedcutionRoot instruction when it is replaced during the vectorization phase to avoid compiler crash on reduction vectorization. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55017 llvm-svn: 347997	2018-11-30 15:14:20 +00:00
Alexey Bataev	27daa0afed	[SLP]Fix PR39774: Set ReductionRoot if the original instruction is vectorized. Summary: If the original reduction root instruction was vectorized, it might be removed from the tree. It means that the insertion point may become invalidated and the whole vectorization of the reduction leads to the incorrect output result. The ReductionRoot instruction must be marked as externally used so it could not be removed. Otherwise it might cause inconsistency with the cost model and we may end up with too optimistic optimization. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54955 llvm-svn: 347759	2018-11-28 14:34:11 +00:00
Simon Pilgrim	775ebccf6b	[SLPVectorizer] Add basic support for mul/and/or/xor horizontal reductions Expand arithmetic reduction to include mul/and/or/xor instructions. This patch just fixes the SLPVectorizer - the effective reduction costs for AVX1+ are still poor (see rL344846) and will need to be improved before SLP sees this as a valid transform - but we can already see the effect on SSE2 tests. This partially helps PR37731, but doesn't fix it all as it still falls over on the extraction/reduction order for some reason. Differential Revision: https://reviews.llvm.org/D53473 llvm-svn: 345037	2018-10-23 15:13:09 +00:00
Chandler Carruth	cdfd07538f	[TI removal] Make `getTerminator()` return a generic `Instruction`. This removes the primary remaining API producing `TerminatorInst` which will reduce the rate at which code is introduced trying to use it and generally make it much easier to remove the remaining APIs across the codebase. Also clean up some of the stragglers that the previous mechanical update of variables missed. Users of LLVM and out-of-tree code generally will need to update any explicit variable types to handle this. Replacing `TerminatorInst` with `Instruction` (or `auto`) almost always works. Most of these edits were made in prior commits using the perl one-liner: ``` perl -i -ple 's/TerminatorInst(\b.* = .*getTerminator)/Instruction\1/g' ``` This also my break some rare use cases where people overload for both `Instruction` and `TerminatorInst`, but these should be easily fixed by removing the `TerminatorInst` overload. llvm-svn: 344504	2018-10-15 10:42:50 +00:00
Matt Arsenault	a7d0550d67	SLPVectorizer: Fix assert with different sized address spaces llvm-svn: 341215	2018-08-31 14:34:53 +00:00
Alexey Bataev	cfab433847	[SLP] Fix insert point for reused extract instructions. Summary: Reworked the previously committed patch to insert shuffles for reused extract element instructions in the correct position. Previous logic was incorrect, and might lead to the crash with PHIs and EH instructions. Reviewers: efriedma, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50143 llvm-svn: 339166	2018-08-07 19:21:05 +00:00
Alexey Bataev	97ffeb4b92	[SLP] Fix PR38339: Instruction does not dominate all uses! Summary: If the ExtractElement instructions can be optimized out during the vectorization and we need to reshuffle the parent vector, this ShuffleInstruction may be inserted in the wrong place causing compiler to produce incorrect code. Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49928 llvm-svn: 338380	2018-07-31 14:02:43 +00:00
Fangrui Song	121474a01b	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00
Simon Pilgrim	0e1947ac6c	[SLPVectorizer] Avoid duplicate scalar cost calculations in BoUpSLP::getEntryCost. NFCI. Pulled out from D49225, we have a lot of repeated scalar cost calculations, often with arguments that don't look the same but turn out to be. llvm-svn: 337390	2018-07-18 13:53:55 +00:00
Simon Pilgrim	dfc2294680	[SLPVectorizer] Don't attempt horizontal reduction on pointer types (PR38191) TTI::getMinMaxReductionCost typically can't handle pointer types - until this is changed its better to limit horizontal reduction to integer/float vector types only. llvm-svn: 337280	2018-07-17 13:43:33 +00:00
Simon Pilgrim	b9a35123dc	[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED-2) We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Reapplied with fix to only accept 2 different casts if they come from the same source type (PR38154). Differential Revision: https://reviews.llvm.org/D49135 llvm-svn: 336989	2018-07-13 11:09:52 +00:00
Martin Storsjo	6000ce8df0	Revert "[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED)" This reverts commit r336812, which broke compilation of a number of projects, see PR38154. llvm-svn: 336949	2018-07-12 21:33:42 +00:00
Simon Pilgrim	c007d602eb	[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED) We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Reapplied with fix to only accept 2 different casts if they come from the same source type. Differential Revision: https://reviews.llvm.org/D49135 llvm-svn: 336812	2018-07-11 15:05:10 +00:00
Simon Pilgrim	4fd0b950a3	Revert rL336804: [SLPVectorizer] Add initial alternate opcode support for cast instructions. Reverting due to buildbot failures llvm-svn: 336806	2018-07-11 14:08:16 +00:00
Simon Pilgrim	17f835882b	[SLPVectorizer] Add initial alternate opcode support for cast instructions. We currently only support binary instructions in the alternate opcode shuffles. This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism: 1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly. 2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this. 3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc. 4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements. Differential Revision: https://reviews.llvm.org/D49135 llvm-svn: 336804	2018-07-11 13:34:09 +00:00
Simon Pilgrim	f4a0972396	[SLPVectorizer] Begin abstracting InstructionsState alternate matching away from opcodes. NFCI. This is an early step towards matching Instructions by attributes other than the opcode. This will be necessary for cast/call alternates which share the same opcode but have different types/intrinsicIDs etc. - which we could vectorize as long as we split them using the alternate mechanism. Differential Revision: https://reviews.llvm.org/D48945 llvm-svn: 336344	2018-07-05 12:30:44 +00:00
Simon Pilgrim	d7c9b3a70c	Fix some irregular whitespace/indentation. NFCI. llvm-svn: 336291	2018-07-04 17:24:05 +00:00
Farhana Aleen	d3e3e16e60	[SLP] Recognize min/max pattern using instructions producing same values. Summary: It is common to have the following min/max pattern during the intermediate stages of SLP since we only optimize at the end. This patch tries to catch such patterns and allow more vectorization. %1 = extractelement <2 x i32> %a, i32 0 %2 = extractelement <2 x i32> %a, i32 1 %cond = icmp sgt i32 %1, %2 %3 = extractelement <2 x i32> %a, i32 0 %4 = extractelement <2 x i32> %a, i32 1 %select = select i1 %cond, i32 %3, i32 %4 Author: FarhanaAleen Reviewed By: ABataev, RKSimon, spatel Differential Revision: https://reviews.llvm.org/D47608 llvm-svn: 336130	2018-07-02 17:55:31 +00:00
Simon Pilgrim	bd73b34751	[SLPVectorizer] Remove nullptr early-outs from Instruction::ShuffleVector getEntryCost This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure. llvm-svn: 336102	2018-07-02 13:41:29 +00:00
Simon Pilgrim	ffa8d2ee7c	[SLPVectorizer] Fix alternate opcode + shuffle cost function to correct handle SK_Select patterns. We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case. This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now... llvm-svn: 336095	2018-07-02 11:28:01 +00:00
Simon Pilgrim	a0b4decfe6	[SLPVectorizer] Only Alternate opcodes use ShuffleVector cases for getEntryCost/vectorizeTree. NFCI. Add assertions - we're already assuming this in how we use the AltOpcode and treat everything as BinaryOperators. llvm-svn: 336092	2018-07-02 10:54:19 +00:00
Simon Pilgrim	978f4e7f3d	[SLPVectorizer] Call InstructionsState.isOpcodeOrAlt with Instruction instead of an opcode. NFCI. llvm-svn: 336069	2018-07-01 20:22:46 +00:00
Simon Pilgrim	4016f678d7	[SLPVectorizer] Replace sameOpcodeOrAlt with InstructionsState.isOpcodeOrAlt helper. NFCI. This is a basic step towards matching more general instructions types than just opcodes. llvm-svn: 336068	2018-07-01 20:07:30 +00:00
Simon Pilgrim	1b899b8559	[SLPVectorizer] Use InstructionsState Op/Alt opcodes directly. NFCI. llvm-svn: 336063	2018-07-01 13:41:58 +00:00
Simon Pilgrim	2c643d6425	[SLPVectorizer] Recognise non uniform power of 2 constants Since D46637 we are better at handling uniform/non-uniform constant Pow2 detection; this patch tweaks the SLP argument handling to support them. As SLP works with arrays of values I don't think we can easily use the pattern match helpers here. Differential Revision: https://reviews.llvm.org/D48214 llvm-svn: 335621	2018-06-26 16:20:16 +00:00
Simon Pilgrim	dab1220353	[SLPVectorizer] Support alternate opcodes in tryToVectorizeList Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree. NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc. Differential Revision: https://reviews.llvm.org/D48488 llvm-svn: 335364	2018-06-22 16:37:34 +00:00

1 2 3 4 5 ...

561 Commits