archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Mark Searles	dfc7fb5622	Revert "AMDGPU: Split block for si_end_cf" This reverts commit `7a6ef30046`. We discovered some internal test failures, so reverting for now. Differential Revision: https://reviews.llvm.org/D61213 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359363 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-27 00:51:18 +00:00
Stanislav Mekhanoshin	09f8a0f6a0	[AMDGPU] gfx1010 VOP3 and VOP3P implementation Differential Revision: https://reviews.llvm.org/D61202 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359328 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-26 17:56:03 +00:00
Stanislav Mekhanoshin	834873d34d	[AMDGPU] gfx1010 VOP2 changes Differential Revision: https://reviews.llvm.org/D61156 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359316 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-26 16:37:51 +00:00
Stanislav Mekhanoshin	f43d543c45	[AMDGPU] Add gfx1010 target definitions Differential Revision: https://reviews.llvm.org/D61041 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359113 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-24 17:03:15 +00:00
Stanislav Mekhanoshin	b4fad1ffbb	[AMDGPU] Fixed addReg() in SIOptimizeExecMaskingPreRA.cpp The second argument is flags, not subreg. Differential Revision: https://reviews.llvm.org/D61031 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359017 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-23 17:59:26 +00:00
Scott Linder	5bc7b0c856	[AMDGPU] Fix hidden argument metadata duplication for V3 Essentially complete a proper rebase of the V3 metadata change over https://reviews.llvm.org/D49096. Minimize the diff between the V2 and V3 variants of the relevant lit tests, and clean up some trailing whitespace. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358992 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-23 14:31:17 +00:00
Nicolai Haehnle	f1c7e10a44	AMDGPU: Fix LCSSA phi lowering in SILowerI1Copies Summary: When an LCSSA phi survives through instruction selection, the pass ends up removing that phi entirely because it is dominated by the logic that does the lanemask merging. This then used to trigger an assertion when processing a dependent phi instruction. Change-Id: Id4949719f8298062fe476a25718acccc109113b6 Reviewers: llvm-commits Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, tpr, dstuttard, rtaylor, arsenm Tags: #llvm Differential Revision: https://reviews.llvm.org/D60999 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358983 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-23 13:12:52 +00:00
Bjorn Pettersson	734bf57ca8	[DAGCombiner] Combine OR as ADD when no common bits are set Summary: The DAGCombiner is rewriting (canonicalizing) an ISD::ADD with no common bits set in the operands as an ISD::OR node. This could sometimes result in "missing out" on some combines that normally are performed for ADD. To be more specific this could happen if we already have rewritten an ADD into OR, and later (after legalizations or combines) we expose patterns that could have been optimized if we had seen the OR as an ADD (e.g. reassociations based on ADD). To make the DAG combiner less sensitive to if ADD or OR is used for these "no common bits set" ADD/OR operations we now apply most of the ADD combines also to an OR operation, when value tracking indicates that the operands have no common bits set. Reviewers: spatel, RKSimon, craig.topper, kparzysz Reviewed By: spatel Subscribers: arsenm, rampitec, lebedev.ri, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59758 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358965 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-23 10:01:08 +00:00
Michael Liao	77d3eb784a	[AMDGPU] Fix an issue in `op_sel_hi` skipping. Summary: - Only apply packed literal `op_sel_hi` skipping on operands requiring packed literals. Even an instruction is `packed`, it may have operand requiring non-packed literal, such as `v_dot2_f32_f16`. Reviewers: rampitec, arsenm, kzhuravl Subscribers: jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60978 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358922 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-22 22:05:49 +00:00
Matt Arsenault	22846997f5	AMDGPU: Skip debug instructions in assert These are inserted after branch relaxation, and for some reason it's decided to put them in the long branch expansion block. It's probably not great to rely on the source block address, so this should probably be switched to being PC relative instead of relying on the block address git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358909 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-22 19:14:26 +00:00
Matt Arsenault	4d1518d22a	AMDGPU/GlobalISel: Fix non-power-of-2 G_EXTRACT sources git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358894 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-22 15:22:46 +00:00
Matt Arsenault	3a3f73cffa	GlobalISel: Legalize scalar G_EXTRACT sources git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358892 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-22 15:10:42 +00:00
Simon Pilgrim	524491cf14	[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGCombine but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. Differential Revision: https://reviews.llvm.org/D60462 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358887 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-22 14:04:35 +00:00
Simon Pilgrim	101d574bdb	[AMDGPU] Regenerate uitofp i8 to float conversion tests. Prep work for D60462 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358879 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-22 10:19:09 +00:00
Simon Pilgrim	2df0acd2ff	[AMDGPU] Regenerate extractelt->truncate test. Prep work for D60462 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358746 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-19 09:49:04 +00:00
Tim Renouf	fed299e82d	[AMDGPU] Avoid DAG combining assert with fneg(fadd(A,0)) fneg combining attempts to turn it into fadd(fneg(A), fneg(0)), but creating the new fadd folds to just fneg(A). When A has multiple uses, this confuses it and you get an assert. Fixed. Differential Revision: https://reviews.llvm.org/D60633 Change-Id: I0ddc9b7286abe78edc0cd8d734fdeb05ff09821c git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358640 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-18 05:27:01 +00:00
Rhys Perry	788a809b3c	AMDGPU: Force skip over SMRD, VMEM and s_waitcnt instructions Summary: This fixes a large Dawn of War 3 performance regression with RADV from Mesa 19.0 to master which was caused by creating less code in some branches. Reviewers: arsen, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358592 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-17 16:31:52 +00:00
Matt Arsenault	91b4e08b81	AMDGPU: Fix unreachable when counting register usage of SGPR96 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358447 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-15 20:51:12 +00:00
Matt Arsenault	c468ee9242	AMDGPU: Fix printed format of SReg_96 These are artificial, so I think this should only come up with inline asm comments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358446 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-15 20:42:18 +00:00
Amara Emerson	040f61e117	[GlobalISel] Enable CSE in the IRTranslator & legalizer for -O0 with constants only. Other opcodes shouldn't be CSE'd until we can be sure debug info quality won't be degraded. This change also improves the IRTranslator so that in most places, but not all, it creates constants using the MIRBuilder directly instead of first creating a new destination vreg and then creating a constant. By doing this, the buildConstant() method can just return the vreg of an existing G_CONSTANT instead of having to create a COPY from it. I measured a 0.2% improvement in compile time and a 0.9% improvement in code size at -O0 ARM64. Compile time: Program base cse diff test-suite...ark/tramp3d-v4/tramp3d-v4.test 9.04 9.12 0.8% test-suite...Mark/mafft/pairlocalalign.test 2.68 2.66 -0.7% test-suite...-typeset/consumer-typeset.test 5.53 5.51 -0.4% test-suite :: CTMark/lencod/lencod.test 5.30 5.28 -0.3% test-suite :: CTMark/Bullet/bullet.test 25.82 25.76 -0.2% test-suite...:: CTMark/ClamAV/clamscan.test 6.92 6.90 -0.2% test-suite...TMark/7zip/7zip-benchmark.test 34.24 34.17 -0.2% test-suite :: CTMark/SPASS/SPASS.test 6.25 6.24 -0.1% test-suite...:: CTMark/sqlite3/sqlite3.test 1.66 1.66 -0.1% test-suite :: CTMark/kimwitu++/kc.test 13.61 13.60 -0.0% Geomean difference -0.2% Code size: Program base cse diff test-suite...-typeset/consumer-typeset.test 1315632 1266480 -3.7% test-suite...:: CTMark/ClamAV/clamscan.test 1313892 1297508 -1.2% test-suite :: CTMark/lencod/lencod.test 1439504 1423112 -1.1% test-suite...TMark/7zip/7zip-benchmark.test 2936980 2904172 -1.1% test-suite :: CTMark/Bullet/bullet.test 3478276 3445460 -0.9% test-suite...ark/tramp3d-v4/tramp3d-v4.test 8082868 `8033492` -0.6% test-suite :: CTMark/kimwitu++/kc.test 3870380 3853972 -0.4% test-suite :: CTMark/SPASS/SPASS.test 1434904 1434896 -0.0% test-suite...Mark/mafft/pairlocalalign.test 764528 764528 0.0% test-suite...:: CTMark/sqlite3/sqlite3.test 782092 782092 0.0% Geomean difference -0.9% Differential Revision: https://reviews.llvm.org/D60580 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358369 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-15 05:04:20 +00:00
David Green	45a375eb6b	Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not seeing through to the constant in other blocks. Revert this patch while we come up with a better way to handle that. I will try to follow this up with some better tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358113 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 18:00:41 +00:00
Matt Arsenault	9b41947020	GlobalISel: Support legalizing G_CONSTANT with irregular breakdown git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358109 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 17:27:53 +00:00
Matt Arsenault	1963c0bccb	GlobalISel: Handle odd breakdowns for bit ops git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358105 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 17:07:56 +00:00
Stanislav Mekhanoshin	f53ab9d6ab	Revert LIS handling in MachineDCE One of out of tree targets has regressed with this patch. Reverting it for now and let liveness to be fully reconstructed in case pass was used after the LIS is created to resolve the regression. Differential Revision: https://reviews.llvm.org/D60466 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358015 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-09 16:13:53 +00:00
Tom Stellard	0f50aefa77	AMDGPU/GlobalISel: Implement call lowering for shaders returning values Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, llvm-commits Differential Revision: https://reviews.llvm.org/D57166 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357964 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-09 02:26:03 +00:00
Nikita Popov	cdee3b7f23	Reapply [ValueTracking] Support min/max selects in computeConstantRange() Add support for min/max flavor selects in computeConstantRange(), which allows us to fold comparisons of a min/max against a constant in InstSimplify. This fixes an infinite InstCombine loop, with the test case taken from D59378. Relative to the previous iteration, this contains some adjustments for AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen, which ends up constant folding some existing med3 tests after this change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option is added, which allows disabling scalar IR passes (that use InstSimplify) for testing purposes. Differential Revision: https://reviews.llvm.org/D59506 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357870 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-07 17:22:16 +00:00
Stanislav Mekhanoshin	5d206b6327	[AMDGPU] Add MachineDCE pass after RenameIndependentSubregs Detect dead lanes can create some dead defs. Then RenameIndependentSubregs will break a REG_SEQUENCE which may use these dead defs. At this point a dead instruction can be removed but we do not run a DCE anymore. MachineDCE was only running before live variable analysis. The patch adds a mean to preserve LiveIntervals and SlotIndexes in case it works past this. Differential Revision: https://reviews.llvm.org/D59626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357805 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 20:11:32 +00:00
Matt Arsenault	6c7dd5967a	AMDGPU/GlobalISel: Fix non-power-of-2 select git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357762 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 14:03:04 +00:00
Piotr Sobczak	959b42493f	[SelectionDAG] Compute known bits of CopyFromReg Summary: Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if the virtual reg used has one def only. This can be particularly useful when calling isBaseWithConstantOffset() with the ISD::CopyFromReg argument, as more optimizations may get enabled in the result. Also add a missing truncation on X86, found by testing of this patch. Change-Id: Id1c9fceec862d118c54a5b53adf72ada5d6daefa Reviewers: bogner, craig.topper, RKSimon Reviewed By: RKSimon Subscribers: lebedev.ri, nemanjai, jvesely, nhaehnle, javed.absar, jsji, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59535 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357745 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 07:44:09 +00:00
Matt Arsenault	7a6ef30046	AMDGPU: Split block for si_end_cf Relying on no spill or other code being inserted before this was precarious. It relied on code diligently checking isBasicBlockPrologue which is likely to be forgotten. Ideally this could be done earlier, but this doesn't work because of phis. Any other instruction can't be placed before them, so we have to accept the position being incorrect during SSA. This avoids regressions in the fast register allocator rewrite from inverting the direction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357634 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-03 20:53:20 +00:00
Matt Arsenault	cd3a8dda10	AMDGPU: Assume ECC is enabled by default if supported The test should really be checking for the property directly in the code object headers, but there are problems with this. I don't see this directly represented in the text form, and for the binary emission this is depending on a function level subtarget feature to emit a global flag. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357558 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-03 01:58:57 +00:00
Matt Arsenault	d833ca999e	AMDGPU: Don't use the default cpu in a few tests Avoids unnecessary test changes in a future commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357539 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-03 00:00:58 +00:00
Michael Liao	c23ae47bcb	[AMDGPU] Add more test cases of D59608. Summary: - Add more test cases. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60071 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357442 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-02 00:36:37 +00:00
Neil Henning	fcc236c268	[AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure. This change incorporates an effort by Connor Abbot to change how we deal with WWM operations potentially trashing valid values in inactive lanes. Previously, the SIFixWWMLiveness pass would work out which registers were being trashed within WWM regions, and ensure that the register allocator did not have any values it was depending on resident in those registers if the WWM section would trash them. This worked perfectly well, but would cause sometimes severe register pressure when the WWM section resided before divergent control flow (or at least that is where I mostly observed it). This fix instead runs through the WWM sections and pre allocates some registers for WWM. It then reserves these registers so that the register allocator cannot use them. This results in a significant register saving on some WWM shaders I'm working with (130 -> 104 VGPRs, with just this change!). Differential Revision: https://reviews.llvm.org/D59295 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357400 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-01 15:19:52 +00:00
Matt Arsenault	234b3a117e	AMDGPU: Remove dx10-clamp from subtarget features Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357302 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 19:14:54 +00:00
Nirav Dave	d3c5ebd041	[DAGCombine] Prune unnused nodes. Summary: Nodes that have no uses are eventually pruned when they are selected from the worklist. Record nodes newly added to the worklist or DAG and perform pruning after every combine attempt. Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight Reviewed By: jyknight Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58070 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357283 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 17:35:56 +00:00
Konstantin Zhuravlyov	142ef796c0	AMDGPU: Make sram-ecc off by default for Vega20 Differential Revision: https://reviews.llvm.org/D59718 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357247 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 12:04:18 +00:00
Matt Arsenault	768023a783	AMDGPU/GlobalISel: Insert waterfall loop for vector indexing The register index can only really be an SGPR. Lie that a VGPR index is legal, and then rewrite the instruction in a waterfall loop to handle the index. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357235 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 03:54:56 +00:00
Matt Arsenault	a88dcdbff2	AMDGPU: Make exec mask optimzations more resistant to block splits Also improve the check for SALU instructions to also ignore implicit_def and other fake instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357170 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-28 14:01:39 +00:00
Piotr Sobczak	b6bb254aa4	[SelectionDAG] Add 2 tests for selection across basic blocks Summary: Add tests for selection across basic block boundary: * one test containing a buffer load, where part of the offset computation is placed in the predecessor of the load * similar test, but containing two buffer loads and shared computations Please note that the behaviour being tested will be updated in a subsequent commit. This commit was extracted from https://reviews.llvm.org/D59535. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: jvesely, nhaehnle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357149 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-28 07:06:26 +00:00
Justin Bogner	c466f32ecf	[LegalizeVectorTypes] Allow single loads and stores for more short vectors When lowering a load or store for TypeWidenVector, the type legalizer would use a single load or store if the associated integer type was legal or promoted. E.g. it loads a v4i8 as an i32 if i32 is legal/promotable. (See https://reviews.llvm.org/rL236528 for reference.) This applies that behaviour to vector types. If the vector type is TypePromoteInteger, the element type is going to be TypePromoteInteger as well, which will lead to have a single promoting load rather than N individual promoting loads. For instance, if we have a v3i1, we would now have a load of v4i1 instead of 3 loads of i1. Patch by Guillaume Marques. Thanks! Differential Revision: https://reviews.llvm.org/D56201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357120 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 20:35:56 +00:00
Matt Arsenault	53202c3a19	RegPressure: Fix crash on blocks with only dbg_value If there were only dbg_values in the block, recede would hit the beginning of the block and try to use thet dbg_value as a real instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357105 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 18:14:02 +00:00
Amara Emerson	a3e702346e	[GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357101 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:47:42 +00:00
Matt Arsenault	50359ffe2c	Reapply "AMDGPU: Scavenge register instead of findUnusedReg" This reapplies r356149, using the correct overload of findUnusedReg which passes the current iterator. This worked most of the time, because the scavenger iterator was moved at the end of the frame index loop in PEI. This would fail if the spill was the first instruction. This was further hidden by the fact that the scavenger wasn't passed in for normal frame index elimination. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357098 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:31:29 +00:00
Matt Arsenault	5503f81d14	AMDGPU: Add testcase I meant to merge into r357093 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357097 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:31:26 +00:00
Quentin Colombet	5a4d441633	[PeepholeOpt] Don't stop simplifying copies on sequence of subregs This patch removes an overly conservative check that would prevent simplifying copies when the value we were tracking would go through several subregister indices. Indeed, the intend of this check was to not track values whenever we have to compose subregister, but actually what the check was doing was bailing anytime we see a second subreg, even if that second subreg would actually be the new source of truth (as opposed to a part of that subreg). Differential Revision: https://reviews.llvm.org/D59891 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357095 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:27:56 +00:00
Matt Arsenault	0755a8d19c	AMDGPU: Enable the scavenger for large frames Another test is needed for the case where the scavenge fail, but there's another issue with that which needs an additional fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357093 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 17:14:32 +00:00
Matt Arsenault	44ee20512e	AMDGPU: Add additional MIR tests for exec mask optimizations Also includes one example of how this transform is unsound. This isn't verifying the copies are used in the control flow intrinisic patterns. Also add option to disable exec mask opt pass. Since this pass is unsound, it may be useful to turn it off until it is fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357091 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:58:30 +00:00
Matt Arsenault	99267d8b98	AMDGPU: Skip debug_instr when collapsing end_cf Based on how these are inserted, I doubt this was causing a problem in practice. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357090 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:58:27 +00:00
Matt Arsenault	502b049dcb	AMDGPU: Fix missing scc implicit def on s_andn2_b64_term Introduce new helper class to copy properties directly from the base instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357089 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-27 16:58:22 +00:00

1 2 3 4 5 ...

2212 Commits