archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Tom Stellard	adf1294087	Merging r266825: ------------------------------------------------------------------------ r266825 \| nhaehnle \| 2016-04-19 14:58:22 -0700 (Tue, 19 Apr 2016) \| 12 lines AMDGPU: Guard VOPC instructions against incorrect commute Summary: The added testcase, which triggered this, was derived from a shader-db case via bugpoint. A separate question is why scalar branching wasn't used. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19208 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271767 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 03:43:02 +00:00
Tom Stellard	0afb7d7e71	Merging r266152: ------------------------------------------------------------------------ r266152 \| thomas.stellard \| 2016-04-12 16:57:30 -0700 (Tue, 12 Apr 2016) \| 13 lines AMDGPU/SI: Fix spilling of 96-bit registers Summary: It seems like this was broken in r252327. I thought we had test cases for this, but it's really hard to tirgger spills of this exact register size since they aren't used very much. Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19021 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271735 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 20:48:40 +00:00
Tom Stellard	47bf9db963	Merging r262732: ------------------------------------------------------------------------ r262732 \| thomas.stellard \| 2016-03-04 10:31:18 -0800 (Fri, 04 Mar 2016) \| 12 lines AMDGPU/SI: Add support for spiling SGPRs to scratch buffer Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271722 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 20:22:44 +00:00
Tom Stellard	d2741da1cd	Merging r261385: ------------------------------------------------------------------------ r261385 \| thomas.stellard \| 2016-02-19 16:37:25 -0800 (Fri, 19 Feb 2016) \| 20 lines AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271700 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 18:16:01 +00:00
Tom Stellard	737edaf048	Merging r260651: ------------------------------------------------------------------------ r260651 \| Matthew.Arsenault \| 2016-02-11 18:40:47 -0800 (Thu, 11 Feb 2016) \| 7 lines AMDGPU: Set element_size in private resource descriptor Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271679 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 15:58:20 +00:00
Tom Stellard	d1c65e8935	Merging r260599: ------------------------------------------------------------------------ r260599 \| thomas.stellard \| 2016-02-11 13:45:07 -0800 (Thu, 11 Feb 2016) \| 14 lines AMDGPU/SI: Make sure MIMG descriptors and samplers stay in SGPRs Summary: It's possible to have resource descriptors and samplers stored in VGPRs, either by a VMEM instruction or in the case of samplers, floating-point calculations. When this happens, we need to use v_readfirstlane to copy these values back to sgprs. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17102 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271642 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 09:50:12 +00:00
Tom Stellard	ab4b667eea	Merging r260588: ------------------------------------------------------------------------ r260588 \| thomas.stellard \| 2016-02-11 13:14:34 -0800 (Thu, 11 Feb 2016) \| 20 lines AMDGPU/SI: When splitting SMRD instructions, add its users to VALU worklist Summary: When we split SMRD instructions into two MUBUFs we were adding the users of the newly created MUBUFs to the VALU worklist. However, the only users these instructions had was the REG_SEQUENCE that was inserted by splitSMRD when the original SMRD instruction was split. We need to make sure to add the users of the original SMRD to the VALU worklist before it is split. I have a test case, but it requires one other bug fix, so it will be added in a later commt. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17101 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271641 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 09:50:09 +00:00
Tom Stellard	873623bac3	Merging r260495: ------------------------------------------------------------------------ r260495 \| Matthew.Arsenault \| 2016-02-10 22:15:39 -0800 (Wed, 10 Feb 2016) \| 9 lines AMDGPU: Fix constant bus use check with subregisters If the two operands to an instruction were both subregisters of the same super register, it would incorrectly think this counted as the same constant bus use. This fixes the verifier error in fmin_legacy.ll which was missing -verify-machineinstrs. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_38@271640 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 09:50:08 +00:00
Nicolai Haehnle	cead1b4a6d	AMDGPU/SI: Add SI Machine Scheduler Summary: It is off by default, but can be used with --misched=si Patch by: Axel Davy Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D11885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257609 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-13 16:10:10 +00:00
Nicolai Haehnle	702b589510	AMDGPU/SI: Fold operands with sub-registers Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257074 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-07 17:10:29 +00:00
Nicolai Haehnle	0589c22ae9	AMDGPU/SI: use S_MOV_B64 for larger copies in copyPhysReg Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15629 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256073 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-19 01:36:26 +00:00
Nicolai Haehnle	710bb5a598	AMDGPU: fix overlapping copies in copyPhysReg Summary: When copying aggregate registers within the same register class, there may be an overlap between source and destination that forces us to do the copy backwards. Do the simplest possible thing that guarantees the correct order of moves when there are overlaps, and does whatever when there is no overlap. (The last part forces some trivial adjustments to test cases.) Together with r255906, this fixes a VM fault in Unreal Elemental Demo. While at it, change the generation of kill and def flags to something that looks more reasonable. This method is used very late during compilation, so it probably doesn't matter in practice, and to be honest, I don't know if this change is actually correct because the semantics in connection with aggregate registers vs. sub-registers are not clear to me. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93264 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15622 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256072 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-19 01:16:06 +00:00
Changpeng Fang	cd00b72f32	AMDGPU/SI: Test commit Summary: This is just my first commit. Test! Reviewers: none Subscribers: none Differential Revision: none git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256022 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-18 20:04:28 +00:00
Changpeng Fang	4990717b55	Revert "AMDGPU/SI: Test commit" This reverts commit a493cb636e0152ad28210934a47c6c44b1437193. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256021 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-18 20:04:26 +00:00
Changpeng Fang	dce96eea12	AMDGPU/SI: Test commit Summary: This is just my first commit. Test! Reviewers: none Subscribers: none Differential Revision: none git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256020 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-18 19:57:41 +00:00
Nicolai Haehnle	7c502030bf	AMDGPU: Fix off-by-one in SIRegisterInfo::eliminateFrameIndex Summary: The method insertNOPs expected the number of wait states to be passed as parameter, while eliminateFrameIndex passed the immediate argument for the S_NOP, leading to an off-by-one error. Rename the method to make the meaning of its parameter clearer. The number of 4 / 5 wait states (which is what the method has always _tried_ to do according to the comment) is correct according to the hardware docs. I stumbled upon this while trying to track down the cause of https://bugs.freedesktop.org/show_bug.cgi?id=93264. While clearly needed, this patch unfortunately does not fix that bug... Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15542 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255906 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-17 16:46:42 +00:00
Tom Stellard	7d2a810fef	AMDGPU/SI: Emit constant arrays in the .text section Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255204 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-10 02:13:01 +00:00
Matt Arsenault	dc53fde2a4	AMDGPU: Optimize VOP2 operand legalization Don't use commuteInstruction, and don't commute if doing so will not improve legality. Skip the more complex checks for literal operands and constant bus restrictions, which are not a concern for VOP2 instructions because src1 does not accept SGPRs or constants and few implicitly read vcc. This gets called quite a few times and the attempts at commuting are a significant fraction of the time spent in SIFixSGPRCopies, so it's somewhat worthwhile to optimize. With this patch and others leading up to it, this reduces the compile time of SIFixSGPRCopies on some of the LuxMark 2 kernels from ~8ms to ~5ms on my system. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254452 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-01 19:57:17 +00:00
Matt Arsenault	0f1b95f818	AMDGPU: Rework how private buffer passed for HSA If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254331 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-30 21:16:03 +00:00
Matt Arsenault	d4a0a430cc	AMDGPU: Rename enums to be consistent with HSA code object terminology git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254330 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-30 21:15:57 +00:00
Matt Arsenault	956f59ab56	AMDGPU: Remove SIPrepareScratchRegs It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254329 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-30 21:15:53 +00:00
Marek Olsak	73f0848ca2	AMDGPU/SI: select S_ABS_I32 when possible (v2) v2: added more tests, moved the SALU->VALU conversion to a separate function It looks like it's not possible to get subregisters in the S_ABS lowering code, and I don't feel like guessing without testing what the correct code would look like. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254095 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-25 21:22:45 +00:00
Matt Arsenault	ade9b95acb	AMDGPU: Create emergency stack slots during frame lowering Test has a bogus verifier error which will be fixed by later commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252327 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-06 18:17:45 +00:00
Matt Arsenault	2b642eb437	AMDGPU: Remove unused scratch resource operands The SGPR spill pseudos don't actually use them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252324 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-06 18:07:53 +00:00
Matt Arsenault	af4bb57907	AMDGPU: Fix hardcoded alignment of spill. Instead of forcing 4 alignment when spilled, set register class alignments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252322 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-06 17:54:47 +00:00
Matt Arsenault	26c74838a7	AMDGPU: Also track whether SGPRs were spilled git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252145 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-05 05:27:10 +00:00
Matt Arsenault	76b6b15dcd	AMDGPU: Fix assert when legalizing atomic operands The operand layout is slightly different for the atomic opcodes from the usual MUBUF loads and stores. This should only fix it on SI/CI. VI is still broken because it still emits the addr64 replacement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252140 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-05 02:46:56 +00:00
Matt Arsenault	4447636b49	AMDGPU: Make findUsedSGPR more readable Add more comments etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251996 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-03 22:30:15 +00:00
Matt Arsenault	110f55db52	AMDGPU: Simplify VOP3 operand legalization. This was checking for a variety of situations that should never happen. This saves a tiny bit of compile time. We should not be selecting instructions with invalid operands in the first place. Most of the time for registers copys are inserted to the correct operand register class. For VOP3, since all operand types are supported and literal constants never are, we just need to verify the constant bus requirements (all immediates should be legal inline ones). The only possibly tricky case to maybe worry about is if when legalizing operands in moveToVALU with s_add_i32 and similar instructions. If the original s_add_i32 had a literal constant and we need to replace it with v_add_i32_e64 we would have an unsupported literal operand. However, I don't think we should worry about that because SIFoldOperands should handle folding literal constant operands into the SALU instructions based on the uses. At SIFoldOperands time, the legality and profitability of operand types is a bit different. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250951 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-21 21:51:02 +00:00
Matt Arsenault	ca4c86d2fd	AMDGPU: Fix not checking implicit operands in verifyInstruction When verifying constant bus restrictions, this wasn't catching uses in implicit operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250948 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-21 21:15:01 +00:00
Matt Arsenault	d2643e2ff9	AMDGPU: Add MachineInstr overloads for instruction format tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250797 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-20 04:35:43 +00:00
Matt Arsenault	3f7c35a966	AMDGPU: Use explicit register size indirect pseudos This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249494 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-07 00:42:51 +00:00
Matt Arsenault	29467e755f	AMDGPU/SI: Add verifier check for exec reads Make sure we aren't accidentally not setting these in the instruction definitions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249170 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-02 18:58:37 +00:00
Marek Olsak	bc68baa694	AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set to prevent setting a huge stride, because DATA_FORMAT has a different meaning if ADD_TID_ENABLE is set. This is a candidate for stable llvm 3.7. Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248858 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-29 23:37:32 +00:00
Matt Arsenault	e706695c2f	AMDGPU: Factor switch into separate function git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248742 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 20:54:57 +00:00
Matt Arsenault	3443ffa833	AMDGPU: Fix splitting x16 SMRD loads When used recursively, this would set the kill flag on the intermediate step from first splitting x16 to x8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248741 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 20:54:52 +00:00
Matt Arsenault	33d8695b88	AMDGPU: Fix moving SMRD loads with literal offsets on CI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248740 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 20:54:46 +00:00
Matt Arsenault	9ed2f31125	AMDGPU: Fix splitting SMRD with large offset The splitting of > 4 dword SMRD instructions if using an offset in an SGPR instead of an immediate was not setting the destination register, resulting an an instruction missing an operand which would assert later. Test will be included in a following commit which fixes a related issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248739 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 20:54:42 +00:00
Andrew Kaylor	aac3c943f3	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 20:33:22 +00:00
Matt Arsenault	728cde2865	AMDGPU: Construct new buffer instruction when moving SMRD It's easier to understand creating a full instruction than the current situation where sometimes a new instruction is created and sometimes it is awkwardly mutated in place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248627 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-25 22:21:19 +00:00
Matt Arsenault	323c9fbce2	AMDGPU: Re-justify workaround and fix worked around problem When buffer resource descriptors were built, the upper two components of the descriptor were first composed into a 64-bit register because legalizeOperands assumed all operands had the same register class. Fix that problem, but keep the workaround. I'm not sure anything actually is actually emitting such a REG_SEQUENCE now. If multiple resource descriptors are set up with different base pointers, this is copied with a single s_mov_b64. We probably should fix this better by recognizing a pair of s_mov_b32 later, but for now delete the dead code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248585 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-25 17:08:42 +00:00
Matt Arsenault	7ba1878629	AMDGPU: Don't create REG_SEQUENCE with SGPR dest and VGPR sources This avoids needting to re-legalize the new REG_SEQUENCE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248584 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-25 17:08:40 +00:00
Matt Arsenault	a5e772ea93	AMDGPU: Return after instruction is processed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248476 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-24 07:51:28 +00:00
Matt Arsenault	e7de900cec	AMDGPU: Remove another unnecessary check from commuteInstruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248475 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-24 07:51:25 +00:00
Matt Arsenault	bb9c0afde5	AMDGPU: Reduce number of copies emitted Instead of always inserting a copy in case the super register is itself a subregister, only extract to the super reg class if this is actually the case. This shouldn't really change codegen, but makes looking at the output of SIFixSGPRCopies easier to read. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248467 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-24 07:16:37 +00:00
Matt Arsenault	d89d4bccff	AMDGPU: Remove unnecessary check If the instruction doesn't have enough operands, it either shouldn't be marked as isCommutable or is malformed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248242 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-22 04:17:45 +00:00
Matt Arsenault	3a2cec85a7	AMDGPU/SI: Fix more cases of losing exec operands git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247230 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 01:23:28 +00:00
Matt Arsenault	92a899b660	AMDGPU: Extract full 64-bit subregister and use subregs Instead of extracting both 32-bit components from the 128-bit register. This produces fewer copies and is easier for the copy peephole optimizer to understand and see the actual uses as extracts from a reg_sequence. This avoids needing to handle subregister composing in the PeepholeOptimizer's ValueTracker for this case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247162 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-09 17:03:29 +00:00
Matt Arsenault	6bf871423e	AMDGPU: Fix adding redundant implicit operands These are already added during the MachineInstr construction, so this was adding the implicit registers twice. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246525 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-01 02:02:21 +00:00
Matt Arsenault	fe59e8ecf3	AMDGPU: Set mem operands for spill instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246357 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-29 06:48:57 +00:00

1 2

71 Commits