archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Stanislav Mekhanoshin	64373efcab	[AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32 If there is an immediate operand we shall not shrink V_SUBB_U32 and V_ADDC_U32, it does not fit e32 encoding. Differential Revison: https://reviews.llvm.org/D34291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305840 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 20:33:44 +00:00
Matt Arsenault	5516bd1387	AMDGPU: Do operand folding in program order Before it was possible to partially fold use instructions before the defs. After the xor is folded into a copy, the same mov can end up in the fold list twice, so on the second attempt it will fail expecting to see a register to fold. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305821 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:56:32 +00:00
Matthias Braun	0cc137e051	RegisterScavenging: Followup to r305625 This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305817 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:43:14 +00:00
Matt Arsenault	73854fd751	AMDGPU: Preserve undef when folding register operands If the source was a copy of an undef register, this would produce a read of an undefined register which is a verifier error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305816 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:41:31 +00:00
Stanislav Mekhanoshin	423a449bd5	[AMDGPU] Eliminate SGPR to VGPR copy when possible SGPRs are generally cheaper, so try to use them over VGPRs. Differential Revision: https://reviews.llvm.org/D34130 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305815 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:32:42 +00:00
Matt Arsenault	4cabef582f	AMDGPU: Fix crash with undef vreg input operand git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305814 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-20 18:28:02 +00:00
Matt Arsenault	5a7b3305c5	AMDGPU: Fix scratch wave offset relative FI expansion The offset may not be an inline immediate, so this needs to be materialized into a register. The post-RA run of SIShrinkInstructions is able to fold it later if it can. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305761 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-19 23:47:21 +00:00
Stanislav Mekhanoshin	7c44c2a308	[AMDGPU] Add infer address spaces pass before SROA It adds it for the target after inlining but before SROA where we can get most out of it. Differential Revision: https://reviews.llvm.org/D34366 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305759 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-19 23:17:36 +00:00
Tom Stellard	865802ce11	AMDGPU/GlobalISel: Mark G_BITCAST s32 <--> <2 x s16> legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D34129 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305692 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-19 13:15:45 +00:00
Matthias Braun	cd03942492	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse to place spills as the very first instruciton of a basic block and thus artifically increase pressure (test in test/CodeGen/PowerPC/scavenging.mir:spill_at_begin) This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305625 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-17 02:08:18 +00:00
Matthias Braun	bd1a266898	Revert "RegScavenging: Add scavengeRegisterBackwards()" Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305566 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-16 17:48:08 +00:00
Matthias Braun	02688b00ef	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305516 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-15 22:14:55 +00:00
Alexander Timofeev	7807f69e9b	DivergencyAnalysis patch for review git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305494 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-15 19:33:10 +00:00
Tom Stellard	c746f23920	AMDGPU/GlobalISel: Mark 32-bit G_ADD as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33992 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305232 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 20:54:56 +00:00
Matt Arsenault	24acabd9f9	AMDGPU: Teach isLegalAddressingMode about flat offsets Also fix reporting r+r as a valid addressing mode without offsets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305203 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 17:06:35 +00:00
Matt Arsenault	23ef7ef4e3	AMDGPU: Start selecting flat instruction offsets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305201 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 16:53:51 +00:00
Matt Arsenault	4923776ab2	AMDGPU: Start adding offset fields to flat instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305194 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 15:55:58 +00:00
Wei Ding	2d0aa2e9ba	AMDGPU : Fix ISA Version Definitions. Differential Revision: http://reviews.llvm.org/D28531 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305137 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-10 03:53:19 +00:00
Stanislav Mekhanoshin	525f44047e	[AMDGPU] Add intrinsics for alignbit and alignbyte instructions Differential Revision: https://reviews.llvm.org/D34046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305098 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-09 19:03:00 +00:00
David Stuttard	00d555d436	[AMDGPU] Fix for issue in alloca to vector promotion pass Summary: Alloca promotion pass not dealing with non-canonical input Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical) Also added some test cases in non-canonical form to check that it no longer crashes Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31710 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305079 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-09 14:16:22 +00:00
Matt Arsenault	271bf6ebf9	AMDGPU: Use correct register names in inline assembly Fixes using physical registers in inline asm from clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305004 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-08 19:03:20 +00:00
Mark Searles	c5293f3fa4	[AMDGPU] Force qsads instrs to use different dest register than source registers The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes. Differential Revision: https://reviews.llvm.org/D33783 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304998 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-08 18:21:19 +00:00
Tom Stellard	55d2b21ff7	AMDGPU/GlobalISel: Mark 32-bit G_SELECT as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33949 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304910 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-07 13:54:51 +00:00
Tom Stellard	5d24d88bc7	AMDGPU/GlobalISel: Mark 32-bit G_ICMP as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33890 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304797 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-06 14:16:50 +00:00
Vivek Pandya	de22782d75	[Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default. If -simplify-mir option is passed then MIRPrinter will not print such fields. This change also required some lit test cases in CodeGen directory to be changed. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D32304 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304779 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-06 08:16:19 +00:00
Matt Arsenault	323e6e9ede	RenameIndependentSubregs: Fix handling of undef tied operands If a tied source operand was undef, it would be replaced but not update the other tied operand, which would end up using different virtual registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304747 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-05 22:58:57 +00:00
Stanislav Mekhanoshin	ca0adcb320	[AMDGPU] Fix SIFoldOperands crash with clamp Fixes bug #33302. Pass did not account that Src1 of max instruction can be an immediate. Differential Revision: https://reviews.llvm.org/D33884 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304696 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-05 01:03:04 +00:00
Stanislav Mekhanoshin	110a2bc818	[AMDGPU] Untangle SDWA pass from SIShrinkInstructions Remove dependency of SDWA pass on SIShrinkInstructions. The goal is to move SDWA even higher in the stack to avoid second run of MachineLICM, MachineCSE and SIFoldOperands. Also added handling to preserve original src modifiers. Differential Revision: https://reviews.llvm.org/D33860 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304665 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-03 17:39:47 +00:00
Tom Stellard	c9a1489af2	AMDGPU/GlobalISel: Mark 1-bit integer constants as legal Summary: These are mostly legal, but will probably need special lowering for some cases. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304628 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-03 01:13:33 +00:00
Stanislav Mekhanoshin	ced381c038	[AMDGPU] Preserve operand order in SIFoldOperands SIFoldOperands can commute operands even if no folding was done. This change is to preserve IR is no folding was done. Differential Revision: https://reviews.llvm.org/D33802 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304625 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-03 00:41:52 +00:00
Konstantin Zhuravlyov	e0fcf72467	AMDGPU: Make auto waitcnt before barrier a feature Differential Revision: https://reviews.llvm.org/D33793 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304571 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-02 17:40:26 +00:00
Alexander Timofeev	dfdb788875	AMDGPUAnnotateUniformValue should always treat volatile loads as divergent git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304554 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-02 15:25:52 +00:00
Mark Searles	48e0515b4f	[AMDGPU] Turn on the new waitcnt insertion pass. Adjust tests. -enable-si-insert-waitcnts=1 becomes the default -enable-si-insert-waitcnts=0 to use old pass Differential Revision: https://reviews.llvm.org/D33730 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304551 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-02 14:19:25 +00:00
Yaxun Liu	2bb8e7e8cc	[AMDGPU] Fix kernel arg segment size for amdgizcl Differential Revision: https://reviews.llvm.org/D33307 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304482 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-01 21:31:53 +00:00
Mark Searles	fa784827e1	[AMDGPU] Fix bugs in new waitcnt pass. Add test. - new waitcnt pass remains off by default; -enable-si-insert-waitcnts=1 to enable it - fix handling of PERMUTE ops - fix insertion of waitcnt instrs at function begin/end ( port of analogous code that was added to old waitcnt pass ) - add new test Differential Revision: https://reviews.llvm.org/D33114 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304311 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-31 16:44:23 +00:00
Dmitry Preobrazhensky	4a31d77be2	[AMDGPU][MC] New syntax for ds_swizzle_b32 offset See Bug 28601: https://bugs.llvm.org//show_bug.cgi?id=28601 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33542 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304309 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-31 16:26:47 +00:00
Tim Northover	3fd4db31a7	MIR: update test for noVRegs removal. I think I hadn't git pulled recently enough to bring it in. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304250 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-30 22:02:19 +00:00
Tim Northover	837e2e977f	MIR: remove explicit "noVRegs" property. We can infer this from the incoming MIR, so there's no reason to represent it with a special flag. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304246 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-30 21:28:57 +00:00
Stanislav Mekhanoshin	f0c3d71794	[AMDGPU] Allow SDWA in instructions with immediates and SGPRs An encoding does not allow to use SDWA in an instruction with scalar operands, either literals or SGPRs. That is however possible to copy these operands into a VGPR first. Several copies of the value are produced if multiple SDWA conversions were done. To cleanup MachineLICM (to hoist copies out of loops), MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace SGPR to VGPR copy with immediate copy right to the VGPR) runs are added after the SDWA pass. Differential Revision: https://reviews.llvm.org/D33583 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304219 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-30 16:49:24 +00:00
Mark Searles	6781296ab7	[AMDGPU] Require waitcnt before barrier for all targets; adjust tests. Differential Revision: https://reviews.llvm.org/D33576 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304217 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-30 16:22:43 +00:00
Konstantin Zhuravlyov	071c0d91bb	Resubmit r303859 with test fixed. [AMDGPU] add intrinsic for s_getpc Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Patch by Tim Corringham git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304031 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-26 20:38:26 +00:00
Tom Stellard	f696b32065	AMDGPU/GlobalISel: Mark 32-bit float constants as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33212 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304003 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-26 16:40:03 +00:00
Matthias Braun	94c4904dc5	CodeGen: Rename DEBUG_TYPE to match passnames Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303921 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-25 21:26:32 +00:00
Nico Weber	b6e91f4292	Revert r303859, CodeGen/AMDGPU/llvm.amdgcn.s.getpc.ll fails on bots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303902 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-25 19:19:29 +00:00
Tim Corringham	b88c01b1f7	[AMDGPU] add intrinsic for s_getpc Summary: The s_getpc instruction is exposed as intrinsic llvm.amdgcn.s.getpc. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32862 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303859 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-25 14:04:14 +00:00
Simon Pilgrim	6b1d32c6d7	[AMDGPU] Add INDIRECT_BASE_ADDR to R600_Reg32 class (PR33045) This fixes 17 of the 41 -verify-machineinstrs test failures identified in PR33045 Differential Revision: https://reviews.llvm.org/D33451 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303691 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-23 21:27:15 +00:00
Changpeng Fang	6b7bd0e1f9	AMDGPU/SI: Move the local memory usage related checking after calling convention checking in PromoteAlloca Summary: Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other. As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage related checking out and replace it after the calling convention checking. Reviewer: arsenm Differential Revision: http://reviews.llvm.org/D33139 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303684 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-23 20:25:41 +00:00
Stanislav Mekhanoshin	6ff1a723f7	[AMDGPU] Combine and (srl) into shl (bfe) Perform DAG combine: and (srl x, c), mask => shl (bfe x, nb + c, mask >> nb), nb Where nb is a number of trailing zeroes in mask. It replaces two instructions with two and BFE is generally a more expensive one. However this is only done if we are selecting a byte or word at an aligned boundary which results in a proper SDWA operand pattern. It is only done if SDWA is supported. TODO: improve SDWA pass to actually convert this pattern. It is not done now because we have an immediate in the instruction, which has be moved into a VGPR. Differential Revision: https://reviews.llvm.org/D33455 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303681 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-23 19:54:48 +00:00
Stanislav Mekhanoshin	ddde657138	[AMDGPU] Convert shl (add) into add (shl) shl (or\|add x, c2), c1 => or\|add (shl x, c1), (c2 << c1) This allows to fold a constant into an address in some cases as well as to eliminate second shift if the expression is used as an address and second shift is a result of a GEP. Differential Revision: https://reviews.llvm.org/D33432 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303641 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-23 15:59:58 +00:00
Stanislav Mekhanoshin	69edad7913	[AMDGPU] Narrow lshl from 64 to 32 bit if possible Turn expensive 64 bit shift into 32 bit if shift does not overflow int: shl (ext x) => zext (shl x) Differential Revision: https://reviews.llvm.org/D33367 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303569 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-22 16:58:10 +00:00

1 2 3 4 5 ...

1068 Commits