RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-01-10 22:46:20 +00:00

Author	SHA1	Message	Date
Matt Arsenault	c3aa2775c2	AMDGPU: Set flat_scratch from flat_scratch_init reg This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260658 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 06:31:30 +00:00
Matt Arsenault	e3601c75c9	AMDGPU: Set element_size in private resource descriptor Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260651 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 02:40:47 +00:00
Nicolai Haehnle	cb2ad1e249	AMDGPU/SI: Do not move scratch resource register on Tonga & Iceland Due to the SGPR init bug, every program claims to use the same number of SGPRs anyway, so there's no point in trying to shift those registers down from their initial spot of reservation. Add a test that uses VGPR spilling and blocks most SGPRs from being used for the scratch resource register. Previously, this would run into an assertion. Differential Revision: http://reviews.llvm.org/D15724 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256870 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-05 20:42:49 +00:00
Changpeng Fang	89e60598f6	AMDGPU/SI: Use flat for global load/store when targeting HSA Summary: For some reason doing executing an MUBUF instruction with the addr64 bit set and a zero base pointer in the resource descriptor causes the memory operation to be dropped when the shader is executed using the HSA runtime. This kind of MUBUF instruction is commonly used when the pointer is stored in VGPRs. The base pointer field in the resource descriptor is set to zero and and the pointer is stored in the vaddr field. This patch resolves the issue by only using flat instructions for global memory operations when targeting HSA. This is an overly conservative fix as all other configurations of MUBUF instructions appear to work. NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15543 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256282 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-22 20:55:23 +00:00
Rafael Espindola	a00544a653	Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA" This reverts commit r256273. It broke CodeGen/AMDGPU/llvm.dbg.value.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256275 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-22 19:46:44 +00:00
Changpeng Fang	808f9643e6	AMDGPU/SI: Use flat for global load/store when targeting HSA Summary: For some reason doing executing an MUBUF instruction with the addr64 bit set and a zero base pointer in the resource descriptor causes the memory operation to be dropped when the shader is executed using the HSA runtime. This kind of MUBUF instruction is commonly used when the pointer is stored in VGPRs. The base pointer field in the resource descriptor is set to zero and and the pointer is stored in the vaddr field. This patch resolves the issue by only using flat instructions for global memory operations when targeting HSA. This is an overly conservative fix as all other configurations of MUBUF instructions appear to work. Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15543 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256273 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-22 19:32:28 +00:00
Tom Stellard	09dd945fa5	AMDGPU/SI: Set the code objects private segment size when targeting HSA. Summary: I'm not sure how things worked before without this. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15492 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255692 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-15 22:55:30 +00:00
Matt Arsenault	0f1b95f818	AMDGPU: Rework how private buffer passed for HSA If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254331 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-30 21:16:03 +00:00
Matt Arsenault	956f59ab56	AMDGPU: Remove SIPrepareScratchRegs It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254329 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-30 21:15:53 +00:00

9 Commits