RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-23 05:46:05 +00:00

Author	SHA1	Message	Date
Matt Arsenault	dc55587b7f	AMDGPU: Rename SI_RETURN This is used for a specific type of return to a shader part's epilog code. Rename to try avoiding confusion from a true call's return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298452 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 22:18:10 +00:00
Matt Arsenault	f06f68a796	AMDGPU: Don't wait at end of block with a trivial successor If there is only one successor, and that successor only has one predecessor the wait can obviously be delayed until uses or the end of the next block. This avoids code quality regressions when there are trivial fallthrough blocks inserted for structurization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297251 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 01:06:58 +00:00
Konstantin Zhuravlyov	017228cd76	[AMDGPU] Add target information that is required by tools to metadata Differential Revision: https://reviews.llvm.org/D28760#fb670e28 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294449 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-08 14:05:23 +00:00
Eugene Zelenko	5751b552da	[AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292688 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-21 00:53:49 +00:00
Jan Vesely	bf64cb107c	AMDGPU/SI: Implement sendmsghalt intrinsic v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290977 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-04 18:06:55 +00:00
Matt Arsenault	350d0dab1e	AMDGPU: Refactor exp instructions Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288695 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-05 20:23:10 +00:00
Matt Arsenault	a89dd1d1c4	AMDGPU: Rename flat operands to match mubuf Use vaddr/vdst for the same purposes. This also fixes a beg in SIInsertWaits for the operand check. The stored value operand is currently called data0 in the single offset case, not data. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288188 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-29 19:30:44 +00:00
Marek Olsak	2a24827c23	AMDGPU/SI: Add back reverted SGPR spilling code, but disable it suggested as a better solution by Matt git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287942 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-25 17:37:09 +00:00
Marek Olsak	82bcf466c2	Revert "AMDGPU: Implement SGPR spilling with scalar stores" This reverts commit 4404d0d6e354e80dd7f8f0a0e12d8ad809cf007e. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287936 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-25 16:03:34 +00:00
Matt Arsenault	4404d0d6e3	AMDGPU: Implement SGPR spilling with scalar stores nThis avoids the nasty problems caused by using memory instructions that read the exec mask while spilling / restoring registers used for control flow masking, but only for VI when these were added. This always uses the scalar stores when enabled currently, but it may be better to still try to spill to a VGPR and use this on the fallback memory path. The cache also needs to be flushed before wave termination if a scalar store is used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286766 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-13 18:20:54 +00:00
Matt Arsenault	e5fd9c09ad	AMDGPU: Preserve vcc undef flags when inverting branch If the branch was on a read-undef of vcc, passes that used analyzeBranch to invert the branch condition wouldn't preserve the undef flag resulting in a verifier error. Fixes verifier failures in a future commit. Also fix verifier error when inserting copy for vccz corruption bug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286133 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-07 19:09:27 +00:00
Tom Stellard	b15bbca10c	AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions Summary: Flat instruction can return out of order, so we need always need to wait for all the outstanding flat operations. Reviewers: tony-tye, arsenm Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D25998 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285479 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 23:53:48 +00:00
Konstantin Zhuravlyov	c7a23a58d8	[AMDGPU] Refactor waitcnt encoding - Refactor bit packing/unpacking - Calculate bit mask given bit shift and bit width - Introduce function for decoding bits of waitcnt - Introduce function for encoding bits of waitcnt - Introduce function for getting waitcnt mask (instead of using bare numbers) - Introduce function fot getting max waitcnt(s) (instead of using bare numbers) Differential Revision: https://reviews.llvm.org/D25298 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283919 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-11 18:58:22 +00:00
Mehdi Amini	67f335d992	Use StringRef in Pass/PassManager APIs (NFC) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-01 02:56:57 +00:00
Konstantin Zhuravlyov	69560a642c	[AMDGPU] Choose VMCNT, EXPCNT, LGKMCNT masks and shifts based on the isa version Differential Revision: https://reviews.llvm.org/D24973 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282877 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-30 17:01:40 +00:00
Konstantin Zhuravlyov	07f122bad7	[AMDGPU] Ask subtarget if waitcnt instruction is needed before barrier instruction Differential Revision: https://reviews.llvm.org/D24985 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282875 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-30 16:50:36 +00:00
Duncan P. N. Exon Smith	83b2ab7c4c	AMDGPU: Remove implicit iterator conversions, NFC Remove remaining implicit conversions from MachineInstrBundleIterator to MachineInstr* from the AMDGPU backend. In most cases, I made them less attractive by preferring MachineInstr& or using a ranged-based for loop. Once all the backends are fixed I'll make the operator explicit so that this doesn't bitrot back. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274906 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-08 19:16:05 +00:00
Matt Arsenault	759ed7e410	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 06:30:11 +00:00
Tom Stellard	8478bd5765	AMDGPU/SI: Use the hazard recognizer to break SMEM soft clauses Summary: Add support for detecting hazards in SMEM soft clauses, so that we only break the clauses when necessary, either by adding s_nop or re-ordering other alu instructions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18870 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268260 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-02 17:39:06 +00:00
Tom Stellard	b8a2cc5119	AMDGPU/SI: Use hazard recognizer to detect DPP hazards Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18603 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268247 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-02 16:23:09 +00:00
Tom Stellard	df1aa5c25d	AMDGPU/SI: Remove wait state handling for SMRD in SIInsertWaits This was supposed to be part of r268143. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268154 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-30 04:04:48 +00:00
Nicolai Haehnle	0493c734a2	AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsic Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267729 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-27 15:46:01 +00:00
Tom Stellard	cb6c943dc2	AMDGPU/SI: Insert wait states required after v_readfirstlane on SI Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266105 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-12 18:40:43 +00:00
Tom Stellard	c2d9280e43	AMDGPU/SI: Add MachineBasicBlock parameter to SIInstrInfo::insertWaitStates Summary: This makes it possible to insert nops at the end of blocks. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18549 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265678 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-07 14:47:07 +00:00
Tom Stellard	f53246799f	AMDGPU/SI: Handle wait states required for DPP instructions Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17543 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263447 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-14 17:05:56 +00:00
Marek Olsak	01d3696081	AMDGPU/SI: Incomplete shader binaries need to finish execution at the end Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D18058 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263441 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-14 15:57:14 +00:00
Matt Arsenault	e407a109c1	AMDGPU: Simplify boolean conditional return statements Patch by Richard Thomson git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262536 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-02 23:00:21 +00:00
Matt Arsenault	a0358fa131	AMDGPU: Fix bug 26659. Fix checking the same instruction twice instead of the second branch that uses vccz. I don't think this matters currently because s_branch_vccnz is always used currently. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262457 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-02 04:12:39 +00:00
Tom Stellard	aced110517	AMDGPU/SI: Fix s_waitcnt insertion for flat instructions Summary: This was broken in r260694 which swapped the address and data operands for flat store instructions. The code in SIInsertWaits assumes that the data operand always comes before the address operand, so we need to add a special case for flat. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17366 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261330 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-19 15:33:13 +00:00
Tom Stellard	58abe5f3d2	AMDGPU/SI: Implement a work-around for smrd corrupting vccz bit Summary: We will hit this once we have enabled uniform branches. The smrd-vccz-bug.ll test will be added with the uniform branch commit. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16725 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260137 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-08 19:49:20 +00:00
Tom Stellard	35cb73cd06	AMDGPU/SI: Correctly initialize SIInsertWaits pass Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16724 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259894 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-05 17:42:38 +00:00
Tom Stellard	641de45f36	AMDGPU: waitcnt operand fixes Summary: Allow lgkmcnt up to 0xF (hardware allows that). Fix mask for ExpCnt in AMDGPUInstPrinter. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16314 Patch by: Nikolay Haustov git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259059 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 17:13:44 +00:00
Marek Olsak	8578c2b6fb	AMDGPU/SI: Remove ending s_endpgm from non-void functions Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16035 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257623 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-13 17:23:12 +00:00
Marek Olsak	fe319de414	AMDGPU/SI: Add s_waitcnt at the end of non-void functions Summary: v2: Make ReturnsVoid private, so that I can another 8 lines of code and look more productive. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16034 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257622 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-13 17:23:09 +00:00
Matt Arsenault	d2643e2ff9	AMDGPU: Add MachineInstr overloads for instruction format tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250797 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-20 04:35:43 +00:00
Matt Arsenault	fb65d2c241	AMDGPU: Fix unused variable warning in release build git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249091 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-01 22:40:35 +00:00
Matt Arsenault	ae6db4bdd7	AMDGPU: Make SIInsertWaits about a factor of 4 faster This was the slowest target custom pass and was spending 80% of the time in getMinimalPhysRegClass which was called for every register operand. Try to use the statically known register class when possible from the instruction's MCOperandInfo. There are a few pseudo instructions which are not well behaved with unknown register classes which still require the expensive physical register class search. There are a few other possibilities for making this even faster, such as not inspecting implicit operands. For now those are checked because it is technically possible to have a scalar load into exec or vcc which can be implicitly used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249079 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-01 21:43:15 +00:00
Matt Arsenault	7a6a7f2409	AMDGPU: Fix recomputing dominator tree unnecessarily SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248587 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-25 17:21:28 +00:00
Matt Arsenault	d0edb1f758	AMDGPU: Add s_dcache_* instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248533 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-24 19:52:27 +00:00
Tom Stellard	7d43ecc4d4	AMDGPU/SI: Better handle s_wait insertion We can wait on either VM, EXP or LGKM. The waits are independent. Without this patch, a wait inserted because of one of them would also wait for all the previous others. This patch makes s_wait only wait for the ones we need for the next instruction. Here's an example of subtle perf reduction this patch solves: This is without the patch: buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen s_load_dwordx4 s[44:47], s[8:9], 0xc s_waitcnt lgkmcnt(0) buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen s_load_dwordx4 s[48:51], s[8:9], 0x10 s_waitcnt vmcnt(1) buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen The s_waitcnt vmcnt(1) is useless. The reason it is added is because the last buffer_load_format_xyzw needs s[44:47], which was issued by the first s_load_dwordx4. It waits for all VM before that call to have finished. Internally after every instruction, 3 counters (for VM, EXP and LGTM) are updated after every instruction. For example buffer_load_format_xyzw will increase the VM counter, and s_load_dwordx4 the LGKM one. Without the patch, for every defined register, the current 3 counters are stored, and are used to know how long to wait when an instruction needs the register. Because of that, the s[44:47] counter includes that to use the register you need to wait for the previous buffer_load_format_xyzw. Instead this patch stores only the counters that matter for the register, and puts zero for the other ones, since we don't need any wait for them. Patch by: Axel Davy Differential Revision: http://reviews.llvm.org/D11883 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245755 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 22:47:27 +00:00
Benjamin Kramer	d3c712e50b	Fix some comment typos. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244402 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-08 18:27:36 +00:00
Tom Stellard	953c681473	R600 -> AMDGPU rename git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-13 03:28:10 +00:00

42 Commits