RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-01-10 22:46:20 +00:00

Author	SHA1	Message	Date
Tom Stellard	7d43ecc4d4	AMDGPU/SI: Better handle s_wait insertion We can wait on either VM, EXP or LGKM. The waits are independent. Without this patch, a wait inserted because of one of them would also wait for all the previous others. This patch makes s_wait only wait for the ones we need for the next instruction. Here's an example of subtle perf reduction this patch solves: This is without the patch: buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen s_load_dwordx4 s[44:47], s[8:9], 0xc s_waitcnt lgkmcnt(0) buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen s_load_dwordx4 s[48:51], s[8:9], 0x10 s_waitcnt vmcnt(1) buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen The s_waitcnt vmcnt(1) is useless. The reason it is added is because the last buffer_load_format_xyzw needs s[44:47], which was issued by the first s_load_dwordx4. It waits for all VM before that call to have finished. Internally after every instruction, 3 counters (for VM, EXP and LGTM) are updated after every instruction. For example buffer_load_format_xyzw will increase the VM counter, and s_load_dwordx4 the LGKM one. Without the patch, for every defined register, the current 3 counters are stored, and are used to know how long to wait when an instruction needs the register. Because of that, the s[44:47] counter includes that to use the register you need to wait for the previous buffer_load_format_xyzw. Instead this patch stores only the counters that matter for the register, and puts zero for the other ones, since we don't need any wait for them. Patch by: Axel Davy Differential Revision: http://reviews.llvm.org/D11883 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245755 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-21 22:47:27 +00:00
Matt Arsenault	5ba7cf9de0	AMDGPU/SI: Fix printing useless info with amdhsa The comments at the bottom would all report 0 if amdhsa was used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245135 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-15 00:12:39 +00:00
Matt Arsenault	892803aa81	AMDGPU: Fix assert on dbg_value instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244728 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-12 09:04:44 +00:00
Tom Stellard	945ad7d241	AMDGPU: Add pass to lower OpenCL image and sampler arguments. The pass adds new kernel arguments for image attributes, and resolves calls to dummy attribute and resource id getter functions. Patch by: Zoltan Gilian git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244372 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-07 23:19:30 +00:00
Matt Arsenault	48b5b553ae	AMDGPU: Assume SMRD access for constant address space Since r243294 these are selected to SMRD and moved later if required. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244354 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-07 20:18:34 +00:00
Tom Stellard	825c884e40	AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11604 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244254 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-06 19:28:38 +00:00
Tom Stellard	732a4ceeee	AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes Summary: This allows us to consolidate several of the TableGen patterns. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11602 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244253 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-06 19:28:30 +00:00
Alex Lorenz	f5cf675376	AMDGPU/SI: Add implicit register operands in the correct order. This commit fixes a bug in the class 'SIInstrInfo' where the implicit register machine operands were added to a machine instruction in an incorrect order - the implicit uses were added before the implicit defs. I found this bug while working on moving the implicit register operand verification code from the MIR parser to the machine verifier. This commit also makes the method 'addImplicitDefUseOperands' in the machine instruction class public so that it can be reused in the 'SIInstrInfo' class. Reviewers: Matt Arsenault Differential Revision: http://reviews.llvm.org/D11689 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243799 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-31 23:30:09 +00:00
Matt Arsenault	30eac4b85a	AMDGPU: Fix v16i32 to v16i8 truncstore git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243731 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-31 04:12:04 +00:00
Matt Arsenault	03b49c843a	AMDGPU: Don't try to use LDS/vector for private if pointer value stored If the pointer is the store's value operand, this would produce a broken module. Make sure the use is actually for the pointer operand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243462 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-28 18:47:00 +00:00
Matt Arsenault	7a1c02d1f7	AMDGPU: Fix crash if called function is a bitcast getCalledFunction() is null, so this would crash. Replace crash with an error on unsupported call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243461 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-28 18:29:14 +00:00
Marek Olsak	dbd8d4f056	AMDGPU: don't match vgpr loads for constant loads Author: Dave Airlie <airlied@redhat.com> In order to implement indirect sampler loads, we don't want to match on a VGPR load but an SGPR one for constants, as we cannot feed VGPRs to the sampler only SGPRs. this should be applicable for llvm 3.7 as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243294 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-27 18:16:08 +00:00
Marek Olsak	bf26b3fcae	AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround This is a candidate for 3.7. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243263 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-27 11:37:42 +00:00
Tom Stellard	f799b25cfc	AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242673 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-20 14:28:41 +00:00
Matt Arsenault	ac69d5205b	Only do fmul (fadd x, x), c combine if the fadd only has one use This was increasing the instruction count if the fadd has multiple uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242498 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-17 01:14:35 +00:00
Tom Stellard	104dab3e04	AMDPGU/SI: Negative offsets aren't allowed in MUBUF's vaddr operand Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11226 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242434 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-16 19:40:09 +00:00
Matt Arsenault	ba38e6c2ae	AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32) This can be done only with moves which theoretically will optimize better later. Although this transform increases the instruction count, it should be code size / cycle count neutral in the worst VALU case. It also seems to slightly improve a couple of testcases due to other DAG combines this exposes. This is probably slightly worse for the SALU case, so it might be better to handle this during moveToVALU, although then you lose some simplifications like the load width reducing in the simple testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242177 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 18:20:33 +00:00
Matt Arsenault	3aa0d7cb53	AMDGPU/SI: Fix read2 merging into a super register. If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, so we only ever write to a vreg_64. Run the register coalescer again to clean this up, although this isn't ideal and often does result in an extra move. Also remove the assert that offset1 > offset0. There isn't a real reason to not allow this other than a minor convenience in the compiler, and it doesn't seem worth the effort of avoiding it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242174 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 17:57:36 +00:00
Tom Stellard	adb194b458	AMDGPU/SI: Add support for shrinking v_cndmask_b32_e32 instructions Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11061 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242146 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 14:15:03 +00:00
Tom Stellard	f5be357d37	AMDGPU/SI: Select mad patterns to v_mac_f32 The two-address instruction pass will convert these back to v_mad_f32 if necessary. Differential Revision: http://reviews.llvm.org/D11060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242038 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 15:47:57 +00:00
Matt Arsenault	ee6d5d1c9e	DAGCombiner: Assume invariant load cannot alias a store The motivation is to allow GatherAllAliases / FindBetterChain to not give up on dependent loads of a pointer from constant memory. This is important for AMDGPU, because most loads are pointers derived from a load of a kernel argument from constant memory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241948 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-10 22:17:40 +00:00
Matt Arsenault	6fe7acaaf8	AMDGPU/SI: Add debugging subtarget feature for DS offsets We don't have a good way to detect most situations where DS offsets are usable on SI, so add an option to force using them even if unsafe for debugging performance problems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241462 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-06 16:01:58 +00:00
Matthias Braun	3c76e5f588	Test for specific output in lit test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241200 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-01 22:34:59 +00:00
Matthias Braun	1a5b04c725	RegisterCoalescer: Cleanup empty subranges after shrinkToUses() A call to removeEmptySubranges() is necessary after every operation that potentially removes all segments from a subregister range; this case in the register coalescer was missing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241027 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-30 00:33:44 +00:00
Matt Arsenault	8be74e16ef	AMDGPU/SI: Fix extra space when printing v_div_fmas_* git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240911 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-28 18:16:14 +00:00
Tom Stellard	0be7d0cf17	AMDPGU/SI: Use correct resource descriptors for VI on HSA Summary: We need to set MTYPE = 2 for VI shaders when targeting the HSA runtime. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D10777 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240841 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 21:58:42 +00:00
Tom Stellard	4a888086a4	AMDGPU/SI: Update amd_kernel_code_t definition and add assembler support Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10772 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240839 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 21:58:31 +00:00
Tom Stellard	d40b451727	AMDGPU/SI: Set ELF OS/ABI to ELFOSABI_AMDGPU_HSA Reviewers: arsenm, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10708 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240832 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 21:15:11 +00:00
Tom Stellard	ac1a45e511	AMDGPU/SI: Add hsa code object directives Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10757 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240831 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 21:15:07 +00:00
Tom Stellard	4aad126e37	AMDGPU/SI: There are no implicit kernel args in the amdhsa ABI Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10706 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240830 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 21:15:03 +00:00
Tom Stellard	0d1bd457c6	AMDGPU/SI: Emit amd_kernel_code_t in EmitFunctionBodyStart() Summary: This way the function symbol points to the start of amd_kernel_code_t rather than the start of the function. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10705 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240829 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 21:14:58 +00:00
Marek Olsak	e874345be4	AMDGPU: really don't commute REV opcodes if the target variant doesn't exist If pseudoToMCOpcode failed, we would return the original opcode, so operands would be swapped, but the instruction would remain the same. It resulted in LSHLREV a, b ---> LSHLREV b, a. This fixes Glamor text rendering and piglit/arb_sample_shading-builtin-gl-sample-mask on VI. This is a candidate for stable branches. v2: the test was simplified by Tom Stellard git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240824 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 20:29:10 +00:00
Tom Stellard	9e7f0c8e77	R600/SI: Use ELF64 format instead of ELF32 Reviewers: arsenm, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10392 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240331 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-22 21:03:54 +00:00
Tom Stellard	309f60c15a	R600: Use EM_AMDGPU for the ELF Machine type Reviewers: arsenm, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10390 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240330 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-22 21:03:52 +00:00
Eric Christopher	933d2bd391	Fix "the the" in comments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240112 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-19 01:53:21 +00:00
Matt Arsenault	5202cab841	Revert "Revert "Fix merges of non-zero vector stores"" Reapply r239539. Don't assume the collected number of stores is the same vector size. Just take the first N stores to fill the vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239825 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-16 15:51:48 +00:00
Tom Stellard	953c681473	R600 -> AMDGPU rename git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-13 03:28:10 +00:00

37 Commits