archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Matt Arsenault	4fd45ebabd	AMDGPU: Fix shouldConvertConstantLoadToIntImm behavior This should really be true for any immediate, not just inline ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277260 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-30 01:40:36 +00:00
Changpeng Fang	539fec5dc2	AMDGPU/SI: Don't handle a loop if there is no loop at all for a terminator BB. Differential Revision: http://reviews.llvm.org/D22021 Reviewed by: arsenm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277073 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 23:01:45 +00:00
Wei Ding	ee8c4ca1e1	AMDGPU : Add intrinsics for compare with the full wavefront result Differential Revision: http://reviews.llvm.org/D22482 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276998 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 16:42:13 +00:00
Nicolai Haehnle	b18ca96c79	AMDGPU: add execfix flag to SI_ELSE Summary: SI_ELSE is lowered into two parts: s_or_saveexec_b64 dst, src (at the start of the basic block) s_xor_b64 exec, exec, dst (at the end of the basic block) The idea is that dst contains the exec mask of the preceding IF block. It can happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside the basic block that contains SI_ELSE, in which case it introduces an instruction s_and_b64 exec, exec, s[...] which masks out bits that can correspond to both the IF and the ELSE paths. So the resulting sequence must be: s_or_savexec_b64 dst, src s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode s_and_b64 dst, dst, exec <-- added by SILowerControlFlow s_xor_b64 exec, exec, dst Whether to add the additional s_and_b64 dst, dst, exec is currently determined via the ExecModified tracking. With this change, it is instead determined by an additional flag on SI_ELSE which is set by SIWholeQuadMode. Finally: It also occured to me that an alternative approach for the long run is for SILowerControlFlow to unconditionally emit s_or_saveexec_b64 dst, src ... s_and_b64 dst, dst, exec s_xor_b64 exec, exec, dst and have a pass that detects and cleans up the "redundant AND with exec" pattern where possible. This could be useful anyway, because we also add instructions s_and_b64 vcc, exec, vcc before s_cbranch_scc (in moveToALU), and those are often redundant. I have some pending changes to how KILL is lowered that could also benefit from such a cleanup pass. In any case, this current patch could help in the short term with the whole ExecModified business. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22846 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276972 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 11:39:24 +00:00
Matt Arsenault	f799c706db	AMDGPU: Use rcp for fdiv 1, x with fpmath metadata Using rcp should be OK for safe math usually, so this should not be replacing the original fdiv. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276823 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 23:25:44 +00:00
Matt Arsenault	c43677a11d	AMDGPU: Add more tests for LDS size with occupancy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276821 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 23:15:59 +00:00
Matthias Braun	ad0f5f6b52	MIRParser: Use dot instead of colon to mark subregisters Change the syntax to use `%0.sub8` to denote a subregister. This seems like a more natural fit to denote subregisters; I also plan to introduce a new ":classname" syntax in upcoming patches to denote the register class of a vreg. Note that this commit disallows plain identifiers to start with a '.' character. This shouldn't affect anything as external names/IR references are all prefixed with '$'/'%', plain identifiers are only used for instruction names, register mask names and subreg indexes. Differential Revision: https://reviews.llvm.org/D22390 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276815 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 21:49:34 +00:00
Tim Northover	d96170e773	GlobalISel: omit braces on MachineInstr types when there's only one. Tidies up the representation a bit in the common case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276772 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 17:28:01 +00:00
Matt Arsenault	cc67a0a36a	AMDGPU: Add missing tests for xnack option for HSA git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276765 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 16:45:50 +00:00
Matt Arsenault	ee4cdb7b75	AMDGPU: Add fp legacy instruction intrinsics This could use some additional optimization work to use mad/mac legacy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276764 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 16:45:45 +00:00
Jan Vesely	4a44da0c82	AMDGPU: Remove read_workdim intrinsic Differential revision: https://reviews.llvm.org/D22732 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276682 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-25 20:17:02 +00:00
Matt Arsenault	9b4a967989	AMDGPU: Fix missing verify-machineinstrs in control flow test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276679 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-25 19:39:06 +00:00
Tom Stellard	a6b9e20623	Revert "[AMDGPU] Emit read-only data to .rodata for hsa" This reverts commit r276298. Data stored in .rodata can have a negative offset from .text, but we don't support negative values in relocations yet. This caused a regression in one of the amp conformance tests: 5_Data_Cont/5_2_a_v/5_2_3_m/Assignment/Test.02.01 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276498 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 23:46:40 +00:00
Tim Northover	3921674c30	GlobalISel: allow multiple types on MachineInstrs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276481 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 22:13:36 +00:00
Anna Thomas	80ee170cb3	Invariant start/end intrinsics overloaded for address space Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: apilipenko, reames Subscribers: llvm-commits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276447 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:49:40 +00:00
Matt Arsenault	c5a5706d17	AMDGPU: Remove redundant test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276439 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:36 +00:00
Matt Arsenault	9da217ee1e	AMDGPU: Fix groupstaticsize for large LDS The size can exceed s_movk_i32's limit, and we don't want to use it this early since it inhibits optimizations. This should probably be merged to the release branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276438 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:33 +00:00
Matt Arsenault	30f0e3e4be	AMDGPU: Add HSA dispatch id intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276437 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:30 +00:00
Matt Arsenault	7488ab3114	AMDGPU: Fix i1 fp_to_int R600's i1 fp_to_uint selected but was incorrect according to what instcombine constant folds to. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276435 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:21 +00:00
Anna Thomas	d89a69b5fd	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276316. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276320 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-21 19:06:28 +00:00
Anna Thomas	4227f92f58	Invariant start/end intrinsics overloaded for address space Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: tstellarAMD, reames, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22519 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276316 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-21 18:41:44 +00:00
Konstantin Zhuravlyov	82910c89dd	[AMDGPU] Emit read-only data to .rodata for hsa Differential Revision: https://reviews.llvm.org/D22538 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276298 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-21 15:59:23 +00:00
Matt Arsenault	a9994065f9	AMDGPU: Fix phis from blocks split due to register indexing git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276257 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-21 09:40:57 +00:00
Tim Northover	4951996d06	GlobalISel: implement low-level type with just size & vector lanes. This should be all the low-level instruction selection needs to determine how to implement an operation, with the remaining context taken from the opcode (e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276158 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-20 19:09:30 +00:00
Matt Arsenault	20e6e25350	AMDGPU: Add missing test coverage for control flow breaks None of the current lit tests hit si_break handling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276129 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-20 15:20:35 +00:00
Yaxun Liu	59e8cabf31	AMDGPU: Fix bug causing crash due to invalid opencl version metadata. Differential Revision: https://reviews.llvm.org/D22526 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276119 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-20 14:38:06 +00:00
Matthias Braun	e3d8cd87b2	Revert "RegScavenging: Add scavengeRegisterBackwards()" Reverting this commit for now as it seems to be causing failures on test-suite tests on the clang-ppc64le-linux-lnt bot. This reverts commit r276044. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276068 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-20 00:21:32 +00:00
Matt Arsenault	63be72069d	AMDGPU: Change fdiv lowering based on !fpmath metadata If 2.5 ulp is acceptable, denormals are not required, and isn't a reciprocal which will already be handled, replace with a faster fdiv. Simplify the lowering tests by using per function subtarget features. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276051 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 23:16:53 +00:00
Matthias Braun	c5e14e0478	RegScavenging: Add scavengeRegisterBackwards() This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276044 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 22:37:09 +00:00
Matt Arsenault	4cead0b564	AMDGPU: Expand register indexing pseudos in custom inserter This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275934 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 00:35:03 +00:00
Matt Arsenault	f36ea238a4	AMDGPU: Fix test name and broken CHECK-LABEL git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275928 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-18 23:09:51 +00:00
Matt Arsenault	bb09cfd86f	AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275871 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-18 18:35:05 +00:00
Matt Arsenault	40ca91a07a	AMDGPU/R600: Replace barrier intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275870 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-18 18:34:59 +00:00
Matt Arsenault	865e2fa1dc	AMDGPU: Remove dead check in AMDGPUPromoteAlloca This is currently only called with GEP users. A direct alloca would only happen with current typed pointers for arrays which are a perverse case. Also fix crashes on 0 x and 1 x arrays. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275869 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-18 18:34:53 +00:00
Nicolai Haehnle	0c05ce4746	AMDGPU: Disable AMDGPUPromoteAlloca pass for shader calling conventions. Summary: The work item intrinsics are not available for the shader calling conventions. And even if we did hook them up most shader stages haves some extra restrictions on the amount of available LDS. Reviewers: tstellarAMD, arsenm Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D20728 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275779 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-18 09:02:47 +00:00
Yaxun Liu	384c6423e5	Re-commit [AMDGPU] Add metadata for runtime Attempting to fix lit test failure on ppc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275676 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-16 05:09:21 +00:00
Matt Arsenault	e066e581b1	AMDGPU: Fix verifier error from partially undef copy In this situation: %VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11, %VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use> %VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3 %VGPR4<def> = COPY %VGPR2 The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1, but VGPR4 is defined immediately after this copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275635 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 22:32:02 +00:00
Matt Arsenault	35290cc53d	AMDGPU: Remove brev intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275620 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 21:27:13 +00:00
Matt Arsenault	5fecfa22e5	AMDGPU: Fix TargetPrefix for remaining r600 intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275619 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 21:27:08 +00:00
Matt Arsenault	a47e87a336	AMDGPU: Remove AMDGPU.ldexp git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275618 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 21:26:56 +00:00
Matt Arsenault	7150fbf236	AMDGPU: Remove legacy rsq.clamped intrinsic Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining. Also fix mismatch with non-IEEE rsq selecting to IEEE rsq. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275617 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 21:26:52 +00:00
Vitaly Buka	a6cb7108c4	Revert "[AMDGPU] Add metadata for runtime" This reverts commit r275566. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275599 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 19:14:57 +00:00
Yaxun Liu	6b0141c6fb	[AMDGPU] Add metadata for runtime Added emitting metadata to elf for runtime. Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream. Differential Revision: https://reviews.llvm.org/D21849 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275566 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 14:58:21 +00:00
Matt Arsenault	beff7fe056	AMDGPU: Fix not expanding control flow after some kill blocks Also stop trying to insert skip blocks at end_cf. This was inserting them at the end of the block which doesn't make sense. The skip should be inserted at the beginning of the block right after the end cf. Just remove this for now since no tests seem to stress this and I think this can be handled more generally later. Fixes bug 28550 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275510 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 00:58:15 +00:00
Matt Arsenault	011dcf3d90	AMDGPU: Fix trying to skip from a block with no successors Found while reducing bug 28550 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275509 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 00:58:13 +00:00
Matt Arsenault	435a4467a3	AMDGPU: Fix splitting kill blocks with defs before kill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275508 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 00:58:09 +00:00
Matt Arsenault	4d120e9b24	AMDGPU/R600: Delete/rename intrinsics no longer used by mesa Use the replacement pass to update the tests, and delete old names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275375 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-14 05:47:17 +00:00
Matt Arsenault	759af1e5a2	AMDGPU: Remove unused intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275371 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-14 05:23:19 +00:00
Matt Arsenault	fb5f7807e0	AMDGPU: Fix test not actually testing anything It wasn't actually running the pass, and since it is missing the llvm prefix, the eh intrinsic was not really an IntrinsicInst. Also add missing test for lifetime markers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275370 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-14 05:23:15 +00:00
Quentin Colombet	3d35f0d482	[MIR] Print on the given output instead of stderr. Currently the MIR framework prints all its outputs (errors and actual representation) on stderr. This patch fixes that by printing the regular output in the output specified with -o. Differential Revision: http://reviews.llvm.org/D22251 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275314 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-13 20:36:03 +00:00

1 2 3 4 5 ...

509 Commits