archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Nicolai Haehnle	a2a1a4f194	AMDGPU: Fix return of non-void-returning shaders Summary: Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that ensures that a non-void-returning shader falls off the end of the last basic block was effectively disabled, since SI_RETURN is now used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21975 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274612 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-06 08:35:17 +00:00
Matt Arsenault	18a21c4966	DAGCombiner: Fold away vector extract of insert with the same index This only really matters when the index is non-constant since the constant case already gets taken care of by other combines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274569 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-05 18:25:02 +00:00
Matt Arsenault	6de48c50fe	AMDGPU: Fix folding SGPRs into madak/madmk src0 Because of the special immediate operand, the constant bus is already used so SGPRs are never useful. r263212 changed the name of the immediate operand, which broke the verifier check for the restriction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274564 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-05 17:09:01 +00:00
Matt Arsenault	4a37139fe5	TII: Fix inlineasm size counting comments as insts The main problem was counting comments on their own line as instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274405 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 23:26:50 +00:00
Matt Arsenault	6e4fd1497d	AMDGPU: Add feature for unaligned access git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274398 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 23:03:44 +00:00
Matt Arsenault	d4452f8fcf	AMDGPU: Expand unaligned accesses early Due to visit order problems, in the case of an unaligned copy the legalized DAG fails to eliminate extra instructions introduced by the expansion of both unaligned parts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274397 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 22:55:55 +00:00
Matt Arsenault	df587174eb	AMDGPU: Improve load/store of illegal types. There was a combine before to handle the simple copy case. Split this into handling loads and stores separately. We might want to change how this handles some of the vector extloads, since this can result in large code size increases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274394 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 22:47:50 +00:00
Nikolay Haustov	2619d465a5	Resubmit r268719 - AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2. This was reverted in r268740 because of problems with corresponding Clang change. Clang change was updated and resubmitted in r274220. Check calling convention in AMDGPUMachineFunction::isKernel This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF. Also, in the future unused non-kernels may be optimized. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19917 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274341 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 10:00:58 +00:00
Matt Arsenault	1bf162a64a	AMDGPU: Fix out of bounds indirect indexing errors This was producing acceses to registers beyond the super register's limits, resulting in verifier failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273977 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-28 01:09:00 +00:00
Matt Arsenault	d35aece639	AMDGPU: Implement per-function subtargets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273940 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 20:48:03 +00:00
Matt Arsenault	dca409d5ad	AMDGPU: Move subtarget feature checks into passes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273937 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 20:32:13 +00:00
Matt Arsenault	5123c149e7	AMDGPU: Fix verifier errors with undef vector indices Also fix pointlessly adding exec to liveins. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273916 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 19:57:44 +00:00
Matt Arsenault	bd288e1778	DAGCombiner: Don't narrow volatile vector loads + extract git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273909 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 19:31:04 +00:00
Jan Vesely	d207fc4c12	AMDGPU/R600: Fix GlobalValue regressions. Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary expressions by including offset even if the offset it 0 (we haven't hit this sooner since tested workloads don't include static offsets) We don't really care about the type of expression, so set it directly. Fixes: r272705 Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32. Fixes: r273166 Differential Revision: http://reviews.llvm.org/D21633 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273785 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-25 18:24:16 +00:00
Konstantin Zhuravlyov	20c7a48718	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273769 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-25 03:11:28 +00:00
Tom Stellard	16fa6f1061	AMDGPU/SI: Make sure not to fold offsets into local address space globals Summary: Offset folding only works if you are emitting relocations, and we don't emit relocations for local address space globals. Reviewers: arsenm, nhaustov Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21647 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273765 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-25 01:59:16 +00:00
Matthias Braun	f011e37181	MachineScheduler: Fully compare top/bottom candidates In bidirectional scheduling this gives more stable results than just comparing the "reason" fields of the top/bottom node because the reason field may be higher depending on what other nodes are in the queue. Differential Revision: http://reviews.llvm.org/D19401 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273755 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-25 00:23:00 +00:00
Matthias Braun	0791b66fef	AMDGPU: Define a schedule class for COPY. COPY was lacking a scheduling class, define it to avoid regressions in the upcoming change to the bidirectional MachineScheduler. Approved by tstellar on IRC. Differential Revision: http://reviews.llvm.org/D21540 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273751 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 23:52:11 +00:00
Matt Arsenault	11c2d4bf28	AMDGPU: Add stub custom CodeGenPrepare pass This will do various things including ones CodeGenPrepare does, but with knowledge of uniform values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273657 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 07:07:55 +00:00
Matt Arsenault	8b2f86f045	AMDGPU: Un-xfail and add tests Un XFAIL a few tests plus a few more I had lying around in my tree, which seem to all work now but I don't see tests that quite test the same things. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273655 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 06:58:01 +00:00
Matt Arsenault	9af2418e41	AMDGPU: Remove disable-irstructurizer subtarget feature The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273653 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 06:30:22 +00:00
Diana Picus	a4a23eae96	[AMDGPU] Remove exit-on-error in test (PR27761) The exit-on-error flag was necessary in order to avoid an assertion when handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize. We can avoid the assertion by creating some dummy nodes. This enables us to remove the exit-on-error flag on the first 2 run lines (SI), but on the third run line (R600) we would run into another assertion when trying to reserve indirect registers. This patch also replaces that assertion with an early exit from the function. Fixes PR27761. Differential Revision: http://reviews.llvm.org/D20852 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273550 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-23 09:19:16 +00:00
Matt Arsenault	fddf7f599f	AMDGPU: Fix liveness when expanding m0 loop git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273514 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-22 23:40:57 +00:00
Changpeng Fang	7cde679f44	AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32 Reviewers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D21533 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273496 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-22 21:33:49 +00:00
Matt Arsenault	e22857013f	AMDGPU: Fix verifier errors in SILowerControlFlow The main sin this was committing was using terminator instructions in the middle of the block, and then not updating the block successors / predecessors. Split the blocks up to avoid this and introduce new pseudo instructions for branches taken with exec masking. Also use a pseudo instead of emitting s_endpgm and erasing it in the special case of a non-void return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273467 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-22 20:15:28 +00:00
Wei Ding	ef86963806	AMDGPU: Add convergent flag to INLINEASM instruction. Differential Revision: http://reviews.llvm.org/D21214 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273455 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-22 18:51:08 +00:00
Jan Vesely	7d5ce4d892	AMDGPU: Add implicitarg.ptr intrinsic. Points to the start of implicit arguments (appended after explicit arguments) Differential Revision: http://reviews.llvm.org/D20297 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273317 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-21 20:46:20 +00:00
Matt Arsenault	b2902b2eb0	AMDGPU: Preserve undef flag on vcc when shrinking v_cndmask_b32 The implicit operand is added by the initial instruction construction, so this was adding an additional vcc use. The original one was missing the undef flag the original condition had, so the verifier would complain. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273182 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 18:34:00 +00:00
Matt Arsenault	17f22f98eb	AMDGPU: Fold more custom nodes to undef This will help sneak undefs past GVN into the DAG for some tests. Also add missing intrinsic for rsq_legacy, even though the node was already selected to the instruction. Also start passing the debug location to intrinsic errors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273181 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 18:33:56 +00:00
Matt Arsenault	96ad9ea23d	Generalize DiagnosticInfoStackSize to support other limits Backends may want to report errors on resources other than stack size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273177 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 18:13:04 +00:00
Matt Arsenault	61691ce470	AMDGPU: Use correct method for determining instruction size git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273172 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 17:51:32 +00:00
Tom Stellard	75473ec73e	AMDGPU: Add support for R_AMDGPU_REL32 relocations Reviewers: arsenm, kzhuravl, rafael Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21401 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273168 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 17:33:43 +00:00
Tom Stellard	a0adb8d997	AMDGPU: Emit R_AMDGPU_ABS32_{HI,LO} for scratch buffer relocations Reviewers: arsenm, rafael, kzhuravl Subscribers: rafael, arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21400 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273166 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 16:59:44 +00:00
Matt Arsenault	115244a728	AMDGPU: Fix kernel argument alignment impacting stack size Don't use AllocateStack because kernel arguments have nothing to do with the stack. The ensureMaxAlignment call was still changing the stack alignment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273080 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-18 05:15:53 +00:00
Matt Arsenault	863cff46f2	AMDGPU: Temporarily select trap to s_endpgm This should select to s_trap, but that requires additonal work to setup and enable the trap handler. For now emit s_endpgm so bugpoint stops getting stuck on the unsupported call to abort. Emit a warning that this will only terminate the wave and not really trap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273062 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-17 22:27:03 +00:00
Matt Arsenault	310a3752c0	AMDGPU: Remove llvm.SI.tid intrinsic Mesa doesn't emit this for llvm >= 3.8 anymore. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273050 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-17 21:18:41 +00:00
Matt Arsenault	11e5e3bbe1	AMDGPU: Disable scheduling in some slow tests Disabling the pre-RA scheduler on large-work-group-registers causes it to be ~50% slower. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272860 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 00:56:47 +00:00
Nicolai Haehnle	682fc3e780	AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272761 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 07:13:05 +00:00
Matt Arsenault	6af03e5068	AMDGPU: Run pointer optimization passes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272736 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 00:11:01 +00:00
Marek Olsak	760c36c5ae	AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272556 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 16:05:57 +00:00
Tom Stellard	4ee3d0cb4d	AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations, and fixup_si_rodata is for relative relocations. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21153 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272417 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 19:26:38 +00:00
Matt Arsenault	dbaa4b4486	AMDGPU: v_cndmask_b32 does not def vcc Fixes verifier errors after SIShrinkInstructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272351 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 00:18:41 +00:00
Tom Stellard	60f588f570	AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_permute Summary: This fixes a bug with ds_permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272349 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 00:01:04 +00:00
Matt Arsenault	4080a06a24	AMDGPU: Fix flat atomics The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272345 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:54 +00:00
Matt Arsenault	bada556f73	AMDGPU: Fix i64 global cmpxchg This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272344 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:48 +00:00
Matt Arsenault	003d842e7f	AMDGPU: Fix missing and broken check lines in atomic tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272343 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:44 +00:00
Wei Ding	39ce7152a2	AMDGPU/SI: Fix 32-bit fdiv lowering We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272292 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 19:17:15 +00:00
Jan Vesely	406c47ff89	SelectionDAG: Implement expansion of {S,U}MIN/MAX in integer legalization Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D17898 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272272 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 16:04:00 +00:00
Nicolai Haehnle	2ac1fa00c9	AMDGPU: Add amdgpu-ps-wqm-outputs function attributes Summary: The presence of this attribute indicates that VGPR outputs should be computed in whole quad mode. This will be used by Mesa for prolog pixel shaders, so that derivatives can be taken of shader inputs computed by the prolog, fixing a bug. The generated code could certainly be improved: if a prolog pixel shader is used (which isn't common in modern OpenGL - they're used for gl_Color, polygon stipples, and forcing per-sample interpolation), Mesa will use this attribute unconditionally, because it has to be conservative. So WQM may be used in the prolog when it isn't really needed, and furthermore a silly back-and-forth switch is likely to happen at the boundary between prolog and main shader parts. Fixing this is a bit involved: we'd first have to add a mechanism by which LLVM writes the WQM-related input requirements to the main shader part binary, and then Mesa specializes the prolog part accordingly. At that point, we may as well just compile a monolithic shader... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272063 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 21:37:17 +00:00
Eric Christopher	c2a7f10882	Revert "Differential Revision: http://reviews.llvm.org/D20557 " Author: Wei Ding <wei.ding2@amd.com> Date: Tue Jun 7 19:04:44 2016 +0000 Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8 as it was breaking the bots. This reverts commit r272044. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272056 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 20:27:12 +00:00

1 2 3 4 5 ...

442 Commits