archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Konstantin Zhuravlyov	ae3b2037b4	AMDGPU/Metadata: Always report a fixed number of hidden arguments Currently it is 6. If the "feature" was not used, report dummy hidden argument. Otherwise it does not match the kernarg size reported in the kernel header. Differential Revision: https://reviews.llvm.org/D45129 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329341 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-05 20:46:04 +00:00
Nicolai Haehnle	83bfebdaca	AMDGPU: Dimension-aware image intrinsics Summary: These new image intrinsics contain the texture type as part of their name and have each component of the address/coordinate as individual parameters. This is a preparatory step for implementing the A16 feature, where coordinates are passed as half-floats or -ints, but the Z compare value and texel offsets are still full dwords, making it difficult or impossible to distinguish between A16 on or off in the old-style intrinsics. Additionally, these intrinsics pass the 'texfailpolicy' and 'cachectrl' as i32 bit fields to reduce operand clutter and allow for future extensibility. v2: - gather4 supports 2darray images - fix a bug with 1D images on SI Change-Id: I099f309e0a394082a5901ea196c3967afb867f04 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44939 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329166 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 10:58:54 +00:00
Nicolai Haehnle	126cd7e831	AMDGPU: Fix copying i1 value out of loop with non-uniform exit Summary: When an i1-value is defined inside of a loop and used outside of it, we cannot simply use the SGPR bitmask from the loop's last iteration. There are also useful and correct cases of an i1-value being copied between basic blocks, e.g. when a condition is computed outside of a loop and used inside it. The concept of dominators is not sufficient to capture what is going on, so I propose the notion of "lane-dominators". Fixes a bug encountered in Nier: Automata. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743 Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40547 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329164 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-04 10:57:58 +00:00
Farhana Aleen	a59291c1f6	[AMDGPU] performMinMaxCombine should not optimize patterns of vectors to min3/max3. Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3. Author: FarhanaAleen Reviewed By: arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D45219 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329131 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 23:00:30 +00:00
Farhana Aleen	d82ffe5dae	Revert "MSG" This reverts commit `9a0ce889d1`. This was committed by mistake. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329119 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 21:51:45 +00:00
Farhana Aleen	9a0ce889d1	MSG git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329114 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 21:20:39 +00:00
Stanislav Mekhanoshin	936a756969	[AMDGPU] Fixed some instructions latencies Differential Revision: https://reviews.llvm.org/D45073 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328874 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-30 16:19:13 +00:00
Michael Bedy	5488d68d0b	[AMDGPU] Fix the SDWA Peephole phase to handle src for dst:UNUSED_PRESERVE. Summary: The phase attempts to transform operations that extract a portion of a value into an SDWA src operand in cases where that value is used only once. It was not prepared for this use to be the preserved portion of a value for dst:UNUSED_PRESERVE, resulting in a crash or assert. This change either rejects the illegal SDWA attempt, or in the case where dst:WORD_1 and the src_sel would be WORD_0, removes the unneeded extract instruction. Reviewers: arsenm, #amdgpu Reviewed By: arsenm, #amdgpu Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D44364 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328856 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-30 05:03:36 +00:00
Matt Arsenault	b18554c107	AMDGPU: Support realigning stack While the stack access instructions don't care about alignment > 4, some transformations on the pointer calculation do make assumptions based on knowing the low bits of a pointer are 0. If a stack object ends up being accessed through its absolute address (relative to the kernel scratch wave offset), the addressing expression may depend on the stack frame being properly aligned. This was breaking in a testcase due to the add->or combine. I think some of the SP/FP handling logic is still backwards, and overly simplistic to support all of the stack features. Code which tries to modify the SP with inline asm for example or variable sized objects will probably require redoing this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328831 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-29 21:30:06 +00:00
Matt Arsenault	ad41f941dc	AMDGPU: Increase default stack alignment 8 and 16-byte values are common, so increase the default alignment to avoid realigning the stack in most functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328821 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-29 20:22:04 +00:00
Matt Arsenault	3ca9749f0a	AMDGPU: Fix selection error on constant loads with < 4 byte alignment git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328818 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-29 19:59:28 +00:00
Tim Renouf	9f475f3a91	Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader" This reverts commit `0daf86291d`. It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a release-with-asserts build. I will resubmit the change when I have fixed that. Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328695 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-28 11:21:07 +00:00
Tim Renouf	0daf86291d	[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader Summary: For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders). This commit fixes that to use offset 0x10 instead of offset 0 for a compute shader, per the PAL ABI spec. Reviewers: kzhuravl, nhaehnle, timcorringham Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D44468 Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328673 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-27 21:35:00 +00:00
Tim Renouf	fccceddef3	[CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo source value Summary: Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing" broke -print-machineinstrs for us on AMDGPU, because we have custom pseudo source values, and MIR serialization does not implement that. This commit at least restores the functionality of -print-machineinstrs, even if it does not properly implement the missing MIR serialization functionality. Differential Revision: https://reviews.llvm.org/D44871 Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328668 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-27 21:14:04 +00:00
Matt Arsenault	a2f8776c07	AMDGPU: Fix not preserving CSR VGPR if used for SGPR spills Before this was not done if the function had no calls in it. This is still a possible issue with any callable function, regardless of calls present. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328659 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-27 19:42:55 +00:00
Matt Arsenault	bf82806c2e	AMDGPU: Fix crash when MachinePointerInfo invalid The combine on a select of a load only triggers for addrspace 0, and discards the MachinePointerInfo. The conservative default needs to be used for this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328652 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-27 18:39:45 +00:00
Matt Arsenault	a833b4252d	AMDGPU: Fix register name format in tests These were changed to match the asm output name a long time ago, although I think the old tablegenerated names still work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328651 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-27 18:39:42 +00:00
Matt Arsenault	aaf7156232	AMDGPU: Fix FP restore from being reordered with stack ops In a function, s5 is used as the frame base SGPR. If a function is calling another function, during the call sequence it is copied to a preserved SGPR and restored. Before it was possible for the scheduler to move stack operations before the restore of s5, since there's nothing to associate a frame index access with the restore. Add an implicit use of s5 to the adjcallstack pseudo which ends the call sequence to preven this from happening. I'm not 100% satisfied with this solution, but I'm not sure what else would be better. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328650 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-27 18:38:51 +00:00
Tony Tye	9272c8addc	[AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44697 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328351 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-23 18:58:47 +00:00
Tony Tye	2b4b7fe362	[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS. Differential Revision: https://reviews.llvm.org/D43736 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328349 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-23 18:45:18 +00:00
Sanjay Patel	3e65cc15a0	[InstSimplify] fp_binop X, NaN --> NaN We propagate the existing NaN value when possible. Differential Revision: https://reviews.llvm.org/D44521 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328140 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-21 19:31:53 +00:00
Sanjay Patel	0245f1cd62	[AMDGPU] change test to avoid NaN math git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327891 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-19 19:26:22 +00:00
Sanjay Patel	8a350b553d	[AMDGPU] adjust tests to be nan-free As suggested in D44521 - bitcast to integer for the math, so we preserve the intent of these tests when NaN math gets folded away. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327890 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-19 19:23:53 +00:00
Matt Arsenault	41fae9f61a	AMDGPU/GlobalISel: RegBankSelect for basic int ops git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327843 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-19 14:07:23 +00:00
Matt Arsenault	fe57640983	AMDGPU: Don't leave dead illegal VGPR->SGPR copies Normally DCE kills these, but at -O0 these get left behind leaving suspicious looking illegal copies. Replace with IMPLICIT_DEF to avoid iterator issues. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327842 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-19 14:07:15 +00:00
Matt Arsenault	65181f7b75	AMDGPU/GlobalISel: Cleanup constant legality git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327774 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-17 15:17:48 +00:00
Matt Arsenault	417485b734	AMDGPU/GlobalISel: Basic G_GEP legality git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327773 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-17 15:17:45 +00:00
Matt Arsenault	177d1142dd	AMDGPU/GlobalISel: Basic legality for load/store git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327772 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-17 15:17:41 +00:00
Farhana Aleen	7c98e88dc9	[AMDGPU] Supported ds_write_b128 generation. Summary: This is a follow-on patch of https://reviews.llvm.org/D44210 Author: FarhanaAleen Reviewed By: msearles Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44319 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327726 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-16 18:12:00 +00:00
Dmitry Preobrazhensky	a5e8c708f7	[AMDGPU][MC][GFX8][GFX9][DISASSEMBLER] Added "_e32" suffix to 32-bit VINTRP opcodes See bug 36751: https://bugs.llvm.org/show_bug.cgi?id=36751 Differential Revision: https://reviews.llvm.org/D44529 Reviewers: artem.tamazov, arsenm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327723 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-16 16:38:04 +00:00
Mark Searles	b30a83dec3	[AMDGPU] Waitcnt pass: Modify the waitcnt pass to propagate info in the case of a single basic block loop. mergeInputScoreBrackets() does this for us; update it so that it processes the single bb's score bracket when processing the single bb's preds. It is, after all, a pred of itself, so it's score bracket is needed. Differential Revision: https://reviews.llvm.org/D44434 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327583 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-14 22:04:32 +00:00
Francis Visoiu Mistrih	0d758f3663	[CodeGen] Use MIR syntax for MachineMemOperand printing Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)". rdar://38163529 Differential Revision: https://reviews.llvm.org/D42377 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327580 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-14 21:52:13 +00:00
Yaxun Liu	d4b84fce52	[AMDGPU] Fix lowering enqueue kernel when kernel has no name Since the enqueued kernels have internal linkage, their names may be dropped. In this case, give them unique names __amdgpu_enqueued_kernel or __amdgpu_enqueued_kernel.n where n is a sequential number starting from 1. Differential Revision: https://reviews.llvm.org/D44322 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327291 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-12 16:34:06 +00:00
Dmitry Preobrazhensky	a70206a47a	[AMDGPU][MC] Corrected GATHER4 opcodes See bug 36252: https://bugs.llvm.org/show_bug.cgi?id=36252 Differential Revision: https://reviews.llvm.org/D43874 Reviewers: artem.tamazov, arsenm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327278 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-12 15:03:34 +00:00
Matt Arsenault	7f9dbc4419	AMDGPU/GlobalISel: Legality and RegBankInfo for G_{INSERT\|EXTRACT}_VECTOR_ELT git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327269 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-12 13:35:53 +00:00
Matt Arsenault	e0eff38b22	AMDGPU/GlobalISel: InstrMapping for G_MERGE_VALUES git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327268 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-12 13:35:49 +00:00
Matt Arsenault	b3834e5d6b	AMDGPU/GlobalISel: Make some G_MERGE_VALUEs legal git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327267 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-12 13:35:43 +00:00
Sanjay Patel	00cb8ab926	[AMDGPU] fix tests to be independent of FP undef git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327211 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-10 16:39:59 +00:00
Matt Arsenault	5c56853ab7	AMDGPU: Fix crash when constant folding with physreg operand git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327209 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-10 16:05:35 +00:00
Farhana Aleen	2006e6286b	[AMDGPU] Supported ds_read_b128 generation; Widened vector length for local address-space. Summary: Starting from GCN 2nd generation, ISA supports ds_read_b128 on top of ds_read_b64. This patch supports ds_read_b128 instruction pattern and generation of this instruction. In the vectorizer, this patch also widen the vector length so that vectorizer generates 128 bit loads for local address-space which gets translated to ds_read_b128. Since the performance benefit is not clear; compiler generates ds_read_b128 under -amdgpu-ds128. Author: FarhanaAleen Reviewed By: rampitec, arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44210 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327153 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-09 17:41:39 +00:00
Sanjay Patel	21de18a5cc	[AMDGPU] fix test to be independent of FP undef git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327147 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-09 16:33:34 +00:00
Stanislav Mekhanoshin	c88f3543c0	[AMDGPU] Fixed V_DIV_FIXUP_F16 selection on GFX9 GFX9 should select opsel version. Differential Revision: https://reviews.llvm.org/D44279 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327106 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-09 07:21:43 +00:00
Sanjay Patel	74ff3cc8bd	[AMDGPU] fix test to survive more FP undef constant folding git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327066 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-08 21:30:56 +00:00
Sanjay Patel	a6a4aed947	[AMDGPU] fix test to survive the most basic undef constant folding This will likely need to be changed again for anything more than: fmul undef, undef -> undef git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327034 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-08 17:34:25 +00:00
Farhana Aleen	084dcd89de	[AMDGPU] Increased vector length for global/constant loads. Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44179 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326910 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-07 17:09:18 +00:00
Farhana Aleen	832984ded2	Revert "[AMDGPU] Widened vector length for global/constant address space." This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326907 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-07 16:55:27 +00:00
Farhana Aleen	a446275ee2	[AMDGPU] Widened vector length for global/constant address space. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326904 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-07 16:29:05 +00:00
Yaxun Liu	2d74623e2e	[AMDGPU] Fix lowering OpenCL enqueue_kernel One addrspacecast disappeared in clang emitted IR for block invoke function due to adoption of the new addr space mapping. Differential Revision: https://reviews.llvm.org/D43785 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326806 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-06 16:04:39 +00:00
Matt Arsenault	f246669c10	AMDGPU/GlobalISel: Add InstrMapping for G_EXTRACT git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326715 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-05 16:25:18 +00:00
Matt Arsenault	4e77263fcb	AMDGPU/GlobalISel: Make some G_EXTRACTs legal As far as I can tell legalization of weird sizes for the output type isn't implemented. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326714 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-05 16:25:15 +00:00

1 2 3 4 5 ...

1518 Commits