archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Matt Arsenault	1ad9b2d946	AMDGPU: Actually write nops for writeNopData Before this was just writing 0s, which ends up looking like a v_cndmask_b32 v0, s0, v0, vcc. Write out an encoded s_nop instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299816 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-08 21:28:38 +00:00
Stanislav Mekhanoshin	abcd91992d	[AMDGPU] Unroll more to eliminate phis and conditions Increase threshold to unroll a loop which contains an "if" statement whose condition defined by a PHI belonging to the loop. This may help to eliminate if region and potentially even PHI itself, saving on both divergence and registers used for the PHI. Add a small bonus for each of such "if" statements. Differential Revision: https://reviews.llvm.org/D31693 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299779 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-07 16:26:28 +00:00
Sam Kolton	218a5a7e27	[AMDGPU] Move SiShrinkInstruction and SDWAPeephole to SSAOptimization passes Summary: Difference beetween PreRegAlloc() and MachineSSAOptimization() are that the former is run despite of -O0 optimization level. In my undestanding SiShrinkInstructions and SDWAPeephole shouldn't run when optimizations are disabled. With this change order of passes will not change. Reviewers: arsenm, vpykhtin, rampitec Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31705 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299757 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-07 10:53:12 +00:00
Konstantin Zhuravlyov	4d40c97796	AMDGPU/GFX9: Fix shared and private aperture queries Differential Revision: https://reviews.llvm.org/D31786 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299727 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 23:02:33 +00:00
Eli Friedman	f25acacbe6	Turn on -addr-sink-using-gep by default. The new codepath has been in the tree for years, and there isn't any reason to use two codepaths here. Differential Revision: https://reviews.llvm.org/D30596 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299723 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 22:42:18 +00:00
Matt Arsenault	c82755f01b	AMDGPU: Diagnose illegal SGPR to VGPR copies This is possible in ways that are not compiler bugs, so stop asserting on them. This emits an extra error when emitting objects when it can't encode the new pseudo, but I'm not sure that matters. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299712 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 21:09:53 +00:00
Matt Arsenault	34d5677726	AMDGPU: Replace fp16SrcZerosHighBits with a whitelist FCOPYSIGN is lowered to bit operations which don't clear the high bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299708 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 20:58:30 +00:00
Stanislav Mekhanoshin	3b10e5fb8d	[AMDGPU] Eliminate barrier if workgroup size is not greater than wavefront size If a workgroup size is known to be not greater than wavefront size the s_barrier instruction is not needed since all threads are guarantied to come to the same point at the same time. Differential Revision: https://reviews.llvm.org/D31731 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299659 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 16:48:30 +00:00
Sam Kolton	3481b0868e	[AMDGPU] Resubmit SDWA peephole: enable by default Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299654 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 15:03:28 +00:00
Ivan Krasin	f836c7dbde	Revert r299536. [AMDGPU] SDWA peephole: enable by default. Reason: breaks multiple bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173 Original Review URL: https://reviews.llvm.org/D31671 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299583 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-05 19:58:12 +00:00
Sam Kolton	4523389543	[AMDGPU] SDWA peephole: enable by default Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299536 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-05 12:00:45 +00:00
Matt Arsenault	dce3b51aea	Verifier: Check some amdgpu calling convention restrictions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299457 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 18:43:11 +00:00
Matt Arsenault	452506a655	AMDGPU: Remove legacy export intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299444 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 16:34:39 +00:00
Matt Arsenault	f486610d1f	AMDGPU: Remove legacy image intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299443 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 16:34:35 +00:00
Matt Arsenault	513e714dfd	AMDGPU: Remove llvm.SI.vs.load.input git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299391 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-03 21:45:13 +00:00
Matt Arsenault	5b7b340242	DAG: Fix missing legalization for any_extend_vector_inreg operands git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299389 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-03 21:28:13 +00:00
Matt Arsenault	cd7c9c3178	AMDGPU: Remove legacy bfe intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299372 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-03 18:08:08 +00:00
Matt Arsenault	c4de629ce2	AMDGPU: Remove unnecessary ands when f16 is legal Add a new node to act as a fancy bitcast from f16 operations to i32 that implicitly zero the high 16-bits of the result. Alternatively could try making v2f16 legal and canonicalizing on build_vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299246 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 19:53:03 +00:00
Jan Vesely	1abd9ecbfc	AMDGPU/R600: Fix amdgpu alias analysis pass. R600 uses higher AS number to access kernel parameters Fixes: r298846 Differential Revision: https://reviews.llvm.org/D31520 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299245 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 19:26:23 +00:00
Sam Kolton	ff45a18189	[AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns Previously compiler often extracted common immediates into specific register, e.g.: ``` %vreg0 = S_MOV_B32 0xff; %vreg2 = V_AND_B32_e32 %vreg0, %vreg1 %vreg4 = V_AND_B32_e32 %vreg0, %vreg3 ``` Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands: ``` SDWA src: %vreg2 src_sel:BYTE_0 SDWA src: %vreg4 src_sel:BYTE_0 ``` With this change peephole check if operand is either immediate or register that is copy of immediate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299202 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 11:42:43 +00:00
Matt Arsenault	3c1dcddf86	AMDGPU: Add all atomicrmw fields to atomic.inc/dec Add scope, order, isVolatile git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299122 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-30 22:21:40 +00:00
Stanislav Mekhanoshin	38e381b5a3	[AMDGPU] Add GlobalOpt parameter to Always Inliner pass If set to false it does not remove global aliases. With this parameter set to false it should be safe to run the pass before link. Differential Revision: https://reviews.llvm.org/D31489 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299108 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-30 20:16:02 +00:00
Stanislav Mekhanoshin	e42e44c7e3	[AMDGPU] Boost unroll threshold for loops reading local memory This is less important than increase threshold for private memory, but still brings performance improvements in a wide range of tests. Unrolling more for local memory serves three purposes: it allows to combine ds operations if offset becomes static, saves registers used for offsets in case of static offsets, and allows better lds latency hiding. Differential Revision: https://reviews.llvm.org/D31412 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298948 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 22:13:51 +00:00
Stanislav Mekhanoshin	88f7810574	[AMDGPU] Split -amdgpu-early-inline-all option Previously it was covered by the internalization. It turns out we cannot run internalizer in FE, it break separate compilation tests. Thus early inliner gets its own option. Differential Revision: https://reviews.llvm.org/D31429 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298935 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-28 18:23:24 +00:00
Yaxun Liu	d81243066b	[AMDGPU] Switch data layout by triple environment amdgiz Switch data layout by target triple environment amdgiz and amdgizcl indicating using of an address space mapping in which generic address space is 0. amdgiz is for non-OpenCL environment where generic address space is 0. amdgizcl is for OpenCL environment where generic address space is 0. Differential Revision: https://reviews.llvm.org/D31211 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298758 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-25 02:05:44 +00:00
Matt Arsenault	a959750939	AMDGPU: Fix annotating loops with nested loop conditions If the branch condition for a loop was a phi which itself was fed from a phi from a loop, it isn't safe to try to delete the phi until after the loop is handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298737 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 20:57:10 +00:00
Matt Arsenault	d4f6485173	AMDGPU: Implement f16 fround git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298730 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 20:04:18 +00:00
Matt Arsenault	876bc45420	AMDGPU: Unify divergent function exits. StructurizeCFG can't handle cases with multiple returns creating regions with multiple exits. Create a copy of UnifyFunctionExitNodes that only unifies exit nodes that skips exit nodes with uniform branch sources. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298729 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 19:52:05 +00:00
Stanislav Mekhanoshin	a332f465b2	[AMDGPU] Fold V_CNDMASK with identical source operands Such instructions sometimes appear after lowering and folding. Differential Revision: https://reviews.llvm.org/D31318 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298723 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 18:55:20 +00:00
Konstantin Zhuravlyov	1007ee7060	[AMDGPU] Rename Kind to ValueKind in metadata to be consistent git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298722 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 18:43:15 +00:00
Stanislav Mekhanoshin	30b1056d8f	[AMDGPU] Add AMDGPUAliasAnalysis to opt pipeline Previously it was added only to the BE. Differential Revision: https://reviews.llvm.org/D31323 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298721 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 18:01:14 +00:00
Konstantin Zhuravlyov	f49ec0fc3f	[AMDGPU] Do not emit isa info as code object metadata - It was decided to expose this information through other means (rocr) Differential Revision: https://reviews.llvm.org/D30970 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298560 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-22 23:27:09 +00:00
Konstantin Zhuravlyov	a539af96a4	[AMDGPU] Emit kernel debug properties as code object metadata Differential Revision: https://reviews.llvm.org/D30969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298558 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-22 23:10:46 +00:00
Konstantin Zhuravlyov	93cb3da5a9	[AMDGPU] Emit kernel code properties as code object metadata - These are not required for low level runtime Differential Revision: https://reviews.llvm.org/D29949 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298556 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-22 22:54:39 +00:00
Konstantin Zhuravlyov	1c4f1852fb	[AMDGPU] Restructure code object metadata creation - Rename runtime metadata -> code object metadata - Make metadata not flow - Switch enums to use ScalarEnumerationTraits - Cleanup and move AMDGPUCodeObjectMetadata.h to AMDGPU/MCTargetDesc - Introduce in-memory representation for attributes - Code object metadata streamer - Create metadata for isa and printf during EmitStartOfAsmFile - Create metadata for kernel during EmitFunctionBodyStart - Finalize and emit metadata to .note during EmitEndOfAsmFile - Other minor improvements/bug fixes Differential Revision: https://reviews.llvm.org/D29948 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298552 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-22 22:32:22 +00:00
Matt Arsenault	dc55587b7f	AMDGPU: Rename SI_RETURN This is used for a specific type of return to a shader part's epilog code. Rename to try avoiding confusion from a true call's return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298452 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 22:18:10 +00:00
Matthias Braun	9c890fc5b3	SplitKit: Fix subreg copy related problems Fix two problems related to r298025: - SplitKit would create duplicate VNIs in some cases leading to crashs when hoisting copies. - VirtRegMap could fail expanding copies at the beginning of a basic block. This fixes http://llvm.org/PR32353 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298448 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 21:58:08 +00:00
Matt Arsenault	d706d030af	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298444 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 21:39:51 +00:00
George Burgess IV	3479ed63a6	Let llvm.objectsize be conservative with null pointers This adds a parameter to @llvm.objectsize that makes it return conservative values if it's given null. This fixes PR23277. Differential Revision: https://reviews.llvm.org/D28494 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298430 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 20:08:59 +00:00
Marek Olsak	1f6c4f9203	AMDGPU: Buffer descriptor changes for GFX9 Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31158 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298397 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 17:00:39 +00:00
Marek Olsak	fe4c8daa0f	AMDGPU: Always use VGPR indexing on GFX9 Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31157 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298396 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 17:00:32 +00:00
Matt Arsenault	b5f0660c6b	AMDGPU: Fix asserting on 0 dmask for image intrinsics Fold these to undef during lowering so users get eliminated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298387 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 16:32:17 +00:00
Matt Arsenault	b216ff22ea	AMDGPU: Convert image intrinsic uses in tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298386 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 16:24:12 +00:00
Matt Arsenault	0597e5e3b0	DAG: Fold bitcast/extract_vector_elt of undef to undef Fixes not eliminating store when intrinsic is lowered to undef. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298385 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 16:20:16 +00:00
Valery Pykhtin	4b7cf6c175	[AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler Differential revision: https://reviews.llvm.org/D31046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298368 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 13:15:46 +00:00
Sam Kolton	f885500f47	[ADMGPU] SDWA peephole optimization pass. Summary: First iteration of SDWA peephole. This pass tries to combine several instruction into one SDWA instruction. E.g. it converts: ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1 V_ADD_I32_e32 %vreg2, %vreg0, %vreg3 V_LSHLREV_B32_e32 %vreg4, 16, %vreg2 ''' Into: ''' V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD ''' Pass structure: 1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''. 2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0''' 3. Iterate over all potential instructions and check if they can be converted into SDWA. 4. Convert instructions to SDWA. This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done). There are several ways this pass can be improved: 1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass. 2. Introduce more SDWA patterns 3. Introduce mnemonics to limit when SDWA patterns should apply Reviewers: vpykhtin, alex-t, arsenm, rampitec Subscribers: wdng, nhaehnle, mgorny Differential Revision: https://reviews.llvm.org/D30038 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298365 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 12:51:34 +00:00
Konstantin Zhuravlyov	f6fa4ce469	[AMDGPU] Run always inliner early in opt Differential Revision: https://reviews.llvm.org/D31141 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298281 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-20 18:06:45 +00:00
Konstantin Zhuravlyov	e50788dcb8	Revert "[AMDGPU] Run always inliner early in opt" This reverts commit r297958, it breaks device-libs build. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298239 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-20 09:26:08 +00:00
Ahmed Bougacha	c86e4bce8e	[GlobalISel] Don't select trivially dead instructions. Folding instructions when selecting can cause them to become dead. Don't select these dead instructions (if they don't have other side effects, and don't define physical registers). Preserve existing tests by adding COPYs. In some tests, the G_CONSTANT vregs never get constrained to a class: the only use of the vreg was folded into another instruction, so the G_CONSTANT, now dead, never gets selected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298224 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-19 16:13:00 +00:00
Stanislav Mekhanoshin	ee8b410dd7	[AMDGPU] Add address space based alias analysis pass This is direct port of HSAILAliasAnalysis pass, just cleaned for style and renamed. Differential Revision: https://reviews.llvm.org/D31103 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298172 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-17 23:56:58 +00:00

1 2 3 4 5 ...

958 Commits