archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Nirav Dave	acc2c1d71d	Elide stores which are overwritten without being observed. Summary: In SelectionDAG, when a store is immediately chained to another store to the same address, elide the first store as it has no observable effects. This is causes small improvements dealing with intrinsics lowered to stores. Test notes: * Many testcases overwrite store addresses multiple times and needed minor changes, mainly making stores volatile to prevent the optimization from optimizing the test away. * Many X86 test cases optimized out instructions associated with associated with va_start. * Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has dependencies to check and can probably be removed and potentially replaced with another test. Reviewers: rnk, john.brawn Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33206 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303198 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 19:43:56 +00:00
Dmitry Preobrazhensky	232c3d52ea	[AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64 See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303070 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 14:28:23 +00:00
Changpeng Fang	ed4c8077b0	AMDGPU/SI: Don't promote to vector if the load/store is volatile. Summary: We should not change volatile loads/stores in promoting alloca to vector. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D33107 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302943 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 20:31:12 +00:00
Tom Stellard	593d52aaad	AMDGPU: Add lit.local.cfg to disable global-isel tests when global-isel is disabled This should fix bots broken by r302919. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302928 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 17:59:30 +00:00
Tom Stellard	f366f4cc57	AMDGPU/GlobalISel: Mark 32-bit integer constants as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33115 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302919 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-12 16:46:46 +00:00
Matt Arsenault	9f4e5a06c6	AMDGPU: Remove tfe bit from flat instruction definitions We don't use it and it was removed in gfx9, and the encoding bit repurposed. Additionally actually using it requires changing the output register class, which wasn't done anyway. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302814 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 17:38:33 +00:00
Matt Arsenault	2bbb56fd75	AMDGPU: Pull fneg out of extract_vector_elt This allows folding source modifiers in more f16 cases. Makes it easier to select per-component packed neg modifiers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302813 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-11 17:26:25 +00:00
Dmitry Preobrazhensky	0e72980cc2	[AMDGPU][MC] Corrected v_madak/madmk to avoid printing "_e32" in disassembler output See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927 Reviewers: vpykhtin, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D32913 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302648 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-10 13:00:28 +00:00
Kannan Narayanan	96d48fac54	[AMDGPU] In the new waitcnt insertion pass, use getHeader instead of getTopBlock to find the loop header. Differential Revision: https://reviews.llvm.org/D32831 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302290 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-05 21:10:17 +00:00
Matthias Braun	0cb25a2a10	MIParser/MIRPrinter: Compute block successors if not explicitely specified - MIParser: If the successor list is not specified successors will be added based on basic block operands in the block and possible fallthrough. - MIRPrinter: Adds a new `simplify-mir` option, with that option set: Skip printing of block successor lists in cases where the parser is guaranteed to reconstruct it. This means we still print the list if some successor cannot be determined (happens for example for jump tables), if the successor order changes or branch probabilities being unequal. Differential Revision: https://reviews.llvm.org/D31262 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302289 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-05 21:09:30 +00:00
Konstantin Zhuravlyov	d2ff9194d6	AMDGPU/AMDHSA: Set COMPUTE_PGM_RSRC2:LDS_SIZE to 0 This field is populated by the CP Differential Revision: https://reviews.llvm.org/D32619 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302277 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-05 20:13:55 +00:00
Marek Olsak	24aaeeb480	AMDGPU: GFX9 GS and HS shaders always have the scratch wave offset in SGPR5 Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D32645 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302200 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-04 22:25:20 +00:00
Chad Rosier	fd8f24ed83	[DAGCombine] Transform (fadd A, (fmul B, -2.0)) -> (fsub A, (fadd B, B)). Differential Revision: http://reviews.llvm.org/D32596 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302153 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-04 14:14:44 +00:00
Matt Arsenault	5c95b810cb	AMDGPU: Don't promote alloca to LDS for leaf functions LDS use in leaf functions not currently handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301958 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-02 18:33:18 +00:00
Matt Arsenault	822c8e2ad2	AMDGPU: Make intrinsics speculatable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301937 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-02 16:57:44 +00:00
Matt Arsenault	23450e5997	AMDGPU: Fix copies from physical registers in SIFixSGPRCopies This would assert when there were multiple defs of a physical register. We just need to move all of the users of it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301730 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-29 01:26:34 +00:00
Marek Olsak	007530e1a2	AMDGPU: Add new amdgcn.init.exec intrinsics v2: More tests, bug fixes, cosmetic changes. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D31762 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301677 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-28 20:21:58 +00:00
Konstantin Zhuravlyov	c75cdfc65b	AMDGPU: Fix ValueKind code object metadata for images Differential Revision: https://reviews.llvm.org/D32504 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301360 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-25 20:38:26 +00:00
Matt Arsenault	47fbd9bc5d	Revert "StructurizeCFG: Directly invert cmp instructions" This reverts commit r300732. This breaks a few tests. I think the problem is related to adding more uses of the condition that don't yet exist at this point. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301242 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 20:25:01 +00:00
Matt Arsenault	38bd5524b0	AMDGPU: Select scratch mubuf offsets when pointer is a constant In call sequence setups, there may not be a frame index base and the pointer is a constant offset from the frame pointer / scratch wave offset register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301230 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 19:40:59 +00:00
Stanislav Mekhanoshin	49a37e6bb1	[AMDGPU] Merge M0 initializations Merges equivalent initializations of M0 and hoists them into a common dominator block. Technically the same code can be used with any register, physical or virtual. Differential Revision: https://reviews.llvm.org/D32279 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301228 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 19:37:54 +00:00
Yaxun Liu	76c532ddba	CodeGen: Add a hook for getFenceOperandTy Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0. This is fine for most targets. However for amdgcn target, the size of pointer in address space 0 depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is 32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target triple environment. Therefore a hook is need in target lowering for getting the fence operand type. This patch has no effect on targets other than amdgcn. Differential Revision: https://reviews.llvm.org/D32186 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301215 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 18:26:27 +00:00
Matt Arsenault	666020a37d	AMDGPU: Move trap lowering to DAG Fixes traps in any block besides the entry block, and fixes depending on a live-in physical register by using a virtual register copy. Also happens to stop emitting a nop in the case debug trap is not supported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301206 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 17:49:13 +00:00
Nicolai Haehnle	c3187b408e	AMDGPU: Move v_readlane lane select from VGPR to SGPR Summary: Fix a compiler bug when the lane select happens to end up in a VGPR. Clarify the semantic of the corresponding intrinsic to be that of the corresponding GLSL: the lane select must be uniform across a wave front, otherwise results are undefined. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D32343 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301197 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 17:17:36 +00:00
Nicolai Haehnle	1c1f7ef631	AMDGPU: Fix crash when scheduling non-memory SMRD instructions Summary: Fixes piglit spec/arb_shader_clock/execution/* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D32345 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301191 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 16:53:52 +00:00
David Blaikie	dae36df6c6	Fix test from polluting the source tree (though this seems like a "does this not crash" test - which isn't very good. Should be fixed) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301071 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-22 07:53:40 +00:00
Konstantin Zhuravlyov	8c373cc5e6	AMDGPU: Temporarily disable packed inlinable literals (v2f16, v2i16) Differential Revision: https://reviews.llvm.org/D32361 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301028 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-21 19:45:22 +00:00
Konstantin Zhuravlyov	ac73bb1e6a	AMDGPU: Fix S_PACK_HH_B32_B16 - We really ought to zero out lower 16 bits Differential Revision: https://reviews.llvm.org/D32356 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301026 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-21 19:35:05 +00:00
Yaxun Liu	5255fef87c	[AMDGPU] Handle SI_MASKED_UNREACHABLE in instruction emitter SI_MASKED_UNREACHABLE does not have machine instruction encoding. It needs special handling in AMDGPUAsmPrinter::EmitInstruction like some other pseudo instructions. This patch fixes compilation failure of RadeonRays. Differential Revision: https://reviews.llvm.org/D32364 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301025 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-21 19:32:02 +00:00
Konstantin Zhuravlyov	989643fa78	AMDGPU: Do not lower fast unsafe div for safe, f32, with fp32 denormals Differential Revision: https://reviews.llvm.org/D32085 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301023 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-21 19:25:33 +00:00
Yaxun Liu	1baa360f32	CodeGen: Let frame index value type match alloca addr space Recently alloca address space has been added to data layout. Due to this change, pointer returned by alloca may have different size as pointer in address space 0. However, currently the value type of frame index is assumed to be of the same size as pointer in address space 0. This patch fixes that. Most targets assume alloca returning pointer in address space 0, which is the default alloca address space. Therefore it is NFC for them. AMDGCN target with amdgiz environment requires this change since it assumes alloca returning pointer to addr space 5 and its size is 32, which is different from the size of pointer in addr space 0 which is 64. Differential Revision: https://reviews.llvm.org/D32021 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300864 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-20 18:15:34 +00:00
Matt Arsenault	ac9b651ab6	AMDGPU: Custom lower illegal small select types Promote them to i32 vectors to avoid unpacking and re-packing the vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300754 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-19 20:53:07 +00:00
Matt Arsenault	0e1e60b73a	AMDGPU: Don't emit amd_kernel_code_t for callable functions This is inserted directly in the text section. The relocation for the function ends up resolving to the beginning of the amd_kernel_code_t header rather than the actual function entry point. Also skip some of the comments for initialization that only makes sense for kernels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300736 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-19 19:38:10 +00:00
Matt Arsenault	fe6b2045f8	StructurizeCFG: Directly invert cmp instructions The most common case for a branch condition is a single use compare. Directly invert the branch predicate rather than adding a lot of xor i1 true which the DAG will have to fold later. This produces nicer to read structurizer output. This produces some random changes in codegen due to the DAG swapping branch conditions itself, and then does a poor job of dealing with those inverts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300732 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-19 18:29:07 +00:00
Matt Arsenault	902e7e59d1	AMDGPU: Don't align callable functions to 256 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300720 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-19 17:42:39 +00:00
Matt Arsenault	610621c4ba	AMDGPU: Change DivergenceAnalysis for function arguments Stop assuming all functions are kernels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300719 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-19 17:42:34 +00:00
Matt Arsenault	74ac54ec5b	AMDGPU: Use MachineRegisterInfo to find max used register Avoid looping through program to determine register counts. This avoids needing to look at regmask operands. Also fixes some counting errors with flat_scr when there are no stack objects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300482 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-17 19:48:30 +00:00
Stanislav Mekhanoshin	d8c6515dbf	[AMDGPU] set read_only access qualifier for pointers If a kernel's pointer argument is known to be readonly set access qualifier accordingly. This allows RT not to flush caches before dispatches. Differential Revision: https://reviews.llvm.org/D32091 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300362 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-14 19:11:40 +00:00
Stanislav Mekhanoshin	b02882850b	[AMDGPU] added SIInstrInfo::getAddNoCarry() helper Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300288 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-14 00:33:44 +00:00
Konstantin Zhuravlyov	2b1bb6d8f8	AMDGPU/GFX9: Do not use v_pack_b32_f16 when packing Differential Revision: https://reviews.llvm.org/D31819 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300275 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-13 23:17:00 +00:00
Stanislav Mekhanoshin	881e9f3177	[AMDGPU] Combine DS operations with offsets bigger than byte In many cases ds operations can be combined even if offsets do not fit into 8 bit encoding. What it takes is to adjust base address. Differential Revision: https://reviews.llvm.org/D31993 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300227 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-13 17:53:07 +00:00
Wei Ding	24e7b0fb5c	AMDGPU : Fix common dominator of two incoming blocks terminates with uniform branch issue. Differential Revision: http://reviews.llvm.org/D31350 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300142 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 23:51:47 +00:00
Matt Arsenault	ab28f3b39e	AMDGPU: Fix invalid copies when copying i1 to phys reg Insert a VReg_1 virtual register so the i1 workaround pass can handle it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300113 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 21:58:23 +00:00
Stanislav Mekhanoshin	bb9002fbb2	[AMDGPU] Generate range metadata for workitem id If workgroup size is known inform llvm about range returned by local id and local size queries. Differential Revision: https://reviews.llvm.org/D31804 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300102 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 20:48:56 +00:00
Sam Kolton	9a421935fb	[AMDGPU] SDWA: make pass global Summary: Remove checks for basic blocks. Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31935 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300040 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 09:36:05 +00:00
Kannan Narayanan	d3302ddc52	[AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing. Based on comments in https://reviews.llvm.org/D31161. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300023 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 03:25:12 +00:00
Matt Arsenault	8c86ad544b	AMDGPU: Insert wait at start of callee functions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300000 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 22:29:31 +00:00
Matt Arsenault	56db90276b	AMDGPU: Refactor SIMachineFunctionInfo slightly Prepare for handling non-entry functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299999 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 22:29:28 +00:00
Matt Arsenault	938bfaf893	AMDGPU: Refactor argument lowering Split into smaller functions and prepare for handling non-entry functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299998 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 22:29:24 +00:00
Matt Arsenault	651ac56097	AMDGPU: Fix folding reg_sequence into copy to phys reg This was producing an illegal reg_sequence defining a physical register with virtual register inputs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299997 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-11 22:29:19 +00:00

1 2 3 4 5 ...

1010 Commits