archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Tom Stellard	a9c6165732	AMDGPU/SI: Don't allow unaligned scratch access Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284257 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 18:10:39 +00:00
Nicolai Haehnle	5d662038f2	AMDGPU: Fix use-after-frees Reviewers: arsenm, tstellarAMD Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D25312 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284215 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 09:03:04 +00:00
Konstantin Zhuravlyov	a91924b28a	[AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external and global address space variables Differential Revision: https://reviews.llvm.org/D25562 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284196 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 04:37:34 +00:00
Matt Arsenault	c31d80dbf5	AMDGPU: Assume spilling will occur at -O0 Because everything live is spilled at the end of a block by fast regalloc, assume this will happen and avoid the copies of the resource descriptor. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284119 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 13:10:00 +00:00
Matt Arsenault	3339279d4a	AMDGPU: Fix truncate to bool warnings git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284116 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 12:45:16 +00:00
Matt Arsenault	7b6a558f24	AMDGPU: Initial implementation of VGPR indexing mode This is the most basic handling of the indirect access pseudos using GPR indexing mode. This currently only enables the mode for a single v_mov_b32 and then disables it. This is much more complicated to use than the movrel instructions, so a new optimization pass is probably needed to fold the access into the uses and keep the mode enabled for them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284031 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 18:49:05 +00:00
Matt Arsenault	548c83d355	AMDGPU: Refactor indirect vector lowering Allow inserting multiple instructions in the expanded loop. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283177 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-04 01:41:05 +00:00
Konstantin Zhuravlyov	f9bcd7b189	[AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit instructions Differential Revision: https://reviews.llvm.org/D24125 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282624 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-28 20:05:39 +00:00
Matt Arsenault	de82da5521	AMDGPU: Fix broken FrameIndex handling We were trying to avoid using a FrameIndex operand in non-pointer operands in a convoluted way, and would break because of using TargetFrameIndex. The TargetFrameIndex should only be used in the case where it makes sense to fold it as part of the addressing mode, otherwise it requires materialization like a normal constant. This wasn't working reliably and failed in the added testcase, hitting the assert when processing the frame index. The TargetFrameIndex was coming from trying to produce an AssertZext limiting the maximum stack size. I'm not sure this was correct to begin with, because it is apparently possible to have a single workitem dispatch that requires all 4G of private memory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281824 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-17 16:09:55 +00:00
Matt Arsenault	8f824feb12	AMDGPU: Allow some control flow intrinsics to be CSEd These clean up some unnecessary or instructions in cases with complex loops. In the original testcase I noticed this, the same or with exec was repeated 5 or 6 times in a row. With this only one is emitted or sometimes a copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281786 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-16 22:11:18 +00:00
Tom Stellard	77361ae206	AMDGPU: Refactor kernel argument lowering Summary: The main challenge in lowering kernel arguments for AMDGPU is determing the memory type of the argument. The generic calling convention code assumes that only legal register types can be stored in memory, but this is not the case for AMDGPU. This consolidates all the logic AMDGPU uses for deducing memory types into a single function. This will make it much easier to support different ABIs in the future. Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24614 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281781 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-16 21:53:00 +00:00
Tom Stellard	1961591989	AMDGPU/SI: Add support for triples with the mesa3d operating system Summary: mesa3d will use the same kernel calling convention as amdhsa, but it will handle everything else like the default 'unknown' OS type. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22783 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281779 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-16 21:34:26 +00:00
Matt Arsenault	8bc95d0a47	AMDGPU: Improve splitting 64-bit bit ops by constants This addresses a TODO to handle operations besides and. This also starts eliminating no-op operations with a constant that can emerge later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281488 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 15:19:03 +00:00
Justin Lebar	c71d5b41ef	[CodeGen] Split out the notions of MI invariance and MI dereferenceability. Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281151 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-11 01:38:58 +00:00
Matt Arsenault	605a81a85c	AMDGPU: Relax SGPR asm constraint register class s should be SReg_32 to be as general as possible. This can avoid a copy from m0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280154 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-30 20:50:08 +00:00
Matt Arsenault	e52dfc95ef	AMDGPU: Move cndmask pseudo to be isel pseudo There's only one use of this for the convenience of a pattern. I think v_mov_b64_pseudo should also be moved, but SIFoldOperands does currently make use of it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279901 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-27 01:00:37 +00:00
NAKAMURA Takumi	805f0aacc0	Untabify. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279408 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-22 00:58:04 +00:00
Matt Arsenault	4da2d32371	AMDGPU: Remove dead option git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278965 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-17 20:07:16 +00:00
Justin Bogner	6673ea81f6	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278902 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-17 05:10:15 +00:00
Matt Arsenault	b24aaff187	AMDGPU: Fix missing test for addressing mode with odd offsets Add test if the constant offset looks unaligned. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278589 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-13 01:43:51 +00:00
Alina Sbirlea	4ebfadfe76	LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments. Summary: TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses. A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted. Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures: - x86 failing tests either did not have the right target, or the right alignment. - NVPTX failing tests did not have the right alignment. - AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks for a maximum size allowed and relies on the next condition checking for %4 for correctness. This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list). Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure), so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context. Reviewers: arsenm, jlebar, tstellarAMD Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D23068 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277735 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 16:38:44 +00:00
Matt Arsenault	94166e75ac	AMDGPU: fdiv -1, x -> rcp -x git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277535 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-02 22:25:04 +00:00
Matt Arsenault	4fd45ebabd	AMDGPU: Fix shouldConvertConstantLoadToIntImm behavior This should really be true for any immediate, not just inline ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277260 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-30 01:40:36 +00:00
Matthias Braun	f79c57a412	MachineFunction: Return reference for getFrameInfo(); NFC getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277017 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 18:40:00 +00:00
Wei Ding	ee8c4ca1e1	AMDGPU : Add intrinsics for compare with the full wavefront result Differential Revision: http://reviews.llvm.org/D22482 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276998 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 16:42:13 +00:00
Matt Arsenault	b5a809e37c	AMDGPU: Remove analyzeImmediate This no longer uses the more complicated classification of constants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276945 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 00:32:02 +00:00
Matt Arsenault	d506595769	AMDGPU: Make AMDGPUMachineFunction fields private ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly aligned, which requires fixing clover. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276766 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 16:45:58 +00:00
Matt Arsenault	ee4cdb7b75	AMDGPU: Add fp legacy instruction intrinsics This could use some additional optimization work to use mad/mac legacy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276764 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 16:45:45 +00:00
Jan Vesely	4a44da0c82	AMDGPU: Remove read_workdim intrinsic Differential revision: https://reviews.llvm.org/D22732 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276682 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-25 20:17:02 +00:00
Matt Arsenault	9da217ee1e	AMDGPU: Fix groupstaticsize for large LDS The size can exceed s_movk_i32's limit, and we don't want to use it this early since it inhibits optimizations. This should probably be merged to the release branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276438 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:33 +00:00
Matt Arsenault	30f0e3e4be	AMDGPU: Add HSA dispatch id intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276437 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:30 +00:00
Matt Arsenault	7c8be6eeb7	AMDGPU: Don't reinvent transferSuccessorsAndUpdatePHIs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276434 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:15 +00:00
Matt Arsenault	a9994065f9	AMDGPU: Fix phis from blocks split due to register indexing git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276257 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-21 09:40:57 +00:00
Matt Arsenault	63be72069d	AMDGPU: Change fdiv lowering based on !fpmath metadata If 2.5 ulp is acceptable, denormals are not required, and isn't a reciprocal which will already be handled, replace with a faster fdiv. Simplify the lowering tests by using per function subtarget features. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276051 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 23:16:53 +00:00
Matt Arsenault	1ce58d721f	AMDGPU: Only use legal inline immediates with kill pseudo Only if the value is negative or positive is what matters, so use a constant that doesn't require an instruction to materialize. These should really just emit the write exec directly, but for stick with the kill pseudo-terminator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275988 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 16:27:56 +00:00
Matt Arsenault	4cead0b564	AMDGPU: Expand register indexing pseudos in custom inserter This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275934 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 00:35:03 +00:00
Matt Arsenault	bb09cfd86f	AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275871 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-18 18:35:05 +00:00
Matt Arsenault	7150fbf236	AMDGPU: Remove legacy rsq.clamped intrinsic Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining. Also fix mismatch with non-IEEE rsq selecting to IEEE rsq. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275617 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 21:26:52 +00:00
Justin Lebar	b2d6ad7cfd	[SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275592 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 18:27:10 +00:00
Matt Arsenault	435a4467a3	AMDGPU: Fix splitting kill blocks with defs before kill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275508 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 00:58:09 +00:00
Matt Arsenault	759af1e5a2	AMDGPU: Remove unused intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275371 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-14 05:23:19 +00:00
Tom Stellard	fab569e180	AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL Reviewers: rafael, ruiu, tony-tye, arsenm, kzhuravl Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21484 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275268 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-13 14:23:33 +00:00
Matt Arsenault	e6aa9db488	AMDGPU: Fold out no-op kill intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275253 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-13 06:04:22 +00:00
Matt Arsenault	8a85be7236	AMDGPU: Follow up to r275203 I meant to squash this into it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275220 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-12 21:41:32 +00:00
Nicolai Haehnle	ac1e47a211	AMDGPU: Treat texture gather instructions more like other MIMG instructions Summary: Setting MIMG to 0 has a bunch of unexpected side effects, including that isVMEM returns false which leads to incorrect treatment in the hazard recognizer. The reason I noticed it is that it also leads to incorrect treatment in VGPR-to-SGPR copies, which is one cause of the referenced bug. The only reason why MIMG was set to 0 is to signal the special handling of dmasks, but that can be checked differently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96877 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D22210 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275113 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-11 21:59:43 +00:00
Matt Arsenault	762cdd4ae8	Revert "AMDGPU: Remove unused control flow intrinsic" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274978 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-09 17:18:39 +00:00
Matt Arsenault	9ddc329c4e	AMDGPU: Fix fdiv lowering when f32 denormals supported Also fix test not actually using function labels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274969 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-09 07:48:11 +00:00
Matt Arsenault	5e2ec03cf4	AMDGPU: Remove unused control flow intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274939 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-08 21:39:44 +00:00
Matt Arsenault	6e4fd1497d	AMDGPU: Add feature for unaligned access git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274398 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 23:03:44 +00:00
Matt Arsenault	df587174eb	AMDGPU: Improve load/store of illegal types. There was a combine before to handle the simple copy case. Split this into handling loads and stores separately. We might want to change how this handles some of the vector extloads, since this can result in large code size increases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274394 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 22:47:50 +00:00

1 2 3 4

189 Commits