archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Dmitry Preobrazhensky	3c357a49a3	[AMDGPU][MC] Enabled constant expressions as operands of sendmsg See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D62735 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364645 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 14:14:02 +00:00
Stanislav Mekhanoshin	41e9ec03c7	[AMDGPU] Packed thread ids in function call ABI Differential Revision: https://reviews.llvm.org/D63851 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364619 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 01:52:13 +00:00
Nicolai Haehnle	c71a1c3902	AMDGPU: Make fixing i1 copies robust against re-ordering Summary: The new test case led to incorrect code. Change-Id: Ief48b227e97aa662dd3535c9bafb27d4a184efca Reviewers: arsenm, david-salinas Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63871 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364566 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-27 16:56:44 +00:00
Diana Picus	4776c1ff97	[GlobalISel] Accept multiple vregs in lowerFormalArgs Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364510 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-27 08:54:17 +00:00
Jay Foad	135f7bd084	[AMDGPU] Fix +DumpCode to print an entry label for the first function Summary: The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. It tries to print an entry label at the start of every function, but that didn't work for the first function in the module because DumpCodeInstEmitter wasn't initialised until EmitFunctionBodyStart which is too late. Change-Id: I790d73ddf4f51fd02ab32529380c7cb7c607c4ee Reviewers: arsenm, tpr, kzhuravl Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63712 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364508 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-27 08:19:28 +00:00
Matt Arsenault	220e7b197a	[AMDGPU] Fix Livereg computation during epilogue insertion The LivePhysRegs calculated in order to find a scratch register in the epilogue code wrongly uses 'LiveIns'. Instead, it should use the 'Liveout' sets. For the liveness, also considering the operands of the terminator (return) instruction which is the insertion point for the scratch-exec-copy instruction. Patch by Christudasan Devadasan git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364470 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-26 20:35:18 +00:00
Diego Novillo	16be858d04	Update phis in AMDGPUUnifyDivergentExitNodes Original patch https://reviews.llvm.org/D63659 from Steven Perron <stevenperron@google.com> The pass AMDGPUUnifyDivergentExitNodes does not update the phi nodes in the successors of blocks that is splits. This is fixed by calling BasicBlock::splitBasicBlock to split the block instead of doing it manually. This does extra work because a new conditional branch is created in BB which is immediately replaced, but I think the simplicity is worth it. It also helps make the code more future proof in case other things need to be updated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364342 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 18:55:16 +00:00
Matt Arsenault	fa3c38cd40	AMDGPU/GlobalISel: Fix broken test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364316 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 13:57:53 +00:00
Matt Arsenault	9bb11013dd	AMDGPU/GlobalISel: Fix duplicated test Somehow ended up with copies of the same tests in AMDGPU and AMDGPU/GlobalISel git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364309 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 13:23:08 +00:00
Matt Arsenault	e3c60dfc18	AMDGPU: Select G_SEXT/G_ZEXT/G_ANYEXT git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364308 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 13:18:11 +00:00
Nicolai Haehnle	90f5eff5ac	AMDGPU: Write LDS objects out as global symbols in code generation Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364297 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 11:52:30 +00:00
Matt Arsenault	3811e57e46	AMDGPU/GlobalISel: Fix regbankselect for amdgcn.class git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364262 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-25 01:07:22 +00:00
Matt Arsenault	931775983c	AMDGPU/GlobalISel: Add tests for regbankselect of v2s16 and/or/xor git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364244 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 22:21:02 +00:00
Matt Arsenault	d73371c16b	AMDGPU/GlobalISel: Select G_TRUNC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364215 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 18:02:18 +00:00
Matt Arsenault	0bb983e6ae	AMDGPU/GlobalISel: RegBankSelect for amdgcn.class git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364214 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 18:00:47 +00:00
Matt Arsenault	6ba4049f7d	AMDGPU/GlobalISel: Split VALU s64 G_ZEXT/G_SEXT in RegBankSelect Scalar extends to s64 can use S_BFE_{I64\|U64}, but vector extends need to extend to the 32-bit half, and then to 64. I'm not sure what the line should be between what RegBankSelect handles, and what instruction select does, but for now I'm erring on the side of RegBankSelect for future post-RBS combines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364212 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 17:54:12 +00:00
Matt Arsenault	2ca6872584	AMDGPU/GlobalISel: Fix selecting G_IMPLICIT_DEF for s1 Try to fail for scc, since I don't think that should ever be produced. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364199 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 16:24:03 +00:00
Matt Arsenault	ea7a0ec917	AMDGPU/GlobalISel: Fix RegBankSelect for s1 sext/zext/anyext This needs different handling if the source is known to be a valid condition or not. Handle turning it into shifts or a select during regbankselect. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364186 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 14:53:58 +00:00
Matt Arsenault	68420a23e0	AMDGPU: Fold frame index into MUBUF This matters for byval uses outside of the entry block, which appear as copies. Previously, the only folding done was during selection, which could not see the underlying frame index. For any uses outside the entry block, the frame index was materialized in the entry block relative to the global scratch wave offset. This may produce worse code in cases where the offset ends up not fitting in the MUBUF offset field. A better heuristic would be helpfu for extreme frames. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364185 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-24 14:53:56 +00:00
Matt Arsenault	c6167a4ad1	AMDGPU: Fix not using s33 for scratch wave offset in kernels Fixes missing piece from r363990. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364099 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-21 20:04:02 +00:00
Amara Emerson	2d73fb9e31	[AArch64][GlobalISel] Make s8 and s16 G_CONSTANTs legal. We sometimes get poor code size because constants of types < 32b are legalized as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the localizer can no longer sink them (although it's possible to extend it to do so). On AArch64 however s8 and s16 constants can be selected in the same way as s32 constants, with a mov pseudo into a W register. If we make s8 and s16 constants legal then we can avoid unnecessary truncates, they can be CSE'd, and the localizer can sink them as normal. There is a caveat: if the user of a smaller constant has to widen the sources, we end up with an anyext of the smaller typed G_CONSTANT. This can cause regressions because of the additional extend and missed pattern matching. To remedy this, there's a new artifact combiner to generate the wider G_CONSTANT if it's legal for the target. Differential Revision: https://reviews.llvm.org/D63587 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364075 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-21 16:43:50 +00:00
Stanislav Mekhanoshin	b9c9b022fc	[AMDGPU] hazard recognizer for fp atomic to s_denorm_mode This requires 3 wait states unless there is a wait or VALU in between. Differential Revision: https://reviews.llvm.org/D63619 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364074 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-21 16:30:14 +00:00
Matt Arsenault	2e84e8180a	AMDGPU: Always use s33 for global scratch wave offset Every called function could possibly need this to calculate the absolute address of stack objectst, and this avoids inserting a copy around every call site in the kernel. It's also somewhat cleaner to keep this in a callee saved SGPR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363990 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 21:58:24 +00:00
Matt Arsenault	483f92d980	AMDGPU: Add intrinsics for DS GWS semaphore instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363983 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 21:11:42 +00:00
Matt Arsenault	dd7dfb81b1	AMDGPU: Insert mem_viol check loop around GWS pre-GFX9 It is necessary to emit this loop around GWS operations in case the wave is preempted pre-GFX9. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363979 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 20:54:32 +00:00
Matt Arsenault	f27469b850	AMDGPU: Eliminate test usage of legacy FP elim attributes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363950 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 17:03:27 +00:00
Matt Arsenault	ad2e27a4af	AMDGPU: Fix ignoring DisableFramePointerElim in leaf functions The attribute can specify elimination for leaf or non-leaf, so it should always be considered. I copied this bug from AArch64, which probably should also be fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363949 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 17:03:23 +00:00
Stanislav Mekhanoshin	47b8de0499	[AMDGPU] gfx10 tests. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363946 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 16:29:40 +00:00
Matt Arsenault	b4a4ed429c	AMDGPU: Treat undef as an inline immediate This should only matter in vectors with an undef component, since a full undef vector would have been folded out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363941 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 16:01:09 +00:00
Matt Arsenault	694307924e	AMDGPU: Make test functions hidden Reduces amount of code in the function from eliminating the GOT load. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363940 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 15:38:30 +00:00
Stanislav Mekhanoshin	c18c17dc52	[AMDGPU] gfx1010 core wave32 changes Differential Revision: https://reviews.llvm.org/D63204 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363934 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 15:08:34 +00:00
Matt Arsenault	d544680fb3	AMDGPU: Don't clobber VCC in MUBUF addr64 emulation Introducing VCC defs during SIFixSGPRCopies is generally problematic. Avoid it by starting with the VOP3 form with the general condition register. This is the easiest to fix instance, but doesn't solve any specific problems I'm looking at. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363904 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-20 00:51:28 +00:00
Matt Arsenault	39c7035c5d	AMDGPU: Undo sub x, c canonicalization for v2i16 Should avoid regression from D62341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363899 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-19 23:37:43 +00:00
Matt Arsenault	fd950f0ed3	AMDGPU: Add baseline test for vector sub x, c canonicalization This will catch regressions from D62341, and show improvements from a future patch to fix them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363888 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-19 22:37:08 +00:00
Matt Arsenault	bff29f6333	AMDGPU: Fix folding immediate into readfirstlane through reg_sequence The def instruction for the vreg may not match, because it may be folding through a reg_sequence. The assert was overly conservative and not necessary. It's not actually important if DefMI really defined the register, because the fold that will be done cares about the def of the value that will be folded. For some reason copies aren't making it through the reg_sequence, although they should. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363876 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-19 20:44:15 +00:00
Matt Arsenault	e1eedb6602	Reapply "AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics" This reapplies r363678, using the correct chain for the CopyToReg for v0. glueCopyToM0 counterintuitively changes the operands of the original node. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363870 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-19 19:55:27 +00:00
Simon Pilgrim	1ad9529ddc	Revert rL363678 : AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics There may or may not be additional work to handle this correctly on SI/CI. ........ Breaks EXPENSIVE_CHECKS buildbots - http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/78/ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363797 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-19 13:00:54 +00:00
Matt Arsenault	5b56cc85b0	Rename ExpandISelPseudo->FinalizeISel, delay register reservation This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363757 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-19 00:25:39 +00:00
Matt Arsenault	6a59b73682	AMDGPU: Add ds_gws_init / ds_gws_barrier intrinsics There may or may not be additional work to handle this correctly on SI/CI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363678 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-18 13:19:57 +00:00
Matt Arsenault	ccd9c0a5d7	AMDGPU: Fold readlane from copy of SGPR or imm These may be inserted to assert uniformity somewhere. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363670 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-18 12:23:46 +00:00
Matt Arsenault	882e369143	AMDGPU: Fix iterator crash in AMDGPUPromoteAlloca The lifetime intrinsic was erased, which was the next iterator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363668 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-18 12:23:44 +00:00
Matt Arsenault	4f2ff6f3f7	AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.scale git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363667 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-18 12:23:42 +00:00
Fangrui Song	0ef6742182	[llvm-objdump] Tidy up AMDGCNPrettyPrinter git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363650 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-18 06:35:18 +00:00
Matt Arsenault	e14caa73dc	GlobalISel: Use the original flags when lowering fneg to fsub This was ignoring the flag on fneg, and using the source instruction's flags. Also fixes tests missing from r358702. Note the expansion itself isn't correct without nnan, but that should be fixed separately. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363637 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 23:48:43 +00:00
Stanislav Mekhanoshin	4b0f16838a	[AMDGPU] gfx1010 subvector test. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363623 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 21:55:06 +00:00
Stanislav Mekhanoshin	dc6cf0fe9d	[AMDGPU] Propagate function attributes thru bitcasts AMDGPUPropagateAttributes will not work on function bitcatsts, so move AMDGPUFixFunctionBitcasts before it. Differential Revision: https://reviews.llvm.org/D63455 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363614 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 20:42:48 +00:00
Nicolai Haehnle	3c07918c35	AMDGPU/GFX10: Don't generate s_code_end padding in the asm-printer Summary: The purpose of the padding is to guard against stale code being fetched into the instruction cache by the lowest level prefetching. We're generating relocatable ELF here, and so the padding should arguably be added by the linker. This is in fact what Mesa does. This also fixes multi-part shaders for Mesa. Change-Id: I6bfede58f20e9f337762ccf39ef9e0e263e69e82 Reviewers: arsenm, rampitec, t-tye Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363602 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 19:28:43 +00:00
Nicolai Haehnle	17d3a9f1c1	AMDGPU: Explicitly define a triple for some tests Summary: This is related to the changes to the groupstaticsize intrinsic in D61494 which would otherwise make the related tests in these files fail or much less useful. Note that for some reason, SOPK generation is less effective in the amdhsa OS, which is why I chose PAL. I haven't investigated this deeper. Change-Id: I6bb99569338f7a433c28b4c9eb1e3e036b00d166 Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63392 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363600 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 19:25:57 +00:00
Stanislav Mekhanoshin	95dd0d84d2	[AMDGPU] gfx1010 wavefrontsize intrinsic folding Differential Revision: https://reviews.llvm.org/D63206 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363588 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:57:50 +00:00
Stanislav Mekhanoshin	f1a5ef5a39	[AMDGPU] Pass to propagate ABI attributes from kernels to the functions The pass works in two modes: Mode 1: Just set attributes starting from kernels. This can work at the very beginning of opt and llc pipeline, but cannot clone functions because it must be a function pass. Mode 2: Actually clone functions for new attributes. This can only work after all function passes in the opt pipeline because it has to be a module pass. Differential Revision: https://reviews.llvm.org/D63208 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363586 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 17:47:28 +00:00

1 2 3 4 5 ...

2395 Commits