archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Marek Olsak	0ce6825d9e	AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset Summary: Apps that benefit: - alien isolation - bioshock infinite - civilization: beyond earth - company of heroes 2 - dirt showdown - dota 2 - F1 2015 - grid autosport - hitman - legend of grimrock - serious sam 3: bfe - shadow warrior - talos principle - total war: warhammer - UE4 demos: effects cave, elemental, sun temple Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38914 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317038 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-31 21:06:42 +00:00
Marek Olsak	4fda278e9b	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316427 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-24 10:27:13 +00:00
Matt Arsenault	b151df8f6d	AMDGPU: Fix not accounting for instruction size in bundles These were counted as 0. Fixes branch limit exceeded errors in some large programs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314944 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-04 22:59:12 +00:00
Nicolai Haehnle	da47d9d5d0	AMDGPU: VALU carry-in and v_cndmask condition cannot be EXEC The hardware will only forward EXEC_LO; the high 32 bits will be zero. Additionally, inline constants do not work. At least, v_addc_u32_e64 v0, vcc, v0, v1, -1 which could conceivably be used to combine (v0 + v1 + 1) into a single instruction, acts as if all carry-in bits are zero. The llvm.amdgcn.ps.live test is adjusted; it would be nice to combine s_mov_b64 s[0:1], exec v_cndmask_b32_e64 v0, v1, v2, s[0:1] into v_mov_b32 v0, v3 but it's not particularly high priority. Fixes dEQP-GLES31.functional.shaders.helper_invocation.value.* git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314522 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-29 15:37:31 +00:00
Matt Arsenault	ec61af4bcc	AMDGPU: Fix crash on immediate operand We can have a v_mac with an immediate src0. We can still fold if it's an inline immediate, otherwise it already uses the constant bus. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313852 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-21 00:45:59 +00:00
Konstantin Zhuravlyov	fe0a82a17c	AMDGPU: Start selecting s_xnor_{b32, b64} Differential Revision: https://reviews.llvm.org/D37981 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313565 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-18 21:22:45 +00:00
Jan Sjodin	ac413e0287	Fix warnings in r313297. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313302 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-14 21:49:52 +00:00
Matt Arsenault	11283fb2c8	AMDGPU: Fix violating constant bus restriction You can't use madmk/madmk if it already uses an SGPR input. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313298 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-14 20:54:29 +00:00
Jan Sjodin	028255f1f7	Add AddresSpace to PseudoSourceValue. Differential Revision: https://reviews.llvm.org/D35089 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313297 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-14 20:53:51 +00:00
Matt Arsenault	a1a416812a	AMDGPU: Don't spill SP reg like a normal CSR git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313217 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-13 23:47:01 +00:00
Stanislav Mekhanoshin	63c545da3a	Allow target to decide when to cluster loads/stores in misched MachineScheduler when clustering loads or stores checks if base pointers point to the same memory. This check is done through comparison of base registers of two memory instructions. This works fine when instructions have separate offset operand. If they require a full calculated pointer such instructions can never be clustered according to such logic. Changed shouldClusterMemOps to accept base registers as well and let it decide what to do about it. Differential Revision: https://reviews.llvm.org/D37698 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313208 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-13 22:20:47 +00:00
Stanislav Mekhanoshin	46582be974	[AMDGPU] Produce madak and madmk from the two-address pass These two instructions are normally selected, but when the two address pass converts mac into mad we end up with the mad where we could have one of these. Differential Revision: https://reviews.llvm.org/D37389 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312928 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-11 17:13:57 +00:00
Stanislav Mekhanoshin	651c4efd77	[AMDGPU] Fix shouldClusterMemOps to process flat loads Flat loads do not have vdata operand but have vdst instead. Differential Revision: https://reviews.llvm.org/D37502 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312640 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-06 15:31:30 +00:00
Eugene Zelenko	5ca94f31ee	[AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310328 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-08 00:47:13 +00:00
Connor Abbott	c300b1a6d3	[AMDGPU] Implement llvm.amdgcn.set.inactive intrinsic Summary: This intrinsic lets us set inactive lanes to an identity value when implementing wavefront reductions. In combination with Whole Wavefront Mode, it lets inactive lanes be skipped over as required by GLSL/Vulkan. Lowering the intrinsic needs to happen post-RA so that RA knows that the destination isn't completely overwritten due to the EXEC shenanigans, so we need another pseudo-instruction to represent the un-lowered intrinsic. Reviewers: tstellar, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34719 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310088 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-04 18:36:54 +00:00
Connor Abbott	ecf573917a	[AMDGPU] Add support for Whole Wavefront Mode Summary: Whole Wavefront Wode (WWM) is similar to WQM, except that all of the lanes are always enabled, regardless of control flow. This is required for implementing wavefront reductions in non-uniform control flow, where we need to use the inactive lanes to propagate intermediate results, so they need to be enabled. We need to propagate WWM to uses (unless they're explicitly marked as exact) so that they also propagate intermediate results correctly. We do the analysis and exec mask munging during the WQM pass, since there are interactions with WQM for things that require both WQM and WWM. For simplicity, WWM is entirely block-local -- blocks are never WWM on entry or exit of a block, and WWM is not propagated to the block level. This means that computations involving WWM cannot involve control flow, but we only ever plan to use WWM for a few limited purposes (none of which involve control flow) anyways. Shaders can ask for WWM using the @llvm.amdgcn.wwm intrinsic. There isn't yet a way to turn WWM off -- that will be added in a future change. Finally, it turns out that turning on inactive lanes causes a number of problems with register allocation. While the best long-term solution seems like teaching LLVM's register allocator about predication, for now we need to add some hacks to prevent ourselves from getting into trouble due to constraints that aren't currently expressed in LLVM. For the gory details, see the comments at the top of SIFixWWMLiveness.cpp. Reviewers: arsenm, nhaehnle, tpr Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D35524 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310087 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-04 18:36:52 +00:00
Connor Abbott	7af89e579d	[AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM Summary: Previously, we assumed that certain types of instructions needed WQM in pixel shaders, particularly DS instructions and image sampling instructions. This was ok because with OpenGL, the assumption was correct. But we want to start using DPP instructions for derivatives as well as other things, so the assumption that we can infer whether to use WQM based on the instruction won't continue to hold. This intrinsic lets frontends like Mesa indicate what things need WQM based on their knowledge of the API, rather than second-guessing them in the backend. We need to keep around the old method of enabling WQM, but eventually we should remove it once Mesa catches up. For now, this will let us use DPP instructions for computing derivatives correctly. Reviewers: arsenm, tpr, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D35167 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310085 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-04 18:36:49 +00:00
Matt Arsenault	c60159767d	AMDGPU: Pass special input registers to functions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309998 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-03 23:00:29 +00:00
Matt Arsenault	1ab1e79a01	AMDGPU: Make areMemAccessesTriviallyDisjoint more aware of segment flat Checking the encoding is insufficient since now there can be global or scratch instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309472 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-29 01:26:21 +00:00
Matt Arsenault	3cf3981405	AMDGPU: Fix getMemOpBaseRegImmOfs for flat with offsets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308762 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-21 18:06:36 +00:00
Matt Arsenault	20f8334c2a	Add an ID field to StackObjects On AMDGPU SGPR spills are really spilled to another register. The spiller creates the spills to new frame index objects, which is used as a placeholder. This will eventually be replaced with a reference to a position in a VGPR to write to and the frame index deleted. It is most likely not a real stack location that can be shared with another stack object. This is a problem when StackSlotColoring decides it should combine a frame index used for a normal VGPR spill with a real stack location and a frame index used for an SGPR. Add an ID field so that StackSlotColoring has a way of knowing the different frame index types are incompatible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308673 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-20 21:03:45 +00:00
Alfred Huang	1356a150af	[AMDGPU] Do not insert an instruction into worklist twice in movetovalu In moveToVALU(), move to vector ALU is performed, all instrs in the use chain will be visited. We do not want the same node to be pushed to the visit worklist more than once. Differential Revision: https://reviews.llvm.org/D34726 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308039 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-14 17:56:55 +00:00
Simon Pilgrim	26aa51226a	[AMDGPU] Fix -Wimplicit-fallthrough warnings. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307381 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 10:18:57 +00:00
Matt Arsenault	ff0022d12c	AMDGPU: Add operand target flags serialization git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306995 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-02 23:21:48 +00:00
Sam Kolton	06ed4a14fd	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 15:02:23 +00:00
Nicolai Haehnle	7ca35760c5	AMDGPU: M0 operands to spill/restore opcodes are dead Summary: With scalar stores, M0 is clobbered and therefore marked as implicitly defined. However, it is also dead. This fixes an assertion when the Greedy Register Allocator decides to optimize a spill/restore pair away again (via tryHintsRecoloring). Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33319 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306375 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-27 08:04:13 +00:00
Sam Kolton	e88fc4046f	[AMDGPU] SDWA: add support for GFX9 in peephole pass Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026 Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34241 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305986 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-22 06:26:41 +00:00
Sam Kolton	7ff8af4ed8	[AMDGPU] SDWA: merge VI and GFX9 pseudo instructions Summary: Previously there were two separate pseudo instruction for SDWA on VI and on GFX9. Created one pseudo instruction that is union of both of them. Added verifier to check that operands conform either VI or GFX9. Reviewers: dp, arsenm, vpykhtin Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, artem.tamazov Differential Revision: https://reviews.llvm.org/D34026 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305886 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-21 08:53:38 +00:00
Matt Arsenault	9cae1d2455	AMDGPU: Don't add same implicit use multiple times For the last component, the same register use was added as an implicit use and another implicit kill use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305205 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 17:19:20 +00:00
Matt Arsenault	37ebc3a859	AMDGPU: Verify that flat offsets aren't used pre-GFX9 For convenience the operand is always present in the instruction, but it isn't valid to use except on GFX9. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305200 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-12 16:37:55 +00:00
Chandler Carruth	e3e43d9d57	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304787 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-06 11:49:48 +00:00
Tom Stellard	f696b32065	AMDGPU/GlobalISel: Mark 32-bit float constants as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D33212 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304003 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-26 16:40:03 +00:00
Matt Arsenault	928b9308a6	AMDGPU: Use appropriate soffset for spilling This needs to be the frame offset register, and not the global scratch wave offset register. For kernels, these are the same. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303287 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-17 19:37:57 +00:00
NAKAMURA Takumi	4ab9bae292	AMDGPUCodeGen: Fix warnings in r303111. [-Wunused-variable] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303137 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-16 04:01:23 +00:00
Jan Sjodin	4f10728b0c	Re-submit AMDGPUMachineCFGStructurizer. Differential Revision: https://reviews.llvm.org/D23209 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303111 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 20:18:37 +00:00
Jan Sjodin	dd98c46159	Revert 303091. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303098 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 18:39:47 +00:00
Jan Sjodin	9ee4b4d97c	Add AMDGPUMachineCFGStructurizer. Differential Revision: https://reviews.llvm.org/D23209 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303091 91177308-0d34-0410-b5e6-96231b3b80d8	2017-05-15 18:13:56 +00:00
Krzysztof Parzyszek	36d7c2b2e5	Move size and alignment information of regclass to TargetRegisterInfo 1. RegisterClass::getSize() is split into two functions: - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const; - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const; 2. RegisterClass::getAlignment() is replaced by: - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const; This will allow making those values depend on subtarget features in the future. Differential Revision: https://reviews.llvm.org/D31783 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301221 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 18:55:33 +00:00
Nicolai Haehnle	c3187b408e	AMDGPU: Move v_readlane lane select from VGPR to SGPR Summary: Fix a compiler bug when the lane select happens to end up in a VGPR. Clarify the semantic of the corresponding intrinsic to be that of the corresponding GLSL: the lane select must be uniform across a wave front, otherwise results are undefined. Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D32343 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301197 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 17:17:36 +00:00
Nicolai Haehnle	1c1f7ef631	AMDGPU: Fix crash when scheduling non-memory SMRD instructions Summary: Fixes piglit spec/arb_shader_clock/execution/* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D32345 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301191 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-24 16:53:52 +00:00
Konstantin Zhuravlyov	ac73bb1e6a	AMDGPU: Fix S_PACK_HH_B32_B16 - We really ought to zero out lower 16 bits Differential Revision: https://reviews.llvm.org/D32356 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301026 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-21 19:35:05 +00:00
Stanislav Mekhanoshin	b02882850b	[AMDGPU] added SIInstrInfo::getAddNoCarry() helper Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300288 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-14 00:33:44 +00:00
Konstantin Zhuravlyov	2b1bb6d8f8	AMDGPU/GFX9: Do not use v_pack_b32_f16 when packing Differential Revision: https://reviews.llvm.org/D31819 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300275 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-13 23:17:00 +00:00
Matt Arsenault	c82755f01b	AMDGPU: Diagnose illegal SGPR to VGPR copies This is possible in ways that are not compiler bugs, so stop asserting on them. This emits an extra error when emitting objects when it can't encode the new pseudo, but I'm not sure that matters. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299712 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-06 21:09:53 +00:00
Sam Kolton	ff45a18189	[AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns Previously compiler often extracted common immediates into specific register, e.g.: ``` %vreg0 = S_MOV_B32 0xff; %vreg2 = V_AND_B32_e32 %vreg0, %vreg1 %vreg4 = V_AND_B32_e32 %vreg0, %vreg3 ``` Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands: ``` SDWA src: %vreg2 src_sel:BYTE_0 SDWA src: %vreg4 src_sel:BYTE_0 ``` With this change peephole check if operand is either immediate or register that is copy of immediate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299202 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 11:42:43 +00:00
Yaxun Liu	ab3be33d40	[AMDGPU] Get address space mapping by target triple environment As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298846 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-27 14:04:01 +00:00
Matt Arsenault	876bc45420	AMDGPU: Unify divergent function exits. StructurizeCFG can't handle cases with multiple returns creating regions with multiple exits. Create a copy of UnifyFunctionExitNodes that only unifies exit nodes that skips exit nodes with uniform branch sources. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298729 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 19:52:05 +00:00
Marek Olsak	1f6c4f9203	AMDGPU: Buffer descriptor changes for GFX9 Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31158 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298397 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-21 17:00:39 +00:00
Matt Arsenault	dbe625a311	AMDGPU: Keep track of modifiers when converting v_mac to v_mad Since v_max_f32_e64/v_max_f16_e64 can be folded if the target instruction supports the clamp bit, we also need to maintain modifiers when converting v_mac to v_mad. This fixes a rendering issue with Dirt Rally because a v_mac instruction with the clamp bit set was converted to a v_mad but that bit was lost during the conversion. Fixes: `e184e01dd7` ("AMDGPU: Fold FP clamp as modifier bit") Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297556 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-11 05:40:40 +00:00
Matt Arsenault	27f4f2f4bc	AMDGPU: Support v2i16/v2f16 packed operations git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296396 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 22:15:25 +00:00

1 2 3 4 5

236 Commits