archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Piotr Sobczak	8db4d72fa5	[AMDGPU] Extend buffer intrinsics with swizzling Summary: Extend cachepolicy operand in the new VMEM buffer intrinsics to supply information whether the buffer data is swizzled. Also, propagate this information to MIR. Intrinsics updated: int_amdgcn_raw_buffer_load int_amdgcn_raw_buffer_load_format int_amdgcn_raw_buffer_store int_amdgcn_raw_buffer_store_format int_amdgcn_raw_tbuffer_load int_amdgcn_raw_tbuffer_store int_amdgcn_struct_buffer_load int_amdgcn_struct_buffer_load_format int_amdgcn_struct_buffer_store int_amdgcn_struct_buffer_store_format int_amdgcn_struct_tbuffer_load int_amdgcn_struct_tbuffer_store Furthermore, disable merging of VMEM buffer instructions in SI Load/Store optimizer, if the "swizzled" bit on the instruction is on. The default value of the bit is 0, meaning that data in buffer is linear and buffer instructions can be merged. There is no difference in the generated code with this commit. However, in the future it will be expected that front-ends use buffer intrinsics with correct "swizzled" bit set. Reviewers: arsenm, nhaehnle, tpr Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68200 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373491 91177308-0d34-0410-b5e6-96231b3b80d8	2019-10-02 17:22:36 +00:00
Jay Foad	35b4599076	Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0" Summary: D61491 caused us to use relocs when they're not strictly necessary, to refer to symbols in the text section. This is a pessimization and it's a problem for some loaders that don't support relocs yet. Reviewers: nhaehnle, arsenm, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65813 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370667 91177308-0d34-0410-b5e6-96231b3b80d8	2019-09-02 14:40:57 +00:00
Matt Arsenault	295d2ad713	AMDGPU: Disambiguate v3f16 format in load/store tables Currently the searchable tables report the number of dwords. These round to the same number for 3 and 4 component d16 instructions. Change this to report the number of elements so this isn't ambiguous. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369202 91177308-0d34-0410-b5e6-96231b3b80d8	2019-08-18 00:20:43 +00:00
Stanislav Mekhanoshin	ac72201f67	[AMDGPU] Fixed occupancy calculation for gfx10 Differential Revision: https://reviews.llvm.org/D65010 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366616 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-19 21:29:51 +00:00
Dmitry Preobrazhensky	e201b95d1a	[AMDGPU][MC][GFX9][GFX10] Added support of GET_DOORBELL message Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D64729 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366071 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-15 15:12:16 +00:00
Stanislav Mekhanoshin	bc8cb016c6	[AMDGPU] gfx908 mAI instructions, MC part Differential Revision: https://reviews.llvm.org/D64446 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365563 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-09 21:43:09 +00:00
Dmitry Preobrazhensky	71895e64c4	[AMDGPU][MC] Corrected parsing of FLAT offset modifier Summary of changes: - simplified handling of FLAT offset: offset_s13 and offset_u12 have been replaced with flat_offset; - provided information about error position for pre-gfx9 targets; - improved errors handling. Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D64244 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365321 91177308-0d34-0410-b5e6-96231b3b80d8	2019-07-08 14:27:37 +00:00
Dmitry Preobrazhensky	539c69ef52	[AMDGPU][MC] Fix 2 for sanitizer failure in 364645 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364656 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 16:28:46 +00:00
Dmitry Preobrazhensky	3c357a49a3	[AMDGPU][MC] Enabled constant expressions as operands of sendmsg See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D62735 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364645 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-28 14:14:02 +00:00
Stanislav Mekhanoshin	c33d0e6650	[AMDGPU] gfx1010 wave32 metadata Differential Revision: https://reviews.llvm.org/D63207 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363577 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-17 16:48:56 +00:00
Stanislav Mekhanoshin	5c3ee716e4	[AMDGPU] gfx1010 base changes for wave32 Differential Revision: https://reviews.llvm.org/D63293 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363299 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-13 19:18:29 +00:00
Dmitry Preobrazhensky	e4a2667a65	[AMDGPU][MC] Enabled constant expressions as operands of s_getreg/s_setreg See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D61125 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363255 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-13 12:46:37 +00:00
Piotr Sobczak	944600b17a	[AMDGPU] Optimize image_[load\|store]_mip Summary: Replace image_load_mip/image_store_mip with image_load/image_store if lod is 0. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63073 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362957 91177308-0d34-0410-b5e6-96231b3b80d8	2019-06-10 15:58:51 +00:00
Stanislav Mekhanoshin	6ae9485eb1	[AMDGPU] gfx1010 utility functions Differential Revision: https://reviews.llvm.org/D61094 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359224 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-25 18:53:41 +00:00
Stanislav Mekhanoshin	f43d543c45	[AMDGPU] Add gfx1010 target definitions Differential Revision: https://reviews.llvm.org/D61041 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359113 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-24 17:03:15 +00:00
Matt Arsenault	234b3a117e	AMDGPU: Remove dx10-clamp from subtarget features Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357302 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-29 19:14:54 +00:00
Tim Renouf	7684aab92a	[AMDGPU] Added v5i32 and v5f32 register classes They are not used by anything yet, but a subsequent commit will start using them for image ops that return 5 dwords. Differential Revision: https://reviews.llvm.org/D58903 Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356735 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-22 10:11:21 +00:00
Tim Renouf	a047778b62	[AMDGPU] Support for v3i32/v3f32 Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet. SI (gfx6) does not have dwordx3 instructions, so they are not enabled there. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58902 Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356659 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-21 12:01:21 +00:00
Dmitry Preobrazhensky	203bf432d6	[AMDGPU][MC] Enable lds_direct operand for v_readfirstlane_b32, v_readlane_b32 and v_writelane_b32 See bug 40662: https://bugs.llvm.org/show_bug.cgi?id=40662 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58713 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355312 91177308-0d34-0410-b5e6-96231b3b80d8	2019-03-04 12:48:32 +00:00
Matt Arsenault	163f2c349f	AMDGPU: Remove GCN features and predicates These are no longer necessary since the R600 tablegen files are split out now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353548 91177308-0d34-0410-b5e6-96231b3b80d8	2019-02-08 19:18:01 +00:00
Chandler Carruth	6b547686c5	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8	2019-01-19 08:50:56 +00:00
Neil Henning	4778d2ba93	[AMDGPU] Extend the SI Load/Store optimizer to combine more things. I've extended the load/store optimizer to be able to produce dwordx3 loads and stores, This change allows many more load/stores to be combined, and results in much more optimal code for our hardware. Differential Revision: https://reviews.llvm.org/D54042 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348937 91177308-0d34-0410-b5e6-96231b3b80d8	2018-12-12 16:15:21 +00:00
Nicolai Haehnle	e3924b1c15	AMDGPU: Divergence-driven selection of scalar buffer load intrinsics Summary: Moving SMRD to VMEM in SIFixSGPRCopies is rather bad for performance if the load is really uniform. So select the scalar load intrinsics directly to either VMEM or SMRD buffer loads based on divergence analysis. If an offset happens to end up in a VGPR -- either because a floating point calculation was involved, or due to other remaining deficiencies in SIFixSGPRCopies -- we use v_readfirstlane. There is some unrelated churn in tests since we now select MUBUF offsets in a unified way with non-scalar buffer loads. Change-Id: I170e6816323beb1348677b358c9d380865cd1a19 Reviewers: arsenm, alex-t, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53283 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348050 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-30 22:55:38 +00:00
Nicolai Haehnle	f2ec2633c5	AMDGPU/InsertWaitcnts: Untangle some semi-global state Summary: Reduce the statefulness of the algorithm in two ways: 1. More clearly split generateWaitcntInstBefore into two phases: the first one which determines the required wait, if any, without changing the ScoreBrackets, and the second one which actually inserts the wait and updates the brackets. 2. Communicate pre-existing s_waitcnt instructions using an argument to generateWaitcntInstBefore instead of through the ScoreBrackets. To simplify these changes, a Waitcnt structure is introduced which carries the counts of an s_waitcnt instruction in decoded form. There are some functional changes: 1. The FIXME for the VCCZ bug workaround was implemented: we only wait for SMEM instructions as required instead of waiting on all counters. 2. We now properly track pre-existing waitcnt's in all cases, which leads to less conservative waitcnts being emitted in some cases. s_load_dword ... s_waitcnt lgkmcnt(0) <-- pre-existing wait count ds_read_b32 v0, ... ds_read_b32 v1, ... s_waitcnt lgkmcnt(0) <-- this is too conservative use(v0) more code use(v1) This increases code size a bit, but the reduced latency should still be a win in basically all cases. The worst code size regressions in my shader-db are: WORST REGRESSIONS - Code Size Before After Delta Percentage 1724 1736 12 0.70 % shaders/private/f1-2015/1334.shader_test [0] 2276 2284 8 0.35 % shaders/private/f1-2015/1306.shader_test [0] 4632 4640 8 0.17 % shaders/private/ue4_elemental/62.shader_test [0] 2376 2384 8 0.34 % shaders/private/f1-2015/1308.shader_test [0] 3284 3292 8 0.24 % shaders/private/talos_principle/1955.shader_test [0] Reviewers: msearles, rampitec, scott.linder, kanarayan Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits, hakzsam Differential Revision: https://reviews.llvm.org/D54226 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347848 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-29 11:06:06 +00:00
Ron Lieberman	322a8075ef	[AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST Add a pass to fixup various vector ISel issues. Currently we handle converting GLOBAL_{LOAD\|STORE}_* and GLOBAL_Atomic_* instructions into their _SADDR variants. This involves feeding the sreg into the saddr field of the new instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347008 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-16 01:13:34 +00:00
Konstantin Zhuravlyov	92d120add2	AMDHSA: More code object v3 fixes: - Make sure IsaInfo::hasCodeObjectV3 returns true only for AMDHSA - Update assembler metadata tests to use v2 by default git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347001 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-15 23:14:23 +00:00
Nicolai Haehnle	69f971eb18	Revert "AMDGPU: Divergence-driven selection of scalar buffer load intrinsics" This reverts commit r344696 for now (except for some test additions). See https://bugs.freedesktop.org/show_bug.cgi?id=108611. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346364 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-07 21:53:43 +00:00
Konstantin Zhuravlyov	7829a6dfd5	AMDGPU: Add sram-ecc feature Differential Revision: https://reviews.llvm.org/D53222 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346177 91177308-0d34-0410-b5e6-96231b3b80d8	2018-11-05 22:44:19 +00:00
Nicolai Haehnle	cc436fd266	AMDGPU: Divergence-driven selection of scalar buffer load intrinsics Summary: Moving SMRD to VMEM in SIFixSGPRCopies is rather bad for performance if the load is really uniform. So select the scalar load intrinsics directly to either VMEM or SMRD buffer loads based on divergence analysis. If an offset happens to end up in a VGPR -- either because a floating point calculation was involved, or due to other remaining deficiencies in SIFixSGPRCopies -- we use v_readfirstlane. There is some unrelated churn in tests since we now select MUBUF offsets in a unified way with non-scalar buffer loads. Change-Id: I170e6816323beb1348677b358c9d380865cd1a19 Reviewers: arsenm, alex-t, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53283 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344696 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-17 15:37:30 +00:00
Konstantin Zhuravlyov	4d82ce5c27	AMDGPU: Re-apply r341982 after fixing the layering issue Move isa version determination into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342069 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-12 18:50:47 +00:00
Ilya Biryukov	867f48781f	Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser." This reverts commit r341982. The change introduced a layering violation. Reverting to unbreak our integrate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342023 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-12 07:05:30 +00:00
Konstantin Zhuravlyov	b479681381	AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser. Also switch away from target features to CPU string when determining isa version. This fixes an issue when we output wrong isa version in the object code when features of a particular CPU are altered (i.e. gfx902 w/o xnack used to result in gfx900). Differential Revision: https://reviews.llvm.org/D51890 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341982 91177308-0d34-0410-b5e6-96231b3b80d8	2018-09-11 18:56:51 +00:00
Matt Arsenault	7e212e4168	AMDGPU: Remove remnants of old address space mapping git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341165 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-31 05:49:54 +00:00
Tim Renouf	a53b9eb46e	[AMDGPU] New buffer intrinsics Summary: This commit adds new intrinsics llvm.amdgcn.raw.buffer.load llvm.amdgcn.raw.buffer.load.format llvm.amdgcn.raw.buffer.load.format.d16 llvm.amdgcn.struct.buffer.load llvm.amdgcn.struct.buffer.load.format llvm.amdgcn.struct.buffer.load.format.d16 llvm.amdgcn.raw.buffer.store llvm.amdgcn.raw.buffer.store.format llvm.amdgcn.raw.buffer.store.format.d16 llvm.amdgcn.struct.buffer.store llvm.amdgcn.struct.buffer.store.format llvm.amdgcn.struct.buffer.store.format.d16 llvm.amdgcn.raw.buffer.atomic.* llvm.amdgcn.struct.buffer.atomic.* with the following changes from the llvm.amdgcn.buffer.* intrinsics: * there are separate raw and struct versions: raw does not have an index arg and sets idxen=0 in the instruction, and struct always sets idxen=1 in the instruction even if the index is 0, to allow for the fact that gfx9 does bounds checking differently depending on whether idxen is set; * there is a combined cachepolicy arg (glc+slc) * there are now only two offset args: one for the offset that is included in bounds checking and swizzling, to be split between the instruction's voffset and immoffset fields, and one for the offset that is excluded from bounds checking and swizzling, to go into the instruction's soffset field. The AMDISD::BUFFER_* SD nodes always have an index operand, all three offset operands, combined cachepolicy operand, and an extra idxen operand. The obsolescent llvm.amdgcn.buffer.* intrinsics continue to work. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D50306 Change-Id: If897ea7dc34fcbf4d5496e98cc99a934f62fc205 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340269 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-21 11:07:10 +00:00
Ryan Taylor	a9d3893acc	[AMDGPU] Optimize _L image intrinsic to _LZ when lod is zero Summary: Add _L to _LZ image intrinsic table mapping to table gen. In ISelLowering check if image intrinsic has lod and if it's equal to zero, if so remove lod and change opcode to equivalent mapped _LZ. Change-Id: Ie24cd7e788e2195d846c7bd256151178cbb9ec71 Subscribers: arsenm, mehdi_amini, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49483 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338523 91177308-0d34-0410-b5e6-96231b3b80d8	2018-08-01 12:12:01 +00:00
Tom Stellard	cba2181e77	AMDGPU: Separate R600 and GCN TableGen files Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335942 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 23:47:12 +00:00
Scott Linder	43cbf8d92e	[AMDGPU] Update assembler for HSA Code Object v3 Update AMDGPU assembler syntax behind the code-object-v3 feature: * Replace/rename most AMDGPU assembler directives/symbols and document them. * Provide more diagnostics (e.g. values out of range, missing values, repeated values). * Provide path for backwards compatibility, even with underlying descriptor changes. Differential Revision: https://reviews.llvm.org/D47736 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335281 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 19:38:56 +00:00
Nicolai Haehnle	d42a61b0d4	AMDGPU: Select MIMG instructions manually in SITargetLowering Summary: Having TableGen patterns for image intrinsics is hitting limitations: for D16 we already have to manually pre-lower the packing of data values, and we will have to do the same for A16 eventually. Since there is already some custom C++ code anyway, it is arguably easier to just do everything in C++, now that we can use the beefed-up generic tables backend of TableGen to provide all the required metadata and map intrinsics to corresponding opcodes. With this approach, all image intrinsic lowering happens in SITargetLowering::lowerImage. That code is dense due to all the cases that it handles, but it should still be easier to follow than what we had before, by virtue of it all being done in a single location, and by virtue of not relying on the TableGen pattern magic that very few people really understand. This means that we will have MachineSDNodes with MIMG instructions during DAG combining, but that seems alright: previously we had intrinsic nodes instead, but those are similarly opaque to the generic CodeGen infrastructure, and the final pattern matching just did a 1:1 translation to machine instructions anyway. If anything, the fact that we now merge the address words into a vector before DAG combine should be an advantage. Change-Id: I417f26bd88f54ce9781c1668acc01f3f99774de6 Reviewers: arsenm, rampitec, rtaylor, tstellar Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48017 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335228 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 13:36:57 +00:00
Nicolai Haehnle	f6113ec066	AMDGPU: Refactor MIMG instruction TableGen using generic tables Summary: This allows us to access rich information about MIMG opcodes from C++ code. Simplifying the mapping between equivalent opcodes of different data size becomes quite natural. This also flattens the MIMG-related class and multiclass hierarchy a little, and collapses together some of the scaffolding for sample and gather4 opcodes. Change-Id: I1a2549fdc1e881ff100e5393d2d87e73729a0ccd Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48016 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335227 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 13:36:44 +00:00
Nicolai Haehnle	db5003ee85	AMDGPU: Use generic tables instead of SearchableTable Summary: Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48014 Change-Id: Ibb43f90d955275571aff17d0c3ecfb5e5b299641 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335226 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 13:36:33 +00:00
Konstantin Zhuravlyov	299cf5ff6a	AMDHSA: Code object v3 updates - Do not emit following assembler directives: - .hsa_code_object_version - .hsa_code_object_isa - .amd_amdgpu_isa - .amd_amdgpu_hsa_metadata - .amd_amdgpu_pal_metadata - Do not emit .note entries - Cleanup and bring in sync kernel descriptor header file - Emit kernel descriptor into .rodata with appropriate relocations and alignments git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334519 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-12 18:02:46 +00:00
Konstantin Zhuravlyov	7dc086936f	AMDGPU : Recalculate SGPRs when trap handler is supported Differential Revision: https://reviews.llvm.org/D29911 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332523 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-16 20:47:48 +00:00
Adrian Prantl	26b584c691	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331272 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-01 15:54:18 +00:00
Matt Arsenault	ac9b3ef76a	AMDGPU: Add Vega12 and Vega20 Changes by Matt Arsenault Konstantin Zhuravlyov git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331215 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-30 19:08:16 +00:00
Stanislav Mekhanoshin	eca9ecf8f9	[AMDGPU] Enabled v2.16 literals for VOP3P Literal encoding needs op_sel_hi to select low 16 bit in this case. Differential Revision: https://reviews.llvm.org/D45745 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330230 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-17 23:09:05 +00:00
Konstantin Zhuravlyov	419eb8a73e	AMDGPU: Remove max_scratch_backing_memory_byte_size from kernel header 1. Remove max_scratch_backing_memory_byte_size from kernel header 2. Make it a reserved field 3. Ignore it while parsing assembly for backwards compatibility 4. Bump up minor version of kernel header Differential Revision: https://reviews.llvm.org/D45452 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329620 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-09 20:47:22 +00:00
Nicolai Haehnle	2360e2aafa	AMDGPU: Make isIntrinsicSourceOfDivergence table-driven Summary: This is in preparation for the new dimension-aware image intrinsics, which I'd rather not have to list here by hand. Change-Id: Iaa16e3a635a11283918ce0d9e1e618591b0bf6fa Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44938 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328939 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-01 17:09:14 +00:00
Stanislav Mekhanoshin	0dc2b94c4e	[AMDGPU] Add default ISA version targets In case if -mattr used to modify feature set bits in llvm-mc call getIsaVersion can fail to identify specific ISA due to test mismatch. Adding default fallback tests which will always correctly report at least major version. Differential Revision: https://reviews.llvm.org/D44163 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326825 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-06 18:33:55 +00:00
Alexander Timofeev	77d8d0a7e7	Pass Divergence Analysis data to Selection DAG to drive divergence dependent instruction selection. Differential revision: https://reviews.llvm.org/D35267 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326703 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-05 15:12:21 +00:00
Konstantin Zhuravlyov	4bf0aa0c17	AMDGPU: Bring processors and features in sync with the spec - Remove gfx800 - Make iceland gfx802 - Add xnack to gfx902 Differential Revision: https://reviews.llvm.org/D43355 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325393 91177308-0d34-0410-b5e6-96231b3b80d8	2018-02-16 21:26:25 +00:00

1 2 3

131 Commits