archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Matt Arsenault	6e4fd1497d	AMDGPU: Add feature for unaligned access git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274398 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 23:03:44 +00:00
Duncan P. N. Exon Smith	a354e21338	Target: Remove unused arguments from overrideSchedPolicy, NFC TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr* arguments (begin and end) that invite implicit conversions from MachineInstrBundleIterator. One option would be to change their type to an iterator, but since they don't seem to have been used since the API was added in 2010, I'm deleting the dead code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274304 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 00:23:27 +00:00
Matt Arsenault	364fc298eb	AMDGPU: Fix global isel crashes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274039 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-28 17:42:09 +00:00
Matt Arsenault	b0b8d0af0c	AMDGPU: Fix global isel build git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273964 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-28 00:11:26 +00:00
Matt Arsenault	d35aece639	AMDGPU: Implement per-function subtargets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273940 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 20:48:03 +00:00
Matt Arsenault	dca409d5ad	AMDGPU: Move subtarget feature checks into passes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273937 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 20:32:13 +00:00
Konstantin Zhuravlyov	20c7a48718	[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue. Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format: - offset 0: work group ID x - offset 4: work group ID y - offset 8: work group ID z - offset 16: work item ID x - offset 20: work item ID y - offset 24: work item ID z Set - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled Differential Revision: http://reviews.llvm.org/D20335 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273769 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-25 03:11:28 +00:00
Matt Arsenault	9af2418e41	AMDGPU: Remove disable-irstructurizer subtarget feature The only real reason to use it is for testing, so replace it with a command line option instead of a potentially function dependent feature. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273653 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 06:30:22 +00:00
Matt Arsenault	759ed7e410	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 06:30:11 +00:00
Matt Arsenault	6ef04515db	AMDGPU: Make FrameLowering stack alignment 16 We don't need it to be that high. The natural alignment for a single workitem's stack is 16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273448 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-22 17:47:39 +00:00
Matt Arsenault	9d6fa96f46	AMDGPU: Fix crashes on unknown processor name If the processor name failed to parse for amdgcn, the resulting output would have R600 ISA in it. If the processor name was missing or invalid for R600, the wavefront size would not be set and there would be crashes from missing itinerary data. Fixes crashes in future commit caused by dividing by the unset/0 wavefront size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271561 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-02 18:37:16 +00:00
Changpeng Fang	faf7289db3	AMDGPU/SI: Enable load-store-opt by default. Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270894 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-26 19:35:29 +00:00
Konstantin Zhuravlyov	d7b9b912dd	[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs Differential Revision: http://reviews.llvm.org/D20081 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270594 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-24 18:37:18 +00:00
Matt Arsenault	7985e4be56	AMDGPU: Fix promote alloca pass creating huge arrays This was assuming it could use all memory before, which is a bad decision because it restricts occupancy. By default, only try to use enough space that could reduce occupancy to 7, an arbitrarily chosen limit. Based on the exist LDS usage, try to round up to the limit in the current tier instead of further hurting occupancy. This isn't ideal, because it doesn't accurately know how much space is going to be used for alignment padding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269708 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-16 21:19:59 +00:00
Matt Arsenault	3e43181f87	AMDGPU: Change private_element_size to 4 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269145 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-11 00:28:54 +00:00
Konstantin Zhuravlyov	d714ad3a0f	[AMDGPU] Reserve VGPRs for trap handler usage if instructed Differential Revision: http://reviews.llvm.org/D19235 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267563 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-26 15:43:14 +00:00
Konstantin Zhuravlyov	5d42fbaf4c	[AMDGPU] Add insert nops pass based on subtarget features instead of cl::opt Also, - Skip pass if machine module does not have debug info - Minor comment changes - Added test Differential Revision: http://reviews.llvm.org/D19079 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266626 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-18 16:28:23 +00:00
Tom Stellard	65b55414cc	AMDGPU: Add skeleton GlobalIsel implementation Summary: This adds the necessary target code to be able to run the ir translator. Lowering function arguments and returns is a nop and there is no support for RegBankSelect. Reviewers: arsenm, qcolombet Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19077 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266356 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-14 19:09:28 +00:00
Nicolai Haehnle	ea7a0c0467	AMDGPU: Add a shader calling convention This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265589 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-06 19:40:20 +00:00
Tom Stellard	d3adac51fc	AMDGPU/SI: Enable lanemask tracking in misched Summary: This results in higher register usage, but should make it easier for the compiler to hide latency. This pass is a prerequisite for some more scheduler improvements, and I think the increase register usage with this patch is acceptable, because when combined with the scheduler improvements, the total register usage will decrease. shader-db stats: 2382 shaders in 478 tests Totals: SGPRS: 48672 -> 49088 (0.85 %) VGPRS: 34148 -> 34847 (2.05 %) Code Size: 1285816 -> 1289128 (0.26 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 492544 -> 573440 (16.42 %) bytes per wave Max Waves: 6856 -> 6846 (-0.15 %) Wait states: 0 -> 0 (0.00 %) Depends on D18451 Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18452 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264876 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-30 16:35:09 +00:00
Matt Arsenault	26419a11ad	AMDGPU: More bits of frame index are known to be zero The maximum private allocation for the whole GPU is 4G, so the maximum possible index for a single workitem is the maximum size divided by the smallest granularity for a dispatch. This increases the number of known zero high bits, which enables more offset folding. The maximum private size per workitem with this is 128M but may be smaller still. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262153 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 20:26:57 +00:00
Matt Arsenault	8dfc553b91	AMDGPU: Split vi-insts subtarget feature This will be more useful for marking builtins acceptable for which subtargets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262121 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 08:53:55 +00:00
Matt Arsenault	a164276e20	AMDGPU: Implement readcyclecounter This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262119 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 08:53:46 +00:00
Matt Arsenault	e3601c75c9	AMDGPU: Set element_size in private resource descriptor Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260651 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 02:40:47 +00:00
Matt Arsenault	6a2bf372b8	AMDGPU: Match some med3 patterns git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259089 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 20:53:42 +00:00
Matt Arsenault	de2c3bc98d	AMDGPU: Fix default device handling When no device name is specified, default to kaveri for HSA since SI is not supported and it woud fail. Default to "tahiti" instead of "SI" since these are effectively the same, and tahiti is an actual device. Move default device handling to the TargetMachine rather than the AMDGPUSubtarget. The module ISA version is computed from the device name provided with the target machine, so the attributes printed by the AsmPrinter were inconsistent with those computed in the subtarget. Also remove DevName field from subtarget since it's redundant with getCPU() in the superclass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258901 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-27 02:17:49 +00:00
Matt Arsenault	2b74ecaae4	AMDGPU: Remove Feature64BitPtr This is a leftover from AMDIL that doesn't do anything and doesn't belong here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258606 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-23 05:32:14 +00:00
Tom Stellard	72304925ab	AMDGPU/SI: Pass whether to use the SI scheduler via Target Attribute Summary: Currently the SI scheduler can be selected via command line option, but it turned out it would be better if it was selectable via a Target Attribute. This patch adds "si-scheduler" attribute to the backend. Reviewers: tstellarAMD, echristo Subscribers: echristo, arsenm Differential Revision: http://reviews.llvm.org/D16192 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258386 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-21 04:28:34 +00:00
Matt Arsenault	5fc0b0f07e	AMDGPU: Add subtarget feature for instruction rates git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258085 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-18 21:13:50 +00:00
Nicolai Haehnle	fac1bfe37d	AMDGPU: add +xnack feature Summary: Enabling this feature will account for the two SGPRs used by the hardware to store the XNACK_MASK physically. The hardware only requires this reservation when the XNACK feature is explicitly enabled. At some point, HSA will probably want to do that, but it does increase SGPR register pressure, so leave it disabled by default for now (but do add a small test). Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15869 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256794 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-04 23:35:53 +00:00
Changpeng Fang	89e60598f6	AMDGPU/SI: Use flat for global load/store when targeting HSA Summary: For some reason doing executing an MUBUF instruction with the addr64 bit set and a zero base pointer in the resource descriptor causes the memory operation to be dropped when the shader is executed using the HSA runtime. This kind of MUBUF instruction is commonly used when the pointer is stored in VGPRs. The base pointer field in the resource descriptor is set to zero and and the pointer is stored in the vaddr field. This patch resolves the issue by only using flat instructions for global memory operations when targeting HSA. This is an overly conservative fix as all other configurations of MUBUF instructions appear to work. NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15543 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256282 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-22 20:55:23 +00:00
Rafael Espindola	a00544a653	Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA" This reverts commit r256273. It broke CodeGen/AMDGPU/llvm.dbg.value.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256275 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-22 19:46:44 +00:00
Changpeng Fang	808f9643e6	AMDGPU/SI: Use flat for global load/store when targeting HSA Summary: For some reason doing executing an MUBUF instruction with the addr64 bit set and a zero base pointer in the resource descriptor causes the memory operation to be dropped when the shader is executed using the HSA runtime. This kind of MUBUF instruction is commonly used when the pointer is stored in VGPRs. The base pointer field in the resource descriptor is set to zero and and the pointer is stored in the vaddr field. This patch resolves the issue by only using flat instructions for global memory operations when targeting HSA. This is an overly conservative fix as all other configurations of MUBUF instructions appear to work. Reviewers: tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15543 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256273 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-22 19:32:28 +00:00
Matt Arsenault	fd920596a2	AMDGPU: Cleanup includes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252328 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-06 18:23:00 +00:00
Matt Arsenault	ade9b95acb	AMDGPU: Create emergency stack slots during frame lowering Test has a bogus verifier error which will be fixed by later commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252327 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-06 18:17:45 +00:00
Daniel Sanders	47b167dd84	Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Eric has replied and has demanded the patch be reverted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247702 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 16:17:27 +00:00
Daniel Sanders	9781f90c7e	Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Thanks go to Pavel Labath for fixing LLDB for me. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247692 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 14:08:28 +00:00
Daniel Sanders	a6aa0c3bcc	Revert r247684 - Replace Triple with a new TargetTuple ... LLDB needs to be updated in the same commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247686 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 13:46:21 +00:00
Daniel Sanders	7b82808e13	Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC. Summary: This is the first patch in the series to migrate Triple's (which are ambiguous) to TargetTuple's (which aren't). For the moment, TargetTuple simply passes all requests to the Triple object it holds. Once it has replaced Triple, it will start to implement the interface in a more suitable way. This change makes some changes to the public C++ API. In particular, InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer() now take TargetTuples instead of Triples. The other public C++ API's have been left as-is for the moment to reduce patch size. This commit also contains a trivial patch to clang to account for the C++ API change. Reviewers: rengolin Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D10969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247683 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-15 13:17:40 +00:00
Tom Stellard	cac05d9b58	AMDPGU/SI: Use AssertZext node to mask high bit for scratch offsets Summary: We can safely assume that the high bit of scratch offsets will never be set, because this would require at least 128 GB of GPU memory. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11225 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242433 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-16 19:40:07 +00:00
Matt Arsenault	6fe7acaaf8	AMDGPU/SI: Add debugging subtarget feature for DS offsets We don't have a good way to detect most situations where DS offsets are usable on SI, so add an option to force using them even if unsafe for debugging performance problems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241462 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-06 16:01:58 +00:00
Tom Stellard	ac1a45e511	AMDGPU/SI: Add hsa code object directives Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10757 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240831 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-26 21:15:07 +00:00
Tom Stellard	953c681473	R600 -> AMDGPU rename git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-13 03:28:10 +00:00

43 Commits