archived-llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2026-01-31 01:35:20 +01:00

Author	SHA1	Message	Date
Dmitry Preobrazhensky	be2eb2d0a8	[AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152 Reviewers: SamWot, artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D36674 llvm-svn: 311006	2017-08-16 13:51:56 +00:00
Quentin Colombet	862e639dc6	Reapply "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310425, thus reapplying r310335 with a fix for link issue of the AArch64 unittests on Linux bots when BUILD_SHARED_LIBS is ON. Original commit message: [GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. ---- The fix for the link issue consists in adding the GlobalISel library in the list of dependencies for the AArch64 unittests. This dependency comes from the use of AArch64Subtarget that needs to know how to destruct the GISel related APIs when being detroyed. Thanks to Bill Seurer and Ahmed Bougacha for helping me reproducing and understand the problem. llvm-svn: 310969	2017-08-15 22:31:51 +00:00
Quentin Colombet	3f63039f98	Revert "[GlobalISel] Remove the GISelAccessor API." This reverts commit r310115. It causes a linker failure for the one of the unittests of AArch64 on one of the linux bot: http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/3429 : && /home/fedora/gcc/install/gcc-7.1.0/bin/g++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -std=c++11 -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -ffunction-sections -fdata-sections -O2 -L/home/fedora/gcc/install/gcc-7.1.0/lib64 -Wl,-allow-shlib-undefined -Wl,-O3 -Wl,--gc-sections unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o -o unittests/Target/AArch64/AArch64Tests lib/libLLVMAArch64CodeGen.so.6.0.0svn lib/libLLVMAArch64Desc.so.6.0.0svn lib/libLLVMAArch64Info.so.6.0.0svn lib/libLLVMCodeGen.so.6.0.0svn lib/libLLVMCore.so.6.0.0svn lib/libLLVMMC.so.6.0.0svn lib/libLLVMMIRParser.so.6.0.0svn lib/libLLVMSelectionDAG.so.6.0.0svn lib/libLLVMTarget.so.6.0.0svn lib/libLLVMSupport.so.6.0.0svn -lpthread lib/libgtest_main.so.6.0.0svn lib/libgtest.so.6.0.0svn -lpthread -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && : unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x0): undefined reference to `vtable for llvm::LegalizerInfo' unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/InstSizes.cpp.o:(.toc+0x8): undefined reference to `vtable for llvm::RegisterBankInfo' The particularity of this bot is that it is built with BUILD_SHARED_LIBS=ON However, I was not able to reproduce the problem so far. Reverting to unblock the bot. llvm-svn: 310425	2017-08-08 22:22:30 +00:00
Matt Arsenault	5dfc642fbd	AMDGPU: Cleanup subtarget features Try to avoid mutually exclusive features. Don't use a real default GPU, and use a fake "generic". The goal is to make it easier to see which set of features are incompatible between feature strings. Most of the test changes are due to random scheduling changes from not having a default fullspeed model. llvm-svn: 310258	2017-08-07 14:58:04 +00:00
Quentin Colombet	0a7c56803e	[GlobalISel] Remove the GISelAccessor API. Its sole purpose was to avoid spreading around ifdefs related to building global-isel. Since r309990, GlobalISel is not optional anymore, thus, we can get rid of this mechanism all together. NFC. llvm-svn: 310115	2017-08-04 20:15:46 +00:00
Quentin Colombet	7b3de01101	[GlobalISel] Make GlobalISel a non-optional library. With this change, the GlobalISel library gets always built. In particular, this is not possible to opt GlobalISel out of the build using the LLVM_BUILD_GLOBAL_ISEL variable any more. llvm-svn: 309990	2017-08-03 21:52:25 +00:00
Matt Arsenault	4165e37f67	AMDGPU: Add encoding for carryless add/sub instructions llvm-svn: 308639	2017-07-20 17:42:47 +00:00
Konstantin Zhuravlyov	12b4051c8d	AMDGPU: Fix amdgpu-flat-work-group-size/amdgpu-waves-per-eu check Differential Revision: https://reviews.llvm.org/D35433 llvm-svn: 308147	2017-07-16 19:38:47 +00:00
Simon Pilgrim	36101c148d	[AMDGPU] Fix -Wimplicit-fallthrough warnings. NFCI. llvm-svn: 307381	2017-07-07 10:18:57 +00:00
Quentin Colombet	053ca5f07d	[AMDGPU] Move GISel accessor initialization from TargetMachine to Subtarget. NFC llvm-svn: 307186	2017-07-05 18:40:56 +00:00
Sam Kolton	48e96ee80f	[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass. Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye Differential Revision: https://reviews.llvm.org/D34626 llvm-svn: 306413	2017-06-27 15:02:23 +00:00
Sam Kolton	076a1edc25	[AMDGPU] SDWA: add support for GFX9 in peephole pass Summary: Added support based on merged SDWA pseudo instructions. Now peephole allow one scalar operand, omod and clamp modifiers. Added several subtarget features for GFX9 SDWA. This diff also contains changes from D34026. Depends D34026 Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D34241 llvm-svn: 305986	2017-06-22 06:26:41 +00:00
Konstantin Zhuravlyov	55508a871a	AMDGPU: Make auto waitcnt before barrier a feature Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571	2017-06-02 17:40:26 +00:00
Matt Arsenault	d3375cc250	AMDGPU: Add new subtarget features for gfx9 flat instructions Flat instructions gain an immediate offset, and 2 new sets of segment specific flat instructions are added. llvm-svn: 302729	2017-05-10 21:19:05 +00:00
Stanislav Mekhanoshin	51dba4fa40	[AMDGPU] Generate range metadata for workitem id If workgroup size is known inform llvm about range returned by local id and local size queries. Differential Revision: https://reviews.llvm.org/D31804 llvm-svn: 300102	2017-04-12 20:48:56 +00:00
Matt Arsenault	8dbeb825a7	AMDGPU: Fix crash when disassembling VOP3 mac The unused dummy src2_modifiers is missing, so it crashes when trying to print it. I tried to fully remove src2_modifiers, but there are some irritations in the places where it is converted to mad since it starts to require modifying use lists while iterating over them. llvm-svn: 299861	2017-04-10 17:58:06 +00:00
Yaxun Liu	da52f0e643	[AMDGPU] Get address space mapping by target triple environment As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846	2017-03-27 14:04:01 +00:00
Matt Arsenault	96b9e12990	AMDGPU: Add VOP3P instruction format Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler llvm-svn: 296368	2017-02-27 18:49:11 +00:00
Matt Arsenault	73b8eb1cc6	AMDGPU: Redefine clamp node as clamp 0.0-1.0 Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788	2017-02-21 23:35:48 +00:00
Matt Arsenault	1f7b67f9b4	AMDGPU: Fix assembler subtarget predicate for gfx9 This was accepting GFX9 instructions on VI. llvm-svn: 295557	2017-02-18 19:12:26 +00:00
Matt Arsenault	a207e31c14	AMDGPU: Merge initial gfx9 support llvm-svn: 295554	2017-02-18 18:29:53 +00:00
Alexander Timofeev	ede4a3e0f8	Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track" This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907. llvm-svn: 295054	2017-02-14 14:29:05 +00:00
Wei Ding	3609e1230f	AMDGPU : Add trap handler support. Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692	2017-02-10 02:15:29 +00:00
Konstantin Zhuravlyov	f4d4c79e96	[AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of using switch statement Differential Revision: https://reviews.llvm.org/D29741 llvm-svn: 294627	2017-02-09 21:33:23 +00:00
Konstantin Zhuravlyov	12928e55f8	[AMDGPU] Add target information that is required by tools to metadata Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449	2017-02-08 14:05:23 +00:00
Konstantin Zhuravlyov	a802442d9d	[AMDGPU][NFC] De-tabify llvm-svn: 294445	2017-02-08 13:29:23 +00:00
Konstantin Zhuravlyov	9fccb13d0b	[AMDGPU] Move register related queries to subtarget class Differential Revision: https://reviews.llvm.org/D29318 llvm-svn: 294440	2017-02-08 13:02:33 +00:00
Alexander Timofeev	325b448b53	[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track lane masks. Differential revision: https://reviews.llvm.org/D29442 llvm-svn: 294324	2017-02-07 17:57:48 +00:00
Stanislav Mekhanoshin	19e9bdea6f	[AMDGPU] Account workgroup size in LDS occupancy limits Functions matching LDS use to occupancy return results for a workgroup of 64 workitems. The numbers has to be adjusted for bigger workgroups. For example a workgroup of size 256 already occupies 4 waves just by itself. Given that all numbers of LDS use in the compiler are per workgroup, occupancy shall be multiplied by 4 in this case. Each 64 workitems still limited by the same number, but 4 subrgoups 64 workitems each can afford 4 times more LDS to get the same occupancy. In addition change initializes LDS size in the subtarget to a real value for SI+ targets. This is required since LDS size is a variable in these calculations. Differential Revision: https://reviews.llvm.org/D29423 llvm-svn: 293837	2017-02-01 22:59:50 +00:00
Matt Arsenault	9317a1de75	AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands Accomplishes what r292982 was supposed to, which ended up only really making the necessary test changes. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 293310	2017-01-27 17:42:26 +00:00
Tom Stellard	bbf29e433b	AMDGPU add support for spilling to a user sgpr pointed buffers Summary: This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1]. Patch By: Dave Airlie Reviewers: nhaehnle, arsenm, tstellarAMD Reviewed By: arsenm Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D25428 llvm-svn: 293000	2017-01-25 01:25:13 +00:00
Matt Arsenault	81a9bfe915	Enable FeatureFlatForGlobal on Volcanic Islands This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982	2017-01-24 22:02:15 +00:00
Matt Arsenault	bd33194651	AMDGPU: Combine fp16/fp64 subtarget features The same control register controls both, and are set to the same defaults. Keep the old names around as aliases. llvm-svn: 292837	2017-01-23 22:31:03 +00:00
Benjamin Kramer	81b88ec2bc	Pacify -Wreorder. llvm-svn: 292599	2017-01-20 10:37:53 +00:00
Sam Kolton	1310b4c7b3	[AMDGPU] Add subtarget features for SDWA/DPP Reviewers: vpykhtin, artem.tamazov, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28900 llvm-svn: 292596	2017-01-20 10:01:25 +00:00
Eugene Zelenko	c816ae3436	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 289475	2016-12-12 22:23:53 +00:00
Alexander Timofeev	3a9e77fc0f	[AMDGPU] Scalarization of global uniform loads. Summary: LC can currently select scalar load for uniform memory access basing on readonly memory address space only. This restriction originated from the fact that in HW prior to VI vector and scalar caches are not coherent. With MemoryDependenceAnalysis we can check that the memory location corresponding to the memory operand of the LOAD is not clobbered along the all paths from the function entry. Reviewers: rampitec, tstellarAMD, arsenm Subscribers: wdng, arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D26917 llvm-svn: 289076	2016-12-08 17:28:47 +00:00
Konstantin Zhuravlyov	a5d550fe9d	[AMDGPU] Add f16 support (VI+) Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753	2016-11-13 07:01:11 +00:00
Matt Arsenault	3ee7b5cf1b	AMDGPU: Use 1/2pi inline imm on VI I'm guessing at how it is supposed to be printed llvm-svn: 285490	2016-10-29 04:05:06 +00:00
Matt Arsenault	d438707f8d	AMDGPU: Diagnose using too many SGPRs This is possible when using inline asm. llvm-svn: 285447	2016-10-28 20:31:47 +00:00
Tom Stellard	19c9255423	AMDGPU/SI: Don't allow unaligned scratch access Summary: The hardware doesn't support this. Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D25523 llvm-svn: 284257	2016-10-14 18:10:39 +00:00
Matt Arsenault	cb0c02c980	AMDGPU: Add instruction definitions for VGPR indexing VI added a second method of indexing into VGPRs besides using v_movrel* llvm-svn: 284027	2016-10-12 18:00:51 +00:00
Tom Stellard	a377ff1511	AMDGPU/SI: Include implicit arguments in kernarg_segment_byte_size Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D24835 llvm-svn: 282223	2016-09-23 01:33:26 +00:00
Konstantin Zhuravlyov	0da0753352	[AMDGPU] Wave and register controls - Implemented amdgpu-flat-work-group-size attribute - Implemented amdgpu-num-active-waves-per-eu attribute - Implemented amdgpu-num-sgpr attribute - Implemented amdgpu-num-vgpr attribute - Dynamic LDS constraints are in a separate patch Patch by Tom Stellard and Konstantin Zhuravlyov Differential Revision: https://reviews.llvm.org/D21562 llvm-svn: 280747	2016-09-06 20:22:28 +00:00
Tom Stellard	d773f533b9	AMDGPU/SI: Implement a custom MachineSchedStrategy Summary: GCNSchedStrategy re-uses most of GenericScheduler, it's just uses a different method to compute the excess and critical register pressure limits. It's not enabled by default, to enable it you need to pass -misched=gcn to llc. Shader DB stats: 32464 shaders in 17874 tests Totals: SGPRS: 1542846 -> 1643125 (6.50 %) VGPRS: 1005595 -> 904653 (-10.04 %) Spilled SGPRs: 29929 -> 27745 (-7.30 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 36688188 -> 37034900 (0.95 %) bytes LDS: 1913 -> 1913 (0.00 %) blocks Max Waves: 254101 -> 265125 (4.34 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 1338220 -> 1438499 (7.49 %) VGPRS: 886221 -> 785279 (-11.39 %) Spilled SGPRs: 29869 -> 27685 (-7.31 %) Spilled VGPRs: 334 -> 352 (5.39 %) Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread Code Size: 34315716 -> 34662428 (1.01 %) bytes LDS: 1551 -> 1551 (0.00 %) blocks Max Waves: 188127 -> 199151 (5.86 %) Wait states: 0 -> 0 (0.00 %) Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23688 llvm-svn: 279995	2016-08-29 19:42:52 +00:00
Matt Arsenault	7a366c7517	AMDGPU: Fix crashes on memory functions llvm-svn: 278369	2016-08-11 17:31:42 +00:00
Matt Arsenault	842a619c4a	AMDGPU: Delete dead code llvm-svn: 276675	2016-07-25 19:06:25 +00:00
Matt Arsenault	164529a800	AMDGPU: Add feature for unaligned access llvm-svn: 274398	2016-07-01 23:03:44 +00:00
Duncan P. N. Exon Smith	c2d816d704	Target: Remove unused arguments from overrideSchedPolicy, NFC TargetSubtargetInfo::overrideSchedPolicy takes two MachineInstr* arguments (begin and end) that invite implicit conversions from MachineInstrBundleIterator. One option would be to change their type to an iterator, but since they don't seem to have been used since the API was added in 2010, I'm deleting the dead code. llvm-svn: 274304	2016-07-01 00:23:27 +00:00
Matt Arsenault	103dc36b00	AMDGPU: Fix global isel crashes llvm-svn: 274039	2016-06-28 17:42:09 +00:00

1 2

90 Commits