archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Matt Arsenault	6e1ede8ee8	AMDGPU: Re-use TM.getNullPointerValue git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297662 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 20:18:14 +00:00
Matt Arsenault	32cb946c46	AMDGPU: Treat 0 as private null pointer in addrspacecast lowering git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297658 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 19:47:31 +00:00
Matt Arsenault	a8ffe4b37c	AMDGPU: Remove packf16 intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297557 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-11 05:51:16 +00:00
Matt Arsenault	dbe625a311	AMDGPU: Keep track of modifiers when converting v_mac to v_mad Since v_max_f32_e64/v_max_f16_e64 can be folded if the target instruction supports the clamp bit, we also need to maintain modifiers when converting v_mac to v_mad. This fixes a rendering issue with Dirt Rally because a v_mac instruction with the clamp bit set was converted to a v_mad but that bit was lost during the conversion. Fixes: `e184e01dd7` ("AMDGPU: Fold FP clamp as modifier bit") Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297556 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-11 05:40:40 +00:00
Stanislav Mekhanoshin	3081264dbe	[AMDGPU] Remove getBidirectionalReasonRank This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297536 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-11 00:29:27 +00:00
Konstantin Zhuravlyov	38890eb0c2	[AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI Differential Revision: https://reviews.llvm.org/D29674 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297499 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 19:39:07 +00:00
Yaxun Liu	ab0e0e2181	Rename PT_NOTE namespace name used in AMDGPUPTNote.h Patch by Guansong Zhang. Differential Revision: https://reviews.llvm.org/D30750 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297498 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 19:35:43 +00:00
Changpeng Fang	1970df176f	AMDGPU/SI: Disable unrolling in the loop vectorizer if the loop is not vectorized. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D30719 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297328 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 00:07:00 +00:00
Matt Arsenault	f06f68a796	AMDGPU: Don't wait at end of block with a trivial successor If there is only one successor, and that successor only has one predecessor the wait can obviously be delayed until uses or the end of the next block. This avoids code quality regressions when there are trivial fallthrough blocks inserted for structurization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297251 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 01:06:58 +00:00
Matt Arsenault	fc8387b8d1	AMDGPU: Constant fold rcp node When doing arcp optimization with a constant denominator, this was leaving behind rcps with constant inputs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297248 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 00:48:46 +00:00
Changpeng Fang	2e729706f1	AMDGPU/SI: Do not insert EndCf in an unreachable block Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D22025 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297243 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 23:29:36 +00:00
Daniel Sanders	35c6dd2400	Recommit: [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297241 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 23:20:35 +00:00
Daniel Sanders	428e17c613	Revert r297177: Change LLT constructor string into an LLT-based object ... More module problems. This time it only showed up in the stage 2 compile of clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile. Somehow, this change causes the build to need Attributes.gen before it's been generated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297188 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 19:21:23 +00:00
Daniel Sanders	86bbf4372b	[globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297177 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 18:32:25 +00:00
Konstantin Zhuravlyov	58580c59ae	Revert "AMDGPU: Set MCAsmInfo::PointerSize" It breaks line tables because the patch is not complete, working on a complete one at the moment This reverts commit r294031. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297118 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 04:44:33 +00:00
Jan Vesely	ec8e013baa	AMDGPU/R600: Fix ALU clause markers use detection also exit early on kill instead of redefinition. Differential Revision: https://reviews.llvm.org/D30230 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297060 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 20:10:05 +00:00
Krzysztof Parzyszek	88a7ff46b1	Make TargetInstrInfo::isPredicable take a const reference, NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296901 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 18:30:54 +00:00
Dmitry Preobrazhensky	23bacbc32a	[AMDGPU][MC] Fix for Bug 30829 + LIT tests Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction). Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code). Added LIT tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296873 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 14:31:06 +00:00
Matt Arsenault	003f1a56c5	AMDGPU: Fix missing dominator tree dependency git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296842 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-02 23:50:51 +00:00
Matt Arsenault	1a0fc1885e	AMDGPU: Fix types for VOP_I16_I16_I16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296523 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 21:31:45 +00:00
Matt Arsenault	743da63164	AMDGPU: Add definition for v_swap_b32 This is somewhat tricky because there are two pairs of tied operands, and it isn't allowed to be VOP3 encoded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296519 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 21:09:04 +00:00
Matt Arsenault	0911246281	AMDGPU: Add definition for v_xad_u32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296515 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 20:27:30 +00:00
Matt Arsenault	c1a133abee	AMDGPU: Add ds_nop to assembler git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296513 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 20:15:46 +00:00
Matt Arsenault	efc7556475	AMDGPU: Add definitions for ds_{read\|write}_b{96\|128} It's not clear to me if this is always better than doing ds_write2_b64 This adds the constraint of a 128-bit register input instead of a pair of 64-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296512 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 20:15:43 +00:00
Stanislav Mekhanoshin	0248798c22	[AMDGPU] Add second pass of the scheduler If during scheduling we have identified that we cannot keep optimistic occupancy increase critical register pressure limit and try scheduling of the whole function again. In this case blocks with smaller pressure will have a chance for better scheduling. Differential Revision: https://reviews.llvm.org/D30442 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296506 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 19:20:33 +00:00
Stanislav Mekhanoshin	004075214e	[AMDGPU] New method to estimate register pressure This change introduces new method to estimate register pressure in GCNScheduler. Standard RPTracker gives huge error due to the following reasons: 1. It does not account for live-ins or live-outs if value is not used in the region itself. That creates a huge error in a very common case if there are a lot of live-thu registers. 2. It does not properly count subregs. 3. It assumes a register used as an input operand can be reused as an output. This is not always possible by itself, this is not what RA will finally do in many cases for various reasons not limited to RA's inability to do so, and this is not so if the value is actually a live-thu. In addition we can now see clear separation between live-in pressure which we cannot change with the scheduling and tentative pressure which we can change. Differential Revision: https://reviews.llvm.org/D30439 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296491 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 17:22:39 +00:00
Konstantin Zhuravlyov	b1d063dce5	[AMDGPU] Change amd_kernel_code_t's minor version to 1 - We do emit amd_kernel_code_t v1.1 Differential Revision: https://reviews.llvm.org/D30433 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296489 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 17:17:52 +00:00
Stanislav Mekhanoshin	b57bf30b47	[AMDGPU] Fix read-undef flags when schedule is reverted If two subregs of the same register are defined and we need to revert schedule changing def order, we will end up with both instructions having def,read-undef flags because adjustLaneLiveness() will only set this flag but will not remove it. Fix this by removing read-undef flags before calling adjustLaneLiveness. Differential Revision: https://reviews.llvm.org/D30428 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296484 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 16:26:27 +00:00
Daniel Sanders	1e598cbf73	Revert r296474 - [globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it. There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296478 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 15:00:27 +00:00
Daniel Sanders	e0180ef4b8	[globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296474 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-28 14:21:31 +00:00
Matt Arsenault	dd2186aaab	AMDGPU: Use v_med3_{f16\|i16\|u16} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296401 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 22:40:39 +00:00
Matt Arsenault	27f4f2f4bc	AMDGPU: Support v2i16/v2f16 packed operations git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296396 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 22:15:25 +00:00
Matt Arsenault	563a987b91	AMDGPU: Add some of the new gfx9 VOP3 instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296382 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 21:04:41 +00:00
Matt Arsenault	a4e4156e12	AMDGPU: Support inlineasm for packed instructions Add packed types as legal so they may be used with inlineasm. Keep all operations expanded for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296379 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 20:52:10 +00:00
Matt Arsenault	132ab30572	AMDGPU: Don't fold immediate if clamp/omod are set Doesn't fix any practical problems because clamp/omod are currently folded after peephole optimizer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296375 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 20:21:31 +00:00
Matt Arsenault	dd23defd5c	AMDGPU: Fold omod into instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296372 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 19:35:42 +00:00
Matt Arsenault	29df731fe5	AMDGPU: Add f16 to shader calling conventions Mostly useful for writing tests for f16 features. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296370 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 19:24:47 +00:00
Matt Arsenault	87fd70245a	AMDGPU: Add VOP3P instruction format Add a few non-VOP3P but instructions related to packed. Includes hack with dummy operands for the benefit of the assembler git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296368 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 18:49:11 +00:00
Konstantin Zhuravlyov	9995ddddac	[AMDGPU] Runtime metadata fixes: - Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it: .amdgpu_runtime_metadata { amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ... - Make IsaInfo optional, and always emit it. Differential Revision: https://reviews.llvm.org/D30349 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296324 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-27 07:55:17 +00:00
Wei Ding	5d1e915557	AMDGPU : Replace FMAD with FMA when denormals are enabled. Differential Revision: http://reviews.llvm.org/D29958 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296186 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-24 23:00:29 +00:00
Stanislav Mekhanoshin	fef0dbe59c	Revert "Correct register pressure calculation in presence of subregs" This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296182 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-24 21:56:16 +00:00
Stanislav Mekhanoshin	186113f5c1	[AMDGPU] Shut the warning "getRegUnitWeight hides overload...". NFC. Clang issues warning about hidden overload. That was intended, so add "using AMDGPUGenRegisterInfo::getRegUnitWeight;" to mute it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296021 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 21:51:28 +00:00
Stanislav Mekhanoshin	0bf4d71d50	Correct register pressure calculation in presence of subregs If a subreg is used in an instruction it counts as a whole superreg for the purpose of register pressure calculation. This patch corrects improper register pressure calculation by examining operand's lane mask. Differential Revision: https://reviews.llvm.org/D29835 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296009 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 20:19:44 +00:00
Jan Vesely	dae323db22	AMDGPU/SI: Fix trunc i16 pattern Hit on ASICs that support 16bit instructions. Differential Revision: https://reviews.llvm.org/D30281 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295990 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 16:12:21 +00:00
Matt Arsenault	210095c5c9	LoadStoreVectorizer: Split even sized illegal chains properly Implement isLegalToVectorizeLoadChain for AMDGPU to avoid producing private address spaces accesses that will need to be split up later. This was doing the wrong thing in the case where the queried chain was an even number of elements. A possible <4 x i32> store was being split into store <2 x i32> store i32 store i32 rather than store <2 x i32> store <2 x i32> when legal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295933 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 03:58:53 +00:00
Matt Arsenault	32a81bbff2	AMDGPU: Add another BFE pattern This is the pattern that falls out of the instruction's definition if offset == 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295912 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-23 00:23:43 +00:00
Matt Arsenault	cd39b42cab	AMDGPU: Use clamp with f64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295908 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-22 23:53:37 +00:00
Matt Arsenault	e184e01dd7	AMDGPU: Fold FP clamp as modifier bit The manual is unclear on the details of this. It's not clear to me if denormals are not allowed with clamp, or if that is only omod. Not allowing denorms for fp16 or fp64 isn't useful so I also question if that is really a restriction. Same with whether this is valid without IEEE mode enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295905 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-22 23:27:53 +00:00
Wei Ding	1cfed01e02	AMDGPU : Update TrapCode based on Trap Handler ABI. Differential Revision: http://reviews.llvm.org/D30232 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295904 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-22 23:22:19 +00:00
Matt Arsenault	c2d34b5027	AMDGPU: Add replacement bfe intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295899 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-22 23:04:58 +00:00

1 2 3 4 5 ...

1601 Commits