archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Matthias Braun	f79c57a412	MachineFunction: Return reference for getFrameInfo(); NFC getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277017 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 18:40:00 +00:00
Tom Stellard	04cc0adf58	AMDGPU/SI: Don't use reserved VGPRs for SGPR spilling Summary: We were using reserved VGPRs for SGPR spilling and this was causing some programs with a workgroup size of 1024 to use more than 64 registers, which is illegal. Reviewers: arsenm, mareko, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22032 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276980 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 14:30:43 +00:00
Matt Arsenault	d506595769	AMDGPU: Make AMDGPUMachineFunction fields private ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly aligned, which requires fixing clover. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276766 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 16:45:58 +00:00
Matt Arsenault	4cead0b564	AMDGPU: Expand register indexing pseudos in custom inserter This is to help moveSILowerControlFlow to before regalloc. There are a couple of tradeoffs with this. The complete CFG is visible to more passes, the loop body avoids an extra copy of m0, vcc isn't required, and immediate offsets can be shrunk into s_movk_i32. The disadvantage is the register allocator doesn't understand that the single lane's vector is dead within the loop body, so an extra register is used to outlive the loop block when expanding the VGPR -> m0 loop. This also now results in worse waitcnt insertion before the loop instead of after for pending operations at the point of the indexing, but that should be fixed by future improvements to cross block waitcnt insertion. v_movreld_b32's operands are now modeled more correctly since vdst is not a true output. This is kind of a hack to treat vdst as a use operand. Extra checking is required in the verifier since I can't seem to get tablegen to emit an implicit operand for a virtual register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275934 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 00:35:03 +00:00
Matt Arsenault	e066e581b1	AMDGPU: Fix verifier error from partially undef copy In this situation: %VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11, %VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use> %VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3 %VGPR4<def> = COPY %VGPR2 The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1, but VGPR4 is defined immediately after this copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275635 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 22:32:02 +00:00
Jacques Pienaar	48ed4ab2d6	Rename AnalyzeBranch* to analyzeBranch*. Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect. Reviewers: tstellarAMD, mcrosier Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai Differential Revision: https://reviews.llvm.org/D22409 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275564 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 14:41:04 +00:00
Matt Arsenault	27c36c3430	AMDGPU: Cleanup pseudoinstructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275133 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-12 00:23:17 +00:00
Matt Arsenault	8711de225b	AMDGPU: Move R600 only pieces into R600 classes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274979 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-09 18:11:15 +00:00
Matt Arsenault	c39550268e	AMDGPU: Improve offset folding for register indexing git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274954 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-09 01:13:56 +00:00
Matt Arsenault	cb5d6ec789	AMDGPU: Simplify isSchedulingBoundary git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274953 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-09 01:13:51 +00:00
Duncan P. N. Exon Smith	83b2ab7c4c	AMDGPU: Remove implicit iterator conversions, NFC Remove remaining implicit conversions from MachineInstrBundleIterator to MachineInstr* from the AMDGPU backend. In most cases, I made them less attractive by preferring MachineInstr& or using a ranged-based for loop. Once all the backends are fixed I'll make the operator explicit so that this doesn't bitrot back. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274906 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-08 19:16:05 +00:00
Matt Arsenault	6de48c50fe	AMDGPU: Fix folding SGPRs into madak/madmk src0 Because of the special immediate operand, the constant bus is already used so SGPRs are never useful. r263212 changed the name of the immediate operand, which broke the verifier check for the restriction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274564 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-05 17:09:01 +00:00
Duncan P. N. Exon Smith	567409db69	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274189 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-30 00:01:54 +00:00
Matt Arsenault	bd1991ecf2	AMDGPU: Remove unused function git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274033 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-28 16:59:49 +00:00
Matt Arsenault	759ed7e410	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 06:30:11 +00:00
Matt Arsenault	8552cfd5ca	AMDGPU: readlane/writelane do not read exec git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273525 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-23 01:26:16 +00:00
NAKAMURA Takumi	96b66d10fe	Reformat blank lines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273131 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 01:05:15 +00:00
NAKAMURA Takumi	82f8dab579	Untabify. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273129 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 00:37:41 +00:00
Changpeng Fang	a3e290966a	AMDGPU/SI: Propagate the Kill flag in storeRegToStackSlot and eliminateFrameIndex Reviewers: arsenm, tstellarAMD Differential Revision: http://reviews.llvm.org/21438 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272958 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 21:20:47 +00:00
Tom Stellard	a09ba98fef	AMDGPU/SI: Refactor fixup handling for constant addrspace variables Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Re-commit this after fixing a bug where we were trying to use a reference to a Triple object that had already been destroyed. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272705 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 20:29:59 +00:00
Tom Stellard	d8ffcd8311	Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables" This reverts commit r272675. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272677 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 15:16:35 +00:00
Tom Stellard	1a5003c59b	AMDGPU/SI: Refactor fixup handling for constant addrspace variables Summary: We now use a standard fixup type applying the pc-relative address of constant address space variables, and we have the GlobalAddress lowering code add the required 4 byte offset to the global address rather than doing it as part of the fixup. This refactoring will make it easier to use the same code for global address space variables and also simplifies the code. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21154 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272675 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 15:11:01 +00:00
Marek Olsak	760c36c5ae	AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272556 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 16:05:57 +00:00
Matt Arsenault	8cd24fa644	AMDGPU: Fix post-RA verifier errors with trackLivenessAfterRegAlloc The condition reg of the cndmask_b64 expansion can't be killed by the first one, and the implicit super register implicit def is needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272554 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 15:53:52 +00:00
Benjamin Kramer	af18e017d2	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272512 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-12 15:39:02 +00:00
Matt Arsenault	f4135c634c	AMDGPU: Add function for getting instruction size git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271936 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 20:10:33 +00:00
Matt Arsenault	66d860c198	AMDGPU: Handle flat in getMemOpBaseRegImmOfs It can still report the base register, and the uses give up when it fails. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271575 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-02 20:05:20 +00:00
Matt Arsenault	25641e07d0	AMDGPU: Fix incorrectly setting kill flag when copying register tuples This fixes some verifier errors when trackLivenessAfterRegAlloc is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271446 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-02 00:04:30 +00:00
Matt Arsenault	6416e4c521	AMDGPU: Fix verifier error when spilling SGPRs The current SGPR spilling test does not stress this because it is using s_buffer_load instructions to increase SGPR pressure and spill, but their output operands have the same SReg_32_XM0 constraint. This fixes an error when the SReg_32 output from most instructions is spilled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270301 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:53:42 +00:00
Matt Arsenault	9d922be248	AMDGPU: Handle cbranch vccz/vccnz git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270297 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:29:40 +00:00
Matt Arsenault	dcb6543de5	AMDGPU: Implement ReverseBranchCondition git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270296 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:29:34 +00:00
Matt Arsenault	f91238f391	AMDGPU: Implement AnalyzeBranch Original patch by Tom Stellard git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270295 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:29:27 +00:00
Matt Arsenault	1e466e511b	AMDGPU: Remove verifier check for scc live ins We only really need this to be true for SIFixSGPRCopies. I'm not sure there's any way this could happen before that point. Fixes a case where MachineCSE could introduce a cross block scc use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269391 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-13 04:15:48 +00:00
Tom Stellard	83f4a25d58	AMDGPU/SI: Fix bug in SIInstrInfo::insertWaitStates() uncovered by r268260 We can't use MI->getDebugLoc() when MI is an iterator that could be MBB.end(). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268265 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-02 18:02:24 +00:00
Tom Stellard	6ab99c7ca6	AMDGPU/SI: Enable the post-ra scheduler Summary: This includes a hazard recognizer implementation to replace some of the hazard handling we had during frame index elimination. Reviewers: arsenm Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18602 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268143 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-30 00:23:06 +00:00
Tom Stellard	ac19ae8d63	AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions Summary: These instructions can add an immediate offset to the address, like other ds instructions. Reviewers: arsenm Subscribers: arsenm, scchan Differential Revision: http://reviews.llvm.org/D19233 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268043 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-29 14:34:26 +00:00
Etienne Bergeron	3f30242bbd	Fix incorrect redundant expression in target AMDGPU. Summary: The expression is detected as a redundant expression. Turn out, this is probably a bug. ``` /home/etienneb/llvm/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:306:26: warning: both side of operator are equivalent [misc-redundant-expression] if (isSMRD(FirstLdSt) && isSMRD(FirstLdSt)) { ``` Reviewers: rnk, tstellarAMD Subscribers: arsenm, cfe-commits Differential Revision: http://reviews.llvm.org/D19460 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267415 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-25 15:06:33 +00:00
Nicolai Haehnle	3441786e27	AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic Summary: This intrinsic returns true if the current thread belongs to a live pixel and false if it belongs to a pixel that we are executing only for derivative computation. It will be used by Mesa to implement gl_HelperInvocation. Note that for pixels that are killed during the shader, this implementation also returns true, but it doesn't matter because those pixels are always disabled in the EXEC mask. This unearthed a corner case in the instruction verifier, which complained about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but correct code, so make the verifier accept it as such. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19191 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267102 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-22 04:04:08 +00:00
Nicolai Haehnle	fea41fb59c	AMDGPU: Guard VOPC instructions against incorrect commute Summary: The added testcase, which triggered this, was derived from a shader-db case via bugpoint. A separate question is why scalar branching wasn't used. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19208 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266825 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-19 21:58:22 +00:00
Jun Bum Lim	232aafceb5	[MachineScheduler]Add support for store clustering Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266437 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-15 14:58:38 +00:00
Matt Arsenault	6d955a1d6d	AMDGPU: Run SIFoldOperands after PeepholeOptimizer PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective. shader-db stats: Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %) Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %) Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266378 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-14 21:58:24 +00:00
Tom Stellard	90d87693b6	AMDGPU/SI: Fix spilling of 96-bit registers Summary: It seems like this was broken in r252327. I thought we had test cases for this, but it's really hard to tirgger spills of this exact register size since they aren't used very much. Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19021 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266152 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-12 23:57:30 +00:00
Tom Stellard	c2d9280e43	AMDGPU/SI: Add MachineBasicBlock parameter to SIInstrInfo::insertWaitStates Summary: This makes it possible to insert nops at the end of blocks. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18549 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265678 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-07 14:47:07 +00:00
Nicolai Haehnle	ea7a0c0467	AMDGPU: Add a shader calling convention This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265589 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-06 19:40:20 +00:00
Matthias Braun	dc2f859a3f	RegisterScavenger: Take a reference as enterBasicBlock() argument. Make it obvious that the argument cannot be nullptr. Remove an unnecessary nullptr check in initRegState. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265511 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-06 02:47:09 +00:00
Tom Stellard	4a79dec8e2	AMDGPU/SI: Limit load clustering to 16 bytes instead of 4 instructions Summary: This helps prevent load clustering from drastically increasing register pressure by trying to cluster 4 SMRDx8 loads together. The limit of 16 bytes was chosen, because it seems like that was the original intent of setting the limit to 4 instructions, but more analysis could show that a different limit is better. This fixes yields small decreases in register usage with shader-db, but also helps avoid a large increase in register usage when lane mask tracking is enabled in the machine scheduler, because lane mask tracking enables more opportunities for load clustering. shader-db stats: 2379 shaders in 477 tests Totals: SGPRS: 49744 -> 48600 (-2.30 %) VGPRS: 34120 -> 34076 (-0.13 %) Code Size: 1282888 -> 1283184 (0.02 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 495616 -> 492544 (-0.62 %) bytes per wave Max Waves: 6843 -> 6853 (0.15 %) Wait states: 0 -> 0 (0.00 %) Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18451 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264589 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-28 16:10:13 +00:00
Nicolai Haehnle	f0b7f107b9	AMDGPU: Add SIWholeQuadMode pass Summary: Whole quad mode is already enabled for pixel shaders that compute derivatives, but it must be suspended for instructions that cause a shader to have side effects (i.e. stores and atomics). This pass addresses the issue by storing the real (initial) live mask in a register, masking EXEC before instructions that require exact execution and (re-)enabling WQM where required. This pass is run before register coalescing so that we can use machine SSA for analysis. The changes in this patch expose a problem with the second machine scheduling pass: target independent instructions like COPY implicitly use EXEC when they operate on VGPRs, but this fact is not encoded in the MIR. This can lead to miscompilation because instructions are moved past changes to EXEC. This patch fixes the problem by adding use-implicit operands to target independent instructions. Some general codegen passes are relaxed to work with such implicit use operands. Reviewers: arsenm, tstellarAMD, mareko Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18162 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263982 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-21 20:28:33 +00:00
Michel Danzer	eda4451117	AMDGPU/SI: Clean up indentation in SIInstrInfo::getDefaultRsrcDataFormat Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263626 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-16 09:10:35 +00:00
Nikolay Haustov	63cffd62c1	[AMDGPU] Assembler: change v_madmk operands to have same order as mad. The constant is now at source operand 1 (previously at 2). This is also how it is in legacy AMD sp3 assembler. Update tests. Differential Revision: http://reviews.llvm.org/D17984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263212 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-11 09:27:25 +00:00
Chad Rosier	cd3a68c781	[TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC. http://reviews.llvm.org/D17967 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263021 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-09 16:00:35 +00:00

1 2 3

126 Commits