Commit Graph

115 Commits

Author SHA1 Message Date
Matt Arsenault
6de48c50fe AMDGPU: Fix folding SGPRs into madak/madmk src0
Because of the special immediate operand, the constant
bus is already used so SGPRs are never useful.

r263212 changed the name of the immediate operand, which
broke the verifier check for the restriction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274564 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-05 17:09:01 +00:00
Duncan P. N. Exon Smith
567409db69 CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr.  This is a
general API improvement.

Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other.  Instead I've done everything as a block and just
updated what was necessary.

This is mostly mechanical fixes: adding and removing `*` and `&`
operators.  The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency.  Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy.  I couldn't run tests
for AVR since llc doesn't link with it turned on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274189 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-30 00:01:54 +00:00
Matt Arsenault
bd1991ecf2 AMDGPU: Remove unused function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274033 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-28 16:59:49 +00:00
Matt Arsenault
759ed7e410 AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 06:30:11 +00:00
Matt Arsenault
8552cfd5ca AMDGPU: readlane/writelane do not read exec
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273525 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 01:26:16 +00:00
NAKAMURA Takumi
96b66d10fe Reformat blank lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273131 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 01:05:15 +00:00
NAKAMURA Takumi
82f8dab579 Untabify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273129 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 00:37:41 +00:00
Changpeng Fang
a3e290966a AMDGPU/SI: Propagate the Kill flag in storeRegToStackSlot and eliminateFrameIndex
Reviewers: arsenm, tstellarAMD

Differential Revision:  http://reviews.llvm.org/21438

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272958 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-16 21:20:47 +00:00
Tom Stellard
a09ba98fef AMDGPU/SI: Refactor fixup handling for constant addrspace variables
Summary:
We now use a standard fixup type applying the pc-relative address of
constant address space variables, and we have the GlobalAddress lowering
code add the required 4 byte offset to the global address rather than
doing it as part of the fixup.

This refactoring will make it easier to use the same code for global
address space variables and also simplifies the code.

Re-commit this after fixing a bug where we were trying to use a
reference to a Triple object that had already been destroyed.

Reviewers: arsenm, kzhuravl

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21154

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272705 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 20:29:59 +00:00
Tom Stellard
d8ffcd8311 Revert "AMDGPU/SI: Refactor fixup handling for constant addrspace variables"
This reverts commit r272675.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272677 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 15:16:35 +00:00
Tom Stellard
1a5003c59b AMDGPU/SI: Refactor fixup handling for constant addrspace variables
Summary:
We now use a standard fixup type applying the pc-relative address of
constant address space variables, and we have the GlobalAddress lowering
code add the required 4 byte offset to the global address rather than
doing it as part of the fixup.

This refactoring will make it easier to use the same code for global
address space variables and also simplifies the code.

Reviewers: arsenm, kzhuravl

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21154

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272675 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 15:11:01 +00:00
Marek Olsak
760c36c5ae AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing
Summary:
Mesa and other users must set this to enable coalescing:
- STRIDE = 0
- SWIZZLE_ENABLE = 1

This makes one particular compute shader 8x faster.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D21136

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272556 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 16:05:57 +00:00
Matt Arsenault
8cd24fa644 AMDGPU: Fix post-RA verifier errors with trackLivenessAfterRegAlloc
The condition reg of the cndmask_b64 expansion can't be killed by
the first one, and the implicit super register implicit def is needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272554 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-13 15:53:52 +00:00
Benjamin Kramer
af18e017d2 Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272512 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 15:39:02 +00:00
Matt Arsenault
f4135c634c AMDGPU: Add function for getting instruction size
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271936 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-06 20:10:33 +00:00
Matt Arsenault
66d860c198 AMDGPU: Handle flat in getMemOpBaseRegImmOfs
It can still report the base register, and the uses give up when it
fails.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271575 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-02 20:05:20 +00:00
Matt Arsenault
25641e07d0 AMDGPU: Fix incorrectly setting kill flag when copying register tuples
This fixes some verifier errors when trackLivenessAfterRegAlloc is
enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271446 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-02 00:04:30 +00:00
Matt Arsenault
6416e4c521 AMDGPU: Fix verifier error when spilling SGPRs
The current SGPR spilling test does not stress this
because it is using s_buffer_load instructions to
increase SGPR pressure and spill, but their output
operands have the same SReg_32_XM0 constraint. This fixes
an error when the SReg_32 output from most instructions
is spilled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270301 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-21 00:53:42 +00:00
Matt Arsenault
9d922be248 AMDGPU: Handle cbranch vccz/vccnz
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270297 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-21 00:29:40 +00:00
Matt Arsenault
dcb6543de5 AMDGPU: Implement ReverseBranchCondition
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270296 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-21 00:29:34 +00:00
Matt Arsenault
f91238f391 AMDGPU: Implement AnalyzeBranch
Original patch by Tom Stellard

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270295 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-21 00:29:27 +00:00
Matt Arsenault
1e466e511b AMDGPU: Remove verifier check for scc live ins
We only really need this to be true for SIFixSGPRCopies.
I'm not sure there's any way this could happen before that point.

Fixes a case where MachineCSE could introduce a cross block
scc use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269391 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 04:15:48 +00:00
Tom Stellard
83f4a25d58 AMDGPU/SI: Fix bug in SIInstrInfo::insertWaitStates() uncovered by r268260
We can't use MI->getDebugLoc() when MI is an iterator that could be
MBB.end().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268265 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-02 18:02:24 +00:00
Tom Stellard
6ab99c7ca6 AMDGPU/SI: Enable the post-ra scheduler
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.

Reviewers: arsenm

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268143 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-30 00:23:06 +00:00
Tom Stellard
ac19ae8d63 AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions
Summary:
These instructions can add an immediate offset to the address, like other
ds instructions.

Reviewers: arsenm

Subscribers: arsenm, scchan

Differential Revision: http://reviews.llvm.org/D19233

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 14:34:26 +00:00
Etienne Bergeron
3f30242bbd Fix incorrect redundant expression in target AMDGPU.
Summary:
The expression is detected as a redundant expression.
Turn out, this is probably a bug.

```
/home/etienneb/llvm/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:306:26: warning: both side of operator are equivalent [misc-redundant-expression]
  if (isSMRD(*FirstLdSt) && isSMRD(*FirstLdSt)) {
```

Reviewers: rnk, tstellarAMD

Subscribers: arsenm, cfe-commits

Differential Revision: http://reviews.llvm.org/D19460

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267415 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-25 15:06:33 +00:00
Nicolai Haehnle
3441786e27 AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic
Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.

Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.

This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267102 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 04:04:08 +00:00
Nicolai Haehnle
fea41fb59c AMDGPU: Guard VOPC instructions against incorrect commute
Summary:
The added testcase, which triggered this, was derived from a shader-db case
via bugpoint. A separate question is why scalar branching wasn't used.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19208

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266825 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-19 21:58:22 +00:00
Jun Bum Lim
232aafceb5 [MachineScheduler]Add support for store clustering
Perform store clustering just like load clustering. This change add
StoreClusterMutation in machine-scheduler. To control StoreClusterMutation,
added enableClusterStores() in TargetInstrInfo.h. This is enabled only on
AArch64 for now.

This change also add support for unscaled stores which were not handled in
getMemOpBaseRegImmOfs().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266437 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-15 14:58:38 +00:00
Matt Arsenault
6d955a1d6d AMDGPU: Run SIFoldOperands after PeepholeOptimizer
PeepholeOptimizer cleans up redundant copies, which makes
the operand folding more effective.

shader-db stats:

Totals:
SGPRS: 34200 -> 34336 (0.40 %)
VGPRS: 22118 -> 21655 (-2.09 %)
Code Size: 632144 -> 633460 (0.21 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 10240 -> 11264 (10.00 %) bytes per wave
Max Waves: 8822 -> 8918 (1.09 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 7704 -> 7840 (1.77 %)
VGPRS: 5169 -> 4706 (-8.96 %)
Code Size: 234444 -> 235760 (0.56 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Scratch: 0 -> 1024 (0.00 %) bytes per wave
Max Waves: 1188 -> 1284 (8.08 %)
Wait states: 0 -> 0 (0.00 %)

Increases:
SGPRS: 35 (0.01 %)
VGPRS: 1 (0.00 %)
Code Size: 59 (0.02 %)
LDS: 0 (0.00 %)
Scratch: 1 (0.00 %)
Max Waves: 48 (0.02 %)
Wait states: 0 (0.00 %)

Decreases:
SGPRS: 26 (0.01 %)
VGPRS: 54 (0.02 %)
Code Size: 68 (0.03 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)
Max Waves: 4 (0.00 %)
Wait states: 0 (0.00 %)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266378 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 21:58:24 +00:00
Tom Stellard
90d87693b6 AMDGPU/SI: Fix spilling of 96-bit registers
Summary:
It seems like this was broken in r252327.  I thought we had test cases
for this, but it's really hard to tirgger spills of this exact register
size since they aren't used very much.

Reviewers: arsenm, nhaehnle

Subscribers: nhaehnle, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19021

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266152 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 23:57:30 +00:00
Tom Stellard
c2d9280e43 AMDGPU/SI: Add MachineBasicBlock parameter to SIInstrInfo::insertWaitStates
Summary: This makes it possible to insert nops at the end of blocks.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265678 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-07 14:47:07 +00:00
Nicolai Haehnle
ea7a0c0467 AMDGPU: Add a shader calling convention
This makes it possible to distinguish between mesa shaders
and other kernels even in the presence of compute shaders.

Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Differential Revision: http://reviews.llvm.org/D18559

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265589 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 19:40:20 +00:00
Matthias Braun
dc2f859a3f RegisterScavenger: Take a reference as enterBasicBlock() argument.
Make it obvious that the argument cannot be nullptr.
Remove an unnecessary nullptr check in initRegState.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265511 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 02:47:09 +00:00
Tom Stellard
4a79dec8e2 AMDGPU/SI: Limit load clustering to 16 bytes instead of 4 instructions
Summary:
This helps prevent load clustering from drastically increasing register
pressure by trying to cluster 4 SMRDx8 loads together.  The limit of 16
bytes was chosen, because it seems like that was the original intent
of setting the limit to 4 instructions, but more analysis could show
that a different limit is better.

This fixes yields small decreases in register usage with shader-db, but
also helps avoid a large increase in register usage when lane mask
tracking is enabled in the machine scheduler, because lane mask tracking
enables more opportunities for load clustering.

shader-db stats:

2379 shaders in 477 tests
Totals:
SGPRS: 49744 -> 48600 (-2.30 %)
VGPRS: 34120 -> 34076 (-0.13 %)
Code Size: 1282888 -> 1283184 (0.02 %) bytes
LDS: 28 -> 28 (0.00 %) blocks
Scratch: 495616 -> 492544 (-0.62 %) bytes per wave
Max Waves: 6843 -> 6853 (0.15 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: nhaehnle, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18451

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264589 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 16:10:13 +00:00
Nicolai Haehnle
f0b7f107b9 AMDGPU: Add SIWholeQuadMode pass
Summary:
Whole quad mode is already enabled for pixel shaders that compute
derivatives, but it must be suspended for instructions that cause a
shader to have side effects (i.e. stores and atomics).

This pass addresses the issue by storing the real (initial) live mask
in a register, masking EXEC before instructions that require exact
execution and (re-)enabling WQM where required.

This pass is run before register coalescing so that we can use
machine SSA for analysis.

The changes in this patch expose a problem with the second machine
scheduling pass: target independent instructions like COPY implicitly
use EXEC when they operate on VGPRs, but this fact is not encoded in
the MIR. This can lead to miscompilation because instructions are
moved past changes to EXEC.

This patch fixes the problem by adding use-implicit operands to
target independent instructions. Some general codegen passes are
relaxed to work with such implicit use operands.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263982 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-21 20:28:33 +00:00
Michel Danzer
eda4451117 AMDGPU/SI: Clean up indentation in SIInstrInfo::getDefaultRsrcDataFormat
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263626 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-16 09:10:35 +00:00
Nikolay Haustov
63cffd62c1 [AMDGPU] Assembler: change v_madmk operands to have same order as mad.
The constant is now at source operand 1 (previously at 2).
This is also how it is in legacy AMD sp3 assembler.
Update tests.

Differential Revision: http://reviews.llvm.org/D17984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263212 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-11 09:27:25 +00:00
Chad Rosier
cd3a68c781 [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.
http://reviews.llvm.org/D17967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263021 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-09 16:00:35 +00:00
Tom Stellard
6bf8b0e0f7 AMDGPU/SI: Add support for spiling SGPRs to scratch buffer
Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17592

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262732 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-04 18:31:18 +00:00
Matt Arsenault
e407a109c1 AMDGPU: Simplify boolean conditional return statements
Patch by Richard Thomson

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262536 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 23:00:21 +00:00
Matt Arsenault
3c8187f7a7 AMDGPU: Cleanup suggested in bug 23960
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262456 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 04:05:14 +00:00
Changpeng Fang
5bbcde0b92 AMDGPU/SI: Implement DS_PERMUTE/DS_BPERMUTE Instruction Definitions and Intrinsics
Summary:
  This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics,
which are new since VI.

Reviewers: tstellarAMD, arsenm

Subscribers: llvm-commits, arsenm

Differential Revision: http://reviews.llvm.org/D17614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262356 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-01 17:51:23 +00:00
Tom Stellard
090553bea6 AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer
Summary:
Instead of trying to replace SMRD instructions with a VGPR base pointer
with an equivalent MUBUF instruction, we now copy the base pointer to
SGPRs using v_readfirstlane.

This is safe to do, because any load selected as an SMRD instruction
has been proven to have a uniform base pointer, so each thread in the
wave will have the same pointer value in VGPRs.

This will fix some errors on VI from trying to replace SMRD instructions
with addr64-enabled MUBUF instructions that don't exist.

Reviewers: arsenm, cfang, nhaehnle

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17305

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261385 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-20 00:37:25 +00:00
Tom Stellard
f4f96b9ce5 [AMDGPU] Rename $dst operand to $vdst for VOP instructions.
Summary: This change renames output operand for VOP instructions from dst to vdst. This is needed to enable decoding named operands for disassembler.

Reviewers: vpykhtin, tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits, nhaustov

Projects: #llvm-amdgpu-spb

Differential Revision: http://reviews.llvm.org/D16920

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260986 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-16 18:14:56 +00:00
Tom Stellard
98ef447825 AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions
Reviewers: arsenm

Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 23:45:29 +00:00
Matt Arsenault
e3601c75c9 AMDGPU: Set element_size in private resource descriptor
Introduce a subtarget feature for this, and leave the default with
the current behavior which assumes up to 16-byte loads/stores can
be used. The field also seems to have the ability to be set to 2 bytes,
but I'm not sure what that would be used for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260651 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 02:40:47 +00:00
Tom Stellard
dfa41e52a9 AMDGPU/SI: Make sure MIMG descriptors and samplers stay in SGPRs
Summary:
It's possible to have resource descriptors and samplers stored in
VGPRs, either by a VMEM instruction or in the case of samplers,
floating-point calculations.  When this happens, we need to use
v_readfirstlane to copy these values back to sgprs.

Reviewers: mareko, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260599 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-11 21:45:07 +00:00
Tom Stellard
d64ea69f34 AMDGPU/SI: When splitting SMRD instructions, add its users to VALU worklist
Summary:
When we split SMRD instructions into two MUBUFs we were adding the users
of the newly created MUBUFs to the VALU worklist.  However, the only
users these instructions had was the REG_SEQUENCE that was inserted
by splitSMRD when the original SMRD instruction was split.

We need to make sure to add the users of the original SMRD to the VALU
worklist before it is split.

I have a test case, but it requires one other bug fix, so it will be
added in a later commt.

Reviewers: mareko, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17101

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260588 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-11 21:14:34 +00:00
Matt Arsenault
d581c66591 AMDGPU: Fix constant bus use check with subregisters
If the two operands to an instruction were both
subregisters of the same super register, it would incorrectly
think this counted as the same constant bus use.

This fixes the verifier error in fmin_legacy.ll which
was missing -verify-machineinstrs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260495 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-11 06:15:39 +00:00