Commit Graph

35 Commits

Author SHA1 Message Date
Matt Arsenault
beff7fe056 AMDGPU: Fix not expanding control flow after some kill blocks
Also stop trying to insert skip blocks at end_cf. This
was inserting them at the end of the block which doesn't make
sense. The skip should be inserted at the beginning of the block
right after the end cf. Just remove this for now since no tests
seem to stress this and I think this can be handled more generally
later.

Fixes bug 28550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275510 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 00:58:15 +00:00
Matt Arsenault
011dcf3d90 AMDGPU: Fix trying to skip from a block with no successors
Found while reducing bug 28550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275509 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 00:58:13 +00:00
Matt Arsenault
8a85be7236 AMDGPU: Follow up to r275203
I meant to squash this into it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275220 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-12 21:41:32 +00:00
Matt Arsenault
3f7a1b2f11 AMDGPU: Fix verifier error with kill intrinsic
Don't create a terminator in the middle of the block.
We should probably get rid of this intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275203 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-12 19:01:23 +00:00
Matt Arsenault
762cdd4ae8 Revert "AMDGPU: Remove unused control flow intrinsic"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274978 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-09 17:18:39 +00:00
Matt Arsenault
c39550268e AMDGPU: Improve offset folding for register indexing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274954 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-09 01:13:56 +00:00
Matt Arsenault
5e2ec03cf4 AMDGPU: Remove unused control flow intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274939 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 21:39:44 +00:00
Matt Arsenault
161b6ac663 AMDGPU: Minor adjustment to r274817
The commit message is inaccurate, modifiesRegister
will check for partial defs of exec.

We currently don't ever emit partial defs of exec,
so it doesn't really matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274886 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 17:06:48 +00:00
Matt Arsenault
b13c425ee9 AMDGPU: Move si_mask_branch register operand to be a use
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 00:55:44 +00:00
Matt Arsenault
d810dc4d0c AMDGPU: Cleanup. Use definesRegister instead of manual loop
Also this will be more precise since it will check
exec_lo/exec_hi writes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274817 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 00:55:39 +00:00
Nicolai Haehnle
a2a1a4f194 AMDGPU: Fix return of non-void-returning shaders
Summary:
Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that
ensures that a non-void-returning shader falls off the end of the last
basic block was effectively disabled, since SI_RETURN is now used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274612 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-06 08:35:17 +00:00
Matt Arsenault
7b7d4b781c AMDGPU: Add m0 vgpr load loop block as successor
This shows up as a verifier error when I move this
earlier, not sure why it didn't before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274275 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-30 20:49:28 +00:00
Matt Arsenault
1bf162a64a AMDGPU: Fix out of bounds indirect indexing errors
This was producing acceses to registers beyond the super
register's limits, resulting in verifier failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273977 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-28 01:09:00 +00:00
Matt Arsenault
5123c149e7 AMDGPU: Fix verifier errors with undef vector indices
Also fix pointlessly adding exec to liveins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273916 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 19:57:44 +00:00
Matt Arsenault
759ed7e410 AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 06:30:11 +00:00
Matt Arsenault
fddf7f599f AMDGPU: Fix liveness when expanding m0 loop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273514 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 23:40:57 +00:00
Matt Arsenault
e22857013f AMDGPU: Fix verifier errors in SILowerControlFlow
The main sin this was committing was using terminator
instructions in the middle of the block, and then
not updating the block successors / predecessors.
Split the blocks up to avoid this and introduce new
pseudo instructions for branches taken with exec masking.

Also use a pseudo instead of emitting s_endpgm and erasing
it in the special case of a non-void return.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273467 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 20:15:28 +00:00
Matt Arsenault
bbac091f56 AMDGPU: Also look for s_cbranch_vccz
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270091 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-19 18:20:25 +00:00
Matt Arsenault
c10caa3301 AMDGPU: Fix crash with unreachable terminators.
If a block has no successors because it ends in unreachable,
this was accessing an invalid iterator.

Also stop counting instructions that don't emit any
real instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268119 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 21:52:13 +00:00
Nicolai Haehnle
ea7a0c0467 AMDGPU: Add a shader calling convention
This makes it possible to distinguish between mesa shaders
and other kernels even in the presence of compute shaders.

Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Differential Revision: http://reviews.llvm.org/D18559

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265589 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 19:40:20 +00:00
Nicolai Haehnle
f0b7f107b9 AMDGPU: Add SIWholeQuadMode pass
Summary:
Whole quad mode is already enabled for pixel shaders that compute
derivatives, but it must be suspended for instructions that cause a
shader to have side effects (i.e. stores and atomics).

This pass addresses the issue by storing the real (initial) live mask
in a register, masking EXEC before instructions that require exact
execution and (re-)enabling WQM where required.

This pass is run before register coalescing so that we can use
machine SSA for analysis.

The changes in this patch expose a problem with the second machine
scheduling pass: target independent instructions like COPY implicitly
use EXEC when they operate on VGPRs, but this fact is not encoded in
the MIR. This can lead to miscompilation because instructions are
moved past changes to EXEC.

This patch fixes the problem by adding use-implicit operands to
target independent instructions. Some general codegen passes are
relaxed to work with such implicit use operands.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263982 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-21 20:28:33 +00:00
Tom Stellard
1f213b9b37 AMDGPU/SI: Fix threshold calculation for branching when exec is zero
Summary:
When control flow is implemented using the exec mask, the compiler will
insert branch instructions to skip over the masked section when exec is
zero if the section contains more than a certain number of instructions.

The previous code would only count instructions in successor blocks,
and this patch modifies the code to start counting instructions in all
blocks between the start and end of the branch.

Reviewers: nhaehnle, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18282

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263969 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-21 18:56:58 +00:00
Nicolai Haehnle
1dd82f9e2a AMDGPU: add missing braces around multi-line if block
This fixes an issue with rL263658 pointed out by Tom Stellard.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263823 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-18 20:32:04 +00:00
Nicolai Haehnle
d22ce50fea AMDGPU: Prevent uniform loops from becoming infinite
Summary:
Uniform loops where the branch leaving the loop is predicated on VCCNZ
must be skipped if EXEC = 0, otherwise they will be infinite.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18137

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263658 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-16 20:14:33 +00:00
Marek Olsak
01d3696081 AMDGPU/SI: Incomplete shader binaries need to finish execution at the end
Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D18058

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263441 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-14 15:57:14 +00:00
Matt Arsenault
c3aa2775c2 AMDGPU: Set flat_scratch from flat_scratch_init reg
This was hardcoded to the static private size, but this
would be missing the offset and additional size for someday
when we have dynamic sizing.

Also stops always initializing flat_scratch even when unused.

In the future we should stop emitting this unless flat instructions
are used to access private memory. For example this will initialize
it almost always on VI because flat is used for global access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260658 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 06:31:30 +00:00
Matt Arsenault
85b3e06674 AMDGPU: Initialize SILowerControlFlow
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260645 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 02:16:10 +00:00
Matt Arsenault
cf344bf8c1 AMDGPU: Remove trailing whitespace
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260644 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 02:16:07 +00:00
Matt Arsenault
96d418302e AMDGPU: Fix adding redundant m0 uses
BuildMI already adds these since they are defined correctly now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250961 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-21 22:37:51 +00:00
Matt Arsenault
d2643e2ff9 AMDGPU: Add MachineInstr overloads for instruction format tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250797 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-20 04:35:43 +00:00
Matt Arsenault
3f7c35a966 AMDGPU: Use explicit register size indirect pseudos
This stops using an unknown reg class operand.

Currently build_vector selection has a broken looking check
where it tries to use a VGPR reg class and an SGPR one if it
sees an SGPR use.

With the source operand has an explicit VGPR class,
illegal copies will be inserted that SIFixSGPRCopies will take care
of normally later, which will allow removing the weird check
of build_vector users. Without this, when removed v_movrels_b32 would
still be emitted even though all of the values were only stored in
SGPRs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249494 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 00:42:51 +00:00
Matt Arsenault
7a6a7f2409 AMDGPU: Fix recomputing dominator tree unnecessarily
SIFixSGPRCopies does not modify the CFG, but this was
being recomputed before running SIFoldOperands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248587 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:21:28 +00:00
Matt Arsenault
d3ff1cd1f5 AMDGPU/SI: Remove VCCReg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-08 00:41:48 +00:00
Matt Arsenault
e601cbe15d AMDGPU/SI: Remove EXECReg
For the same reasons as the other physical registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244062 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-05 16:42:57 +00:00
Tom Stellard
953c681473 R600 -> AMDGPU rename
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 03:28:10 +00:00