65 Commits

Author SHA1 Message Date
Tom Stellard
1d6fd076a3 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 20:59:01 +00:00
Mark Searles
8f93b43810 [AMDGPU] prevent hitting Assertion `isReg() && "Wrong MachineOperand accessor"'
The use iterator, used within findMaskOperands(), can return anything which is
not a def. isUse() requires a register, so check isReg() before calling isUse().

Differential Revision: https://reviews.llvm.org/D48047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334459 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-12 00:41:26 +00:00
Tim Renouf
dde1312300 [AMDGPU] Fixed incorrect break from loop
Summary:
Lower control flow did not correctly handle the case that a loop break
in if/else was on a condition that was not guaranteed to be masked by
exec. The first test kernel shows an example of this going wrong; after
exiting the loop, exec is all ones, even if it was not before the loop.

The fix is for lowering of if-break and else-break to insert an
S_AND_B64 to mask the break condition with exec. This commit also
includes the optimization of not inserting that S_AND_B64 if it is
obviously not needed because the break condition is the result of a
V_CMP in the same basic block.

V2: Addressed some review comments.
V3: Test fixes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44046

Change-Id: I0fc56a01209a9e99d1d5c9b0ffd16f111caf200c

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333258 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-25 07:55:04 +00:00
Tom Stellard
f02d6fd47c AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332930 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-22 02:03:23 +00:00
Adrian Prantl
26b584c691 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331272 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-01 15:54:18 +00:00
Matthias Braun
fa621d294f Rename LiveIntervalAnalysis.h to LiveIntervals.h
Headers/Implementation files should be named after the class they
declare/define.

Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in
favor of `class LiveIntarvals;`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320546 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 02:51:04 +00:00
Francis Visoiu Mistrih
a4ec08b6fd [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319187 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 17:15:09 +00:00
David Blaikie
e3a9b4ce3a Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318490 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-17 01:07:10 +00:00
Marek Olsak
4fda278e9b AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.

Also allow kill in all shader stages.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316427 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 10:27:13 +00:00
Stanislav Mekhanoshin
6069fa8531 [AMDGPU] Preserve inverted bit in SI_IF in presence of SI_KILL
In case if SI_KILL is in between of the SI_IF and SI_END_CF we need
to preserve the bits actually flipped by if rather then restoring
the original mask.

Differential Revision: https://reviews.llvm.org/D36299

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310031 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-04 06:58:42 +00:00
Stanislav Mekhanoshin
ddb10d2e51 [AMDGPU] Optimize SI_IF lowering for simple if regions
Currently SI_IF results in a s_and_saveexec_b64 followed by s_xor_b64.
The xor is used to extract only the changed bits. In case of a simple
if region where the only use of that value is in the SI_END_CF to
restore the old exec mask, we can omit the xor and perform an or of
the exec mask with the original exec value saved by the
s_and_saveexec_b64.

Differential Revision: https://reviews.llvm.org/D35861

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309185 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-26 21:29:15 +00:00
Chandler Carruth
e3e43d9d57 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304787 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-06 11:49:48 +00:00
Eugene Zelenko
68c521d030 [AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292623 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 17:52:16 +00:00
Stanislav Mekhanoshin
b8fa7c40ea [AMDGPU] Add exec copy to LiveIntervals in SILowerControlFlow::emitElse
This instruction is missing from LiveIntervals.
I'm not aware of any problems because of this though.

Differential Revision: https://reviews.llvm.org/D28879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292521 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-19 21:26:22 +00:00
Diana Picus
8a47810cd6 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291891 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 09:58:52 +00:00
Stanislav Mekhanoshin
ab827bdc35 [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.

With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask_b32
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a propagation of source SGPR pair in place of v_cmp
is implemented. Additional side effect of this is that we may consume less VGPRs
at a cost of more SGPRs in case if holding of multiple conditions is needed, and
that is a clear win in most cases.

Differential Revision: https://reviews.llvm.org/D26114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288053 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-28 18:58:49 +00:00
Stanislav Mekhanoshin
64620b1c31 [AMDGPU] Fix multiple vreg definitions in si-lower-control-flow
Differential Revision: https://reviews.llvm.org/D26939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 01:42:34 +00:00
Mehdi Amini
67f335d992 Use StringRef in Pass/PassManager APIs (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:56:57 +00:00
Matt Arsenault
0461ece2ce AMDGPU: Partially fix control flow at -O0
Fixes to allow spilling all registers at the end of the block
work with exec modifications. Don't emit s_and_saveexec_b64 for
if lowering, and instead emit copies. Mark control flow mask
instructions as terminators to get correct spill code placement
with fast regalloc, and then have a separate optimization pass
form the saveexec.

This should work if SGPRs are spilled to VGPRs, but
will likely fail in the case that an SGPR spills to memory
and no workitem takes a divergent branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282667 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-29 01:44:16 +00:00
Matt Arsenault
36a8c3e60f AMDGPU: Remove register operand from si_mask_branch
It isn't used for anything, and is also misleading since
it could be spilled at the end of the block, so it can't be relied
on. There ends up being a verifier error about using an undefined
register since the spill kills the register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279899 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-27 00:42:21 +00:00
Matt Arsenault
7517ed227a AMDGPU: Split SILowerControlFlow into two pieces
Do most of the lowering in a pre-RA pass. Keep the skip jump
insertion late, plus a few other things that require more
work to move out.

One concern I have is now there may be COPY instructions
which do not have the necessary implicit exec uses
if they will be lowered to v_mov_b32.

This has a positive effect on SGPR usage in shader-db.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279464 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-22 19:33:16 +00:00
Matt Arsenault
ece2d8b253 AMDGPU: Remove unused tracking of flat instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278361 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 17:15:28 +00:00
Matt Arsenault
34c6b123f7 AMDGPU: Change insertion point of si_mask_branch
Insert before the skip branch if one is created.
This is a somewhat more natural placement relative
to the skip branches, and makes it possible to implement
analyzeBranch for skip blocks.

The test changes are mostly due to a quirk where
the block label is not emitted if there is a terminator
that is not also a branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278273 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-10 19:11:42 +00:00
Nicolai Haehnle
b18ca96c79 AMDGPU: add execfix flag to SI_ELSE
Summary:
SI_ELSE is lowered into two parts:

s_or_saveexec_b64 dst, src (at the start of the basic block)

s_xor_b64 exec, exec, dst (at the end of the basic block)

The idea is that dst contains the exec mask of the preceding IF block. It can
happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside
the basic block that contains SI_ELSE, in which case it introduces an instruction

s_and_b64 exec, exec, s[...]

which masks out bits that can correspond to both the IF and the ELSE paths.
So the resulting sequence must be:

s_or_savexec_b64 dst, src

s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode
s_and_b64 dst, dst, exec <-- added by SILowerControlFlow

s_xor_b64 exec, exec, dst

Whether to add the additional s_and_b64 dst, dst, exec is currently determined
via the ExecModified tracking. With this change, it is instead determined by
an additional flag on SI_ELSE which is set by SIWholeQuadMode.

Finally: It also occured to me that an alternative approach for the long run
is for SILowerControlFlow to unconditionally emit

s_or_saveexec_b64 dst, src

...

s_and_b64 dst, dst, exec
s_xor_b64 exec, exec, dst

and have a pass that detects and cleans up the "redundant AND with exec"
pattern where possible. This could be useful anyway, because we also add
instructions

s_and_b64 vcc, exec, vcc

before s_cbranch_scc (in moveToALU), and those are often redundant. I have
some pending changes to how KILL is lowered that could also benefit from
such a cleanup pass.

In any case, this current patch could help in the short term with the whole
ExecModified business.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D22846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276972 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 11:39:24 +00:00
Reid Kleckner
12e910f70a Remove MCAsmInfo.h include from TargetOptions.h
TargetOptions wants the ExceptionHandling enum. Move that to
MCTargetOptions.h to avoid transitively including Dwarf.h everywhere in
clang. Now you can add a DWARF tag without a full rebuild of clang
semantic analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276883 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-27 16:03:57 +00:00
Matt Arsenault
d506595769 AMDGPU: Make AMDGPUMachineFunction fields private
ABIArgOffset is a problem because properly fsetting the
KernArgSize requires that the reserved area before the
real kernel arguments be correctly aligned, which requires
fixing clover.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276766 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 16:45:58 +00:00
Matt Arsenault
21e0aa8d55 AMDGPU: Make skip threshold an option
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276680 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-25 19:48:29 +00:00
Davide Italiano
f36cce1574 [AMDGPU] Remove spurious line (should've been removed in r276029).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276030 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 21:16:30 +00:00
Davide Italiano
5012465830 [AMDGPU] Remove dead code.
LGTM'd by Matt Arsenault.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276029 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 21:10:49 +00:00
Matt Arsenault
4cead0b564 AMDGPU: Expand register indexing pseudos in custom inserter
This is to help moveSILowerControlFlow to before regalloc.
There are a couple of tradeoffs with this. The complete CFG
is visible to more passes, the loop body avoids an extra copy of m0,
vcc isn't required, and immediate offsets can be shrunk into s_movk_i32.

The disadvantage is the register allocator doesn't understand that
the single lane's vector is dead within the loop body, so an extra
register is used to outlive the loop block when expanding the
VGPR -> m0 loop. This also now results in worse waitcnt insertion
before the loop instead of after for pending operations at the point
of the indexing, but that should be fixed by future improvements to
cross block waitcnt insertion.

v_movreld_b32's operands are now modeled more correctly since vdst
is not a true output. This is kind of a hack to treat vdst as a
use operand. Extra checking is required in the verifier since
I can't seem to get tablegen to emit an implicit operand for a
virtual register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275934 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 00:35:03 +00:00
Matt Arsenault
beff7fe056 AMDGPU: Fix not expanding control flow after some kill blocks
Also stop trying to insert skip blocks at end_cf. This
was inserting them at the end of the block which doesn't make
sense. The skip should be inserted at the beginning of the block
right after the end cf. Just remove this for now since no tests
seem to stress this and I think this can be handled more generally
later.

Fixes bug 28550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275510 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 00:58:15 +00:00
Matt Arsenault
011dcf3d90 AMDGPU: Fix trying to skip from a block with no successors
Found while reducing bug 28550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275509 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 00:58:13 +00:00
Matt Arsenault
8a85be7236 AMDGPU: Follow up to r275203
I meant to squash this into it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275220 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-12 21:41:32 +00:00
Matt Arsenault
3f7a1b2f11 AMDGPU: Fix verifier error with kill intrinsic
Don't create a terminator in the middle of the block.
We should probably get rid of this intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275203 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-12 19:01:23 +00:00
Matt Arsenault
762cdd4ae8 Revert "AMDGPU: Remove unused control flow intrinsic"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274978 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-09 17:18:39 +00:00
Matt Arsenault
c39550268e AMDGPU: Improve offset folding for register indexing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274954 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-09 01:13:56 +00:00
Matt Arsenault
5e2ec03cf4 AMDGPU: Remove unused control flow intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274939 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 21:39:44 +00:00
Matt Arsenault
161b6ac663 AMDGPU: Minor adjustment to r274817
The commit message is inaccurate, modifiesRegister
will check for partial defs of exec.

We currently don't ever emit partial defs of exec,
so it doesn't really matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274886 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 17:06:48 +00:00
Matt Arsenault
b13c425ee9 AMDGPU: Move si_mask_branch register operand to be a use
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 00:55:44 +00:00
Matt Arsenault
d810dc4d0c AMDGPU: Cleanup. Use definesRegister instead of manual loop
Also this will be more precise since it will check
exec_lo/exec_hi writes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274817 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-08 00:55:39 +00:00
Nicolai Haehnle
a2a1a4f194 AMDGPU: Fix return of non-void-returning shaders
Summary:
Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that
ensures that a non-void-returning shader falls off the end of the last
basic block was effectively disabled, since SI_RETURN is now used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274612 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-06 08:35:17 +00:00
Matt Arsenault
7b7d4b781c AMDGPU: Add m0 vgpr load loop block as successor
This shows up as a verifier error when I move this
earlier, not sure why it didn't before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274275 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-30 20:49:28 +00:00
Matt Arsenault
1bf162a64a AMDGPU: Fix out of bounds indirect indexing errors
This was producing acceses to registers beyond the super
register's limits, resulting in verifier failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273977 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-28 01:09:00 +00:00
Matt Arsenault
5123c149e7 AMDGPU: Fix verifier errors with undef vector indices
Also fix pointlessly adding exec to liveins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273916 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 19:57:44 +00:00
Matt Arsenault
759ed7e410 AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 06:30:11 +00:00
Matt Arsenault
fddf7f599f AMDGPU: Fix liveness when expanding m0 loop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273514 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 23:40:57 +00:00
Matt Arsenault
e22857013f AMDGPU: Fix verifier errors in SILowerControlFlow
The main sin this was committing was using terminator
instructions in the middle of the block, and then
not updating the block successors / predecessors.
Split the blocks up to avoid this and introduce new
pseudo instructions for branches taken with exec masking.

Also use a pseudo instead of emitting s_endpgm and erasing
it in the special case of a non-void return.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273467 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 20:15:28 +00:00
Matt Arsenault
bbac091f56 AMDGPU: Also look for s_cbranch_vccz
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270091 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-19 18:20:25 +00:00
Matt Arsenault
c10caa3301 AMDGPU: Fix crash with unreachable terminators.
If a block has no successors because it ends in unreachable,
this was accessing an invalid iterator.

Also stop counting instructions that don't emit any
real instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268119 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 21:52:13 +00:00
Nicolai Haehnle
ea7a0c0467 AMDGPU: Add a shader calling convention
This makes it possible to distinguish between mesa shaders
and other kernels even in the presence of compute shaders.

Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Differential Revision: http://reviews.llvm.org/D18559

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265589 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 19:40:20 +00:00