52 Commits

Author SHA1 Message Date
Tom Stellard
1d6fd076a3 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 20:59:01 +00:00
Matt Arsenault
36d1b4fe6f AMDGPU: Pass function directly instead of MachineFunction
These functions just query the underlying IR function,
so pass it directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333442 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-29 17:42:50 +00:00
Tom Stellard
f02d6fd47c AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332930 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-22 02:03:23 +00:00
Matt Arsenault
7c20cc0669 AMDGPU: Assign enum name to stack ID
Also assert that it is correct for SGPRs. There is currently a bug
where stack slot coloring replaces SGPR spill FIs with one with
the default ID, which results in a more confusing assert later
about a dead object.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330607 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-23 15:51:26 +00:00
Tim Renouf
22c1a15758 [AMDGPU] For OS type AMDPAL, fixed scratch on compute shader
Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).

This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.

V2: Ensure s0 (s8 for gfx9 merged shader) is marked live-in when loading
scratch descriptor from GIT.

Reviewers: kzhuravl, nhaehnle, timcorringham

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm

Differential Revision: https://reviews.llvm.org/D44468

Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329690 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-10 11:25:15 +00:00
Matt Arsenault
13e18b3d4b AMDGPU: Fix build warning in release
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328832 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-29 21:44:44 +00:00
Matt Arsenault
b18554c107 AMDGPU: Support realigning stack
While the stack access instructions don't care about
alignment > 4, some transformations on the pointer calculation
do make assumptions based on knowing the low bits of a pointer
are 0. If a stack object ends up being accessed through its
absolute address (relative to the kernel scratch wave offset),
the addressing expression may depend on the stack frame being
properly aligned. This was breaking in a testcase due to the
add->or combine.

I think some of the SP/FP handling logic is still backwards,
and overly simplistic to support all of the stack features.
Code which tries to modify the SP with inline asm for example
or variable sized objects will probably require redoing this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328831 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-29 21:30:06 +00:00
Tim Renouf
9f475f3a91 Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader"
This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981.

It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a
release-with-asserts build. I will resubmit the change when I have fixed
that.

Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328695 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-28 11:21:07 +00:00
Tim Renouf
0daf86291d [AMDGPU] For OS type AMDPAL, fixed scratch on compute shader
Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).

This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.

Reviewers: kzhuravl, nhaehnle, timcorringham

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm

Differential Revision: https://reviews.llvm.org/D44468

Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328673 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-27 21:35:00 +00:00
Tim Renouf
f157381aa2 [AMDGPU] Scratch setup fix on AMDPAL gfx9+ merge shader
Summary:
With OS type AMDPAL, the scratch descriptor is hardwired to be loaded
from offset 0 of the global information table, whose low pointer is
passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as
the hardware reserves s0-s7.

Reviewers: kzhuravl

Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl

Differential Revision: https://reviews.llvm.org/D42203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326088 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-26 14:46:43 +00:00
Matthias Braun
d318139827 MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320884 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:58 +00:00
Konstantin Zhuravlyov
63dcaeaca4 AMDGPU: Fix set but not used warnings related to AMDGPUAS
Differential Revision: https://reviews.llvm.org/D39499


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317114 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 19:12:38 +00:00
Tim Renouf
2532de2a09 [AMDGPU] AMDPAL scratch buffer support
Summary:
Added support for scratch (including spilling) for OS type amdpal:
generates code to set up the scratch descriptor if it is needed.

With amdpal, the scratch resource descriptor is loaded from offset 0 of
the global information table. The low 32 bits of the address of the
global information table is passed in s0.

Added amdgpu-git-ptr-high function attribute to hard-wire the high 32
bits of the address of the global information table. If the function
attribute is not specified, or is 0xffffffff, then the backend generates
code to use the high 32 bits of pc.

The documentation for the AMDPAL ABI will be added in a later commit.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye

Differential Revision: https://reviews.llvm.org/D37483

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314501 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 09:49:35 +00:00
Matt Arsenault
a1a416812a AMDGPU: Don't spill SP reg like a normal CSR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313217 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 23:47:01 +00:00
Matt Arsenault
c60159767d AMDGPU: Pass special input registers to functions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309998 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-03 23:00:29 +00:00
Matt Arsenault
6023e68dae AMDGPU: Fix clobbering CSR VGPRs when spilling SGPR to it
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 01:52:45 +00:00
Matt Arsenault
43950949ad AMDGPU: Initial implementation of calls
Includes a hack to fix the type selected for
the GlobalAddress of the function, which will be
fixed by changing the default datalayout to use
generic pointers for 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 19:54:18 +00:00
Matt Arsenault
5472b31175 AMDGPU: Annotate necessity of flat-scratch-init
As an approximation of the existing handling to avoid
regressions. Fixes using too many registers with calls
on subtargets with the SGPR allocation bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308326 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-18 16:44:58 +00:00
Matt Arsenault
da7ac1f435 AMDGPU: Figure out private memory regs after lowering
Introduce pseudo-registers for registers needed for stack
access, which are replaced during finalizeLowering.
Note these pseudo-registers are currently only used for the
used register location, and not for determining their
input argument register.

This is better because it avoids the need to try to predict
whether a call will be emitted from the IR, and also
detects stack objects introduced by legalization.

Test changes are from the HasStackObjects check being more
accurate since stack objects introduced during legalization
are now known.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308325 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-18 16:44:56 +00:00
Matt Arsenault
8e828b87b2 AMDGPU: Setup SP/FP in callee function prolog/epilog
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306312 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 17:53:59 +00:00
Matt Arsenault
ec6175c524 AMDGPU: Partially fix implicit.buffer.ptr intrinsic handling
This should not be treated as a different version of
private_segment_buffer. These are distinct things with
different uses and register classes, and requires the
function argument info to have more context about the
function's type and environment.

Also add missing test coverage for the intrinsic, and
emit an error for HSA. This also encovers that the intrinsic
is broken unless there happen to be stack objects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306264 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-26 03:01:31 +00:00
Chandler Carruth
e3e43d9d57 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304787 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-06 11:49:48 +00:00
Matt Arsenault
a0540d3468 AMDGPU: Start defining a calling convention
Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303308 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 21:56:25 +00:00
Marek Olsak
24aaeeb480 AMDGPU: GFX9 GS and HS shaders always have the scratch wave offset in SGPR5
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D32645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302200 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 22:25:20 +00:00
Matt Arsenault
3676647251 AMDGPU: Shift down reserved SP register like scratch wave offset
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301367 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-25 23:40:57 +00:00
Matt Arsenault
c6fb348394 AMDGPU: Slightly simplify prolog reserved register handling
Rely on MachineRegisterInfo's knowledge of used physical
registers.

Move flat_scratch initialization earlier, so the uses are visible
when making these decisions.

This will make it easier to add another reserved register
at the end for the stack pointer rather than handling another
special case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301254 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 21:08:32 +00:00
Krzysztof Parzyszek
36d7c2b2e5 Move size and alignment information of regclass to TargetRegisterInfo
1. RegisterClass::getSize() is split into two functions:
   - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
   - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
   - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;

This will allow making those values depend on subtarget features in the
future.

Differential Revision: https://reviews.llvm.org/D31783


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301221 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 18:55:33 +00:00
Yaxun Liu
ab3be33d40 [AMDGPU] Get address space mapping by target triple environment
As we introduced target triple environment amdgiz and amdgizcl, the address
space values are no longer enums. We have to decide the value by target triple.

The basic idea is to use struct AMDGPUAS to represent address space values.
For address space values which are not depend on target triple, use static
const members, so that they don't occupy extra memory space and is equivalent
to a compile time constant.

Since the struct is lightweight and cheap, it can be created on the fly at
the point of usage. Or it can be added as member to a pass and created at
the beginning of the run* function.

Differential Revision: https://reviews.llvm.org/D31284


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298846 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-27 14:04:01 +00:00
Konstantin Zhuravlyov
38890eb0c2 [AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI
Differential Revision: https://reviews.llvm.org/D29674


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297499 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 19:39:07 +00:00
Matt Arsenault
206dfa3c0d AMDGPU: Don't add emergency stack slot if all spills are SGPR->VGPR
This should avoid reporting any stack needs to be allocated in the
case where no stack is truly used. An unused stack slot is still
left around in other cases where there are real stack objects
but no spilling occurs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295891 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 22:23:32 +00:00
Matt Arsenault
138d429065 AMDGPU: Always allocate emergency stack slot at offset 0
This allows us to ensure that 0 is never a valid pointer
to a user object, and ensures that the offset is always legal
without needing a register to access it. This comes at the cost
of usable offsets and wasted stack space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295877 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 21:05:25 +00:00
Matt Arsenault
bcb6a77aca AMDGPU: Don't use stack space for SGPR->VGPR spills
Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.

I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.

The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295753 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 19:12:08 +00:00
Matt Arsenault
83c857cd3a AMDGPU: Merge initial gfx9 support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 18:29:53 +00:00
Konstantin Zhuravlyov
c478d3544a [AMDGPU] Move register related queries to subtarget class
Differential Revision: https://reviews.llvm.org/D29318


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294440 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 13:02:33 +00:00
Tom Stellard
d367f44048 AMDGPU add support for spilling to a user sgpr pointed buffers
Summary:
This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].

Patch By: Dave Airlie

Reviewers: nhaehnle, arsenm, tstellarAMD

Reviewed By: arsenm

Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D25428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293000 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-25 01:25:13 +00:00
Matt Arsenault
2d7bc6b1e1 AMDGPU: Fix using incorrect private resource with no allocation
It's possible to have a use of the private resource descriptor or
scratch wave offset registers even though there are no allocated
stack objects. This would result in continuing to use the maximum
number reserved registers. This could go over the number of SGPRs
available on VI, or violate the SGPR limit requested by
the function attributes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285435 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 19:43:31 +00:00
Tom Stellard
1961591989 AMDGPU/SI: Add support for triples with the mesa3d operating system
Summary:
mesa3d will use the same kernel calling convention as amdhsa, but it will
handle everything else like the default 'unknown' OS type.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D22783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281779 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16 21:34:26 +00:00
Konstantin Zhuravlyov
1f99c41083 [AMDGPU] Wave and register controls
- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch

Patch by Tom Stellard and Konstantin Zhuravlyov

Differential Revision: https://reviews.llvm.org/D21562



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280747 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:22:28 +00:00
Matt Arsenault
6b025a011f AMDGPU: Use copy instead of mov during frame lowering
This occurs before RA pseudos are expanded. It's less
code to emit the copy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280297 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 21:52:25 +00:00
Matt Arsenault
8cf15c673c AMDGPU: Refactor frame lowering
This will make future changes easier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280296 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 21:52:21 +00:00
Matt Arsenault
898f0e0994 AMDGPU: Use CreateStackObject instead of CreateSpillStackObject
I'm not sure what the difference is, but no other target
uses this for emergency spill slots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278272 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-10 19:11:36 +00:00
Matthias Braun
f79c57a412 MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277017 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 18:40:00 +00:00
Konstantin Zhuravlyov
20c7a48718 [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.

Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
  - offset 0: work group ID x
  - offset 4: work group ID y
  - offset 8: work group ID z
  - offset 16: work item ID x
  - offset 20: work item ID y
  - offset 24: work item ID z

Set
  - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
  - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
  - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled

Differential Revision: http://reviews.llvm.org/D20335


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273769 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-25 03:11:28 +00:00
Matt Arsenault
759ed7e410 AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 06:30:11 +00:00
Nirav Dave
0e9b526d1b Soften assertion in AMDGPU emitPrologue.
[AMDGPU] emitPrologue looks for an unused unallocated SGPR that is not
the scratch descriptor. Continue search if unused register found fails
other requirements.

Reviewers: arsenm, tstellarAMD, nhaehnle

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D20526

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270646 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 01:45:42 +00:00
Matt Arsenault
71a492ec1e AMDGPU: Fix assert on ttmp registers
Use register class that does not include them when looking
for unallocated registers.

This is hit by the udiv v8i64 test in the opencl integer
conformance test, and takes a few seconds to compile in
a debug build so no test included.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269938 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-18 15:19:50 +00:00
Tom Stellard
ca5193215b AMDGPU/SI: Don't try to move scratch wave offset when there are no free SGPRs
Summary:
When there were no free SGPRs, we were trying to move this value into
some of the reserved registers which was causing a segmentation fault.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262577 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-03 03:45:09 +00:00
Matt Arsenault
c3aa2775c2 AMDGPU: Set flat_scratch from flat_scratch_init reg
This was hardcoded to the static private size, but this
would be missing the offset and additional size for someday
when we have dynamic sizing.

Also stops always initializing flat_scratch even when unused.

In the future we should stop emitting this unless flat instructions
are used to access private memory. For example this will initialize
it almost always on VI because flat is used for global access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260658 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 06:31:30 +00:00
Nicolai Haehnle
cb2ad1e249 AMDGPU/SI: Do not move scratch resource register on Tonga & Iceland
Due to the SGPR init bug, every program claims to use the same number
of SGPRs anyway, so there's no point in trying to shift those registers
down from their initial spot of reservation.

Add a test that uses VGPR spilling and blocks most SGPRs from being used for
the scratch resource register. Previously, this would run into an assertion.

Differential Revision: http://reviews.llvm.org/D15724

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256870 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 20:42:49 +00:00
Matt Arsenault
0f1b95f818 AMDGPU: Rework how private buffer passed for HSA
If we know we have stack objects, we reserve the registers
that the private buffer resource and wave offset are passed
and use them directly.

If not, reserve the last 5 SGPRs just in case we need to spill.
After register allocation, try to pick the next available registers
instead of the last SGPRs, and then insert copies from the inputs
to the reserved registers in the progloue.

This also only selectively enables all of the input registers
which are really required instead of always enabling them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254331 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-30 21:16:03 +00:00