11 Commits

Author SHA1 Message Date
Matt Arsenault
138d429065 AMDGPU: Always allocate emergency stack slot at offset 0
This allows us to ensure that 0 is never a valid pointer
to a user object, and ensures that the offset is always legal
without needing a register to access it. This comes at the cost
of usable offsets and wasted stack space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295877 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 21:05:25 +00:00
Matt Arsenault
d019e8638a Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292982 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-24 22:02:15 +00:00
Matt Arsenault
fa5aafaac2 DAG: Avoid OOB when legalizing vector indexing
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291604 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 22:02:30 +00:00
Nicolai Haehnle
e299df1799 AMDGPU: Properly implement SIRegisterInfo::isFrameOffsetLegal and needsFrameBaseReg
Summary:
Without the fix to isFrameOffsetLegal to consider the instruction's
immediate offset, the new test case hits the corresponding assertion in
resolveFrameIndex, because the LocalStackSlotAllocation pass re-uses a
different base register.

With only the fix to isFrameOffsetLegal, code quality reduces in a bunch of
places because frame base registers are added where they're not needed.
This is addressed by properly implementing needsFrameBaseReg, which also
helps to avoid unnecessary zero frame indices in a bunch of other places.

Fixes piglit glsl-1.50/execution/variable-indexing/gs-output-array-vec4-index-wr.shader_test

Reviewers: arsenm, tstellarAMD

Subscribers: qcolombet, kzhuravl, wdng, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D27344

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289048 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-08 14:08:02 +00:00
Matt Arsenault
8d2aadbfac AMDGPU: Materialize frame index before add
It isn't generally safe to fold the frame index
directly into the operand since it will possibly
not be an inline immediate after it is expanded.

This surprisingly seems to produce better code, since
the FI doesn't prevent folding other immediate operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288185 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-29 19:20:48 +00:00
Matt Arsenault
f63894ba9e Reapply "AMDGPU: Don't use offen if it is 0"
This reverts r283003

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285203 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-26 15:08:16 +00:00
Mehdi Amini
3e821f8cd8 Revert "AMDGPU: Don't use offen if it is 0"
This reverts commit r282999.
Tests are not passing: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/20038

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283003 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:35:24 +00:00
Matt Arsenault
494146de48 AMDGPU: Don't use offen if it is 0
This removes many re-initializations of a base register to 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282999 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 01:37:15 +00:00
Matt Arsenault
de82da5521 AMDGPU: Fix broken FrameIndex handling
We were trying to avoid using a FrameIndex operand in non-pointer
operands in a convoluted way, and would break because of
using TargetFrameIndex. The TargetFrameIndex should only be used
in the case where it makes sense to fold it as part of the addressing
mode, otherwise it requires materialization like a normal constant.
This wasn't working reliably and failed in the added testcase, hitting
the assert when processing the frame index.

The TargetFrameIndex was coming from trying to produce an AssertZext
limiting the maximum stack size. I'm not sure this was correct to begin
with, because it is apparently possible to have a single workitem
dispatch that requires all 4G of private memory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281824 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-17 16:09:55 +00:00
Matt Arsenault
0f7125844e AMDGPU: Support folding FrameIndex operands
This avoids test regressions in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281491 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-14 15:51:33 +00:00
Nicolai Haehnle
a48da984f5 AMDGPU: fix local stack slot allocation bugs
Summary:
The main bug fix here is using the 32-bit encoding of V_ADD_I32 in
materializeFrameBaseRegister and resolveFrameIndex, so that arbitrary
immediates work.

The second part is that we may now require the SegmentWaveByteOffset
even when there are initially no stack objects and VGPR spilling isn't
enabled, for stack slots that are allocated later. This means that some
bits become effectively dead and can be cleaned up.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96602
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21551

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275108 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 21:44:40 +00:00