Commit Graph

209 Commits

Author SHA1 Message Date
Matt Arsenault
e3601c75c9 AMDGPU: Set element_size in private resource descriptor
Introduce a subtarget feature for this, and leave the default with
the current behavior which assumes up to 16-byte loads/stores can
be used. The field also seems to have the ability to be set to 2 bytes,
but I'm not sure what that would be used for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260651 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 02:40:47 +00:00
Nicolai Haehnle
e22674efca AMDGPU: Quick fix for extreme slowness in spill-scavenge-offset.ll test
Summary: Also, some cosmetic fixes.

Reviewers: arsenm, tstellarAMD

Subscribers: qcolombet, llvm-commits

Differential Revision: http://reviews.llvm.org/D17161

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260625 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 00:05:34 +00:00
Tom Stellard
dfa41e52a9 AMDGPU/SI: Make sure MIMG descriptors and samplers stay in SGPRs
Summary:
It's possible to have resource descriptors and samplers stored in
VGPRs, either by a VMEM instruction or in the case of samplers,
floating-point calculations.  When this happens, we need to use
v_readfirstlane to copy these values back to sgprs.

Reviewers: mareko, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260599 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-11 21:45:07 +00:00
Matt Arsenault
d581c66591 AMDGPU: Fix constant bus use check with subregisters
If the two operands to an instruction were both
subregisters of the same super register, it would incorrectly
think this counted as the same constant bus use.

This fixes the verifier error in fmin_legacy.ll which
was missing -verify-machineinstrs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260495 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-11 06:15:39 +00:00
Matt Arsenault
fae18e933b AMDGPU: Remove some old intrinsic uses from tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260493 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-11 06:02:01 +00:00
Nicolai Haehnle
ac2300f5ed AMDGPU: Release the scavenged offset register during VGPR spill
Summary:
This fixes a crash where subsequent spills would be unable to scavenge
a register. In particular, it fixes a crash in piglit's
spec@glsl-1.50@execution@geometry@max-input-components (the test still
has a shader that fails to compile because of too many SGPR spills, but
at least it doesn't crash any more).

This is a candidate for the release branch.

Reviewers: arsenm, tstellarAMD

Subscribers: qcolombet, arsenm

Differential Revision: http://reviews.llvm.org/D16558

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260427 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-10 20:13:58 +00:00
Matt Arsenault
60a32b5936 AMDGPU: Remove bfi and bfm intrinsics
Nothing is using them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260123 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-08 19:06:01 +00:00
Matt Arsenault
98d69cc318 SelectionDAG: Lower some range metadata to AssertZext
If a range has a lower bound of 0, add an AssertZext from the
nearest floor power of two.

This allows operations with some workitem intrinsics with known
maximum ranges to use fast 24-bit multiplies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260109 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-08 16:28:19 +00:00
Matt Arsenault
c0410c6002 AMDGPU: Account for LDS alignment
The current situation isn't great, because the amount of padding
requires is determined by the inverse order of the first encountered
use. We should eventually somehow sort these to minimize wasted space.

Another problem is the alignment of kernel arguments isn't
respected. The group_segment_alignment is always emitted as
the default 16, and typed arguments with higher alignments
or an explicitly set alignment are also ignored.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259912 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-05 19:47:29 +00:00
Matt Arsenault
d1d0a1a39d AMDGPU: Preserve alignments on new created globals
Also switch to internal linkage, and include the name of the function in
the name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259911 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-05 19:47:23 +00:00
Jonas Paulsson
8e9339f574 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
Recommited, after some fixing with test cases.

Updated test cases:
test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
test/CodeGen/AArch64/tailcall_misched_graph.ll

Temporarily disabled test cases:
test/CodeGen/AMDGPU/split-vector-memoperand-offsets.ll
test/CodeGen/PowerPC/ppc64-fastcc.ll (partially updated)
test/CodeGen/PowerPC/vsx-fma-m.ll
test/CodeGen/PowerPC/vsx-fma-sp.ll

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259673 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-03 17:52:29 +00:00
Matt Arsenault
f1f2dd4ca2 AMDGPU: Do not promote allocas with non-inbounds GEPs
If we can't assume the pointer value isn't within the bounds
of the object, it seems risky to try to replace the pointer
calculations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259573 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 21:16:12 +00:00
Matt Arsenault
551787639e AMDGPU: Handle promoting memmove
Also add missing tests for the others.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259558 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 20:28:10 +00:00
Matt Arsenault
ec856e4504 AMDGPU: Skip promote alloca with no optimizations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259551 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:32:42 +00:00
Matt Arsenault
53b80ebb13 AMDGPU: Whitelist handled intrinsics
We shouldn't crash on unhandled intrinsics.
Also simplify failure handling in loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259546 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:18:53 +00:00
Matt Arsenault
374613d697 AMDGPU: Use inbounds when calculating workitem offset
When promoting allocas to LDS, we know we are indexing
into a specific area just created, and the calculation
will also never overflow.

Also emit some of the muls as nsw nuw, because instcombine
infers this already from the range metadata. I think
putting this on the other adds and muls might be OK too,
but I'm not 100% sure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259545 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:18:48 +00:00
Oliver Stannard
9ed9eb72f4 Refactor backend diagnostics for unsupported features
Re-commit of r258951 after fixing layering violation.

The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259498 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 13:52:43 +00:00
Matt Arsenault
084a86c382 AMDGPU: Fix emitting invalid workitem intrinsics for HSA
The AMDGPUPromoteAlloca pass was emitting the read.local.size
calls, which with HSA was incorrectly selected to reading from
the offset mesa uses off of the kernarg pointer.

Error on intrinsics which aren't supported by HSA, and start
emitting the correct IR to read the workgroup size
out of the dispatch pointer.

Also initialize the pass so it can be tested with opt, and
start moving towards not depending on the subtarget as an
argument.

Start emitting errors for the intrinsics not handled with HSA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259297 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-30 05:19:45 +00:00
Matt Arsenault
9a30bf46a7 AMDGPU: Stop checking intrinsics not used by HSA for dispatch-ptr
Only the dispatch.ptr intrinsic is supposed to be used now to get
the workgroup size, and the read.local.size intrinsics do not
work correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259296 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-30 05:10:59 +00:00
Matt Arsenault
27e98d1c21 AMDGPU: Add new amdgcn workitem intrinsics
These use the correct prefix and follow the HSA naming convention
rather than the config register option names.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259293 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-30 04:25:19 +00:00
Matt Arsenault
3d679fa973 AMDGPU: Remove 24-bit intrinsics
The known bit matching code seems to work reasonably well,
so these shouldn't really be needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259180 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-29 10:05:16 +00:00
Matt Arsenault
e26a9f0de4 AMDGPU: Match fmed3 patterns with legacy fmin/fmax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259090 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 20:53:48 +00:00
Matt Arsenault
6a2bf372b8 AMDGPU: Match some med3 patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259089 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 20:53:42 +00:00
Matt Arsenault
ed6685cf17 AMDGPU: Set DX10Clamp bit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259088 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 20:53:35 +00:00
Oliver Stannard
b95072ef89 Revert r259035, it introduces a cyclic library dependency
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259045 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 13:19:47 +00:00
Oliver Stannard
ef19a274ad Add backend dignostic printer for unsupported features
Re-commit of r258951 after fixing layering violation.

The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.

In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.

Differential Revision: http://reviews.llvm.org/D16590



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259035 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 10:07:27 +00:00
NAKAMURA Takumi
c1aeea845d Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features"
It broke layering violation in LLVMIR.

clang r258950 "Add backend dignostic printer for unsupported features"
llvm  r258951 "Refactor backend diagnostics for unsupported features"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259016 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 04:41:32 +00:00
Oliver Stannard
bf8415a84d Refactor backend diagnostics for unsupported features
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.

The implementation of DiagnosticInfoUnsupported::print must be in
lib/Codegen rather than in the existing file in lib/IR/ to avoid
introducing a dependency from IR to CodeGen.

Differential Revision: http://reviews.llvm.org/D16590



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258951 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-27 17:30:33 +00:00
Marek Olsak
73be6ab813 AMDGPU/SI: Stoney has only 16 LDS banks
Summary:
This is a candidate for stable, along with all patches that add the "stoney"
processor.

Reviewers: tstellarAMD

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D16485

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258922 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-27 11:19:45 +00:00
Matt Arsenault
de2c3bc98d AMDGPU: Fix default device handling
When no device name is specified, default to kaveri
for HSA since SI is not supported and it woud fail.

Default to "tahiti" instead of "SI" since these are
effectively the same, and tahiti is an actual device.

Move default device handling to the TargetMachine
rather than the AMDGPUSubtarget. The module ISA version
is computed from the device name provided with the target
machine, so the attributes printed by the AsmPrinter were
inconsistent with those computed in the subtarget.

Also remove DevName field from subtarget since it's redundant
with getCPU() in the superclass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258901 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-27 02:17:49 +00:00
Matt Arsenault
ce00361269 AMDGPU: Make v32i8/v64i8 illegal types
Old intrinsics were forcing these, but they have now all
been removed. This fixes large i8 vector operations generally
being broken.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258788 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 04:43:48 +00:00
Matt Arsenault
337264a351 AMDGPU: Remove old sample intrinsics
I did my best to try to update all the uses in tests that
just happened to use the old ones to the newer intrinsics.

I'm not sure I got all of the immediate operand conversions
correct, since the value seems to have been ignored by the
old pattern but I don't think it really matters.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258787 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 04:38:08 +00:00
Matt Arsenault
2aa06ab7ea AMDGPU: Add new amdgcn intrinsics for cube instructions
More cleanup to try to get all intrinsics using the correct
amdgcn prefix that are as close to the instruction as possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258786 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 04:29:56 +00:00
Matt Arsenault
c024d32472 AMDGPU: Implement read_register and write_register intrinsics
Some of the special intrinsics now that now correspond to a instruction
also have special setting of some registers, e.g. llvm.SI.sendmsg sets
m0 as well as use s_sendmsg. Using these explicit register intrinsics
may be a better option.

Reading the exec mask and others may be useful for debugging. For this
I'm not sure this is entirely correct because we would want this to
be convergent, although it's possible this is already treated
sufficently conservatively.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258785 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 04:29:24 +00:00
Matt Arsenault
ae4d40b742 AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now
Also move into backend intrinsics to discourage use of the old name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258783 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 04:14:16 +00:00
Dan Gohman
f4e788949d [MC] Use .p2align instead of .align
For historic reasons, the behavior of .align differs between targets.
Fortunately, there are alternatives, .p2align and .balign, which make the
interpretation of the parameter explicit, and which behave consistently across
targets.

This patch teaches MC to use .p2align instead of .align, so that people reading
code for multiple architectures don't have to remember which way each platform
does its .align directive.

Differential Revision: http://reviews.llvm.org/D16549


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258750 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 00:03:25 +00:00
Matt Arsenault
ba78f314e9 AMDGPU: Replace some deprecated intrinsic uses in tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258614 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 05:42:49 +00:00
Matt Arsenault
ed7be7aac6 AMDGPU: Run instnamer on a few tests
This will make future test updates easier

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258613 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 05:42:43 +00:00
Matt Arsenault
78c5400038 AMDGPU: Remove more unused intrinsics
Replace tests with lrp with basic IR expansion

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258612 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 05:42:38 +00:00
Matt Arsenault
c5d9da7bab AMDGPU: Add new name for barrier intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258558 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 21:30:43 +00:00
Matt Arsenault
faf8ffaefd AMDGPU: Rename intrinsics to use amdgcn prefix
The intrinsic target prefix should match the target name
as it appears in the triple.

This is not yet complete, but gets most of the important ones.
llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled
for compatability for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258557 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 21:30:34 +00:00
Matt Arsenault
584bbb20e9 AMDGPU: Fix crash with invariant markers
The promote alloca pass didn't handle these intrinsics and crashed.
These intrinsics should accept any address space, but for now just
erase them to avoid breaking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258537 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 19:47:54 +00:00
Matt Arsenault
a75de8d6ed AMDGPU: Rename some r600 intrinsics to use correct TargetPrefix
These ones aren't directly emitted by mesa and inserted by a pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258523 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 19:00:09 +00:00
Matt Arsenault
eb3f71fe95 AMDGPU: Remove AMDGPU.fract intrinsic
Mesa doesn't use this, and this is pattern matched already
from fsub x, (ffloor x)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258513 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 18:42:38 +00:00
Tom Stellard
221c8b9773 AMDGPU/SI: Promote i1 SETCC operations
Summary:
While working on uniform branching, I've hit a few cases where we emit
i1 SETCC operations.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16233

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258352 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 21:48:24 +00:00
Matt Arsenault
a98abc22cb AMDGPU: Remove AMDGPU.trunc intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258348 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 21:05:53 +00:00
Matt Arsenault
1cbcadaf34 AMDGPU: Remove AMDIL.fraction intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258347 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 21:05:49 +00:00
Matt Arsenault
8cafd8eeaa AMDGPU: Remove AMDIL.round.nearest intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258346 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 21:05:40 +00:00
Matt Arsenault
f70b08b399 AMDGPU: Remove abs intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258343 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 20:58:29 +00:00
Matt Arsenault
68886ef2dc AMDGPU: Remove min/max intrinsics
This removes support for mesa 11.0.x

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258342 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 20:50:19 +00:00