Commit Graph

60 Commits

Author SHA1 Message Date
Matt Arsenault
e407a109c1 AMDGPU: Simplify boolean conditional return statements
Patch by Richard Thomson

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262536 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 23:00:21 +00:00
Matt Arsenault
7ff5c71a61 AMDGPU: Don't emit build_pair during udivrem legalization
Technically you aren't supposed to emit these after type legalization
for some reason, and we use vector extracts of bitcasted integers
as the canonical way to do this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262298 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-01 05:06:05 +00:00
Matt Arsenault
12cb9f057f AMDGPU: Set HasExtractBitInsn
This currently does not have the control over the bitwidth,
and there are missing optimizations to reduce the integer to
32-bit if it can be.

But in most situations we do want the sinking to occur.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262296 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-01 04:58:17 +00:00
Matt Arsenault
a4c1dc826a AMDGPU: Rename intrinsic to better match instruction name
Also fixes missing f32 test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260780 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 01:03:00 +00:00
Matt Arsenault
197cdba264 AMDGPU: Fix mishandling alignment when scalarizing vector loads/stores
I don't think this was causing any real problems, so I'm not sure
how to test for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260646 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 02:22:21 +00:00
Matt Arsenault
b49a0edca2 AMDGPU: Split R600 and SI store lowering
These were only sharing some somewhat incorrect
logic for when to scalarize or split vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260490 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-11 05:32:46 +00:00
Matt Arsenault
e6640ee461 AMDGPU: Split R600 and SI load lowering
These weren't actually sharing anything in the common
LowerLOAD.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260398 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-10 18:21:39 +00:00
Ahmed Bougacha
3248c624fa [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260316 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09 22:54:12 +00:00
Matt Arsenault
60a32b5936 AMDGPU: Remove bfi and bfm intrinsics
Nothing is using them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260123 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-08 19:06:01 +00:00
Matt Arsenault
c0410c6002 AMDGPU: Account for LDS alignment
The current situation isn't great, because the amount of padding
requires is determined by the inverse order of the first encountered
use. We should eventually somehow sort these to minimize wasted space.

Another problem is the alignment of kernel arguments isn't
respected. The group_segment_alignment is always emitted as
the default 16, and typed arguments with higher alignments
or an explicitly set alignment are also ignored.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259912 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-05 19:47:29 +00:00
Oliver Stannard
9ed9eb72f4 Refactor backend diagnostics for unsupported features
Re-commit of r258951 after fixing layering violation.

The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259498 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 13:52:43 +00:00
Matt Arsenault
3d679fa973 AMDGPU: Remove 24-bit intrinsics
The known bit matching code seems to work reasonably well,
so these shouldn't really be needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259180 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-29 10:05:16 +00:00
Matt Arsenault
e26a9f0de4 AMDGPU: Match fmed3 patterns with legacy fmin/fmax
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259090 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 20:53:48 +00:00
Matt Arsenault
6a2bf372b8 AMDGPU: Match some med3 patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259089 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 20:53:42 +00:00
Oliver Stannard
b95072ef89 Revert r259035, it introduces a cyclic library dependency
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259045 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 13:19:47 +00:00
Oliver Stannard
ef19a274ad Add backend dignostic printer for unsupported features
Re-commit of r258951 after fixing layering violation.

The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.

In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.

Differential Revision: http://reviews.llvm.org/D16590



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259035 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 10:07:27 +00:00
NAKAMURA Takumi
c1aeea845d Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features"
It broke layering violation in LLVMIR.

clang r258950 "Add backend dignostic printer for unsupported features"
llvm  r258951 "Refactor backend diagnostics for unsupported features"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259016 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-28 04:41:32 +00:00
Oliver Stannard
bf8415a84d Refactor backend diagnostics for unsupported features
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.

The implementation of DiagnosticInfoUnsupported::print must be in
lib/Codegen rather than in the existing file in lib/IR/ to avoid
introducing a dependency from IR to CodeGen.

Differential Revision: http://reviews.llvm.org/D16590



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258951 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-27 17:30:33 +00:00
Matt Arsenault
ae4d40b742 AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now
Also move into backend intrinsics to discourage use of the old name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258783 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 04:14:16 +00:00
Matt Arsenault
78c5400038 AMDGPU: Remove more unused intrinsics
Replace tests with lrp with basic IR expansion

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258612 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 05:42:38 +00:00
Matt Arsenault
79f77a9c0d AMDGPU: Move amdgcn intrinsic handling into SITargetLowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-23 05:32:20 +00:00
Matt Arsenault
faf8ffaefd AMDGPU: Rename intrinsics to use amdgcn prefix
The intrinsic target prefix should match the target name
as it appears in the triple.

This is not yet complete, but gets most of the important ones.
llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled
for compatability for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258557 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 21:30:34 +00:00
Matt Arsenault
a98abc22cb AMDGPU: Remove AMDGPU.trunc intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258348 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 21:05:53 +00:00
Matt Arsenault
8cafd8eeaa AMDGPU: Remove AMDIL.round.nearest intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258346 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 21:05:40 +00:00
Matt Arsenault
f70b08b399 AMDGPU: Remove abs intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258343 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 20:58:29 +00:00
Matt Arsenault
68886ef2dc AMDGPU: Remove min/max intrinsics
This removes support for mesa 11.0.x

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258342 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 20:50:19 +00:00
Matt Arsenault
67893e08de AMDGPU: Reduce 64-bit SRAs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258096 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-18 22:09:04 +00:00
Matt Arsenault
a60d27cb35 AMDGPU: Split 64-bit and of constant up
This breaks the tests that were meant for testing
64-bit inline immediates, so move those to shl where
they won't be broken up.

This should be repeated for the other related bit ops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258095 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-18 22:01:13 +00:00
Matt Arsenault
deaace45d6 AMDGPU: Generalize shl combine
Reduce 64-bit shl with constant > 32. We already special cased
this for the == 32 case, but this also works for any >= 32 constant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258092 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-18 21:55:14 +00:00
Matt Arsenault
cc893f0656 AMDGPU: Reduce 64-bit lshr by constant to 32-bit
64-bit shifts are very slow on some subtargets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258090 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-18 21:43:36 +00:00
Manuel Jacob
75e1cfb035 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257999 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-16 20:30:46 +00:00
Marek Olsak
7207597936 AMDGPU/SI: Add support for non-void functions
Summary:
Return values can be stored in SGPRs (i32) and VGPRs (f32).

This will be used by functions which expect some bytecode or other binary to
be appended at the end. It allows defining in which registers the return
values will be stored.

v2: don't do this for compute shaders

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D16033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257621 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-13 17:23:04 +00:00
Matt Arsenault
6e3a667705 AMDGPU: Implement {{s|u}}int_to_fp i64 -> f32
The old lowering for uint_to_fp failed opencl conformance.
It might be OK for fast math mode, but I'm not sure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257393 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 22:01:48 +00:00
Matt Arsenault
68f559ea61 AMDGPU: Fix ctlz combine for sub 32-bit types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257353 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 17:02:06 +00:00
Matt Arsenault
f12a12cd25 AMDGPU: Pattern match ffbh pattern to instruction.
The hardware instruction's output on 0 is -1 rather than 32.
Eliminate a test and select to -1. This removes an extra instruction
from the compatability function with HSAIL's firstbit instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257352 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 17:02:00 +00:00
Matt Arsenault
01a6cb6ce3 AMDGPU: Custom lower i64 ctlz
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257348 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 16:50:29 +00:00
Matt Arsenault
3bbc287300 LegalizeDAG: Expand ctlz with ctlz_zero_undef if legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257345 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 16:37:46 +00:00
Matt Arsenault
1451e94ee0 AMDGPU: Use generic bitreverse intrinsic
Also fix bug in vector legalization for bitreverse.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255512 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 17:25:38 +00:00
Matt Arsenault
4f0027139e AMDGPU: Fix splitting vector loads with existing offsets
If the original MMO had an offset, it was dropped.
Also use the correct alignment after adding the new offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255508 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 16:59:40 +00:00
Artyom Skrobov
824e14ddab Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.

Reviewers: MatzeB, andreadb, tstellarAMD

Subscribers: arsenm, jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D14945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254085 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-25 19:41:11 +00:00
Matt Arsenault
25a68d8d25 AMDGPU: Split LDS vector loads
If properly aligned this could allow using ds_read_b64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253975 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-24 12:18:54 +00:00
Matt Arsenault
04abf1ee5f AMDGPU: Split x8 and x16 vector loads instead of scalarize
The one regression in the builtin tests is in the read2 test which now
(again) has many extra copies, but this should be solved once the pass
is replaced with a DAG combine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253974 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-24 12:05:03 +00:00
Matt Arsenault
07437a8e79 AMDGPU: Split DiagnosticInfoUnsupported into its own file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250959 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-21 22:37:46 +00:00
Artyom Skrobov
a8c4b61591 Don't pretend AMDGPU backend knows how to custom-lower UDIVREM for vector types; it can't
Reviewers: arsenm, jvesely, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13734


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250384 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-15 09:18:47 +00:00
Matt Arsenault
c08fe15c4f DAGCombiner: Combine extract_vector_elt from build_vector
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.

InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-12 23:59:50 +00:00
Sanjay Patel
39490133e4 propagate fast-math-flags on DAG nodes
After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, 
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: 
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.

This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I 
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.

This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.

Differential Revision: http://reviews.llvm.org/D12095



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247815 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 16:31:21 +00:00
Matt Arsenault
ca675427d2 AMDGPU: Produce error on dynamic_stackalloc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246048 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 18:37:13 +00:00
Michael Kuperstein
a4fe29414d [TLI] Refactor "is integer division cheap" queries.
This removes the isPow2SDivCheap() query, as it is not currently used in
any meaningful way. isIntDivCheap() no longer relies on a state variable
(as all in-tree target set it to false), but the interface allows querying
based on the type optimization level.

NFC.

Differential Revision: http://reviews.llvm.org/D12082

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245430 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-19 11:17:59 +00:00
James Y Knight
276ea22dc9 Remove redundant TargetFrameLowering::getFrameIndexOffset virtual
function.

This was the same as getFrameIndexReference, but without the FrameReg
output.

Differential Revision: http://reviews.llvm.org/D12042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245148 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-15 02:32:35 +00:00
Simon Pilgrim
bf616c106e [AMDGPU] Use the general SMAX/SMIN/UMAX/UMIN pattern matching and remove the AMDGPU implementation
D9746 added general SMAX/SMIN/UMAX/UMIN pattern matching to SelectionDAGBuilder::visitSelect.

Differential Revision: http://reviews.llvm.org/D12007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244960 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-13 21:40:02 +00:00