90 Commits

Author SHA1 Message Date
Dmitry Preobrazhensky
cb5431a931 [AMDGPU][MC] Fix for Bug 28207 + LIT tests
Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type

Reviewers: vpykhtin, arsenm

Differential Revision: https://reviews.llvm.org/D31327

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-27 15:57:17 +00:00
Yaxun Liu
ab3be33d40 [AMDGPU] Get address space mapping by target triple environment
As we introduced target triple environment amdgiz and amdgizcl, the address
space values are no longer enums. We have to decide the value by target triple.

The basic idea is to use struct AMDGPUAS to represent address space values.
For address space values which are not depend on target triple, use static
const members, so that they don't occupy extra memory space and is equivalent
to a compile time constant.

Since the struct is lightweight and cheap, it can be created on the fly at
the point of usage. Or it can be added as member to a pass and created at
the beginning of the run* function.

Differential Revision: https://reviews.llvm.org/D31284


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298846 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-27 14:04:01 +00:00
Matt Arsenault
27f4f2f4bc AMDGPU: Support v2i16/v2f16 packed operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296396 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 22:15:25 +00:00
Matt Arsenault
3b595d2304 AMDGPU: Generalize matching of v_med3_f32
I think this is safe as long as no inputs are known to ever
be nans.

Also add an intrinsic for fmed3 to be able to handle all safe
math cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293598 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 03:07:46 +00:00
Matt Arsenault
f39022545d AMDGPU: Make i32 uaddo/usubo legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293514 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 18:11:38 +00:00
Tom Stellard
1f91c2f5d6 AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D29068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293321 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 18:41:14 +00:00
Matt Arsenault
cfe56d7c95 AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.

This should be applied to the release branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292472 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-19 06:04:12 +00:00
Jan Vesely
0835374acb AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes
This will make transition to SCRATCH_MEMORY easier

Differential Revision: https://reviews.llvm.org/D24746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-06 21:00:46 +00:00
Matt Arsenault
0b698fea77 AMDGPU: Select branch on undef to uniform scc branch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289877 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 21:57:11 +00:00
Eugene Zelenko
43dec7d682 [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289282 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-09 22:06:55 +00:00
Tom Stellard
c53f76cc0b AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.
Patch By: Wei Ding

Summary: This patch fixes the fdiv precision issues.

Reviewers: b-sumner, cfang, wdng, arsenm

Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D26424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288879 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 02:42:15 +00:00
Marek Olsak
2a24827c23 AMDGPU/SI: Add back reverted SGPR spilling code, but disable it
suggested as a better solution by Matt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287942 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-25 17:37:09 +00:00
Marek Olsak
2acdc08776 Revert "AMDGPU: Make m0 unallocatable"
This reverts commit 124ad83dae04514f943902446520c859adee0e96.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287932 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-25 16:03:15 +00:00
Matt Arsenault
124ad83dae AMDGPU: Make m0 unallocatable
m0 may need to be written for spill code, so
we don't want general code uses relying on the
value stored in it.

This introduces a few code quality regressions where copies
from m0 are not coalesced into copies of a copy of m0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287841 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-24 00:26:40 +00:00
Matt Arsenault
f577de357a AMDGPU: Remove unnecessary and on conditional branch
The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286134 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-07 19:09:33 +00:00
Matt Arsenault
ac4d1bb2a0 AMDGPU: Handle CopyToReg in getOperandRegClass
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285768 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-01 23:22:17 +00:00
Nicolai Haehnle
877e3beed6 AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes
Summary:
This will be used for 64-bit MULHU, which is in turn used for the 64-bit
divide-by-constant optimization (see D24822).

Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25289

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284224 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 10:30:00 +00:00
Konstantin Zhuravlyov
49e7805871 [AMDGPU] Pass optimization level to SelectionDAGISel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283133 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 18:47:26 +00:00
Mehdi Amini
67f335d992 Use StringRef in Pass/PassManager APIs (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:56:57 +00:00
Matt Arsenault
de82da5521 AMDGPU: Fix broken FrameIndex handling
We were trying to avoid using a FrameIndex operand in non-pointer
operands in a convoluted way, and would break because of
using TargetFrameIndex. The TargetFrameIndex should only be used
in the case where it makes sense to fold it as part of the addressing
mode, otherwise it requires materialization like a normal constant.
This wasn't working reliably and failed in the added testcase, hitting
the assert when processing the frame index.

The TargetFrameIndex was coming from trying to produce an AssertZext
limiting the maximum stack size. I'm not sure this was correct to begin
with, because it is apparently possible to have a single workitem
dispatch that requires all 4G of private memory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281824 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-17 16:09:55 +00:00
Matt Arsenault
982faf27a3 AMDGPU: Use i64 scalar compare instructions
VI added eq/ne for i64, so use them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281800 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-17 02:02:19 +00:00
Matt Arsenault
d5a5e9043a AMDGPU: Run LoadStoreVectorizer pass by default
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281112 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 22:29:28 +00:00
Matthias Braun
f79c57a412 MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277017 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 18:40:00 +00:00
Matt Arsenault
b5a809e37c AMDGPU: Remove analyzeImmediate
This no longer uses the more complicated classification
of constants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276945 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 00:32:02 +00:00
Nicolai Haehnle
4d01feb4ad AMDGPU: Unify MOVRELSOffset and MOVRELDOffset
Summary:
Previously, constant index insertelements would be turned into SI_INDIRECT_DST,
which is bound to prevent some optimization opportunities. Worse, it mislead
the heuristic that decides whether immediates should be lowered to S_MOV_B32
or V_MOV_B32 in a way that resulted in unnecessary v_readfirstlanes.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D22217

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275160 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-12 08:12:16 +00:00
Matt Arsenault
c39550268e AMDGPU: Improve offset folding for register indexing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274954 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-09 01:13:56 +00:00
Tom Stellard
db74125d0c AMDGPU/SI: Remove address space query functions from AMDGPUDAGToDAGISel
Summary:
These have been replaced with TableGen code (except for isConstantLoad,
which is still used for R600).  The queries were broken for cases
where MemOperand was a PseudoSourceValue.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21684

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274561 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-05 16:10:44 +00:00
Tom Stellard
25cd212829 AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads
This moves of the r600 logic out of isGlobalLoad() and into the
TableGen files.

Differential Revision: http://reviews.llvm.org/D21710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274527 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-05 00:12:51 +00:00
Tom Stellard
b60390bf60 AMDGPU/SI: Remove hack for selecting < 32-bit loads to MUBUF instructions
Summary:
The isGlobalLoad() query was returning true for constant address space loads
with memory types less than 32-bits, which is wrong.  This logic has been
replaced with PatFrag in the TableGen files, to provide the same functionality.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274521 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-04 20:41:48 +00:00
Matt Arsenault
759ed7e410 AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 06:30:11 +00:00
Matt Arsenault
2816573cb7 AMDGPU: Fix gcc warnings
Mostly removing dead code. Apparently gcc's warning
for unused functions is better

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273363 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 01:53:49 +00:00
Rafael Espindola
95ba82925b Delete more dead code.
Found by gcc 6.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273322 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-21 21:51:41 +00:00
Rafael Espindola
1963865e9d Delete some dead code.
Found by gcc 6.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273303 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-21 19:48:12 +00:00
NAKAMURA Takumi
96b66d10fe Reformat blank lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273131 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 01:05:15 +00:00
NAKAMURA Takumi
82f8dab579 Untabify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273129 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 00:37:41 +00:00
Nicolai Haehnle
682fc3e780 AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics
Summary:
This fixes two related bugs. First, the generic optimization passes
unfortunately generate negative constant offsets but the hardware treats
SOffset as an unsigned value.

Second, there is a hardware bug on SI and CI, where address clamping in MUBUF
instructions does not work correctly when SOffset is larger than the buffer
size. This patch works around this bug by never using SOffset.

An alternative workaround would be to do the clamping manually when SOffset
is too large, but generating the required code sequence during instruction
selection would be rather involved, and in any case the resulting code would
probably be worse.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272761 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 07:13:05 +00:00
Benjamin Kramer
af18e017d2 Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272512 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 15:39:02 +00:00
Tom Stellard
60f588f570 AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_*permute
Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.

Reviewers: scchan, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272349 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 00:01:04 +00:00
Matt Arsenault
4080a06a24 AMDGPU: Fix flat atomics
The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272345 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-09 23:42:54 +00:00
Matt Arsenault
bada556f73 AMDGPU: Fix i64 global cmpxchg
This was using extract_subreg sub0 to extract the low register
of the result instead of sub0_sub1, producing an invalid copy.

There doesn't seem to be a way to use the compound subreg indices
in tablegen since those are generated, so manually select it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272344 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-09 23:42:48 +00:00
Jan Vesely
fbff874b03 AMDGPU/R600: Implement memory loads from constant AS
Reviewers: tstellard

Subscribers: arsenm

Differential Revision: http://reviews.llvm.org/D19792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269479 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 20:39:29 +00:00
Justin Bogner
42e9bbeb4c SDAG: Implement Select instead of SelectImpl in AMDGPUDAGToDAGISel
- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
  the method to try* and return a bool for success.
- Where we were calling SelectNodeTo, just return afterwards.

Part of llvm.org/pr26808.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269349 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-12 21:03:32 +00:00
Simon Pilgrim
826093255c Fixed unused but set variable warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268931 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 16:42:23 +00:00
Justin Bogner
9ed38db20e SDAG: Rename Select->SelectImpl and repurpose Select as returning void
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.

We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.

Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268693 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-05 23:19:08 +00:00
Matt Arsenault
7614ec6431 AMDGPU: Make i64 loads/stores promote to v2i32
Now that unaligned access expansion should not attempt
to produce i64 accesses, we can remove the hack in
PreprocessISelDAG where this is done.

This allows splitting i64 private accesses while
allowing the new add nodes indexing the vector components
can be folded with the base pointer arithmetic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268293 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-02 20:07:26 +00:00
Tom Stellard
ac19ae8d63 AMDGPU/SI: Add offset field to ds_permute/ds_bpermute instructions
Summary:
These instructions can add an immediate offset to the address, like other
ds instructions.

Reviewers: arsenm

Subscribers: arsenm, scchan

Differential Revision: http://reviews.llvm.org/D19233

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 14:34:26 +00:00
Matt Arsenault
f9fe659922 AMDGPU: Implement addrspacecast
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267452 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-25 19:27:24 +00:00
Matt Arsenault
4bfa27af78 AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267244 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 22:59:16 +00:00
Nicolai Haehnle
44aa3537dd [StructurizeCFG] Annotate branches that were treated as uniform
Summary:
This fully solves the problem where the StructurizeCFG pass does not
consider the same branches as uniform as the SIAnnotateControlFlow pass.
The patch in D19013 helps with this problem, but is not sufficient
(and, interestingly, causes a "regression" with one of the existing
test cases).

No tests included here, because tests in D19013 already cover this.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19018

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266346 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 17:42:35 +00:00
Matt Arsenault
bc0aee542f AMDGPU: Add atomic_inc + atomic_dec intrinsics
These are different than atomicrmw add 1 because they have
an additional input value to clamp the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 14:05:04 +00:00