122 Commits

Author SHA1 Message Date
Matt Arsenault
bd04b64cd1 AMDGPU: Select v_mad_u64_u32 and v_mad_i64_i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317492 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06 17:04:37 +00:00
Marek Olsak
0ce6825d9e AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset
Summary:
Apps that benefit:
- alien isolation
- bioshock infinite
- civilization: beyond earth
- company of heroes 2
- dirt showdown
- dota 2
- F1 2015
- grid autosport
- hitman
- legend of grimrock
- serious sam 3: bfe
- shadow warrior
- talos principle
- total war: warhammer
- UE4 demos: effects cave, elemental, sun temple

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38914

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317038 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 21:06:42 +00:00
NAKAMURA Takumi
d8ff2f49ce Untabify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316079 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 13:31:28 +00:00
Vitaly Buka
eec5b16c88 Remove unused variables
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315847 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-15 05:35:02 +00:00
Matt Arsenault
b1763ab78e AMDGPU: Look for src mods before fp_extend
When selecting modifiers for mad_mix instructions,
look at fneg/fabs that occur before the conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315748 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 20:45:49 +00:00
Matt Arsenault
16961e0820 AMDGPU: Fix failure to select branch with optnone
opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass.
The heuristic in the custom selector for brcond deferred the
branch uniformity check to the pattern, which would fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315360 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 20:34:49 +00:00
Matt Arsenault
d341fb0564 AMDGPU: Fix incorrect selection of pseudo-branches
These should only be used if the machine structurizer is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315357 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 20:22:07 +00:00
Nicolai Haehnle
73de2f1810 AMDGPU: Split MUBUF offset into aligned components
Summary:
Atomic buffer operations do not work (and trap on gfx9) when the
components are unaligned, even if their sum is aligned.

Previously, we generated an offset of 4156 without an SGPR by
splitting it as 4095 + 61 (immediate + inline constant). The
highest offset for which we can do this correctly is 4156 = 4092 + 64.

Fixes dEQP-GLES31.functional.ssbo.atomic.*

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37850

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 12:22:23 +00:00
Matt Arsenault
7287fcb5d5 AMDGPU: Start selecting v_mad_mixlo_f16
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 20:28:39 +00:00
Matt Arsenault
a942315e5f AMDGPU: Match load d16 hi instructions
Also starts selecting global loads for constant address
in some cases. Some end up selecting to mubuf still, which
requires investigation.

We still get sub-optimal regalloc and extra waitcnts inserted
due to not really tracking the liveness of the separate register
halves.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313716 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 05:01:53 +00:00
Davide Italiano
1546bf0dba [AMDGPU] Remove unused function. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312836 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 23:54:11 +00:00
Matt Arsenault
0bb6355f63 AMDGPU: Start selecting v_mad_mix_f32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 18:05:07 +00:00
Tom Stellard
56199e7135 AMDGPU: Fix warnings introduced by r310336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310337 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-08 05:52:00 +00:00
Tom Stellard
39aad0ab08 AMDGPU: Move R600 parts of AMDGPUISelDAGToDAG into their own class
Summary: This refactoring is required in order to split the R600 and GCN tablegen files.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D36286

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310336 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-08 04:57:55 +00:00
Matt Arsenault
c8c75789a0 AMDGPU: Add analysis pass for function argument info
This will allow only adding necessary inputs to callee functions
that need special inputs forwarded from the kernel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309996 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-03 22:30:46 +00:00
Matt Arsenault
d74d012b62 AMDGPU: Start selecting global instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309470 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-29 01:03:53 +00:00
Dmitry Preobrazhensky
3fa112e645 [AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier
See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591

Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm

Differential Revision: https://reviews.llvm.org/D35424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308740 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-21 13:54:11 +00:00
Matt Arsenault
524fde4af1 AMDGPU: Rename _RTN atomic instructions
Move the _RTN to the end of the name. It reads
better if the other addressing mode components
line up with the non-RTN version. It is also
more convenient to define saddr variants of
FLAT atomics to have the RTN last, and it is
good to have a consistent naming scheme.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308674 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-20 21:06:04 +00:00
Matt Arsenault
23ef7ef4e3 AMDGPU: Start selecting flat instruction offsets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305201 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-12 16:53:51 +00:00
Matt Arsenault
4923776ab2 AMDGPU: Start adding offset fields to flat instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305194 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-12 15:55:58 +00:00
Chandler Carruth
e3e43d9d57 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304787 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-06 11:49:48 +00:00
Marek Olsak
0a21c3c299 Revert "AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns"
This reverts commit e065977c4b.

It doesn't work. S_LOAD_DWORD_IMM_ci and friends aren't selected by any of
the patterns, so it was putting 32-bit literals into the 8-bit field.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-24 14:53:50 +00:00
Marek Olsak
e065977c4b AMDGPU: Fold CI-specific complex SMRD patterns into existing complex patterns
This is just a cleanup. Also, it adds checking that ByteCount is aligned to 4.

Reviewers: arsenm, nhaehnle, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303658 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 17:14:34 +00:00
Matt Arsenault
64444b4dcc AMDGPU: Change mubuf soffset register when SP relative
Check the MachinePointerInfo for whether the access is
supposed to be relative to the stack pointer.

No tests because this is used in later commits implementing
calls.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303301 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 21:02:58 +00:00
Matt Arsenault
81c9a2995b AMDGPU: Make better use of op_sel with high components
Handle more general swizzles.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303296 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 20:30:58 +00:00
Matt Arsenault
572b72726d AMDGPU: Try to use op_sel when selecting packed instructions
Avoids instructions to pack a vector when the source is really
a scalar being broadcast.

Also be smarter and look for per-component fneg.

Doesn't yet handle scalar from upper half of register
or other swizzles.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303291 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 20:00:00 +00:00
Matt Arsenault
9f4e5a06c6 AMDGPU: Remove tfe bit from flat instruction definitions
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.

Additionally actually using it requires changing the output register
class, which wasn't done anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-11 17:38:33 +00:00
Amara Emerson
195d3fa988 Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.
This removes BinaryWithFlagsSDNode, and flags are now all passed by value.

Differential Revision: https://reviews.llvm.org/D32527



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301803 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-01 15:17:51 +00:00
Davide Italiano
9636fedac3 [AMDGPU] Garbage collect dead code. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301375 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-26 01:00:52 +00:00
Matt Arsenault
185d3194ea AMDGPU: Clean up VOP3NoMods pattern
There is no need to copy the operands or inspect the sources.
Also remove some unnecessary clamp/omod usage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301363 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-25 21:17:38 +00:00
Matt Arsenault
38bd5524b0 AMDGPU: Select scratch mubuf offsets when pointer is a constant
In call sequence setups, there may not be a frame index base
and the pointer is a constant offset from the frame
pointer / scratch wave offset register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301230 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 19:40:59 +00:00
Matt Arsenault
ab28f3b39e AMDGPU: Fix invalid copies when copying i1 to phys reg
Insert a VReg_1 virtual register so the i1 workaround pass
can handle it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300113 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 21:58:23 +00:00
Dmitry Preobrazhensky
cb5431a931 [AMDGPU][MC] Fix for Bug 28207 + LIT tests
Enabled clamp and omod for v_cvt_* opcodes which have src0 of an integer type

Reviewers: vpykhtin, arsenm

Differential Revision: https://reviews.llvm.org/D31327

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-27 15:57:17 +00:00
Yaxun Liu
ab3be33d40 [AMDGPU] Get address space mapping by target triple environment
As we introduced target triple environment amdgiz and amdgizcl, the address
space values are no longer enums. We have to decide the value by target triple.

The basic idea is to use struct AMDGPUAS to represent address space values.
For address space values which are not depend on target triple, use static
const members, so that they don't occupy extra memory space and is equivalent
to a compile time constant.

Since the struct is lightweight and cheap, it can be created on the fly at
the point of usage. Or it can be added as member to a pass and created at
the beginning of the run* function.

Differential Revision: https://reviews.llvm.org/D31284


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298846 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-27 14:04:01 +00:00
Matt Arsenault
27f4f2f4bc AMDGPU: Support v2i16/v2f16 packed operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296396 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 22:15:25 +00:00
Matt Arsenault
3b595d2304 AMDGPU: Generalize matching of v_med3_f32
I think this is safe as long as no inputs are known to ever
be nans.

Also add an intrinsic for fmed3 to be able to handle all safe
math cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293598 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 03:07:46 +00:00
Matt Arsenault
f39022545d AMDGPU: Make i32 uaddo/usubo legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293514 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 18:11:38 +00:00
Tom Stellard
1f91c2f5d6 AMDGPU/SI: Move some ISel helpers into utils so they can be shared with GISel
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D29068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293321 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 18:41:14 +00:00
Matt Arsenault
cfe56d7c95 AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.

This should be applied to the release branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292472 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-19 06:04:12 +00:00
Jan Vesely
0835374acb AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes
This will make transition to SCRATCH_MEMORY easier

Differential Revision: https://reviews.llvm.org/D24746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-06 21:00:46 +00:00
Matt Arsenault
0b698fea77 AMDGPU: Select branch on undef to uniform scc branch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289877 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 21:57:11 +00:00
Eugene Zelenko
43dec7d682 [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289282 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-09 22:06:55 +00:00
Tom Stellard
c53f76cc0b AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.
Patch By: Wei Ding

Summary: This patch fixes the fdiv precision issues.

Reviewers: b-sumner, cfang, wdng, arsenm

Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D26424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288879 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 02:42:15 +00:00
Marek Olsak
2a24827c23 AMDGPU/SI: Add back reverted SGPR spilling code, but disable it
suggested as a better solution by Matt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287942 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-25 17:37:09 +00:00
Marek Olsak
2acdc08776 Revert "AMDGPU: Make m0 unallocatable"
This reverts commit 124ad83dae.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287932 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-25 16:03:15 +00:00
Matt Arsenault
124ad83dae AMDGPU: Make m0 unallocatable
m0 may need to be written for spill code, so
we don't want general code uses relying on the
value stored in it.

This introduces a few code quality regressions where copies
from m0 are not coalesced into copies of a copy of m0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287841 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-24 00:26:40 +00:00
Matt Arsenault
f577de357a AMDGPU: Remove unnecessary and on conditional branch
The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286134 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-07 19:09:33 +00:00
Matt Arsenault
ac4d1bb2a0 AMDGPU: Handle CopyToReg in getOperandRegClass
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285768 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-01 23:22:17 +00:00
Nicolai Haehnle
877e3beed6 AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes
Summary:
This will be used for 64-bit MULHU, which is in turn used for the 64-bit
divide-by-constant optimization (see D24822).

Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25289

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284224 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 10:30:00 +00:00
Konstantin Zhuravlyov
49e7805871 [AMDGPU] Pass optimization level to SelectionDAGISel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283133 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 18:47:26 +00:00