Commit Graph

2108 Commits

Author SHA1 Message Date
Matt Arsenault
ec61af4bcc AMDGPU: Fix crash on immediate operand
We can have a v_mac with an immediate src0.
We can still fold if it's an inline immediate,
otherwise it already uses the constant bus.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-21 00:45:59 +00:00
Matt Arsenault
820b8a54fb AMDGPU: Start selecting v_mad_mixhi_f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:01:24 +00:00
Matt Arsenault
4739f7353d AMDGPU: Add tied operands to v_mad_mix{lo|hi}_f16
These write to the low and high half of the destination
register and leave the other 16-bits unchanged. This is true
for most 16-bit instructions on gfx9, but we don't use that
now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313812 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 20:53:49 +00:00
Matt Arsenault
7287fcb5d5 AMDGPU: Start selecting v_mad_mixlo_f16
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 20:28:39 +00:00
Matt Arsenault
ae40a10420 AMDGPU: Fix encoding of op_sel for mad_mix* opcodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313797 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 19:09:28 +00:00
Stanislav Mekhanoshin
b5a9104224 [AMDGPU] Fixed memory leak with inliner replaced
Delete inliner before replacing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313723 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:34:28 +00:00
Matt Arsenault
e232c83060 AMDGPU: Move r600 only code into r600 only td file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313719 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:11:25 +00:00
Stanislav Mekhanoshin
2e5d75b42d [AMDGPU] Fix regression in test clang/test/CodeGen/backend-unsupported-error.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313718 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:10:15 +00:00
Matt Arsenault
a942315e5f AMDGPU: Match load d16 hi instructions
Also starts selecting global loads for constant address
in some cases. Some end up selecting to mubuf still, which
requires investigation.

We still get sub-optimal regalloc and extra waitcnts inserted
due to not really tracking the liveness of the separate register
halves.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313716 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 05:01:53 +00:00
Stanislav Mekhanoshin
fbf0e1603c [AMDGPU] Port of HSAIL inliner
Differential Revision: https://reviews.llvm.org/D36849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313714 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 04:25:58 +00:00
Matt Arsenault
6a28475ea4 AMDGPU: Cleanup load/store PatFrags
Try to use a consistent naming scheme.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313713 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 03:43:35 +00:00
Matt Arsenault
8e11a03a95 AMDGPU: Match store d16_hi instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313712 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 03:20:09 +00:00
Stanislav Mekhanoshin
60873e298c [AMDGPU] Prevent post-RA scheduler from breaking memory clauses
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.

Differential Revision: https://reviews.llvm.org/D38014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313670 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 20:54:38 +00:00
Matt Arsenault
4b385be048 AMDGPU: Run internalize symbols at -O0
The relocations used for externally visible functions
aren't supported, so the direct call emitted ends
up hitting a linker error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313616 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 07:40:11 +00:00
Konstantin Zhuravlyov
fe0a82a17c AMDGPU: Start selecting s_xnor_{b32, b64}
Differential Revision: https://reviews.llvm.org/D37981


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 21:22:45 +00:00
Jan Sjodin
ac413e0287 Fix warnings in r313297.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 21:49:52 +00:00
Matt Arsenault
11283fb2c8 AMDGPU: Fix violating constant bus restriction
You can't use madmk/madmk if it already uses an SGPR input.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313298 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 20:54:29 +00:00
Jan Sjodin
028255f1f7 Add AddresSpace to PseudoSourceValue.
Differential Revision: https://reviews.llvm.org/D35089



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313297 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 20:53:51 +00:00
Matt Arsenault
4d43fa8b05 AMDGPU: Fix assert on alloca of array of struct
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313282 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 18:02:29 +00:00
Matt Arsenault
c6fa88b24e AMDGPU: Stop modifying SP in call sequences
Because the stack growth direction and addressing is done
in the same direction, modifying SP at the beginning of the
call sequence was incorrect. If we had a stack passed argument,
we would end up skipping that number of bytes before pushing
arguments, leaving unused/inconsistent space.

The callee creates fixed stack objects in its frame, so
the space necessary for these is already logically allocated
in the callee, so we just let the callee increment SP if
it really requires it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 17:37:40 +00:00
Matt Arsenault
a15bb16493 AMDGPU: Make frame register caller preserved
Using SplitCSR for the frame register was very broken. Often
the copies in the prolog and epilog were optimized out, in addition
to them being inserted after the true prolog where the FP
was clobbered.

I have a hacky solution which works that continues to use
split CSR, but for now this is simpler and will get to working
programs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313274 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 17:14:57 +00:00
Matt Arsenault
a1a416812a AMDGPU: Don't spill SP reg like a normal CSR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313217 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 23:47:01 +00:00
Stanislav Mekhanoshin
63c545da3a Allow target to decide when to cluster loads/stores in misched
MachineScheduler when clustering loads or stores checks if base
pointers point to the same memory. This check is done through
comparison of base registers of two memory instructions. This
works fine when instructions have separate offset operand. If
they require a full calculated pointer such instructions can
never be clustered according to such logic.

Changed shouldClusterMemOps to accept base registers as well and
let it decide what to do about it.

Differential Revision: https://reviews.llvm.org/D37698

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313208 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 22:20:47 +00:00
Matt Arsenault
93b87c84ea AMDGPU: Handle coldcc in more places
Missed in r312936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313205 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-13 21:55:52 +00:00
Matt Arsenault
e4e1eed1d7 AMDGPU: Allow coldcc calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312936 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 18:54:20 +00:00
Stanislav Mekhanoshin
46582be974 [AMDGPU] Produce madak and madmk from the two-address pass
These two instructions are normally selected, but when the
two address pass converts mac into mad we end up with the
mad where we could have one of these.

Differential Revision: https://reviews.llvm.org/D37389

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312928 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 17:13:57 +00:00
Tim Renouf
2c5cb5f335 [AMDGPU] exp should not be in WQM mode
A mrt exp with vm=1 must be in exact (non-WQM) mode, as it also exports
the exec mask as the valid mask to determine which pixels to render.

This commit marks any exp as needing to be in exact mode.

Actually, if there are multiple mrt exps, only one needs to have vm=1,
and only that one needs to be in exact mode. But that is an optimization
for another day.

Differential Revision: https://reviews.llvm.org/D36305

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312915 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 13:55:39 +00:00
Tim Renouf
8b9e95cb87 AMDGPU: trivial comment change
... to check commit access for new committer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312900 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-11 08:31:32 +00:00
Davide Italiano
1546bf0dba [AMDGPU] Remove unused function. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312836 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 23:54:11 +00:00
Matt Arsenault
75448f1d3b AMDGPU: Start using !con operator
We have a lot of operand definition work essentially producing
every valid permutation of operands to workaround builiding
operand lists based on the instruction features. Apparently tablegen
already has a mostly undocumented operator to concat dags which
simplies this.

Convert one simple place to use this. The BUF instruction definitions
have much more complicated logic that can be totally rewritten now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312822 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 19:09:13 +00:00
Matt Arsenault
fadb61df65 AMDGPU: Recompute scc liveness
The various scalar bit operations set SCC,
so one is erased or moved it needs to be recomputed.
Not sure why the existing tests don't fail on this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312819 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 18:51:26 +00:00
Matt Arsenault
0bb6355f63 AMDGPU: Start selecting v_mad_mix_f32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 18:05:07 +00:00
Konstantin Zhuravlyov
3964b8bfc8 AMDGPU: Handle non-temporal loads and stores
Differential Revision: https://reviews.llvm.org/D36862


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312729 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 17:14:54 +00:00
Konstantin Zhuravlyov
b6f64be453 AMDGPU: Handle more than one memory operand in SIMemoryLegalizer
Differential Revision: https://reviews.llvm.org/D37397


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312725 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 16:14:21 +00:00
Matt Arsenault
ca22b05483 AMDGPU: Don't legalize i16 extloads to i32 with legal i16
Keeping non-i16 extloads makes it easier to match some new
gfx9 load instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312699 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 05:37:34 +00:00
Stanislav Mekhanoshin
6148c30603 [AMDGPU] Use v_pk_max_f16 for fcanonicalize
Differential Revision: https://reviews.llvm.org/D37325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312676 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-06 22:27:29 +00:00
Stanislav Mekhanoshin
953b70393a [AMDGPU] Fixed encoding of v_pk_mul_f16 in fcanonicalize
Differential Revision: https://reviews.llvm.org/D37522

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312660 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-06 18:29:51 +00:00
Stanislav Mekhanoshin
651c4efd77 [AMDGPU] Fix shouldClusterMemOps to process flat loads
Flat loads do not have vdata operand but have vdst instead.

Differential Revision: https://reviews.llvm.org/D37502

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312640 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-06 15:31:30 +00:00
Nicolai Haehnle
adf1cb63f2 AMDGPU: Make worst-case assumption about the wait states in inline assembly
Summary:
Mesa still uses a hack where empty inline assembly is used as a kind of
optimization barrier. This exposed a problem where not enough wait states
were inserted, because the hazard recognizer implicitly assumed that each
inline assembly "instruction" has at least one wait state.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312635 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-06 13:50:13 +00:00
Yaxun Liu
1e1d0b01c1 [AMDGPU] Transform __read_pipe_* and __write_pipe_*
When packet size equals packet align and is power of 2, transform
__read_pipe* and __write_pipe* to specialized library function.

Differential Revision: https://reviews.llvm.org/D36831


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312598 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-06 00:30:27 +00:00
Konstantin Zhuravlyov
9e6f849b2e AMDGPU: Cleanup/refactor SIMemoryLegalizer [3]:
- Refactor SIMemOpInfo's constructors
  - Allow construction of NotAtomic SIMemOpInfo

Differential Revision: https://reviews.llvm.org/D37396


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312563 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 19:01:10 +00:00
Matt Arsenault
4e0c4fb9c1 AMDGPU: Fix not accounting for tail call resource usage
If the only call in a function is a tail call, the
function isn't considered to have a call since it's a
type of return.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312561 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 18:36:36 +00:00
Konstantin Zhuravlyov
f9ab88e18d AMDGPU/NFC: Cleanup/refactor SIMemoryLegalizer [2]:
- Make SIMemOpInfo a class
  - Add accessor methods to SIMemOpInfo
  - Move get*Info methods to SIMemOpInfo

Differential Revision: https://reviews.llvm.org/D37395


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312541 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 16:41:25 +00:00
Konstantin Zhuravlyov
c0c4768b6b AMDGPU/NFC: Cleanup/refactor SIMemoryLegalizer [1]:
- Rename MemOpInfo -> SIMemOpInfo
  - Move SIMemOpInfo class out of SIMemoryLegalizer class

Differential Revision: https://reviews.llvm.org/D37394


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312540 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-05 16:18:05 +00:00
Stanislav Mekhanoshin
f3b5f2ad4a [AMDGPU] Prevent infinite recursion in DAG.computeKnownBits()
Differential Revision: https://reviews.llvm.org/D37392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312364 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 20:43:20 +00:00
Matt Arsenault
757642511d AMDGPU: Add ds_{read|write}_addtid_b32 definitions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312349 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 18:38:02 +00:00
Matt Arsenault
6a29a225d2 AMDGPU: Add most d16 load/store instruction definitions
Doesn't include the tied operand necessary for the loads,
but is enough for the assembler to work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312347 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 18:36:06 +00:00
Nicolai Haehnle
96b6414540 AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states
Summary:
This fixes a bug that was exposed on gfx9 in various
GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests,
e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36193

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312337 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 16:56:32 +00:00
Matt Arsenault
fcd77e8a04 AMDGPU: Fold clamp modifier for packed instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312297 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 23:53:50 +00:00
Eugene Zelenko
046ca04445 [Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes. Also affected in files (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312289 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 21:56:16 +00:00