5 Commits

Author SHA1 Message Date
Nicolai Haehnle
330c65751e AMDGPU: Convert test cases to the dimension-aware intrinsics
Summary:
Also explicitly port over some tests in llvm.amdgcn.image.* that were
missing. Some tests are removed because they no longer apply (i.e.
explicitly testing building an address vector via insertelement).

This is in preparation for the eventual removal of the old-style
intrinsics.

Some additional notes:
- constant-address-space-32bit.ll: change some GCN-NEXT to GCN because
  the instruction schedule was subtly altered
- insert_vector_elt.ll: the old test didn't actually test anything,
  because %tmp1 was not used; remove the load, because it doesn't work
  (Because of the amdgpu_ps calling convention? In any case, it's
  orthogonal to what the test claims to be testing.)

Change-Id: Idfa99b6512ad139e755e82b8b89548ab08f0afcf

Reviewers: arsenm, rampitec

Subscribers: MatzeB, qcolombet, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D48018

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335229 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 13:37:19 +00:00
Francis Visoiu Mistrih
ca0df55065 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319665 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 17:18:51 +00:00
Matt Arsenault
b216ff22ea AMDGPU: Convert image intrinsic uses in tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298386 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 16:24:12 +00:00
Matt Arsenault
7517ed227a AMDGPU: Split SILowerControlFlow into two pieces
Do most of the lowering in a pre-RA pass. Keep the skip jump
insertion late, plus a few other things that require more
work to move out.

One concern I have is now there may be COPY instructions
which do not have the necessary implicit exec uses
if they will be lowered to v_mov_b32.

This has a positive effect on SGPR usage in shader-db.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279464 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-22 19:33:16 +00:00
Nicolai Haehnle
b18ca96c79 AMDGPU: add execfix flag to SI_ELSE
Summary:
SI_ELSE is lowered into two parts:

s_or_saveexec_b64 dst, src (at the start of the basic block)

s_xor_b64 exec, exec, dst (at the end of the basic block)

The idea is that dst contains the exec mask of the preceding IF block. It can
happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside
the basic block that contains SI_ELSE, in which case it introduces an instruction

s_and_b64 exec, exec, s[...]

which masks out bits that can correspond to both the IF and the ELSE paths.
So the resulting sequence must be:

s_or_savexec_b64 dst, src

s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode
s_and_b64 dst, dst, exec <-- added by SILowerControlFlow

s_xor_b64 exec, exec, dst

Whether to add the additional s_and_b64 dst, dst, exec is currently determined
via the ExecModified tracking. With this change, it is instead determined by
an additional flag on SI_ELSE which is set by SIWholeQuadMode.

Finally: It also occured to me that an alternative approach for the long run
is for SILowerControlFlow to unconditionally emit

s_or_saveexec_b64 dst, src

...

s_and_b64 dst, dst, exec
s_xor_b64 exec, exec, dst

and have a pass that detects and cleans up the "redundant AND with exec"
pattern where possible. This could be useful anyway, because we also add
instructions

s_and_b64 vcc, exec, vcc

before s_cbranch_scc (in moveToALU), and those are often redundant. I have
some pending changes to how KILL is lowered that could also benefit from
such a cleanup pass.

In any case, this current patch could help in the short term with the whole
ExecModified business.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D22846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276972 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 11:39:24 +00:00