71 Commits

Author SHA1 Message Date
Konstantin Zhuravlyov
4d82ce5c27 AMDGPU: Re-apply r341982 after fixing the layering issue
Move isa version determination into TargetParser.

Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342069 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-12 18:50:47 +00:00
Ilya Biryukov
867f48781f Revert "AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination into TargetParser."
This reverts commit r341982.

The change introduced a layering violation. Reverting to unbreak
our integrate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342023 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-12 07:05:30 +00:00
Konstantin Zhuravlyov
b479681381 AMDGPU: Move isa version and EF_AMDGPU_MACH_* determination
into TargetParser.

Also switch away from target features to CPU string when
determining isa version. This fixes an issue when we
output wrong isa version in the object code when features
of a particular CPU are altered (i.e. gfx902 w/o xnack
used to result in gfx900).

Differential Revision: https://reviews.llvm.org/D51890



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341982 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-11 18:56:51 +00:00
Ryan Taylor
da360ea600 [AMDGPU] Add support for a16 modifiear for gfx9
Summary:
Adding support for a16 for gfx9. A16 bit replaces r128 bit for gfx9.

Change-Id: Ie8b881e4e6d2f023fb5e0150420893513e5f4841

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50575

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340831 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-28 15:07:30 +00:00
Tim Renouf
12c4e30c78 [AMDGPU] New tbuffer intrinsics
Summary:
This commit adds new intrinsics
  llvm.amdgcn.raw.tbuffer.load
  llvm.amdgcn.struct.tbuffer.load
  llvm.amdgcn.raw.tbuffer.store
  llvm.amdgcn.struct.tbuffer.store

with the following changes from the llvm.amdgcn.tbuffer.* intrinsics:

* there are separate raw and struct versions: raw does not have an index
  arg and sets idxen=0 in the instruction, and struct always sets
  idxen=1 in the instruction even if the index is 0, to allow for the
  fact that gfx9 does bounds checking differently depending on whether
  idxen is set;

* there is a combined format arg (dfmt+nfmt)

* there is a combined cachepolicy arg (glc+slc)

* there are now only two offset args: one for the offset that is
  included in bounds checking and swizzling, to be split between the
  instruction's voffset and immoffset fields, and one for the offset
  that is excluded from bounds checking and swizzling, to go into the
  instruction's soffset field.

The AMDISD::TBUFFER_* SD nodes always have an index operand, all three
offset operands, combined format operand, combined cachepolicy operand,
and an extra idxen operand.

The tbuffer pseudo- and real instructions now also have a combined
format operand.

The obsolescent llvm.amdgcn.tbuffer.* and llvm.SI.tbuffer.store
intrinsics continue to work.

V2: Separate raw and struct intrinsics.
V3: Moved extract_glc and extract_slc defs to a more sensible place.
V4: Rebased on D49995.
V5: Only two separate offset args instead of three.
V6: Pseudo- and real instructions have joint format operand.
V7: Restored optionality of dfmt and nfmt in assembler.
V8: Addressed minor review comments.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D49026

Change-Id: If22ad77e349fac3a5d2f72dda53c010377d470d4

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340268 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-21 11:06:05 +00:00
Tom Stellard
cba2181e77 AMDGPU: Separate R600 and GCN TableGen files
Summary:
We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc.  This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself.  This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.

Reviewers: arsenm, nhaehnle, jvesely

Reviewed By: arsenm

Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46365

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335942 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 23:47:12 +00:00
Nicolai Haehnle
3bd8feb970 AMDGPU: Turn D16 for MIMG instructions into a regular operand
Summary:
This allows us to reduce the number of different machine instruction
opcodes, which reduces the table sizes and helps flatten the TableGen
multiclass hierarchies.

We can do this because for each hardware MIMG opcode, we have a full set
of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata
and vaddr registers. Instead of having separate D16 machine instructions,
a packed D16 instructions loading e.g. 4 components can simply use the
same V2 opcode variant that non-D16 instructions use.

We still require a TSFlag for D16 buffer instructions, because the
D16-ness of buffer instructions is part of the opcode. Renaming the flag
should help avoid future confusion.

The one non-obvious code change is that for gather4 instructions, the
disassembler can no longer automatically decide whether to use a V2 or
a V4 variant. The existing logic which choose the correct variant for
other MIMG instruction is extended to cover gather4 as well.

As a bonus, some of the assembler error messages are now more helpful
(e.g., complaining about a wrong data size instead of a non-existing
instruction).

While we're at it, delete a whole bunch of dead legacy TableGen code.

Change-Id: I89b02c2841c06f95e662541433e597f5d4553978

Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor

Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D47434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335222 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 13:36:01 +00:00
Stanislav Mekhanoshin
7de8fd2b74 [AMDGPU] Added checks for dpp_ctrl value
- Report error for invalid dpp_ctrl values.
- Changed the way it is reported, now the error will be emitted into
  asm and will work with release build as well.
- Added dpp_ctrl value verifier for codegen.
- Added symbolic constants for dpp_ctrl.

Differential Revision: https://reviews.llvm.org/D46565

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331775 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-08 16:53:02 +00:00
Dmitry Preobrazhensky
a5e8c708f7 [AMDGPU][MC][GFX8][GFX9][DISASSEMBLER] Added "_e32" suffix to 32-bit VINTRP opcodes
See bug 36751: https://bugs.llvm.org/show_bug.cgi?id=36751

Differential Revision: https://reviews.llvm.org/D44529

Reviewers: artem.tamazov, arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327723 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-16 16:38:04 +00:00
Stanislav Mekhanoshin
f20c58e01b [AMDGPU] Add HW_REG_SH_MEM_BASES symbolic name for s_getreg_b32
Differential Revision: https://reviews.llvm.org/D41617

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322500 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-15 18:49:15 +00:00
Dmitry Preobrazhensky
18ab0b4852 [AMDGPU][MC][GFX8][GFX9] Added XNACK_MASK support
See bug 35764: https://bugs.llvm.org/show_bug.cgi?id=35764

Differential Revision: https://reviews.llvm.org/D41614

Reviewers: vpykhtin, artem.tamazov, arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322189 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-10 14:22:19 +00:00
Dmitry Preobrazhensky
c428d374bf [AMDGPU][MC] Added support of 256- and 512-bit tuples of ttmp registers
See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561

This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic.

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41437

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321359 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22 15:18:06 +00:00
Dmitry Preobrazhensky
7ebf60fab4 [AMDGPU][MC][GFX9] Corrected encoding of ttmp registers, disabled tba/tma
See bugs 35494 and 35559:
https://bugs.llvm.org/show_bug.cgi?id=35494
https://bugs.llvm.org/show_bug.cgi?id=35559

Reviewers: vpykhtin, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320375 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-11 15:23:20 +00:00
Dmitry Preobrazhensky
6f67bb8d45 [AMDGPU][MC][DISASSEMBLER][GFX9] Corrected decoding of GLOBAL/SCRATCH opcodes
See bug 35433: https://bugs.llvm.org/show_bug.cgi?id=35433

Differential Revision: https://reviews.llvm.org/D40493

Reviewers: artem.tamazov, SamWot, arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319050 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-27 17:14:35 +00:00
Dmitry Preobrazhensky
fa8708611d [AMDGPU][MC][GFX9][disassembler] Corrected decoding of op_sel_hi for v_mad_mix*
See bug 35148: https://bugs.llvm.org//show_bug.cgi?id=35148

Reviewers: tamazov, SamWot, arsenm

Differential Revision: https://reviews.llvm.org/D39492

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318526 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-17 15:15:40 +00:00
Tom Stellard
7dcd9e77f5 AMDGPU: Add R600InstPrinter class
Summary:
This is step towards separating the GCN and R600 tablegen'd code.

This is a little awkward for now, because the R600 functions won't have the
MCSubtargetInfo parameter, so we need to have AMDMGPUInstPrinter
delegate to R600InstPrinter, but once the tablegen'd code is split,
we will be able to drop the delegation and use R600InstPrinter directly.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311128 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 22:20:04 +00:00
Dmitry Preobrazhensky
9676036f42 [AMDGPU][MC] Corrected VOP3 version of v_interp_* instructions for VI
See bug 32621: https://bugs.llvm.org//show_bug.cgi?id=32621

Reviewers: vpykhtin, SamWot, arsenm

Differential Revision: https://reviews.llvm.org/D35902

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310251 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-07 13:14:12 +00:00
Tom Stellard
39aff8ce5a AMDGPU: Remove deadcode from AMDGPUInstPrinter
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D36034

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309477 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-29 03:56:53 +00:00
Matt Arsenault
c75bdb4f9e AMDGPU: Fix allocating pseudo-registers
There's no need for these to be part of a class since
they are immediately replaced. New unreachable hit in
existing tests.'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308903 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-24 18:06:15 +00:00
Dmitry Preobrazhensky
3fa112e645 [AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier
See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591

Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm

Differential Revision: https://reviews.llvm.org/D35424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308740 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-21 13:54:11 +00:00
David Stuttard
dad6e61ce7 [AMDGPU] Add intrinsics for tbuffer load and store
Intrinsic already existed for llvm.SI.tbuffer.store

Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.*

Added CodeGen tests for the 2 new variants added.
Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr

Differential Revision: https://reviews.llvm.org/D30687

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306031 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-22 16:29:22 +00:00
Dmitry Preobrazhensky
80514214e1 [AMDGPU][MC][GFX9] Corrected VOP3P relevant code to fix disassembler failures
See Bug 33509: https://bugs.llvm.org//show_bug.cgi?id=33509

Reviewers: Sam Kolton, Artem Tamazov, Valery Pykhtin

Differential Revision: https://reviews.llvm.org/D34360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305923 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-21 16:00:54 +00:00
Matt Arsenault
f958f31ecf AMDGPU: Start adding global_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305838 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-20 19:54:14 +00:00
Chandler Carruth
e3e43d9d57 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304787 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-06 11:49:48 +00:00
Dmitry Preobrazhensky
4a31d77be2 [AMDGPU][MC] New syntax for ds_swizzle_b32 offset
See Bug 28601: https://bugs.llvm.org//show_bug.cgi?id=28601

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33542

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304309 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-31 16:26:47 +00:00
Dmitry Preobrazhensky
7bf2a5770d [AMDGPU][MC] Fix for Bug 28211 + LIT tests
- corrected DS_GWS_* opcodes (see VI_Shader_Programming#16.pdf for detailed description)
  - address operand is not used
  - several opcodes have data operand
  - all opcodes have offset modifier
- DS_AND_SRC2_B32: corrected typo in mnemo
- DS_WRAP_RTN_F32 replaced with DS_WRAP_RTN_B32
- added CI/VI opcodes:
  - DS_CONDXCHG32_RTN_B64
  - DS_GWS_SEMA_RELEASE_ALL
- added VI opcodes:
  - DS_CONSUME
  - DS_APPEND
  - DS_ORDERED_COUNT

Differential Revision: https://reviews.llvm.org/D31707

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299767 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-07 13:07:13 +00:00
Dmitry Preobrazhensky
194f24401f [AMDGPU][MC] Fix for Bugs 28200, 28202 + LIT tests
Fixed several related issues with VOP3 fp modifiers.

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D30821

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298255 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-20 14:50:35 +00:00
Matt Arsenault
87fd70245a AMDGPU: Add VOP3P instruction format
Add a few non-VOP3P but instructions related to packed.

Includes hack with dummy operands for the benefit of the assembler

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296368 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 18:49:11 +00:00
Matt Arsenault
1b020b3be5 AMDGPU: Change exp with compr bit printing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295873 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 20:37:12 +00:00
Konstantin Zhuravlyov
017228cd76 [AMDGPU] Add target information that is required by tools to metadata
Differential Revision: https://reviews.llvm.org/D28760#fb670e28


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294449 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 14:05:23 +00:00
Matt Arsenault
9bc1383d56 AMDGPU: Change vintrp printing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289664 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 16:36:12 +00:00
Eugene Zelenko
359c877504 [AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289475 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-12 22:23:53 +00:00
Matt Arsenault
8d631491b3 AMDGPU: Fix handling of 16-bit immediates
Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.

Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.

The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289306 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-10 00:39:12 +00:00
Matt Arsenault
425b3b69c9 AMDGPU: Change vintrp printing to better match sc
Some of the immediates need to be printed differently
eventually.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289291 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-10 00:23:12 +00:00
Matt Arsenault
3acbc32a69 AMDGPU: Change how exp is printed
This is an improvement over a long list of unreadable numbers.
A follow up patch will try to match how sc formats these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288697 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-05 20:31:49 +00:00
Matt Arsenault
856f36957c AMDGPU: Fix formatting of 1/2pi immediate
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286912 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 00:04:33 +00:00
Artem Tamazov
86d93952ed [AMDGPU][MC][gfx8] Support 20-bit immediate offset in SMEM instructions.
Fixes Bug 30808.
Note that passing subtarget information to predicates seems too complicated, so gfx8-specific def smrd_offset_20 introduced.
Old gfx6/7-specific def renamed to smrd_offset_8 for clarity.
Lit tests updated.

Differential Revision: https://reviews.llvm.org/D26085

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285590 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-31 16:07:39 +00:00
Matt Arsenault
ac5efca3f0 AMDGPU: Use 1/2pi inline imm on VI
I'm guessing at how it is supposed to be printed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285490 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-29 04:05:06 +00:00
Krzysztof Parzyszek
735fbf86f3 [AMDGPU] Stop using MCRegisterClass::getSize()
Differential Review: https://reviews.llvm.org/D24675


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284619 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 17:40:36 +00:00
Matt Arsenault
7e2ade4213 AMDGPU: Add instruction definitions for VGPR indexing
VI added a second method of indexing into VGPRs
besides using v_movrel*

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284027 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 18:00:51 +00:00
Konstantin Zhuravlyov
c7a23a58d8 [AMDGPU] Refactor waitcnt encoding
- Refactor bit packing/unpacking
- Calculate bit mask given bit shift and bit width
- Introduce function for decoding bits of waitcnt
- Introduce function for encoding bits of waitcnt
- Introduce function for getting waitcnt mask (instead of using bare numbers)
- Introduce function fot getting max waitcnt(s) (instead of using bare numbers)

Differential Revision: https://reviews.llvm.org/D25298


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283919 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 18:58:22 +00:00
Sam Kolton
a7de0c7962 [AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h
Reviewers: artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D25084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283560 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-07 14:46:06 +00:00
Konstantin Zhuravlyov
69560a642c [AMDGPU] Choose VMCNT, EXPCNT, LGKMCNT masks and shifts based on the isa version
Differential Revision: https://reviews.llvm.org/D24973


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282877 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-30 17:01:40 +00:00
Konstantin Zhuravlyov
c87be4a440 [AMDGPU] Enable changing instprinter's behavior based on the per-function
subtarget

This is a prerequisite for coming waitcnt changes

Differential Revision: https://reviews.llvm.org/D24939



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282489 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 14:42:48 +00:00
Sam Kolton
03d317688d AMDGPU] Assembler: better support for immediate literals in assembler.
Summary:
Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals.
E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least.

With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction).
Here are rules how we convert literals:
    - We parsed fp literal:
        - Instruction expects 64-bit operand:
            - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5)
                - then we do nothing this literal
            - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5)
                - report error
            - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant
                - If instruction expect fp operand type (f64)
                    - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5)
                        - If so then do nothing
                    - Else (e.g. v_fract_f64 v[0:1], 3.1415)
                        - report warning that low 32 bits will be set to zeroes and precision will be lost
                        - set low 32 bits of literal to zeroes
                - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5)
                    - report error as it is unclear how to encode this literal
        - Instruction expects 32-bit operand:
            - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow
            - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5)
                - do nothing
                - Else report error
            - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0)
    - Parsed binary literal:
        - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35)
            - do nothing
        - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35)
            - report error
        - Else, literal is not-inlinable and we are not required to inline it
            - Are high 32 bit of literal zeroes or same as sign bit (32 bit)
                - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef)
            - Else
                - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0)

For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types:
'''
enum OperandType {
    OPERAND_REG_IMM32_INT,
    OPERAND_REG_IMM32_FP,
    OPERAND_REG_INLINE_C_INT,
    OPERAND_REG_INLINE_C_FP,
}
'''

This is not working yet:
    - Several tests are failing
    - Problems with predicate methods for inline immediates
    - LLVM generated assembler parts try to select e64 encoding before e32.
More changes are required for several AsmOperands.

Reviewers: vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, artem.tamazov

Differential Revision: https://reviews.llvm.org/D22922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281050 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 14:44:04 +00:00
Valery Pykhtin
9364829511 [AMDGPU] fix failure on printing of non-existing instruction operands.
Differential revision: https://reviews.llvm.org/D23323

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278665 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-15 10:56:48 +00:00
Valery Pykhtin
d27913ee68 Revert "[AMDGPU] fix failure on printing of non-existing instruction operands."
This reverts revision 278333, newly added test failed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278336 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 14:22:05 +00:00
Valery Pykhtin
3876e2e984 [AMDGPU] fix failure on printing of non-existing instruction operands.
Differential revision: https://reviews.llvm.org/D23323

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278333 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 13:49:46 +00:00
Matt Arsenault
1a8f788ef1 AMDGPU: Remove unnecessary string usage in AsmPrinter
Registers are printed a lot, so don't create temporary
std::strings. Using char instead of a string to an ostream
saves a function call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274581 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-05 22:06:56 +00:00
Sam Kolton
bc0e31263e [AMDGPU] AsmParser: Support for sext() modifier in SDWA. Some code cleaning in AMDGPUOperand.
Summary:
sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported.
Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier.
Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers.
Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...).

Reviewers: vpykhtin, artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D20968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272384 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 09:57:59 +00:00