30 Commits

Author SHA1 Message Date
Valery Pykhtin
5d11fd54b0 [AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions
Differential revision: https://reviews.llvm.org/D24738

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282234 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-23 09:08:07 +00:00
Valery Pykhtin
91016854e7 [AMDGPU] Refactor VOP3 instruction TD definitions
Differential revision: https://reviews.llvm.org/D24664

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281965 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-20 10:41:16 +00:00
Valery Pykhtin
c220fde748 [AMDGPU] Refactor MUBUF/MTBUF instructions
Differential revision: https://reviews.llvm.org/D24295

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281137 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-10 13:09:16 +00:00
Wei Ding
8dae05acf4 AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type.
Differential Revision: http://reviews.llvm.org/D23700

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281081 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 19:31:51 +00:00
Valery Pykhtin
b01d8d2aaa [AMDGPU] Refactor FLAT TD instructions
Differential revision: https://reviews.llvm.org/D24072

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280655 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-05 11:22:51 +00:00
Valery Pykhtin
81276830ee [AMDGPU] Scalar Memory instructions TD refactoring
Differential revision: https://reviews.llvm.org/D23996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280349 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 09:56:47 +00:00
Wei Ding
443f72b62d AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type.
Differential Revision: http://reviews.llvm.org/D23689

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279126 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-18 19:51:14 +00:00
Wei Ding
9bcebab62b AMDGPU : Add LLVM intrinsics for SAD related instructions.
Differential Revision: http://reviews.llvm.org/D23133

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278354 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 16:33:53 +00:00
Valery Pykhtin
1704eb6864 [AMDGPU] refactor DS instruction definitions. NFC.
Differential revision: https://reviews.llvm.org/D22522

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277344 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-01 14:21:30 +00:00
Matt Arsenault
4080a06a24 AMDGPU: Fix flat atomics
The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272345 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-09 23:42:54 +00:00
Matt Arsenault
14cb586d5e AMDGPU: Add fract intrinsic
Remove broken patterns matching it. This was matching the
unsafe math pattern and expanding the fix for the buggy instruction
from the pattern. The problems are also on CI. Remove the workarounds
and only use fract with unsafe math or from the intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271078 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-28 00:19:52 +00:00
Matt Arsenault
d8f221e6c0 AMDGPU: Implement i64 global atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266075 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 14:05:11 +00:00
Matt Arsenault
bc0aee542f AMDGPU: Add atomic_inc + atomic_dec intrinsics
These are different than atomicrmw add 1 because they have
an additional input value to clamp the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 14:05:04 +00:00
Jan Vesely
62aa62a6e9 AMDGPU/SI: Implement atomic load/store for i32 and i64
Standard load/store instructions with GLC bit set.

Reviewers: tstellardAMD, arsenm

Differential Revision: http://reviews.llvm.org/D18760

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265709 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-07 19:23:11 +00:00
Tom Stellard
059753cf8e AMDGPU: Implement {BUFFER,FLAT}_ATOMIC_CMPSWAP{,_X2}
Summary:
Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.

32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.

Patch by: Vedran Miletić

Reviewers: arsenm, tstellarAMD, nhaehnle

Subscribers: jvesely, scchan, kanarayan, arsenm

Differential Revision: http://reviews.llvm.org/D17280

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265170 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-01 18:27:37 +00:00
Tom Stellard
73fb824626 [AMDGPU] Disassembler: Added basic disassembler for AMDGPU target
Changes:

- Added disassembler project
- Fixed all decoding conflicts in .td files
- Added DecoderMethod=“NONE” option to Target.td that allows to
  disable decoder generation for an instruction.
- Created decoding functions for VS_32 and VReg_32 register classes.
- Added stubs for decoding all register classes.
- Added several tests for disassembler

Disassembler only supports:

- VI subtarget
- VOP1 instruction encoding
- 32-bit register operands and inline constants

[Valery]

One of the point that requires to pay attention to is how decoder
conflicts were resolved:

- Groups of target instructions were separated by using different
  DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate
  approach.

- There were conflicts in IMAGE_<> instructions caused by two
  different reasons:

1. dmask wasn’t specified for the output (fixed)
2. There are image instructions that differ only by the number of
   the address components but have the same encoding by the HW spec. The
   actual number of address components is determined by the HW at runtime
   using image resource descriptor starting from the VGPR encoded in an
   IMAGE instruction. This means that we should choose only one instruction
   from conflicting group to be the rule for decoder. I didn’t find the way
   to disable decoder generation for an arbitrary instruction and therefore
   made a onelinear fix to tablegen generator that would suppress decoder
   generation when DecoderMethod is set to “NONE”. This is a change that
   should be reviewed and submitted first. Otherwise I would need to
   specify different DecoderNamespace for every instruction in the
   conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not
   used in other targets.
3. IMAGE_GATHER decoder generation is for now disabled and to be
   done later.

[/Valery]

Patch By: Sam Kolton

Differential Revision: http://reviews.llvm.org/D16723

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261185 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-18 03:42:32 +00:00
Tom Stellard
abf168408a [AMDGPU] Assembler: Swap operands of flat_store instructions to match AMD assembler
Historically, AMD internal sp3 assembler has flat_store* addr, data
format. To match existing code and to enable reuse, change LLVM
definitions to match.  Also update MC and CodeGen tests.

Differential Revision: http://reviews.llvm.org/D16927

Patch by: Nikolay Haustov

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260694 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 17:57:54 +00:00
Matt Arsenault
60a64d9460 AMDGPU: Tidy minor td file issues
Make comments and indentation more consistent.

Rearrange a few things to be in a more consistent order,
such as organizing subtarget features from those describing
an actual device property, and those used as options.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258789 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 04:49:22 +00:00
Tom Stellard
46850ab55f AMDGPU/SI: Consolidate FLAT patterns
Summary:
We had to sets of identical FLAT patterns one inside the
HasFlatAddressSpace predicate and one inside the useFlatForGloabl
predicate.  This patch merges these sets into a single pattern
under the isCIVI predicate.

The reason we can remove the predicates is that when MUBUF instructions
are legal, the instruction selector will prefer selecting those over
FLAT instructions because MUBUF patterns have a higher complexity score.
So, in this case having patterns for FLAT instructions will have no effect.

This change also simplifies the process for forcing global address space
loads to use FLAT instructions, since we no only have to disable the
MUBUF patterns instead of having to disable the MUBUF patterns and
enable the FLAT patterns.

Reviewers: arsenm, cfang

Subscribers: llvm-commits

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256807 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 02:26:37 +00:00
Tom Stellard
c5273ddfec AMDGPU/SI: Move VI SMEM pattern back into VIInstructions.td
Summary: This was accidently moved to CIInstructions.td in r256282

Reviewers: cfang, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256775 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 20:23:10 +00:00
Tom Stellard
abdebe3a07 AMDGPU/SI: Fix encoding of flat instructions on VI
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15735

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256360 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 03:18:18 +00:00
Tom Stellard
3c0fc4c6bf AMDGPU/SI: Remove non-existent flat instructions
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 02:41:55 +00:00
Changpeng Fang
89e60598f6 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

  NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256282 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 20:55:23 +00:00
Rafael Espindola
a00544a653 Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"
This reverts commit r256273.

It broke CodeGen/AMDGPU/llvm.dbg.value.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256275 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 19:46:44 +00:00
Changpeng Fang
808f9643e6 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256273 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 19:32:28 +00:00
Matt Arsenault
d0edb1f758 AMDGPU: Add s_dcache_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248533 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:27 +00:00
Matt Arsenault
1348e9d04d AMDGPU: Add cache invalidation instructions.
These are necessary for implementing mem_fence for
OpenCL 2.0.

The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248532 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:21 +00:00
Matt Arsenault
e48caeb48f AMDGPU: Improve accuracy of instruction rates for some FP instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245774 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-22 00:50:41 +00:00
Matt Arsenault
ffd72ef643 AMDGPU: Move CI instructions into CIInstructions.td
There are still a couple of CI patterns left in SIInstructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245767 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-22 00:16:34 +00:00
Tom Stellard
953c681473 R600 -> AMDGPU rename
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 03:28:10 +00:00