1285 Commits

Author SHA1 Message Date
Matt Arsenault
6939475a93 AMDGPU: Cleanup immediate folding code
Move code down to use, reorder to avoid hard to follow
immediate folding logic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-23 21:51:07 +00:00
Matt Arsenault
ef97727654 AMDGPU: Fix debug printing
The uint8_t was printed as a char which didn't really work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287817 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-23 21:51:05 +00:00
Matt Arsenault
057bbbe4ae AMDGPU: Fix not setting kill flag on temp reg when spilling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287808 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-23 21:00:12 +00:00
Matt Arsenault
e834ce5976 AMDGPU: Fix adding extra implicit def of register
In the scalar case, there's no reason to add an additional
def of the same register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287807 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-23 21:00:10 +00:00
Matt Arsenault
79d4f8b8b1 AMDGPU: Fix MMO when splitting spill
The size and offset were wrong. The size of the object was
being used for the size of the access, when here it is really
being split into 4-byte accesses. The underlying object size
is set in the MachinePointerInfo, which also didn't have the
offset set.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287806 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-23 20:52:53 +00:00
Stanislav Mekhanoshin
64620b1c31 [AMDGPU] Fix multiple vreg definitions in si-lower-control-flow
Differential Revision: https://reviews.llvm.org/D26939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 01:42:34 +00:00
Daniel Sanders
b313c742da Check that emitted instructions meet their predicates on all targets except ARM, Mips, and X86.
Summary:
* ARM is omitted from this patch because this check appears to expose bugs in this target.
* Mips is omitted from this patch because this check either detects bugs or deliberate
  emission of instructions that don't satisfy their predicates. One deliberate
  use is the SYNC instruction where the version with an operand is correctly
  defined as requiring MIPS32 while the version without an operand is defined
  as an alias of 'SYNC 0' and requires MIPS2.
* X86 is omitted from this patch because it doesn't use the tablegen-erated
  MCCodeEmitter infrastructure.

Patches for ARM and Mips will follow.

Depends on D25617

Reviewers: tstellarAMD, jmolloy

Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits

Differential Revision: https://reviews.llvm.org/D25618

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287439 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-19 13:05:44 +00:00
Konstantin Zhuravlyov
5527d64b74 [AMDGPU] Change frexp.exp intrinsic to return i16 for f16 input
Differential Revision: https://reviews.llvm.org/D26862


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287389 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 22:31:08 +00:00
Matt Arsenault
13892fc867 AMDGPU: Fix unused variable warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287362 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 18:33:36 +00:00
Tom Stellard
a006842d47 AMDGPU/SI: Remove zero_extend patterns for i16 ops selected to 32-bit insts
Summary:
The 32-bit instructions don't zero the high 16-bits like the 16-bit
instructions do.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D26828

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287342 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 13:53:34 +00:00
Nicolai Haehnle
1e27a618c6 AMDGPU: Fix legalization of MUBUF instructions in shaders
Summary:
The addr64-based legalization is incorrect for MUBUF instructions with idxen
set as well as for BUFFER_LOAD/STORE_FORMAT_* instructions.  This affects
e.g.  shaders that access buffer textures.

Since we never actually need the addr64-legalization in shaders, this patch
takes the easy route and keys off the calling convention.  If this ever
affects (non-OpenGL) compute, the type of legalization needs to be chosen
based on some TSFlag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98664

Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D26747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287339 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 11:55:52 +00:00
Simon Pilgrim
9f23214cb5 Fix spelling mistakes in AMDGPU target comments. NFC.
Identified by Pedro Giffuni in PR27636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287333 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 11:04:02 +00:00
Matt Arsenault
94dac3bd7b AMDGPU: Move redundant setting of inst properties
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287311 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 04:42:59 +00:00
Matt Arsenault
cbafc5829d AMDGPU: Fix crash on illegal type for inlineasm
There are still crashes on non-MVT types in other
places.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287310 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 04:42:57 +00:00
Konstantin Zhuravlyov
1d609512ed Revert "AMDGPU: Enable ConstrainCopy DAG mutation"
This reverts commit r287146.

This breaks few conformance tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287233 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-17 16:41:49 +00:00
Konstantin Zhuravlyov
0c92298282 [AMDGPU] Custom lower f16 = fp_round f64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287203 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-17 04:28:37 +00:00
Konstantin Zhuravlyov
8fd8772fcd [AMDGPU] Promote f16/i16 conversions to f32/i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287201 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-17 04:00:46 +00:00
Konstantin Zhuravlyov
54556b5c81 [AMDGPU] Expand br_cc for f16
Differential Revision: https://reviews.llvm.org/D26732


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287199 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-17 03:49:01 +00:00
Matt Arsenault
4fbd908949 AMDGPU: Enable ConstrainCopy DAG mutation
This fixes a probably unintended divergence from the default
scheduler behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287146 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 20:35:23 +00:00
Tom Stellard
ae5ecee3ec AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies pass
Summary:
1. Don't try to copy values to and from the same register class.
2. Replace copies with of registers with immediate values with v_mov/s_mov
   instructions.

The main purpose of this change is to make MachineSink do a better job of
determining when it is beneficial to split a critical edge, since the pass
assumes that copies will become move instructions.

This prevents a regression in uniform-cfg.ll if we enable critical edge
splitting for AMDGPU.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287131 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 18:42:17 +00:00
Konstantin Zhuravlyov
925ac794ff [AMDGPU] Refactor v_mac_{f16, f32} patterns into a class NFC
Differential Revision: https://reviews.llvm.org/D26711


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287077 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 03:39:12 +00:00
Konstantin Zhuravlyov
7f4d4fdf3f [AMDGPU] Handle f16 select{_cc}
- Select `select` to `v_cndmask_b32`
- Expand `select_cc`
- Refactor patterns

Differential Revision: https://reviews.llvm.org/D26714


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 03:16:26 +00:00
Jan Vesely
ea73d16588 AMDGPU/GCN: Exit early in hazard recognizer if there is no vreg argument
wbinvl.* are vector instruction that do not sue vector registers.

v2: check only M?BUF instructions

Differential Revision: https://reviews.llvm.org/D26633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287056 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 23:55:15 +00:00
Tom Stellard
1d353ca419 AMDGPU/SI: Fix pattern for i16 = sign_extend i1
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D26670

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287035 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 21:25:56 +00:00
Matt Arsenault
7036e8dad1 AMDGPU: Enable store clustering
Also respect the TII hook for these like the generic code does
in case we want a flag later to disable this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287021 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 20:22:55 +00:00
Matt Arsenault
339cf19d8c AMDGPU: Analyze mubuf with immediate soffset
Fixes giving up on clustering common addr64 accesses with
constant 0 soffset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287018 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 20:14:27 +00:00
Matt Arsenault
123bf00b81 AMDGPU: Fix return after else
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287015 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 19:58:54 +00:00
Matt Arsenault
7946b1057f AMDGPU: Replace assert(false) with unreachable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287013 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 19:34:37 +00:00
Stanislav Mekhanoshin
79f84ef39d [AMDGPU] Add wave barrier builtin
The wave barrier represents the discardable barrier. Its main purpose is to
carry convergent attribute, thus preventing illegal CFG optimizations. All lanes
in a wave come to convergence point simultaneously with SIMT, thus no special
instruction is needed in the ISA. The barrier is discarded during code generation.

Differential Revision: https://reviews.llvm.org/D26585

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287007 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 19:00:15 +00:00
Sam Kolton
718ab76006 [AMDGPU] TableGen: change individual instruction flags to bit type from bits<1>
Summary: This is needed to be able to use this flags in InstrMappings.

Reviewers: tstellarAMD, vpykhtin

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D26666

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286960 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 13:39:07 +00:00
Matt Arsenault
9de96caccf AMDGPU: Fix f16 fabs/fneg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286931 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 02:25:28 +00:00
Matt Arsenault
f32cff39c1 AMDGPU: Set hasExtraSrcRegAllocReq on v_div_scale_*
This doesn't solve any problems I know about, but this should have
more conservative assumptions about the operands'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286913 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 00:05:42 +00:00
Matt Arsenault
856f36957c AMDGPU: Fix formatting of 1/2pi immediate
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286912 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-15 00:04:33 +00:00
Changpeng Fang
af76145600 AMDGPU/SI: Support data types other than V4f32 in image intrinsics
Summary:
  Extend image intrinsics to support data types of V1F32 and V2F32.

  TODO: we should define a mapping table to change the opcode for data type of V2F32 but just one channel is active,
  even though such case should be very rare.

Reviewers:
  tstellarAMD

Differential Revision:
  http://reviews.llvm.org/D26472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286860 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-14 18:33:18 +00:00
Matt Arsenault
4404d0d6e3 AMDGPU: Implement SGPR spilling with scalar stores
nThis avoids the nasty problems caused by using
memory instructions that read the exec mask while
spilling / restoring registers used for control flow
masking, but only for VI when these were added.

This always uses the scalar stores when enabled currently,
but it may be better to still try to spill to a VGPR
and use this on the fallback memory path.

The cache also needs to be flushed before wave termination
if a scalar store is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286766 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-13 18:20:54 +00:00
Konstantin Zhuravlyov
9027123253 [AMDGPU] Add f16 support (VI+)
Differential Revision: https://reviews.llvm.org/D25975


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286753 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-13 07:01:11 +00:00
Tom Stellard
da5a5c79fa AMDGPU/SI: Promote i16 = fp_[us]int f32 for VI
Summary: This fixes a regression caused by r286464.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D26570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286687 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-12 00:19:11 +00:00
Tom Stellard
6687aabcf5 AMDGPU/SI: Fix visit order assumption in SIFixSGPRCopies
Summary:
This pass was assuming that when a PHI instruction defined a register
used by another PHI instruction that the defining insstruction would
be legalized before the using instruction.

This assumption was causing the pass to not legalize some PHI nodes
within divergent flow-control.

This fixes a bug that was uncovered by r285762.

Reviewers: nhaehnle, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D26303

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286676 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-11 23:35:42 +00:00
Sam Kolton
45849f5ed4 [AMDGPU] TargetStreamer: Fix .note section name
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286591 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-11 13:41:52 +00:00
Yaxun Liu
81a2553927 AMDGPU: Attempt to fix build failure on x86-64 selfhost build
Remove redundant include file.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286552 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-11 02:48:50 +00:00
Stanislav Mekhanoshin
a0c045c407 Revert "[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies"
This reverts commit r286171, it breaks piglit test fs-discard-exit-2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286530 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-11 00:22:34 +00:00
Joerg Sonnenberger
0e71d50f7e Fix requirements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286527 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-10 23:53:45 +00:00
Evandro Menezes
3f647d62d1 [DAG Combiner] Fix the native computation of the Newton series for reciprocals
The generic infrastructure to compute the Newton series for reciprocal and
reciprocal square root was conceived to allow a target to compute the series
itself.  However, the original code did not properly consider this condition
if returned by a target.  This patch addresses the issues to allow a target
to compute the series on its own.

Differential revision: https://reviews.llvm.org/D22975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286523 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-10 23:31:06 +00:00
Yaxun Liu
a2ee7d2991 AMDGPU: Emit runtime metadata as a note element in .note section
Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata.

However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html).

This patch lets AMDGPU backend emits runtime metadata as a note element in .note section.

Differential Revision: https://reviews.llvm.org/D25781


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286502 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-10 21:18:49 +00:00
Tom Stellard
0deee390af AMDGPU: Add VI i16 support
Patch By: Wei Ding

Differential Revision: https://reviews.llvm.org/D18049

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286464 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-10 16:02:37 +00:00
Stanislav Mekhanoshin
c312996e7a [AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.

With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a forward propagation of a v_cmp 64 bit result
to an user is implemented. Additional side effect of this is that we may
consume less VGPRs in a cost of more SGPRs in case if holding of multiple
conditions is needed, and that is a clear win in most cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286171 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-07 23:04:50 +00:00
Matt Arsenault
f577de357a AMDGPU: Remove unnecessary and on conditional branch
The comment explaining why this was necessary is incorrect
in its description of v_cmp's behavior for inactive workitems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286134 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-07 19:09:33 +00:00
Matt Arsenault
e5fd9c09ad AMDGPU: Preserve vcc undef flags when inverting branch
If the branch was on a read-undef of vcc, passes that used
analyzeBranch to invert the branch condition wouldn't preserve
the undef flag resulting in a verifier error.

Fixes verifier failures in a future commit.

Also fix verifier error when inserting copy for vccz
corruption bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286133 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-07 19:09:27 +00:00
Matt Arsenault
72dab34b83 AMDGPU: Try to fix (non-clang?) bot builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286120 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-07 16:52:50 +00:00
Matt Arsenault
c6533e305b AMDGPU: Refactor copyPhysReg
Separate the subregister splitting logic to re-use later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286118 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-07 16:39:22 +00:00