74 Commits

Author SHA1 Message Date
Tim Renouf
0b9f636469 [AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers
This commit allows v_cndmask_b32_e64 with abs, neg source
modifiers on src0, src1 to be assembled and disassembled.

This does appear to be allowed, even though they are floating point
modifiers and the operand type is b32.

To do this, I added src0_modifiers and src1_modifiers to the
MachineInstr, which involved fixing up several places in codegen and mir
tests.

Differential Revision: https://reviews.llvm.org/D59191

Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356398 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-18 19:25:39 +00:00
Stanislav Mekhanoshin
a4638a6a3a [AMDGPU] Silence gcc 7 warnings
Differential Revision: https://reviews.llvm.org/D59330

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356100 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-13 21:15:52 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Alexander Timofeev
bf5049eaf1 [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Detailed description: SIFoldOperands::foldInstOperand iterates over the
operand uses calling the function that changes def-use iteratorson the
way. As a result loop exits immediately when def-use iterator is
changed. Hence, the operand is folded to the very first use instruction
only. This makes VGPR live along the whole basic block and increases
register pressure significantly. The performance drop observed in SHOC
DeviceMemory test is caused by this bug.

Proposed fix: collect uses to separate container for further processing
in another loop.

Testing: make check-llvm
SHOC performance test.

Reviewers: rampitec, ronlieb

Differential Revision: https://reviews.llvm.org/D56161

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350350 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-03 19:55:32 +00:00
Stanislav Mekhanoshin
45657b2c3b [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343249 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 18:55:20 +00:00
Alexander Timofeev
c851af3c8a [AMDGPU] Preliminary patch for divergence driven instruction selection. Operands Folding 1.
Reviewers: rampitec

Differential revision: https://reviews/llvm/org/D51316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341068 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-30 13:55:04 +00:00
Fangrui Song
18fbbd8771 [AMDGPU] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=off
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340868 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-28 19:19:03 +00:00
Matt Arsenault
0e21fb6c46 AMDGPU: Force shrinking of add/sub even if the carry is used
The original motivating example uses a 64-bit add, so the carry
is used. Insert a copy from VCC. This may allow shrinking of
the used carry instruction. At worst, we are replacing a
mov to materialize the constant with a copy of vcc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340862 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-28 18:44:16 +00:00
Matt Arsenault
ac00e8c95b AMDGPU: Shrink insts to fold immediates
This needs to be done in the SSA fold operands
pass to be effective, so there is a bit of overlap
with SIShrinkInstructions but I don't think this
is practically avoidable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340859 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-28 18:34:24 +00:00
Matt Arsenault
10398322af AMDGPU: Check NSZ MI flag when folding omod
I'm not sure the exact nsz flag combination that
is OK. I think as long as it's on either, this is OK.
For now just check it on the omod multiply.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339513 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-12 08:44:25 +00:00
Matt Arsenault
273374717e AMDGPU: Fold v_lshl_or_b32 with 0 src0
Appears from expansion of some packed cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339025 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-06 15:40:20 +00:00
Tom Stellard
1d6fd076a3 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 20:59:01 +00:00
Tom Stellard
cba2181e77 AMDGPU: Separate R600 and GCN TableGen files
Summary:
We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc.  This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself.  This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.

Reviewers: arsenm, nhaehnle, jvesely

Reviewed By: arsenm

Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46365

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335942 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 23:47:12 +00:00
Tom Stellard
f02d6fd47c AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332930 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-22 02:03:23 +00:00
Nicola Zaghen
0818e789cb Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332240 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-14 12:53:11 +00:00
Matt Arsenault
ac9b3ef76a AMDGPU: Add Vega12 and Vega20
Changes by
  Matt Arsenault
  Konstantin Zhuravlyov

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331215 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-30 19:08:16 +00:00
Stanislav Mekhanoshin
dbe20f7ece [AMDGPU] Truncate packed inline constant
If a packed inline constant is sign extended it must be truncated
after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented
as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended
to 64 bit. After the value shifted right by 16 to use it in a low part
with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline
constant any longer.

Fixed the error and added verification code. Without the fix and with
the verification bug is causing pk_max_f16_literal.ll to fail.

Differential Revision: https://reviews.llvm.org/D45987

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330752 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-24 18:17:55 +00:00
Stanislav Mekhanoshin
8aac337938 [AMDGPU] Use packed literals with zero either lower or hi part
Differential Revision: https://reviews.llvm.org/D45790

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330365 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-19 21:16:50 +00:00
Stanislav Mekhanoshin
eca9ecf8f9 [AMDGPU] Enabled v2.16 literals for VOP3P
Literal encoding needs op_sel_hi to select low 16 bit in this case.

Differential Revision: https://reviews.llvm.org/D45745

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330230 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-17 23:09:05 +00:00
Matt Arsenault
5c56853ab7 AMDGPU: Fix crash when constant folding with physreg operand
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327209 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-10 16:05:35 +00:00
Matt Arsenault
0bd3aded70 AMDGPU: Don't crash when trying to fold implicit operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324550 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-08 01:12:46 +00:00
Matthias Braun
d318139827 MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320884 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:58 +00:00
Matthias Braun
fa621d294f Rename LiveIntervalAnalysis.h to LiveIntervals.h
Headers/Implementation files should be named after the class they
declare/define.

Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in
favor of `class LiveIntarvals;`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320546 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 02:51:04 +00:00
Francis Visoiu Mistrih
fd11bc0813 [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320022 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 10:40:31 +00:00
Francis Visoiu Mistrih
7384652668 [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).

Basically:

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed

Differential Revision: https://reviews.llvm.org/D40420

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319427 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 12:12:19 +00:00
Francis Visoiu Mistrih
a4ec08b6fd [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319187 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 17:15:09 +00:00
Vitaly Buka
eec5b16c88 Remove unused variables
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315847 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-15 05:35:02 +00:00
Matt Arsenault
ac8ec29c42 AMDGPU: Add comment about clamps
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314952 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-05 00:13:20 +00:00
Matt Arsenault
5eacfa4990 AMDGPU: Do not fold clamp instructions when sources are different
Patch by hakzsam (Samuel Pitoiset)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314951 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-05 00:13:17 +00:00
Matt Arsenault
820b8a54fb AMDGPU: Start selecting v_mad_mixhi_f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:01:24 +00:00
Matt Arsenault
fcd77e8a04 AMDGPU: Fold clamp modifier for packed instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312297 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 23:53:50 +00:00
Nicolai Haehnle
78554e8137 AMDGPU: Fix crash when folding immediates into multiple uses
Summary:
When an immediate is folded by constant folding, we re-scan the entire
use list for two reasons:

1. The constant folding may have created a new use of the same reg.
2. The constant folding may have removed an additional use in the list
   we're currently traversing (e.g., constant folding an S_ADD_I32 c, c).

However, this could previously lead to a crash when an unrelated use was
added twice into the FoldList. Since we re-scan the whole list anyway, we
might as well just clear the FoldList again before we do so.

Using a MIR test to show this because real code seems to trigger the issue
only in connection with some really subtle control flow structures.

Fixes GL45-CTS.shading_language_420pack.binding_images on gfx9.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D35416

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308314 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-18 14:54:41 +00:00
Simon Pilgrim
26aa51226a [AMDGPU] Fix -Wimplicit-fallthrough warnings. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307381 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-07 10:18:57 +00:00
Matt Arsenault
5516bd1387 AMDGPU: Do operand folding in program order
Before it was possible to partially fold use instructions
before the defs. After the xor is folded into a copy, the same
mov can end up in the fold list twice, so on the second attempt
it will fail expecting to see a register to fold.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305821 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-20 18:56:32 +00:00
Matt Arsenault
73854fd751 AMDGPU: Preserve undef when folding register operands
If the source was a copy of an undef register, this would
produce a read of an undefined register which is a verifier
error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305816 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-20 18:41:31 +00:00
Matt Arsenault
4cabef582f AMDGPU: Fix crash with undef vreg input operand
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-20 18:28:02 +00:00
Stanislav Mekhanoshin
ca0adcb320 [AMDGPU] Fix SIFoldOperands crash with clamp
Fixes bug #33302. Pass did not account that Src1 of max instruction
can be an immediate.

Differential Revision: https://reviews.llvm.org/D33884

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304696 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-05 01:03:04 +00:00
Stanislav Mekhanoshin
ced381c038 [AMDGPU] Preserve operand order in SIFoldOperands
SIFoldOperands can commute operands even if no folding was done.
This change is to preserve IR is no folding was done.

Differential Revision: https://reviews.llvm.org/D33802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304625 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-03 00:41:52 +00:00
Stanislav Mekhanoshin
f0c3d71794 [AMDGPU] Allow SDWA in instructions with immediates and SGPRs
An encoding does not allow to use SDWA in an instruction with
scalar operands, either literals or SGPRs. That is however possible
to copy these operands into a VGPR first.

Several copies of the value are produced if multiple SDWA conversions
were done. To cleanup MachineLICM (to hoist copies out of loops),
MachineCSE (to remove duplicate copies) and SIFoldOperands (to replace
SGPR to VGPR copy with immediate copy right to the VGPR) runs are added
after the SDWA pass.

Differential Revision: https://reviews.llvm.org/D33583

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304219 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-30 16:49:24 +00:00
Sam Kolton
ff45a18189 [AMDGPU] SDWA Peephole: improve search for immediates in SDWA patterns
Previously compiler often extracted common immediates into specific register, e.g.:
```
%vreg0 = S_MOV_B32 0xff;
%vreg2 = V_AND_B32_e32 %vreg0, %vreg1
%vreg4 = V_AND_B32_e32 %vreg0, %vreg3
```
Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands:
```
SDWA src: %vreg2 src_sel:BYTE_0
SDWA src: %vreg4 src_sel:BYTE_0
```
With this change peephole check if operand is either immediate or register that is copy of immediate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299202 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-31 11:42:43 +00:00
Stanislav Mekhanoshin
a332f465b2 [AMDGPU] Fold V_CNDMASK with identical source operands
Such instructions sometimes appear after lowering and folding.

Differential Revision: https://reviews.llvm.org/D31318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298723 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-24 18:55:20 +00:00
Matt Arsenault
27f4f2f4bc AMDGPU: Support v2i16/v2f16 packed operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296396 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 22:15:25 +00:00
Matt Arsenault
dd23defd5c AMDGPU: Fold omod into instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296372 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 19:35:42 +00:00
Matt Arsenault
cd39b42cab AMDGPU: Use clamp with f64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295908 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:53:37 +00:00
Matt Arsenault
e184e01dd7 AMDGPU: Fold FP clamp as modifier bit
The manual is unclear on the details of this. It's not
clear to me if denormals are not allowed with clamp,
or if that is only omod. Not allowing denorms for
fp16 or fp64 isn't useful so I also question if that
is really a restriction. Same with whether this is valid
without IEEE mode enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295905 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:27:53 +00:00
Matt Arsenault
c6b1aed80d AMDGPU: Fix folding immediates into mac src2
Whether it is legal or not needs to check for the instruction
it will be replaced with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291711 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:00:02 +00:00
Matt Arsenault
1639229587 AMDGPU: Constant fold when immediate is materialized
In future commits these patterns will appear after moveToVALU changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291615 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 23:32:04 +00:00
Matt Arsenault
8d631491b3 AMDGPU: Fix handling of 16-bit immediates
Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.

Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.

The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289306 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-10 00:39:12 +00:00
Tom Stellard
c53f76cc0b AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.
Patch By: Wei Ding

Summary: This patch fixes the fdiv precision issues.

Reviewers: b-sumner, cfang, wdng, arsenm

Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D26424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288879 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 02:42:15 +00:00
Matt Arsenault
a333ef5119 AMDGPU: Refactor immediate folding logic
Change the logic for when to fold immediates to
consider the destination operand rather than the
source of the materializing mov instruction.

No change yet, but this will allow for correctly handling
i16/f16 operands. Since 32-bit moves are used to materialize
constants for these, the same bitvalue will not be in the
register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288184 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-29 19:20:42 +00:00