780 Commits

Author SHA1 Message Date
Hans Wennborg
fa13347f23 Merging r295990:
------------------------------------------------------------------------
r295990 | jvesely | 2017-02-23 08:12:21 -0800 (Thu, 23 Feb 2017) | 5 lines

AMDGPU/SI: Fix trunc i16 pattern

Hit on ASICs that support 16bit instructions.

Differential Revision: https://reviews.llvm.org/D30281
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@296158 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 18:56:41 +00:00
Hans Wennborg
18b90cd722 Merging r293635:
------------------------------------------------------------------------
r293635 | nha | 2017-01-31 06:35:37 -0800 (Tue, 31 Jan 2017) | 16 lines

[DAGCombine] require UnsafeFPMath for re-association of addition

Summary:
The affected transforms all implicitly use associativity of addition,
for which we usually require unsafe math to be enabled.

The "Aggressive" flag is only meant to convey information about the
performance of the fused ops relative to a fmul+fadd sequence.

Fixes Bug 31626.

Reviewers: spatel, hfinkel, mehdi_amini, arsenm, tstellarAMD

Subscribers: jholewinski, nemanjai, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D28675
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-02 21:34:25 +00:00
Matt Arsenault
974695d103 Merging r293310:
------------------------------------------------------------------------
r293310 | arsenm | 2017-01-27 09:42:26 -0800 (Fri, 27 Jan 2017) | 8 lines

AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands

Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293329 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 20:21:31 +00:00
Matt Arsenault
e536396218 Merging r292982:
------------------------------------------------------------------------
r292982 | arsenm | 2017-01-24 14:02:15 -0800 (Tue, 24 Jan 2017) | 8 lines

Enable FeatureFlatForGlobal on Volcanic Islands

This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293326 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 19:26:48 +00:00
Matt Arsenault
cb3fa250a4 Merging r292473:
------------------------------------------------------------------------
r292473 | arsenm | 2017-01-18 22:35:27 -0800 (Wed, 18 Jan 2017) | 9 lines

AMDGPU: Disable some fneg combines unless nsz

For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.

fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293319 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 18:39:19 +00:00
Matt Arsenault
802910f034 Merging r292472:
------------------------------------------------------------------------
r292472 | arsenm | 2017-01-18 22:04:12 -0800 (Wed, 18 Jan 2017) | 5 lines

AMDGPU: Remove modifiers from v_div_scale_*

They seem to produce nonsense results when used.

This should be applied to the release branch.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293317 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 18:31:33 +00:00
Hans Wennborg
4fba04fd96 Merging r292651:
------------------------------------------------------------------------
r292651 | jvesely | 2017-01-20 13:24:26 -0800 (Fri, 20 Jan 2017) | 8 lines

AMDGPU/R600: Serialize vector trunc stores to private AS

Add DUMMY_CHAIN SDNode to denote stores of interest

Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=28915
Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=30411

Differential Revision: https://reviews.llvm.org/D27964
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293118 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-26 00:26:36 +00:00
Matt Arsenault
cd002582ba AMDGPU: Skip fneg/select combine if it can fold into other
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291792 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 18:58:15 +00:00
Matt Arsenault
9db1ec3d4d AMDGPU: Fold free fneg into sin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291790 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 18:48:09 +00:00
Matt Arsenault
49dd8fcb21 AMDGPU: Fold fneg into fmul_legacy
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291784 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 18:26:30 +00:00
Matt Arsenault
bd870734a5 AMDGPU: Fold fneg into rcp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291779 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 17:46:35 +00:00
Matt Arsenault
cca494fd03 AMDGPU: Fold fneg into fp_round
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291778 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 17:46:33 +00:00
Matt Arsenault
e652041f69 AMDGPU: Fold fneg into fp_extend
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291777 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 17:46:28 +00:00
Matt Arsenault
94bf68d551 AMDGPU: Fold fneg into fma or fmad
Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291733 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 00:32:16 +00:00
Matt Arsenault
ef33822be5 AMDGPU: Fold fneg into fmul
Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 00:23:20 +00:00
Matt Arsenault
bcf34bbbdd AMDGPU: Fold fneg into fadd
Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291731 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 00:09:34 +00:00
Matt Arsenault
8694e2f853 AMDGPU: Pull fneg/fabs out of a select
Allows better source modifier usage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291729 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 23:57:38 +00:00
Matt Arsenault
f1e95d3604 AMDGPU: Fix shrinking of addc/subb.
To shrink to VOP2 the input carry must also be VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291720 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:58:12 +00:00
Matt Arsenault
fac51240d9 AMDGPU: Fix sext_inreg for i1 in i16
This produces worse code when i16 is legal, mostly
due to combines getting confused by conversions inserted
for uniform 16-bit operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:35:22 +00:00
Matt Arsenault
8c7e9845cf AMDGPU: Fix breaking VOP3 v_add_i32s
This was shrinking the instruction even though the carry output
register was a virtual register, not known VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:35:17 +00:00
Matt Arsenault
c6b1aed80d AMDGPU: Fix folding immediates into mac src2
Whether it is legal or not needs to check for the instruction
it will be replaced with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291711 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 22:00:02 +00:00
Kyle Butt
0aa7497cd7 Revert "CodeGen: Allow small copyable blocks to "break" the CFG."
This reverts commit ada6595a52.

This needs a simple probability check because there are some cases where it is
not profitable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291695 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 19:55:19 +00:00
Matt Arsenault
a40945ed88 DAGCombiner: Add hasOneUse checks to fadd/fma combine
Even with aggressive fusion enabled, this requires duplicating
the fmul, or increases an fadd to another fma which is not an
improvement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291642 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 02:02:12 +00:00
Jan Vesely
53dcfdf89b AMDGPU/EG,CM: Add fp16 conversion instructions
Differential Revision: https://reviews.llvm.org/D28164

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291622 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 00:12:39 +00:00
Matt Arsenault
1639229587 AMDGPU: Constant fold when immediate is materialized
In future commits these patterns will appear after moveToVALU changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291615 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 23:32:04 +00:00
Kyle Butt
ada6595a52 CodeGen: Allow small copyable blocks to "break" the CFG.
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.

Differential revision: https://reviews.llvm.org/D27742

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291609 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 23:04:30 +00:00
Matt Arsenault
fa5aafaac2 DAG: Avoid OOB when legalizing vector indexing
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291604 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 22:02:30 +00:00
Matt Arsenault
da59cd0847 AMDGPU: Add tests for HasMultipleConditionRegisters
This was enabled without many specific tests or the comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291586 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 19:08:15 +00:00
Matt Arsenault
f7c0d4013c AMDGPU: Add Assert[SZ]Ext during argument load creation
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.

Fixes code quality regressions vs. SI in min.ll test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291461 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-09 18:52:39 +00:00
Bjorn Pettersson
26c3968163 [SelectionDAG] Fix in legalization of UMAX/SMAX/UMIN/SMIN. Solves PR31486.
Summary:
Originally

 i64 = umax t8, Constant:i64<4>

was expanded into

 i32,i32 = umax Constant:i32<0>, Constant:i32<0>
 i32,i32 = umax t7, Constant:i32<4>

Now instead the two produced umax:es return i32 instead of i32, i32.

Thanks to Jan Vesely for help with the test case.

Patch by mikael.holmen at ericsson.com

Reviewers: bogner, jvesely, tstellarAMD, arsenm

Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D28135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291441 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-09 12:03:50 +00:00
Jan Vesely
0835374acb AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes
This will make transition to SCRATCH_MEMORY easier

Differential Revision: https://reviews.llvm.org/D24746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-06 21:00:46 +00:00
Konstantin Zhuravlyov
9060577664 [AMDGPU] Do not emit .AMDGPU.config section for amdhsa
Differential Revision: https://reviews.llvm.org/D27732


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291245 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-06 17:02:10 +00:00
Jan Vesely
bf64cb107c AMDGPU/SI: Implement sendmsghalt intrinsic
v2: expose using amdgcn prefix

Differential Revision: https://reviews.llvm.org/D23511

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290977 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-04 18:06:55 +00:00
Matt Arsenault
e2b3286a26 AMDGPU: Invert cmp + select with constant
Canonicalize a select with a constant to the false side. This
enables more instruction shrinking opportunities since an
inline immediate can be used for the false side of v_cndmask_b32_e32.

This seems to usually be better but causes some code size regressions
in some tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290372 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 21:40:08 +00:00
Matt Arsenault
ad47821c65 AMDGPU: Use i16 for i16 shift amount
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290351 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 16:36:25 +00:00
Matt Arsenault
8d973070a0 AMDGPU: Use i16 comparison instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290348 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 16:27:11 +00:00
Matt Arsenault
46e5f1c88d AMDGPU: Swap order of operands in fadd/fsub combine
FMA is canonicalized to constant in the middle operand. Do
the same so fmad matches and avoid an extra combine step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290313 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 04:03:40 +00:00
Matt Arsenault
121f8654d3 AMDGPU: Check fast math flags in fadd/fsub combines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290312 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 04:03:35 +00:00
Matt Arsenault
ff4096b8f8 AMDGPU: Form more FMAs if fusion is allowed
Extend the existing fadd/fsub->fmad combines to produce
FMA if allowed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290311 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 03:55:35 +00:00
Matt Arsenault
75c32f5150 AMDGPU: Enable some f32 fadd/fsub combines for f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290308 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 03:40:39 +00:00
Matt Arsenault
cee1c4614a AMDGPU: Implement isFMAFasterThanFMulAndFAdd for f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290307 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 03:21:48 +00:00
Matt Arsenault
998b18c570 AMDGPU: setcc test cleanup
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290306 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 03:21:45 +00:00
Matt Arsenault
a8dff18ebc AMDGPU: Allow rcp and rsq usage with f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290302 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 03:05:44 +00:00
Matt Arsenault
4bb99910b0 AMDGPU: Custom lower f16 fdiv
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290301 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 03:05:41 +00:00
Matt Arsenault
0bb2ef4a14 AMDGPU: Implement f16 fcanonicalize
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290300 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-22 03:05:37 +00:00
Matt Arsenault
256f8018fa AMDGPU: Allow 16-bit types in inline asm constraints
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290193 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 19:06:12 +00:00
Matt Arsenault
4bcae756d4 AMDGPU: Run fp combine tests on VI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290192 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 18:55:11 +00:00
Matt Arsenault
44e57608d4 AMDGPU: Don't add same instruction multiple times to worklist
When the instruction is processed the first time, it may be
deleted resulting in crashes. While the new test adds the same
user to the worklist twice, this particular case doesn't crash
but I'm not sure why.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290191 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 18:55:06 +00:00
Tom Stellard
38206ae07e AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*
Reviewers: arsenm, nhaehnle, mareko

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D27834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290184 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 17:19:44 +00:00
Tom Stellard
11d071bf72 AMDGPU/SI: Add a MachineMemOperand to MIMG instructions
Summary:
Without a MachineMemOperand, the scheduler was assuming MIMG instructions
were ordered memory references, so no loads or stores could be reordered
across them.

Reviewers: arsenm

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290179 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 15:52:17 +00:00