11 Commits

Author SHA1 Message Date
Matt Arsenault
d706d030af AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298444 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 21:39:51 +00:00
Sanjay Patel
9a9478ccb0 [DAGCombiner] add missing folds for scalar select of {-1,0,1}
The motivation for filling out these select-of-constants cases goes back to D24480, 
where we discussed removing an IR fold from add(zext) --> select. And that goes back to:
https://reviews.llvm.org/rL75531
https://reviews.llvm.org/rL159230

The idea is that we should always canonicalize patterns like this to a select-of-constants 
in IR because that's the smallest IR and the best for value tracking. Note that we currently 
do the opposite in some cases (like the cases in *this* patch). Ie, the proposed folds in 
this patch already exist in InstCombine today:
https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1151

As this patch shows, most targets generate better machine code for simple ext/add/not ops 
rather than a select of constants. So the follow-up steps to make this less of a patchwork 
of special-case folds and missing IR canonicalization:

1. Have DAGCombiner convert any select of constants into ext/add/not ops.
2  Have InstCombine canonicalize in the other direction (create more selects).

Differential Revision: https://reviews.llvm.org/D30180


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296137 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-24 17:17:33 +00:00
Jan Vesely
dae323db22 AMDGPU/SI: Fix trunc i16 pattern
Hit on ASICs that support 16bit instructions.

Differential Revision: https://reviews.llvm.org/D30281

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295990 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 16:12:21 +00:00
Konstantin Zhuravlyov
1d609512ed Revert "AMDGPU: Enable ConstrainCopy DAG mutation"
This reverts commit r287146.

This breaks few conformance tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287233 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-17 16:41:49 +00:00
Matt Arsenault
4fbd908949 AMDGPU: Enable ConstrainCopy DAG mutation
This fixes a probably unintended divergence from the default
scheduler behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287146 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 20:35:23 +00:00
Matt Arsenault
7eba65d30c AMDGPU: Use unsigned compare for eq/ne
For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for the one bit inputs we
don't use), but for consistency always using the
unsigned one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282832 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-30 01:50:20 +00:00
Matt Arsenault
d764af3c4e AMDGPU: Support commuting with immediate in src0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280970 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 17:19:29 +00:00
Tom Stellard
4a5c408c28 AMDGPU/SI: Implement a custom MachineSchedStrategy
Summary:
GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
a different method to compute the excess and critical register
pressure limits.

It's not enabled by default, to enable it you need to pass -misched=gcn
to llc.

Shader DB stats:

32464 shaders in 17874 tests
Totals:
SGPRS: 1542846 -> 1643125 (6.50 %)
VGPRS: 1005595 -> 904653 (-10.04 %)
Spilled SGPRs: 29929 -> 27745 (-7.30 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 36688188 -> 37034900 (0.95 %) bytes
LDS: 1913 -> 1913 (0.00 %) blocks
Max Waves: 254101 -> 265125 (4.34 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 1338220 -> 1438499 (7.49 %)
VGPRS: 886221 -> 785279 (-11.39 %)
Spilled SGPRs: 29869 -> 27685 (-7.31 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 34315716 -> 34662428 (1.01 %) bytes
LDS: 1551 -> 1551 (0.00 %) blocks
Max Waves: 188127 -> 199151 (5.86 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23688

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279995 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 19:42:52 +00:00
Tom Stellard
6ab99c7ca6 AMDGPU/SI: Enable the post-ra scheduler
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.

Reviewers: arsenm

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268143 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-30 00:23:06 +00:00
Marek Olsak
bbc32e3efd AMDGPU/SI: use S_AND for i1 trunc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251630 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 15:05:03 +00:00
Tom Stellard
953c681473 R600 -> AMDGPU rename
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 03:28:10 +00:00