17 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin
87fd46af90 [AMDGPU] Fix incorrect register pressure calculation
Earlier fix D32572 introduced a bug where live-ins were calculated
for basic block instead of scheduling region. This change fixes it.

Differential Revision: https://reviews.llvm.org/D33086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302812 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-11 17:16:55 +00:00
Konstantin Zhuravlyov
51ff1d79cd AMDGPU: Fix assert in scheduler
Assert is triggered if DBG_VALUE is first instruction in BB

Differential Revision: https://reviews.llvm.org/D32572


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301511 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-27 03:22:44 +00:00
Stanislav Mekhanoshin
7a9c51436f [AMDGPU] Fix recorded region boundaries in max-occupancy scheduler
This is incorrect to record region boundaries before scheduling,
it may change after scheduling. As a result second pass may see less
instructions to schedule than it should.

Differential Revision: https://reviews.llvm.org/D31434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298945 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 21:48:54 +00:00
Valery Pykhtin
4b7cf6c175 [AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler
Differential revision: https://reviews.llvm.org/D31046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298368 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 13:15:46 +00:00
Stanislav Mekhanoshin
3081264dbe [AMDGPU] Remove getBidirectionalReasonRank
This method inverts the Reason field of a scheduling candidate.
It does right comparison between RegCritical and RegExcess, but
everything else is broken. In fact it can prefer less strong reason
such as Weak over RegCritical because Weak > -RegCritical.

The CandReason enum is properly sorted, so just remove artificial
ranking.

Differential Revision: https://reviews.llvm.org/D30557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297536 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-11 00:29:27 +00:00
Stanislav Mekhanoshin
0248798c22 [AMDGPU] Add second pass of the scheduler
If during scheduling we have identified that we cannot keep optimistic
occupancy increase critical register pressure limit and try scheduling
of the whole function again. In this case blocks with smaller pressure
will have a chance for better scheduling.

Differential Revision: https://reviews.llvm.org/D30442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296506 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 19:20:33 +00:00
Stanislav Mekhanoshin
004075214e [AMDGPU] New method to estimate register pressure
This change introduces new method to estimate register pressure in
GCNScheduler. Standard RPTracker gives huge error due to the following
reasons:

1. It does not account for live-ins or live-outs if value is not used
in the region itself. That creates a huge error in a very common case
if there are a lot of live-thu registers.
2. It does not properly count subregs.
3. It assumes a register used as an input operand can be reused as an
output. This is not always possible by itself, this is not what RA
will finally do in many cases for various reasons not limited to RA's
inability to do so, and this is not so if the value is actually a
live-thu.

In addition we can now see clear separation between live-in pressure
which we cannot change with the scheduling and tentative pressure
which we can change.

Differential Revision: https://reviews.llvm.org/D30439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296491 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 17:22:39 +00:00
Stanislav Mekhanoshin
b57bf30b47 [AMDGPU] Fix read-undef flags when schedule is reverted
If two subregs of the same register are defined and we need to revert
schedule changing def order, we will end up with both instructions
having def,read-undef flags because adjustLaneLiveness() will only set
this flag but will not remove it.

Fix this by removing read-undef flags before calling adjustLaneLiveness.

Differential Revision: https://reviews.llvm.org/D30428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296484 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 16:26:27 +00:00
Stanislav Mekhanoshin
aae13371be [AMDGPU] Revert failed scheduling
This patch reverts region's scheduling to the original untouched state
in case if we have have decreased occupancy.

In addition it switches to use TargetRegisterInfo occupancy callback
for pressure limits instead of gradually increasing limits which were
just passed by. We are going to stay with the best schedule so we do
not need to tolerate worsened scheduling anymore.

Differential Revision: https://reviews.llvm.org/D29971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295206 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 17:19:50 +00:00
Konstantin Zhuravlyov
c478d3544a [AMDGPU] Move register related queries to subtarget class
Differential Revision: https://reviews.llvm.org/D29318


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294440 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 13:02:33 +00:00
Stanislav Mekhanoshin
0887cb0d09 [AMDGPU] Fix GCNSchedStrategy.cpp debug output
There is typo in the debug output: top and bottom candidates are switched.

Differential Revision: https://reviews.llvm.org/D29608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294257 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-06 23:16:51 +00:00
Stanislav Mekhanoshin
a1d4ee75a4 [AMDGPU] Account workgroup size in LDS occupancy limits
Functions matching LDS use to occupancy return results for a workgroup
of 64 workitems. The numbers has to be adjusted for bigger workgroups.
For example a workgroup of size 256 already occupies 4 waves just by
itself. Given that all numbers of LDS use in the compiler are per
workgroup, occupancy shall be multiplied by 4 in this case. Each 64
workitems still limited by the same number, but 4 subrgoups 64 workitems
each can afford 4 times more LDS to get the same occupancy.

In addition change initializes LDS size in the subtarget to a real value
for SI+ targets. This is required since LDS size is a variable in these
calculations.

Differential Revision: https://reviews.llvm.org/D29423

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293837 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-01 22:59:50 +00:00
Valery Pykhtin
9016ce4442 [AMDGPU] Fix typo in GCNSchedStrategy
Differential revision: https://reviews.llvm.org/D28980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293171 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-26 10:51:47 +00:00
Marek Olsak
ee082f5e52 AMDGPU/SI: Allow using SGPRs 96-101 on VI
Summary:
There is no point in setting SGPRS=104, because VI allocates SGPRs
in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs
for general purposes.

Reviewers: tstellarAMD

Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27149

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289260 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-09 19:49:40 +00:00
Matt Arsenault
783d67d503 AMDGPU: Whitespace fixes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285659 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-01 00:55:14 +00:00
Konstantin Zhuravlyov
1f99c41083 [AMDGPU] Wave and register controls
- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch

Patch by Tom Stellard and Konstantin Zhuravlyov

Differential Revision: https://reviews.llvm.org/D21562



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280747 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:22:28 +00:00
Tom Stellard
4a5c408c28 AMDGPU/SI: Implement a custom MachineSchedStrategy
Summary:
GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
a different method to compute the excess and critical register
pressure limits.

It's not enabled by default, to enable it you need to pass -misched=gcn
to llc.

Shader DB stats:

32464 shaders in 17874 tests
Totals:
SGPRS: 1542846 -> 1643125 (6.50 %)
VGPRS: 1005595 -> 904653 (-10.04 %)
Spilled SGPRs: 29929 -> 27745 (-7.30 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 36688188 -> 37034900 (0.95 %) bytes
LDS: 1913 -> 1913 (0.00 %) blocks
Max Waves: 254101 -> 265125 (4.34 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 1338220 -> 1438499 (7.49 %)
VGPRS: 886221 -> 785279 (-11.39 %)
Spilled SGPRs: 29869 -> 27685 (-7.31 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 34315716 -> 34662428 (1.01 %) bytes
LDS: 1551 -> 1551 (0.00 %) blocks
Max Waves: 188127 -> 199151 (5.86 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23688

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279995 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 19:42:52 +00:00