41 Commits

Author SHA1 Message Date
Jay Foad
303c51ac63 [AMDGPU] Add VerifyScheduling support.
Summary:
This is cut and pasted from the corresponding GenericScheduler
functions.

Reviewers: arsenm, atrick, tstellar, vpykhtin

Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68264

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373346 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 15:45:47 +00:00
Jay Foad
9edd4decf5 Add tracing in pickNodeFromQueue.
This matches GenericScheduler::pickNodeFromQueue, from which this
function was mostly cut and pasted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372829 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-25 08:45:41 +00:00
Matt Arsenault
39b334b81d AMDGPU: Fix typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371249 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-06 20:00:22 +00:00
Matt Arsenault
cd091def41 AMDGPU: Avoid constructing new std::vector in initCandidate
Approximately 30% of the time was spent in the std::vector
constructor. In one testcase this pushes the scheduler to being the
second slowest pass.

I'm not sure I understand why these vector are necessary. The default
scheduler initCandidate seems to use some pre-existing vectors for the
pressure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371136 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-05 22:44:06 +00:00
Valery Pykhtin
b0b4bb7b62 [AMDGPU] Speed up live-in virtual register set computaion in GCNScheduleDAGMILive.
Differential revision: https://reviews.llvm.org/D62401

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363661 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-18 11:43:17 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Tom Stellard
1d6fd076a3 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 20:59:01 +00:00
Stanislav Mekhanoshin
9136eac979 [AMDGPU] Small refactoring in the scheduler
After last changes some code can be simplified.

Differential Revision: https://reviews.llvm.org/D47661

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333934 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-04 17:57:40 +00:00
Stanislav Mekhanoshin
10bf29495f [AMDGPU] Track occupancy in MFI
Keep track of achieved occupancy in SIMachineFunctionInfo.
At the moment we have a lot of duplicated or even missed code to
query and maintain occupancy info. Record it in the MFI and
query in a single call. Interfaces:

- getOccupancy() - returns current recorded achieved occupancy.
- getMinAllowedOccupancy() - returns lesser of the achieved occupancy
and the lowest occupancy we are ready to tolerate. For example if
a kernel is memory bound we are ready to tolerate 4 waves.
- limitOccupancy() - record occupancy level if we have to lower it.
- increaseOccupancy() - record occupancy if scheduler managed to
increase the occupancy.

MFI takes care of integrating different checks affecting occupancy,
including LDS use and waves-per-eu attribute. Note that scheduler
starts with not yet known register pressure, so has to record either
limit or increase in occupancy after it is done. Later passes can
just query a resulting value.

New interface is used in the active scheduler and NFC wrt its work.
Changes are also made to experimental schedulers to use it and record
an occupancy after they are done. Before the change waves-per-eu was
ignored by experimental schedulers and tolerance window for memory
bound kernels was not used.

Differential Revision: https://reviews.llvm.org/D47509

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333629 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-31 05:36:04 +00:00
Stanislav Mekhanoshin
277225f527 [AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:

1. Function is memory bound.
2. Kernel can use wave limiter.

Currently these hints are used in the scheduler. If a function is suspected
to be memory bound we allow occupancy to decrease to 4 waves in the course
of scheduling.

Differential Revision: https://reviews.llvm.org/D46992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333289 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-25 17:25:12 +00:00
Nicola Zaghen
0818e789cb Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332240 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-14 12:53:11 +00:00
Stanislav Mekhanoshin
cf1a1cbbf1 [AMDGPU] Fix amdgpu-waves-per-eu accounting in scheduler
We cannot query this attribute from a subtarget given a machine function.
At this point attribute itself is already unavailable and can only be
obtained through MFI.

Differential Revision: https://reviews.llvm.org/D46781

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332166 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-12 01:41:56 +00:00
Shiva Chen
24abe71d71 [DebugInfo] Examine all uses of isDebugValue() for debug instructions.
Because we create a new kind of debug instruction, DBG_LABEL, we need to
check all passes which use isDebugValue() to check MachineInstr is debug
instruction or not. When expelling debug instructions, we should expel
both DBG_VALUE and DBG_LABEL. So, I create a new function,
isDebugInstr(), in MachineInstr to check whether the MachineInstr is
debug instruction or not.

This patch has no new test case. I have run regression test and there is
no difference in regression test.

Differential Revision: https://reviews.llvm.org/D45342

Patch by Hsiangkai Wang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331844 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-09 02:42:00 +00:00
Hiroshi Inoue
25f9810059 [NFC] fix trivial typos in comments
"the the" -> "the"



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323074 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-22 05:54:46 +00:00
Matthias Braun
d318139827 MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320884 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:58 +00:00
Yaxun Liu
06d39e2dc8 Recommit CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value
The regression on ppc64 was not due to this commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320788 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 03:56:57 +00:00
Yaxun Liu
92d81a8d46 Revert CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value
This commit might have caused regression on ppc64. Revert it to verify that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320712 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-14 16:12:04 +00:00
Yaxun Liu
084f87947f CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value
Two issues were found about machine inst scheduler when compiling ProRender
with -g for amdgcn target:

GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it
should not since DBG_VALUE is not mapped in LiveIntervals.

when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and
ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D41132


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320650 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 22:38:09 +00:00
Matt Arsenault
a50e269370 AMDGPU: Fix crash when scheduling DBG_VALUE
This calls handleMove with a DBG_VALUE instruction,
which isn't tracked by LiveIntervals. I'm not sure
this is the correct place to fix this. The generic
scheduler seems to have more deliberate region
selection that skips dbg_value.

The test is also really hard to reduce. I haven't been able
to figure out what exactly causes this particular case to
try moving the dbg_value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 03:09:23 +00:00
Francis Visoiu Mistrih
ca0df55065 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319665 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 17:18:51 +00:00
Evandro Menezes
fdda7ea9d5 [CodeGen] Rename DEBUG_TYPE to match passnames
Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were
absent from https://reviews.llvm.org/rL303921.

Differential revision: https://reviews.llvm.org/D35231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307719 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-11 22:08:28 +00:00
Stanislav Mekhanoshin
791f311a49 [AMDGPU] Use GCNRPTracker dumper methods in scheduler
Differential Revision: https://reviews.llvm.org/D33244

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303186 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-16 16:31:45 +00:00
Stanislav Mekhanoshin
a1445b5182 [AMDGPU] Cache live-ins and register pressure in scheduler
Using LIS can be quite expensive, so caching of calculated region
live-ins and pressure is implemented. It does two things:

1. Caches the info for the second stage when we schedule with
   decreased target occupancy.
2. Tracks the basic block from top to bottom thus eliminating the
   need to scan whole register file liveness at every region split
   in the middle of the block.

The scheduling is now done in 3 stages instead of two, with the first
one being really a no-op and only used to collect scheduling regions
as sent by the scheduler driver.

There is no functional change to the current behavior, only compilation
speed is affected. In general computeBlockPressure() could be simplified
if we switch to backward RP tracker, because scheduler sends regions
within a block starting from the last upward. We could use a natural
order of upward tracker to seamlessly change between regions of the same
block, since live reg set of a previous tracked region would become a
live-out of the next region. That however requires fixing upward tracker
to properly account defs and uses of the same instruction as both are
contributing to the current pressure. When we converge on the produced
pressure we should be able to switch between them back and forth. In
addition, backward tracker is less expensive as it uses LIS in recede
less often than forward uses it in advance.

At the moment the worst known case compilation time has improved from 26
minutes to 8.5.

Differential Revision: https://reviews.llvm.org/D33117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303184 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-16 16:11:26 +00:00
Stanislav Mekhanoshin
e61df31e20 [AMDGPU] Turn register pressure estimation into forward tracker
This factors register pressure estimation mechanism from the
GCNSchedStrategy into the forward tracker to unify interface
with other strategies and expose it to other interested phases.

Differential Revision: https://reviews.llvm.org/D33105

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303179 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-16 15:43:52 +00:00
Stanislav Mekhanoshin
87fd46af90 [AMDGPU] Fix incorrect register pressure calculation
Earlier fix D32572 introduced a bug where live-ins were calculated
for basic block instead of scheduling region. This change fixes it.

Differential Revision: https://reviews.llvm.org/D33086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302812 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-11 17:16:55 +00:00
Konstantin Zhuravlyov
51ff1d79cd AMDGPU: Fix assert in scheduler
Assert is triggered if DBG_VALUE is first instruction in BB

Differential Revision: https://reviews.llvm.org/D32572


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301511 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-27 03:22:44 +00:00
Stanislav Mekhanoshin
7a9c51436f [AMDGPU] Fix recorded region boundaries in max-occupancy scheduler
This is incorrect to record region boundaries before scheduling,
it may change after scheduling. As a result second pass may see less
instructions to schedule than it should.

Differential Revision: https://reviews.llvm.org/D31434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298945 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 21:48:54 +00:00
Valery Pykhtin
4b7cf6c175 [AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler
Differential revision: https://reviews.llvm.org/D31046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298368 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 13:15:46 +00:00
Stanislav Mekhanoshin
3081264dbe [AMDGPU] Remove getBidirectionalReasonRank
This method inverts the Reason field of a scheduling candidate.
It does right comparison between RegCritical and RegExcess, but
everything else is broken. In fact it can prefer less strong reason
such as Weak over RegCritical because Weak > -RegCritical.

The CandReason enum is properly sorted, so just remove artificial
ranking.

Differential Revision: https://reviews.llvm.org/D30557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297536 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-11 00:29:27 +00:00
Stanislav Mekhanoshin
0248798c22 [AMDGPU] Add second pass of the scheduler
If during scheduling we have identified that we cannot keep optimistic
occupancy increase critical register pressure limit and try scheduling
of the whole function again. In this case blocks with smaller pressure
will have a chance for better scheduling.

Differential Revision: https://reviews.llvm.org/D30442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296506 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 19:20:33 +00:00
Stanislav Mekhanoshin
004075214e [AMDGPU] New method to estimate register pressure
This change introduces new method to estimate register pressure in
GCNScheduler. Standard RPTracker gives huge error due to the following
reasons:

1. It does not account for live-ins or live-outs if value is not used
in the region itself. That creates a huge error in a very common case
if there are a lot of live-thu registers.
2. It does not properly count subregs.
3. It assumes a register used as an input operand can be reused as an
output. This is not always possible by itself, this is not what RA
will finally do in many cases for various reasons not limited to RA's
inability to do so, and this is not so if the value is actually a
live-thu.

In addition we can now see clear separation between live-in pressure
which we cannot change with the scheduling and tentative pressure
which we can change.

Differential Revision: https://reviews.llvm.org/D30439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296491 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 17:22:39 +00:00
Stanislav Mekhanoshin
b57bf30b47 [AMDGPU] Fix read-undef flags when schedule is reverted
If two subregs of the same register are defined and we need to revert
schedule changing def order, we will end up with both instructions
having def,read-undef flags because adjustLaneLiveness() will only set
this flag but will not remove it.

Fix this by removing read-undef flags before calling adjustLaneLiveness.

Differential Revision: https://reviews.llvm.org/D30428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296484 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 16:26:27 +00:00
Stanislav Mekhanoshin
aae13371be [AMDGPU] Revert failed scheduling
This patch reverts region's scheduling to the original untouched state
in case if we have have decreased occupancy.

In addition it switches to use TargetRegisterInfo occupancy callback
for pressure limits instead of gradually increasing limits which were
just passed by. We are going to stay with the best schedule so we do
not need to tolerate worsened scheduling anymore.

Differential Revision: https://reviews.llvm.org/D29971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295206 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 17:19:50 +00:00
Konstantin Zhuravlyov
c478d3544a [AMDGPU] Move register related queries to subtarget class
Differential Revision: https://reviews.llvm.org/D29318


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294440 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-08 13:02:33 +00:00
Stanislav Mekhanoshin
0887cb0d09 [AMDGPU] Fix GCNSchedStrategy.cpp debug output
There is typo in the debug output: top and bottom candidates are switched.

Differential Revision: https://reviews.llvm.org/D29608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294257 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-06 23:16:51 +00:00
Stanislav Mekhanoshin
a1d4ee75a4 [AMDGPU] Account workgroup size in LDS occupancy limits
Functions matching LDS use to occupancy return results for a workgroup
of 64 workitems. The numbers has to be adjusted for bigger workgroups.
For example a workgroup of size 256 already occupies 4 waves just by
itself. Given that all numbers of LDS use in the compiler are per
workgroup, occupancy shall be multiplied by 4 in this case. Each 64
workitems still limited by the same number, but 4 subrgoups 64 workitems
each can afford 4 times more LDS to get the same occupancy.

In addition change initializes LDS size in the subtarget to a real value
for SI+ targets. This is required since LDS size is a variable in these
calculations.

Differential Revision: https://reviews.llvm.org/D29423

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293837 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-01 22:59:50 +00:00
Valery Pykhtin
9016ce4442 [AMDGPU] Fix typo in GCNSchedStrategy
Differential revision: https://reviews.llvm.org/D28980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293171 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-26 10:51:47 +00:00
Marek Olsak
ee082f5e52 AMDGPU/SI: Allow using SGPRs 96-101 on VI
Summary:
There is no point in setting SGPRS=104, because VI allocates SGPRs
in multiples of 16, so 104 -> 112. That enables us to use all 102 SGPRs
for general purposes.

Reviewers: tstellarAMD

Subscribers: qcolombet, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D27149

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289260 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-09 19:49:40 +00:00
Matt Arsenault
783d67d503 AMDGPU: Whitespace fixes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285659 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-01 00:55:14 +00:00
Konstantin Zhuravlyov
1f99c41083 [AMDGPU] Wave and register controls
- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch

Patch by Tom Stellard and Konstantin Zhuravlyov

Differential Revision: https://reviews.llvm.org/D21562



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280747 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:22:28 +00:00
Tom Stellard
4a5c408c28 AMDGPU/SI: Implement a custom MachineSchedStrategy
Summary:
GCNSchedStrategy re-uses most of GenericScheduler, it's just uses
a different method to compute the excess and critical register
pressure limits.

It's not enabled by default, to enable it you need to pass -misched=gcn
to llc.

Shader DB stats:

32464 shaders in 17874 tests
Totals:
SGPRS: 1542846 -> 1643125 (6.50 %)
VGPRS: 1005595 -> 904653 (-10.04 %)
Spilled SGPRs: 29929 -> 27745 (-7.30 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 36688188 -> 37034900 (0.95 %) bytes
LDS: 1913 -> 1913 (0.00 %) blocks
Max Waves: 254101 -> 265125 (4.34 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 1338220 -> 1438499 (7.49 %)
VGPRS: 886221 -> 785279 (-11.39 %)
Spilled SGPRs: 29869 -> 27685 (-7.31 %)
Spilled VGPRs: 334 -> 352 (5.39 %)
Scratch VGPRs: 1612 -> 1624 (0.74 %) dwords per thread
Code Size: 34315716 -> 34662428 (1.01 %) bytes
LDS: 1551 -> 1551 (0.00 %) blocks
Max Waves: 188127 -> 199151 (5.86 %)
Wait states: 0 -> 0 (0.00 %)

Reviewers: arsenm, mareko, nhaehnle, MatzeB, atrick

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D23688

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279995 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 19:42:52 +00:00