This change relaxes the checks for hasOnlyUniformBranches such that our
region is uniform if:
1. All conditional branches that are direct children are uniform.
2. And either:
a. All sub-regions are uniform.
b. There is one or less conditional branches among the direct
children.
Differential Revision: https://reviews.llvm.org/D62198
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361610 91177308-0d34-0410-b5e6-96231b3b80d8
As it's causing some bot failures (and per request from kbarton).
This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358546 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
StructurizeCFG::orderNodes basically uses a reverse post-order (RPO) traversal of the region list to get the order.
The only problem with it is that sometimes backedges for outer loops will be visited before backedges for inner loops.
To solve this problem, a loop depth based approach has been used to make sure all blocks in this loop has been visited
before moving on to outer loop.
However, we found a problem for a SubRegion which is a loop itself:
--> BB1 --> BB2 --> BB3 -->
In this case, BB2 is a SubRegion (loop), and thus its loopdepth is different than that of BB1 and BB3. This fact will lead
BB2 to be placed in the wrong order.
In this work, we treat the SubRegion as a special case and use its exit block to determine the loop and its depth
to guard the sorting.
Reviewers:
arsenm, jlebar
Differential Revision:
https://reviews.llvm.org/D46912
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333111 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform, so the branch is
non-uniform.
This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.
As discovered after committing an earlier version of this change, this
exposes a subtle interaction between this pass and DivergenceAnalysis:
since we remove and re-create branch instructions, we can no longer rely
on DivergenceAnalysis for branches in subregions that were already
processed by the pass.
Explicitly remove branch instructions from DivergenceAnalysis to
avoid dangling pointers as a matter of defensive programming, and
change how we detect non-uniform subregions.
Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4
Differential Revision: https://reviews.llvm.org/D43743
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329165 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform.
This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.
Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4
Reviewers: arsenm, rampitec, jlebar
Subscribers: wdng, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D40546
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325881 91177308-0d34-0410-b5e6-96231b3b80d8
It fails with -verify-region-info. This seems to be a issue
with RegionInfo itself which existed before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321806 91177308-0d34-0410-b5e6-96231b3b80d8
The work order was changed in r228186 from SCC order
to RPO with an arbitrary sorting function. The sorting
function attempted to move inner loop nodes earlier. This
was was apparently relying on an assumption that every block
in a given loop / the same loop depth would be seen before
visiting another loop. In the broken testcase, a block
outside of the loop was encountered before moving onto
another block in the same loop. The testcase would then
structurize such that one blocks unconditional successor
could never be reached.
Revert to plain RPO for the analysis phase. This fixes
detecting edges as backedges that aren't really.
The processing phase does use another visited set, and
I'm unclear on whether the order there is as important.
An arbitrary order doesn't work, and triggers some infinite
loops. The reversed RPO list seems to work and is closer
to the order that was used before, minus the arbitary
custom sorting.
A few of the changed tests now produce smaller code,
and a few are slightly worse looking.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321751 91177308-0d34-0410-b5e6-96231b3b80d8