[LoopUtils] Better accuracy for getLoopEstimatedTripCount.

Summary: Current implementation of getLoopEstimatedTripCount returns 1 iteration less than it should. The reason is that in bottom tested loop first iteration is executed before first back branch is taken. For example for loop with !{!"branch_weights", i32 1 // taken, i32 1 // exit} metadata getLoopEstimatedTripCount gives 1 while actual number of iterations is 2.

Reviewers: Ayal, fhahn

Reviewed By: Ayal

Subscribers: mgorny, hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71990
This commit is contained in:
Evgeniy Brevnov 2020-01-16 17:34:02 +07:00
parent 437cb9c178
commit c5e9c8b1df
6 changed files with 13 additions and 10 deletions

View File

@ -723,12 +723,15 @@ Optional<unsigned> llvm::getLoopEstimatedTripCount(Loop *L) {
if (LatchBR->getSuccessor(0) != L->getHeader())
std::swap(BackedgeTakenWeight, LatchExitWeight);
if (!BackedgeTakenWeight || !LatchExitWeight)
return 0;
if (!LatchExitWeight)
return None;
// Divide the count of the backedge by the count of the edge exiting the loop,
// rounding to nearest.
return llvm::divideNearest(BackedgeTakenWeight, LatchExitWeight);
// Estimated backedge taken count is a ratio of the backedge taken weight by
// the the edge exiting weight, rounded to nearest.
uint64_t BackedgeTakenCount =
llvm::divideNearest(BackedgeTakenWeight, LatchExitWeight);
// Estimated trip count is one plus estimated backedge taken count.
return BackedgeTakenCount + 1;
}
bool llvm::hasIterationCountInvariantInParent(Loop *InnerLoop,

View File

@ -11,7 +11,7 @@ declare void @f2()
define void @test1(i32 %k) !prof !4 {
; CHECK: Loop Unroll: F[test1] Loop %for.body
; CHECK: PEELING loop %for.body with iteration count 2!
; CHECK: PEELING loop %for.body with iteration count 4!
; CHECK: PEELING loop %for.body with iteration count 5!
; CHECK: llvm.loop.unroll.disable
for.body.lr.ph:
br label %for.body

View File

@ -5,7 +5,7 @@
; Regression test for setting the correct idom for exit blocks.
; CHECK: Loop Unroll: F[basic]
; CHECK: PEELING loop %for.body with iteration count 1!
; CHECK: PEELING loop %for.body with iteration count 2!
define i32 @basic(i32* %p, i32 %k, i1 %c1, i1 %c2) #0 !prof !3 {
entry:

View File

@ -5,7 +5,7 @@
; Regression test for setting the correct idom for exit blocks.
; CHECK: Loop Unroll: F[basic]
; CHECK: PEELING loop %for.body with iteration count 1!
; CHECK: PEELING loop %for.body with iteration count 2!
define i32 @basic(i32* %p, i32 %k, i1 %c1, i1 %c2) #0 !prof !3 {
entry:

View File

@ -8,7 +8,7 @@
; All side exits to deopt does not change weigths.
; CHECK: Loop Unroll: F[basic]
; CHECK: PEELING loop %for.body with iteration count 3!
; CHECK: PEELING loop %for.body with iteration count 4!
; CHECK-NO-PEEL-NOT: PEELING loop %for.body
; CHECK-LABEL: @basic
; CHECK: br i1 %c, label %{{.*}}, label %side_exit, !prof !15

View File

@ -9,7 +9,7 @@
; from the loop, and update the branch weights for the peeled loop properly.
; CHECK: Loop Unroll: F[basic]
; CHECK: PEELING loop %for.body with iteration count 3!
; CHECK: PEELING loop %for.body with iteration count 4!
; CHECK: Loop Unroll: F[optsize]
; CHECK-NOT: PEELING