[NFC][MCA] ZnVer1: Update RegisterFile to identify false dependencies on partially written registers.

Summary:
Pretty mechanical follow-up for D49196.

As microarchitecture.pdf notes, "20 AMD Ryzen pipeline",
"20.8 Register renaming and out-of-order schedulers":
  The integer register file has 168 physical registers of 64 bits each.
  The floating point register file has 160 registers of 128 bits each.
"20.14 Partial register access":
  The processor always keeps the different parts of an integer register together.
  ...
  An instruction that writes to part of a register will therefore have a false dependence
  on any previous write to the same register or any part of it.

Reviewers: andreadb, courbet, RKSimon, craig.topper, GGanesh

Reviewed By: GGanesh

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D49393

llvm-svn: 337676
This commit is contained in:
Roman Lebedev 2018-07-23 10:10:13 +00:00
parent d57bd45acc
commit 52b85377eb
6 changed files with 98 additions and 96 deletions

View File

@ -93,7 +93,7 @@ def : ReadAdvance<ReadAfterLd, 4>;
// The Integer PRF for Zen is 168 entries, and it holds the architectural and
// speculative version of the 64-bit integer registers.
// Reference: "Software Optimization Guide for AMD Family 17h Processors"
def ZnIntegerPRF : RegisterFile<168, [GR8, GR16, GR32, GR64, CCR]>;
def ZnIntegerPRF : RegisterFile<168, [GR64, CCR]>;
// 36 Entry (9x4 entries) floating-point Scheduler
def ZnFPU : ProcResGroup<[ZnFPU0, ZnFPU1, ZnFPU2, ZnFPU3]> {

View File

@ -7,9 +7,9 @@ add %ecx, %ebx
# CHECK: Iterations: 1
# CHECK-NEXT: Instructions: 3
# CHECK-NEXT: Total Cycles: 8
# CHECK-NEXT: Total Cycles: 9
# CHECK-NEXT: Dispatch Width: 4
# CHECK-NEXT: IPC: 0.38
# CHECK-NEXT: IPC: 0.33
# CHECK-NEXT: Block RThroughput: 1.0
# CHECK: Instruction Info:
@ -26,11 +26,11 @@ add %ecx, %ebx
# CHECK-NEXT: 1 1 0.25 addl %ecx, %ebx
# CHECK: Timeline view:
# CHECK-NEXT: Index 01234567
# CHECK-NEXT: Index 012345678
# CHECK: [0,0] DeeeeER. imulq %rax, %rbx
# CHECK-NEXT: [0,1] DeeE--R. lzcntw %ax, %bx
# CHECK-NEXT: [0,2] D====eER addl %ecx, %ebx
# CHECK: [0,0] DeeeeER . imulq %rax, %rbx
# CHECK-NEXT: [0,1] D===eeER. lzcntw %ax, %bx
# CHECK-NEXT: [0,2] D=====eER addl %ecx, %ebx
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
@ -40,5 +40,5 @@ add %ecx, %ebx
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 1 1.0 1.0 0.0 imulq %rax, %rbx
# CHECK-NEXT: 1. 1 1.0 1.0 2.0 lzcntw %ax, %bx
# CHECK-NEXT: 2. 1 5.0 0.0 0.0 addl %ecx, %ebx
# CHECK-NEXT: 1. 1 4.0 0.0 0.0 lzcntw %ax, %bx
# CHECK-NEXT: 2. 1 6.0 0.0 0.0 addl %ecx, %ebx

View File

@ -10,9 +10,9 @@ xor %bx, %dx
# CHECK: Iterations: 1500
# CHECK-NEXT: Instructions: 4500
# CHECK-NEXT: Total Cycles: 1129
# CHECK-NEXT: Total Cycles: 4503
# CHECK-NEXT: Dispatch Width: 4
# CHECK-NEXT: IPC: 3.99
# CHECK-NEXT: IPC: 1.00
# CHECK-NEXT: Block RThroughput: 0.8
# CHECK: Instruction Info:
@ -48,31 +48,32 @@ xor %bx, %dx
# CHECK: Resource pressure by instruction:
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Instructions:
# CHECK-NEXT: - - - 0.25 0.75 - - - - - - - addw %cx, %dx
# CHECK-NEXT: - - 0.25 - - 0.75 - - - - - - movw %ax, %dx
# CHECK-NEXT: - - 0.50 0.50 - - - - - - - - xorw %bx, %dx
# CHECK-NEXT: - - 0.25 0.25 0.25 0.25 - - - - - - addw %cx, %dx
# CHECK-NEXT: - - 0.25 0.25 0.25 0.25 - - - - - - movw %ax, %dx
# CHECK-NEXT: - - 0.25 0.25 0.25 0.25 - - - - - - xorw %bx, %dx
# CHECK: Timeline view:
# CHECK-NEXT: Index 012345678
# CHECK-NEXT: 0123456789
# CHECK-NEXT: Index 0123456789 0
# CHECK: [0,0] DeER . . addw %cx, %dx
# CHECK-NEXT: [0,1] DeER . . movw %ax, %dx
# CHECK-NEXT: [0,2] D=eER. . xorw %bx, %dx
# CHECK-NEXT: [1,0] D==eER . addw %cx, %dx
# CHECK-NEXT: [1,1] .DeE-R . movw %ax, %dx
# CHECK-NEXT: [1,2] .D=eER . xorw %bx, %dx
# CHECK-NEXT: [2,0] .D==eER . addw %cx, %dx
# CHECK-NEXT: [2,1] .DeE--R . movw %ax, %dx
# CHECK-NEXT: [2,2] . DeE-R . xorw %bx, %dx
# CHECK-NEXT: [3,0] . D=eER . addw %cx, %dx
# CHECK-NEXT: [3,1] . DeE-R . movw %ax, %dx
# CHECK-NEXT: [3,2] . D=eER . xorw %bx, %dx
# CHECK-NEXT: [4,0] . D=eER. addw %cx, %dx
# CHECK-NEXT: [4,1] . DeE-R. movw %ax, %dx
# CHECK-NEXT: [4,2] . D=eER. xorw %bx, %dx
# CHECK-NEXT: [5,0] . D==eER addw %cx, %dx
# CHECK-NEXT: [5,1] . DeE-R movw %ax, %dx
# CHECK-NEXT: [5,2] . D=eER xorw %bx, %dx
# CHECK: [0,0] DeER . . . . addw %cx, %dx
# CHECK-NEXT: [0,1] D=eER. . . . movw %ax, %dx
# CHECK-NEXT: [0,2] D==eER . . . xorw %bx, %dx
# CHECK-NEXT: [1,0] D===eER . . . addw %cx, %dx
# CHECK-NEXT: [1,1] .D===eER . . . movw %ax, %dx
# CHECK-NEXT: [1,2] .D====eER . . . xorw %bx, %dx
# CHECK-NEXT: [2,0] .D=====eER. . . addw %cx, %dx
# CHECK-NEXT: [2,1] .D======eER . . movw %ax, %dx
# CHECK-NEXT: [2,2] . D======eER . . xorw %bx, %dx
# CHECK-NEXT: [3,0] . D=======eER . . addw %cx, %dx
# CHECK-NEXT: [3,1] . D========eER . . movw %ax, %dx
# CHECK-NEXT: [3,2] . D=========eER. . xorw %bx, %dx
# CHECK-NEXT: [4,0] . D=========eER . addw %cx, %dx
# CHECK-NEXT: [4,1] . D==========eER . movw %ax, %dx
# CHECK-NEXT: [4,2] . D===========eER . xorw %bx, %dx
# CHECK-NEXT: [5,0] . D============eER . addw %cx, %dx
# CHECK-NEXT: [5,1] . D============eER. movw %ax, %dx
# CHECK-NEXT: [5,2] . D=============eER xorw %bx, %dx
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
@ -81,6 +82,6 @@ xor %bx, %dx
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 6 2.3 0.2 0.0 addw %cx, %dx
# CHECK-NEXT: 1. 6 1.0 1.0 1.0 movw %ax, %dx
# CHECK-NEXT: 2. 6 1.8 0.0 0.2 xorw %bx, %dx
# CHECK-NEXT: 0. 6 7.0 0.2 0.0 addw %cx, %dx
# CHECK-NEXT: 1. 6 7.7 0.0 0.0 movw %ax, %dx
# CHECK-NEXT: 2. 6 8.5 0.0 0.0 xorw %bx, %dx

View File

@ -10,9 +10,9 @@ add %cx, %bx
# CHECK: Iterations: 1500
# CHECK-NEXT: Instructions: 4500
# CHECK-NEXT: Total Cycles: 1507
# CHECK-NEXT: Total Cycles: 7503
# CHECK-NEXT: Dispatch Width: 4
# CHECK-NEXT: IPC: 2.99
# CHECK-NEXT: IPC: 0.60
# CHECK-NEXT: Block RThroughput: 1.0
# CHECK: Instruction Info:
@ -53,30 +53,30 @@ add %cx, %bx
# CHECK-NEXT: - - 0.33 - 0.33 0.33 - - - - - - addw %cx, %bx
# CHECK: Timeline view:
# CHECK-NEXT: 0123
# CHECK-NEXT: Index 0123456789
# CHECK-NEXT: 0123456789 01234567
# CHECK-NEXT: Index 0123456789 0123456789
# CHECK: [0,0] DeeeER . . imulw %ax, %bx
# CHECK-NEXT: [0,1] DeeE-R . . lzcntw %ax, %bx
# CHECK-NEXT: [0,2] D==eER . . addw %cx, %bx
# CHECK-NEXT: [1,0] D===eeeER . . imulw %ax, %bx
# CHECK-NEXT: [1,1] .DeeE---R . . lzcntw %ax, %bx
# CHECK-NEXT: [1,2] .D==eE--R . . addw %cx, %bx
# CHECK-NEXT: [2,0] .D===eeeER. . imulw %ax, %bx
# CHECK-NEXT: [2,1] .DeeE----R. . lzcntw %ax, %bx
# CHECK-NEXT: [2,2] . D=eE---R. . addw %cx, %bx
# CHECK-NEXT: [3,0] . D===eeeER . imulw %ax, %bx
# CHECK-NEXT: [3,1] . DeeE----R . lzcntw %ax, %bx
# CHECK-NEXT: [3,2] . D==eE---R . addw %cx, %bx
# CHECK-NEXT: [4,0] . D===eeeER . imulw %ax, %bx
# CHECK-NEXT: [4,1] . DeeE----R . lzcntw %ax, %bx
# CHECK-NEXT: [4,2] . D==eE---R . addw %cx, %bx
# CHECK-NEXT: [5,0] . D====eeeER. imulw %ax, %bx
# CHECK-NEXT: [5,1] . DeeE----R. lzcntw %ax, %bx
# CHECK-NEXT: [5,2] . D==eE---R. addw %cx, %bx
# CHECK-NEXT: [6,0] . D====eeeER imulw %ax, %bx
# CHECK-NEXT: [6,1] . DeeE-----R lzcntw %ax, %bx
# CHECK-NEXT: [6,2] . D=eE----R addw %cx, %bx
# CHECK: [0,0] DeeeER . . . . . . . imulw %ax, %bx
# CHECK-NEXT: [0,1] D==eeER . . . . . . . lzcntw %ax, %bx
# CHECK-NEXT: [0,2] D====eER . . . . . . . addw %cx, %bx
# CHECK-NEXT: [1,0] D=====eeeER . . . . . . imulw %ax, %bx
# CHECK-NEXT: [1,1] .D======eeER . . . . . . lzcntw %ax, %bx
# CHECK-NEXT: [1,2] .D========eER . . . . . . addw %cx, %bx
# CHECK-NEXT: [2,0] .D=========eeeER . . . . . imulw %ax, %bx
# CHECK-NEXT: [2,1] .D===========eeER . . . . . lzcntw %ax, %bx
# CHECK-NEXT: [2,2] . D============eER . . . . . addw %cx, %bx
# CHECK-NEXT: [3,0] . D=============eeeER . . . . imulw %ax, %bx
# CHECK-NEXT: [3,1] . D===============eeER . . . . lzcntw %ax, %bx
# CHECK-NEXT: [3,2] . D=================eER . . . . addw %cx, %bx
# CHECK-NEXT: [4,0] . D=================eeeER . . . imulw %ax, %bx
# CHECK-NEXT: [4,1] . D===================eeER . . . lzcntw %ax, %bx
# CHECK-NEXT: [4,2] . D=====================eER . . . addw %cx, %bx
# CHECK-NEXT: [5,0] . D======================eeeER . . imulw %ax, %bx
# CHECK-NEXT: [5,1] . D=======================eeER . . lzcntw %ax, %bx
# CHECK-NEXT: [5,2] . D=========================eER . . addw %cx, %bx
# CHECK-NEXT: [6,0] . D==========================eeeER . imulw %ax, %bx
# CHECK-NEXT: [6,1] . D============================eeER. lzcntw %ax, %bx
# CHECK-NEXT: [6,2] . D=============================eER addw %cx, %bx
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
@ -85,6 +85,6 @@ add %cx, %bx
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 7 3.9 0.7 0.0 imulw %ax, %bx
# CHECK-NEXT: 1. 7 1.0 1.0 3.6 lzcntw %ax, %bx
# CHECK-NEXT: 2. 7 2.7 0.0 2.6 addw %cx, %bx
# CHECK-NEXT: 0. 7 14.1 0.1 0.0 imulw %ax, %bx
# CHECK-NEXT: 1. 7 15.9 0.0 0.0 lzcntw %ax, %bx
# CHECK-NEXT: 2. 7 17.6 0.0 0.0 addw %cx, %bx

View File

@ -5,9 +5,9 @@ lzcnt %ax, %bx ## partial register stall.
# CHECK: Iterations: 1500
# CHECK-NEXT: Instructions: 1500
# CHECK-NEXT: Total Cycles: 379
# CHECK-NEXT: Total Cycles: 1504
# CHECK-NEXT: Dispatch Width: 4
# CHECK-NEXT: IPC: 3.96
# CHECK-NEXT: IPC: 1.00
# CHECK-NEXT: Block RThroughput: 0.3
# CHECK: Instruction Info:
@ -44,16 +44,17 @@ lzcnt %ax, %bx ## partial register stall.
# CHECK-NEXT: - - 0.25 0.25 0.25 0.25 - - - - - - lzcntw %ax, %bx
# CHECK: Timeline view:
# CHECK-NEXT: Index 012345
# CHECK-NEXT: 01
# CHECK-NEXT: Index 0123456789
# CHECK: [0,0] DeeER. lzcntw %ax, %bx
# CHECK-NEXT: [1,0] DeeER. lzcntw %ax, %bx
# CHECK-NEXT: [2,0] DeeER. lzcntw %ax, %bx
# CHECK-NEXT: [3,0] DeeER. lzcntw %ax, %bx
# CHECK-NEXT: [4,0] .DeeER lzcntw %ax, %bx
# CHECK-NEXT: [5,0] .DeeER lzcntw %ax, %bx
# CHECK-NEXT: [6,0] .DeeER lzcntw %ax, %bx
# CHECK-NEXT: [7,0] .DeeER lzcntw %ax, %bx
# CHECK: [0,0] DeeER. .. lzcntw %ax, %bx
# CHECK-NEXT: [1,0] D=eeER .. lzcntw %ax, %bx
# CHECK-NEXT: [2,0] D==eeER .. lzcntw %ax, %bx
# CHECK-NEXT: [3,0] D===eeER .. lzcntw %ax, %bx
# CHECK-NEXT: [4,0] .D===eeER .. lzcntw %ax, %bx
# CHECK-NEXT: [5,0] .D====eeER.. lzcntw %ax, %bx
# CHECK-NEXT: [6,0] .D=====eeER. lzcntw %ax, %bx
# CHECK-NEXT: [7,0] .D======eeER lzcntw %ax, %bx
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
@ -62,4 +63,4 @@ lzcnt %ax, %bx ## partial register stall.
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 8 1.0 1.0 0.0 lzcntw %ax, %bx
# CHECK-NEXT: 0. 8 4.0 0.1 0.0 lzcntw %ax, %bx

View File

@ -12,9 +12,9 @@ lzcnt 2(%rsp), %cx
# CHECK: Iterations: 1500
# CHECK-NEXT: Instructions: 4500
# CHECK-NEXT: Total Cycles: 4507
# CHECK-NEXT: Total Cycles: 10503
# CHECK-NEXT: Dispatch Width: 4
# CHECK-NEXT: IPC: 1.00
# CHECK-NEXT: IPC: 0.43
# CHECK-NEXT: Block RThroughput: 1.3
# CHECK: Instruction Info:
@ -46,7 +46,7 @@ lzcnt 2(%rsp), %cx
# CHECK: Resource pressure per iteration:
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
# CHECK-NEXT: 1.00 1.00 0.66 1.00 0.67 0.67 - - - - - 1.00
# CHECK-NEXT: 1.00 1.00 0.67 1.00 0.67 0.67 - - - - - 1.00
# CHECK: Resource pressure by instruction:
# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Instructions:
@ -55,21 +55,21 @@ lzcnt 2(%rsp), %cx
# CHECK-NEXT: 1.00 - 0.33 - 0.33 0.33 - - - - - - lzcntw 2(%rsp), %cx
# CHECK: Timeline view:
# CHECK-NEXT: 012345678
# CHECK-NEXT: Index 0123456789
# CHECK-NEXT: 0123456789 0
# CHECK-NEXT: Index 0123456789 0123456789
# CHECK: [0,0] DeeeER . . . imull %edx, %ecx
# CHECK-NEXT: [0,1] DeeeeeeER . . . lzcntw (%rsp), %cx
# CHECK-NEXT: [0,2] .DeeeeeeER. . . lzcntw 2(%rsp), %cx
# CHECK-NEXT: [1,0] .D======eeeER . . imull %edx, %ecx
# CHECK-NEXT: [1,1] . DeeeeeeE--R . . lzcntw (%rsp), %cx
# CHECK-NEXT: [1,2] . DeeeeeeE--R . . lzcntw 2(%rsp), %cx
# CHECK-NEXT: [2,0] . D=======eeeER . imull %edx, %ecx
# CHECK-NEXT: [2,1] . DeeeeeeE----R . lzcntw (%rsp), %cx
# CHECK-NEXT: [2,2] . DeeeeeeE---R . lzcntw 2(%rsp), %cx
# CHECK-NEXT: [3,0] . D=========eeeER imull %edx, %ecx
# CHECK-NEXT: [3,1] . DeeeeeeE-----R lzcntw (%rsp), %cx
# CHECK-NEXT: [3,2] . DeeeeeeE-----R lzcntw 2(%rsp), %cx
# CHECK: [0,0] DeeeER . . . . . imull %edx, %ecx
# CHECK-NEXT: [0,1] DeeeeeeER . . . . . lzcntw (%rsp), %cx
# CHECK-NEXT: [0,2] .DeeeeeeER. . . . . lzcntw 2(%rsp), %cx
# CHECK-NEXT: [1,0] .D======eeeER . . . . imull %edx, %ecx
# CHECK-NEXT: [1,1] . D=====eeeeeeER . . . lzcntw (%rsp), %cx
# CHECK-NEXT: [1,2] . D======eeeeeeER . . . lzcntw 2(%rsp), %cx
# CHECK-NEXT: [2,0] . D===========eeeER. . . imull %edx, %ecx
# CHECK-NEXT: [2,1] . D===========eeeeeeER . . lzcntw (%rsp), %cx
# CHECK-NEXT: [2,2] . D===========eeeeeeER . . lzcntw 2(%rsp), %cx
# CHECK-NEXT: [3,0] . D=================eeeER . imull %edx, %ecx
# CHECK-NEXT: [3,1] . D================eeeeeeER. lzcntw (%rsp), %cx
# CHECK-NEXT: [3,2] . D=================eeeeeeER lzcntw 2(%rsp), %cx
# CHECK: Average Wait times (based on the timeline view):
# CHECK-NEXT: [0]: Executions
@ -78,6 +78,6 @@ lzcnt 2(%rsp), %cx
# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage
# CHECK: [0] [1] [2] [3]
# CHECK-NEXT: 0. 4 6.5 0.3 0.0 imull %edx, %ecx
# CHECK-NEXT: 1. 4 1.0 1.0 2.8 lzcntw (%rsp), %cx
# CHECK-NEXT: 2. 4 1.0 1.0 2.5 lzcntw 2(%rsp), %cx
# CHECK-NEXT: 0. 4 9.5 0.3 0.0 imull %edx, %ecx
# CHECK-NEXT: 1. 4 9.0 0.0 0.0 lzcntw (%rsp), %cx
# CHECK-NEXT: 2. 4 9.5 0.0 0.0 lzcntw 2(%rsp), %cx