unittests/ASM: Adds unittest found from Ender Lilies that crashed with NZCV

SHA instructions are very large right now and cause register spilling
due to their codegen. Ender Lilies has a really large block in a
function called `sha1_block_data_order` that was causing FEX to spill
NZCV flags incorrectly. The assumption which held true before NZCV
optimizations were a thing was that all flags were either 1-bit in an
8-bit container, or just 8-bit (x87 TOP flag).

NZCV host flags broke this assumption by making its flags 32-bit which
ended up breaking when encounting spilling situations.
This commit is contained in:
Ryan Houdek 2023-11-08 04:52:22 -08:00
parent bf147f47b5
commit 5bdd422db6

View File

@ -0,0 +1,49 @@
%ifdef CONFIG
{
"RegData": {
"RAX": "0",
"XMM0": ["0", "0"]
}
}
%endif
; FEX-Emu has a bug around NZCV flags getting spilled and filled.
; The bug comes down to NZCV actually being 32-bit but our IR incorrectly assumed that all flags were 8-bit.
; Once a spill situation happened, it would only store and reload the lower 8-bits of the NZCV flag which wasn't correct.
; This caused this code to infinite loop and read past memory and crash.
; Code found from Ender Lilies in their `sha1_block_data_order` function which is significantly longer than this snippit.
lea rsi, [rel .data_vecs]
mov rax, 1
; Break visibility
jmp loop_top
loop_top:
; Decrement counter.
dec rax
; Load rsi + 0x40 in to rbx
lea rbx, [rsi+0x40]
; Move rbx in to rsi, incrementing the pointer by 64-bytes if rax isn't zero.
cmovne rsi, rbx
; Do a sha1rnds4, which uses enough temporaries to spill NZCV which picks up a crash.
sha1rnds4 xmm0, xmm0, 0x0
; This memory access will crash once we loop too many times.
movdqu xmm0, [rsi]
; Jump back to the top
jne loop_top
hlt
.data_vecs:
dq 0, 0, 0, 0
dq 0, 0, 0, 0
dq 0, 0, 0, 0
dq 0, 0, 0, 0
dq 0, 0, 0, 0
dq 0, 0, 0, 0