1519 Commits

Author SHA1 Message Date
Ryan Houdek
542ed8b6ad
Implement support for querying AES256 support
This is a different feature flag than regular AES as the default AES+AVX
only operates on 128-bit wide vectors.

With the newer `VAES` extension this is expanded to 256-bit.
2024-06-19 05:51:47 -07:00
Paulo Matos
2483329ef6 Fixes AFP.NEP handling on scalar insertions
Fixes #3690

When doing scalar insertions, upper bits come from different arguments
depending on the operation. These are listed in the ARM spec under the
NEP bit documentation.
2024-06-19 10:02:54 +02:00
Paulo Matos
359221b379 Set tag properly in X87 FST(reg) 2024-06-19 10:02:05 +02:00
Paulo Matos
f9b38a1de7 FXCH should set C1 to zero 2024-06-19 08:57:48 +02:00
Ryan Houdek
67e1ac0442
Merge pull request #3725 from alyssarosenzweig/ir/vbic
IR: rename _VBic -> _VAndn
2024-06-18 16:34:26 -07:00
Ryan Houdek
c57e9e008f
Merge pull request #3723 from alyssarosenzweig/fexcore/zero-helper
OpcodeDispatcher: refactor zero vector loads
2024-06-18 16:34:15 -07:00
Ryan Houdek
b34c23fe3d
HostFeatures: Work around Qualcomm Oryon RNG errata
The Oryon is the first CPU we know of that implemented support for the
RNG extension. It also has an errata where reading the RNDRRS register
never returns success. X86's RDSEED guarantees forward progress with
enough retries.

When an x86 processor messed this up at one point, some Linux systems
would infinite loop (presumably when something in boot was filling an
entropy pool). This required a microcode change to fix that processor.

The rdseed unittest infinite loops on this platform if RNG was exposed.
2024-06-18 16:29:53 -07:00
Alyssa Rosenzweig
01da5972fc IR: rename _VBic -> _VAndn
to be consistent with the scalar _Andn opcode, which is specifically named _Andn
and not _Bic.

noticed while reviewing AVX patches

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-18 14:00:01 -04:00
Ryan Houdek
bf812aae8f CoreState: Adds avx_high structure for tracking decoupled AVX halves.
Needed something inbetween the `InlineJITBlockHeader` and `avx_high` in
order to match alignment requirements of 16-byte for avx_high. Chose the
`DeferredSignalRefCount` because we hit it quite frequently and it is
basically the only 64-bit variable that we end up touching
significantly.

In the future the CPUState object is going to need to change its view of
the object depending on if the device supports SVE256 or not, but we
don't need to frontload the work right now. It'll become significantly
easier to support that path once the RCLSE pass gets deleted.
2024-06-18 12:00:45 -04:00
Ryan Houdek
9a71443005 CoreState: Adds a gregs offset check
This is required to be less than the maximum range for LDP and STP in
the Arm64 Dispatcher otherwise it breaks. Necessary to ensure this when
reorganizing the CoreState.
2024-06-18 12:00:45 -04:00
Ryan Houdek
ee165249bc Dispatcher: Fix ARM64EC
We don't have CI for this and was missed.
2024-06-18 12:00:45 -04:00
Alyssa Rosenzweig
af8cfb79e5 OpcodeDispatcher: refactor zero vector loads
AVX128 is going to slam this, so make it more ergonomic.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-18 11:44:46 -04:00
Ryan Houdek
13ebfb1a49
Merge pull request #3711 from Sonicadvance1/avx128_2
FEXCore: Disentangle the SVE256 feature from AVX
2024-06-17 17:35:15 -07:00
Ryan Houdek
f863b30951
Merge pull request #3716 from alyssarosenzweig/ir-dump/unrecoverable
json_ir_generator: don't print unrecoverable temps
2024-06-17 17:25:27 -07:00
Ryan Houdek
1ce27a5e6b
FEXCore: Disentangle the SVE256 feature from AVX
In quite a few locations we are mixing the case that SVE256 == AVX or
that AVX means the guest register size is 256-bit.

While this is true today, this is entanglement is going to change very
quickly and cause confusion in follow-up PRs.

Now we have SVE128, SVE256, and SVE2 HostFeatures to disambiguate the
different features which mean different things.

This PR keeps the alias that `SupportsAVX` = `SupportsSVE256 && SupportsSVE2`
but that alias is going to very quickly change its definition.
2024-06-17 17:20:32 -07:00
Ryan Houdek
933d622860
Merge pull request #3710 from Sonicadvance1/avx128_1
CoreState: Move `InlineJITBlockHeader` to the start of the struct
2024-06-17 17:17:56 -07:00
Alyssa Rosenzweig
29390b439a json_ir_generator: don't print unrecoverable temps
this makes the print more noisy for no benefit, don't do it.

before:

    %9(GPRFixed16) i32 = Add OpSize:Tmp:Size, %6(GPRFixed0) i64, %17(Invalid)
    %10(GPR0) i64 = Bfi OpSize:Tmp:Size, #0x10, #0x0, %6(GPRFixed0) i64, %9(GPRFixed16) i32
    (%11 i64) StoreRegister %6(GPRFixed0) i64, #0x11, GPR, u8:Tmp:Size
    (%12 i64) StoreRegister %9(GPRFixed16) i32, #0x10, GPR, u8:Tmp:Size
    (%13 i64) StoreRegister %10(GPR0) i64, #0x0, GPR, u8:Tmp:Size

after:

    %9(GPRFixed16) i32 = Add %6(GPRFixed0) i64, %17(Invalid)
    %10(GPR0) i64 = Bfi #0x10, #0x0, %6(GPRFixed0) i64, %9(GPRFixed16) i32
    (%11 i64) StoreRegister %6(GPRFixed0) i64, #0x11, GPR
    (%12 i64) StoreRegister %9(GPRFixed16) i32, #0x10, GPR
    (%13 i64) StoreRegister %10(GPR0) i64, #0x0, GPR

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 14:58:56 -04:00
Alyssa Rosenzweig
799c17eb90 Arm64Emitter: drop out of date comment
I fixed this when we landed the new RA

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 14:58:08 -04:00
Alyssa Rosenzweig
5fb84866e0 json_ir_generator: rework argument printing
for next commit

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 14:40:29 -04:00
Alyssa Rosenzweig
4965344ef5
Merge pull request #3705 from alyssarosenzweig/pre-rclse
Clean ups from my RCLSE branch
2024-06-17 14:22:01 -04:00
Alyssa Rosenzweig
46ca53ad0d
Merge pull request #3704 from alyssarosenzweig/ra/spill-better
RA: priorize remat over spilling
2024-06-17 09:01:50 -04:00
Alyssa Rosenzweig
61ff1b3584
Merge pull request #3712 from alyssarosenzweig/jit/silly-assert
JIT: delete silly assert
2024-06-17 08:59:00 -04:00
Alyssa Rosenzweig
7c0c5de4bd JIT: delete silly assert
noticed in the area.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-17 08:51:22 -04:00
Ryan Houdek
a9bacc1b6b
CoreState: Move InlineJITBlockHeader to the start of the struct
This currently doesn't do much but soon this will be very important to
ensure the data prefetcher of Cortex keeps the cachelines following this
variable in L1.
2024-06-17 02:59:56 -07:00
Alyssa Rosenzweig
9443b18076 RegisterAllocationPass: optimize spill loop
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-16 08:15:15 -04:00
Alyssa Rosenzweig
4bd84eb523 OpcodeDispatcher: extract PF/AF invalidate helpers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
e2073dcd30 OpcodeDispatcher: extract safe Thunk
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
fd72669c7e OpcodeDispatcher: extract safe Break
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
81c144697b OpcodeDispatcher: extract safe ExitFunction
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
aecf180dfe OpcodeDispatcher: extract FlushRegisterCache
The "end the clause" signal. for now just flushes flags.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
10fa4a4f20 OpcodeDispatcher: remove never-gonna-be-done todo
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:47 -04:00
Alyssa Rosenzweig
534732564b OpcodeDispatcher: drop pointless thunks for packss
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:23:46 -04:00
Alyssa Rosenzweig
6a314bc9cd RegisterAllocationPass: prioritize remat over spilling
No instcountci changes yet, since nothing currently spills in instcountci. This
mitigates spilling later seen with #3703, and should help for certain
pathological blocks even without those changes (maybe we should try to get some
of those blocks in instcountci?).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-15 20:21:57 -04:00
Ryan Houdek
1d1ed012d8
FEXCore: Fixes Call with 32-bit displacement and address size override
FEX had a bug with this instruction where it was incorrectly using both
the address size override and operand size override to truncate the
immediate offset. This isn't how the instruction should behave as it
should actually ignore the address size override.

This now puts it correctly inline with how the jump instruction works
and adds a unit test to ensure it doesn't break again.

This fixes a crash from the Arch rootfs from the glibc dynamic linker
being compiling in a way where a call instruction was getting aligned
using this prefix (Since the compiler knew it does nothing).
2024-06-14 14:00:35 -07:00
Lioncache
d133fa6dc1 ASIMD Tests: Remove erroneous disassembly tests
The vixl disassembler has gotten more strict about certain instruction types, so these tests
aren't really needed.

Alternatively, we could mark them as unallocated, but we can opt to remove them here.
2024-06-14 16:12:21 -04:00
Ryan Houdek
184c9d21bb
Revert "OpcodeDispatcher: optimize logical flags"
This reverts commit bb8336fcad9cf5619215e5a9f765ca48c7d48970.
2024-06-13 19:28:16 -07:00
Alyssa Rosenzweig
a8bf3859ea ConstProp: rm pointless constant folding
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
aa7dcffcea ConstProp: drop const pool heuristic
slightly worse for compile time, slightly better output, honestly I'll take the
win because this is easier to reason about.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
be1a5cea8e ConstProp: drop addressgen const pool stuff
I don't get the point, it should be handled by a combination of existing
passes/techniques just fine. no instcountci changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
402ea84aa0 RedundantFlagCalculationElimination: cleanup DCE
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
19a7b06b91 ConstProp: swallow up LongDivideElimination
as usual.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
96bd643e5b ConstProp: always inline constants
x86/interpreter leftover, I think.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
6b9293979c ConstProp: swallow up InlineCallOptimization
No reason to have a separate pass for this, merging should be a bit faster since
it eliminates an IR walk.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
7d5cee4384 InlineCallOptimization: rm x86 leftover
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
32f5a28433 IR: use Ref instead of OrderedNode
find-and-replace across the tree, excluding IR.h itself.

also excluded IRValidation because its treatment of blocks blows up and will be
reformed in the new IR anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-03 12:19:34 -04:00
Alyssa Rosenzweig
ce30179ed1 IR: add Ref typedef
To put new IR lipstick on the old IR pig.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-03 12:19:34 -04:00
Alyssa Rosenzweig
a515b707f3
Merge pull request #3679 from Sonicadvance1/memory_model_emulation_programmer_documentation
FEXCore/docs: Adds programmer documentation about memory model emulation
2024-06-03 09:24:37 -04:00
Alyssa Rosenzweig
951fee361f OpcodeDispatcher: optimize shld
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 14:44:24 -04:00
Alyssa Rosenzweig
abfd974d70 OpcodeDispatcher: select hardware addressing modes
Now that we have a framework to do this in.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:50 -04:00
Alyssa Rosenzweig
97966930e9 OpcodeDispatcher/x87f64: fuse addr calc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-01 09:42:33 -04:00