1830 Commits

Author SHA1 Message Date
Ryan Houdek
a82fcdecd7
PR review comments 2024-08-16 10:21:18 -07:00
Ryan Houdek
27acbe305d
ArchHelpers: Adjust ClearICache for its usage
Instead of clearing a hardcoded 16 bytes, adjust for the actual number
of instructions modified. The implementation will still only clear a
single cacheline so it doesn't change behaviour.
2024-08-16 10:21:18 -07:00
Ryan Houdek
cd0739a534
Arm64Helpers: Moves instruction definitions to implementation
This used to exist in the FEXCore header since the unaligned handler was
done in the frontend. Once it got moved in to FEXCore it had stayed
there. Move it over now.
2024-08-16 10:21:18 -07:00
Ryan Houdek
f1055d0713
Arm64: On backpatch ensure DMB instructions get patched in first
In the case of a visibility tear when one thread is backpatching while
another is executing. The executing thread can /potentially/ see the
writing of instructions depending on coherency rules or filling of
cachelines.

By ensuring the DMB instructions are backpatched over the NOP
instructions first, this ensures correct atomic visibility even on tear.
2024-08-16 10:21:18 -07:00
Ryan Houdek
7875b20594
Arm64: Handle backpatching in a thread-safe manner
When code buffers are shared between threads, FEX needs to be careful
around backpatching its code buffers, since one thread might have
backpatched the code that another thread was also planning on
backpatching.

To handle this case, when the handler fails to find a backpatchable
instruction, check if it was already backpatched. This can be determined
by atomically reading the instructions back and seeing if they have
turned in to the non-atomic variants.

In most cases we can just return saying that it has been handled, in the
case of a store we need to back the PC up 4 bytes to ensure the DMB is
executed before the non-atomic store.
2024-08-16 10:21:18 -07:00
Ryan Houdek
25679cd319
Arm64: Moves some unaligned atomic handling before mutex lock
These handlers don't do any code backpatching so locking the spinlock
futex isn't necessary. Move them before the lock to make them a bit more
efficient once code buffers get shared.
2024-08-16 10:21:18 -07:00
Ryan Houdek
fa45ec32db
Arm64: Early exit in unaligned handler if not on arm64
NFC, just removes a level of indent
2024-08-16 10:21:18 -07:00
Ryan Houdek
ef4c4f6e9b
FEXCore: Disable vixl linking if vixl disasm or simulator is disabled
This was mostly there, just needed to remove some extraneous headers and
only insert vixl in to the library list if the options were enabled.
2024-08-16 07:29:41 -07:00
Davide Cavalca
75687070a5 Install libraries in the correct location 2024-08-16 10:04:25 -04:00
Ryan Houdek
2eb7a9ff28
OpcodeDispatcher: Remove old bad assumption in INC/DEC
Somewhere there was an assumption made that INC and DEC supported the
repeat prefix. This isn't actually the case, while the prefix can be
encoded, it is a nop and should only expect to be used for padding.

Adds a unittest to ensure that behaviour is as expected.
2024-08-15 10:22:31 -07:00
Ryan Houdek
933c65d805
Merge pull request #3956 from alyssarosenzweig/opt/pop-return
small optimizations for returns
2024-08-15 03:30:23 -07:00
Ryan Houdek
df0ecad15b
Merge pull request #3933 from bylaws/arm64-suspend
Support cooperative suspend on ARM64EC
2024-08-15 01:22:28 -07:00
Alyssa Rosenzweig
2dc92c122e BranchOps: micro-optimize ExitFunction
this should be slightly faster on Firestorm and no worse on recent cortex

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 16:59:34 -04:00
Alyssa Rosenzweig
0db17bd58a OpcodeDispatcher: optimize IRET
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 16:59:34 -04:00
Alyssa Rosenzweig
aac16493f1 OpcodeDispatcher: optimize RET
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 16:59:34 -04:00
Ryan Houdek
aa5d2ff31c
Merge pull request #3951 from alyssarosenzweig/opt/pops
Add a hack for multiple destinations & make good use of it
2024-08-14 12:03:00 -07:00
Alyssa Rosenzweig
c70e44cf05 OpcodeDispatcher: extract Push helper
Mirrors the Pop helper we added. This cleans up a bunch

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
ede13e37d9 OpcodeDispatcher: optimize Thunk
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
a7dada457a OpcodeDispatcher: optimize POPF
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
6367554d30 OpcodeDispatcher: optimize LEAVE
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
81969a684e OpcodeDispatcher: optimize pop segment
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
ee339b5960 OpcodeDispatcher: optimize POPA
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
881c940693 OpcodeDispatcher: optimize POP
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
900c62fa7b OpcodeDispatcher: add Pop helpers
hide away the allocate dance

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
200c6c054f IR: introduce POP operation
rmw on a source, kind of terrible.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
bc0927b7b1 JIT: optimize moves for cmpxchg
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
231c28395c RegisterAllocationPass: clean up after pair removal
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
64a45c0d29 IR: remove pairs
They're now unused. And won't be missed.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
cab02be637 IR: remove unused pair create/extract
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
74f341bc0e IR: remove pair from cpuid
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
feaa1af1a8 IR: remove pair from XGetBV
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:37:06 -04:00
Alyssa Rosenzweig
d59d040b4e IR: add design doc breadcrumb
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
a9c26cbf71 IR: add coalescing heuristics for pair replacements
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
f4b5c4e69a OpcodeDispatcher: allow upper garbage for cmpxchg
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
b8cac9f7d5 JIT: avoid some moves with caspal
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
13974df204 IR: drop CASPair pair result
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
fa6fe9bf06 IR: drop cmppairz pairs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
746be0824e IR: drop CASPair source pairs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
93120cabbb IR: drop pair from memcpy
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
04ae05f4ce IR: add hack for multiple destinations
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
83773dddc7 IR: fix size for cmppairz
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Alyssa Rosenzweig
9813553f02 json_ir_generator: introduce multidestination hack
We now have two types of destinations:

* regular destinations. These are SSA. You get exactly 1 per instruction. This
  is what almost every instruction should use.

* special destinations, introduced here. These are *not* SSA. They must be
  allocated with a special instruction (added later in this PR), and then they
  are mutated by the instruction. There are two types, either pure destinations
  ("out") or read-modify-write source+destinations ("in-out"). The former are
  useful for instructions that return multiple destinations, like Memcpy. The
  latter are useful for instructions that need a source tied with a special
  destination (currently just Pop, introduced later in this series).

Special destinations reuse the mechanism of sources, to get around the
limitations on regular destinations in our current IR. Ops with special
destinations desugar to ops with no destination but extra sources prefixed Out
or Inout.

They further require HasSideEffects so we don't optimize ourselves into corners.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-14 09:17:23 -04:00
Billy Laws
890e5e1f0f FEXCore: Support disabling host cacheline clean/clear operations 2024-08-13 13:25:34 +00:00
Alyssa Rosenzweig
40812efaae OpcodeDispatcher: better handle SIB indexing
if we have shift and a constant, we can save an instruction

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-13 08:41:16 -04:00
Alyssa Rosenzweig
3429321d59 OpcodeDispatcher: allow upper garbage for MOVGPR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-13 08:41:16 -04:00
Alyssa Rosenzweig
6cddd6cbe7 OpcodeDispatcher: allow upper garbage for a2/a3
stores mask.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-13 08:41:16 -04:00
Alyssa Rosenzweig
23d07d7d0c OpcodeDispatcher: fix folding negative offsets for 32-bit
I don't know what I was thinking when I wrote that code. Drop the silly logic
and let ConstProp inline the immediates. This fixes a lot of silly code
generated for 32-bit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-13 08:41:16 -04:00
Ryan Houdek
8aa7d1a278
Merge pull request #3939 from alyssarosenzweig/opt/cfinv
Invert carry flag internally
2024-08-13 01:05:54 -07:00
Ryan Houdek
f3811f04bd
Config: Little assume non-null check
Removes a simple runtime nullcheck in Config::Layer::Set. Since we never pass a
nullptr to this.
2024-08-11 10:25:53 -07:00
Alyssa Rosenzweig
91f4c54768 OpcodeDispatcher: optimize RDRAND on flagm
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-08-10 15:21:08 -04:00