Ryan Houdek
946c805d84
Merge pull request #3459 from Sonicadvance1/fix_591
...
Capture a 64-bit process trying to jump to 32-bit syscall handler
2024-02-26 21:57:03 -08:00
Ryan Houdek
0b34035085
Merge pull request #3439 from Sonicadvance1/allocate_first_4gb_of_64bit
...
FEXLoader: Allocate the second 4GB of virtual memory when executing 32-bit
2024-02-26 18:46:22 -08:00
Alyssa Rosenzweig
5f16f357af
Merge pull request #3456 from alyssarosenzweig/opt/adc
...
Optimize ADC
2024-02-26 09:39:37 -04:00
Ryan Houdek
4f028b8614
Capture a 64-bit process trying to jump to 32-bit syscall handler
...
Fixes #591
Adds a simple unittest
2024-02-26 05:37:29 -08:00
Alyssa Rosenzweig
7deb4976a3
RedundantFlagCalculationElimination: fix missing NEG case
...
can be predicated.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 17:37:16 -04:00
Alyssa Rosenzweig
1e153e0c81
OpcodeDispatcher: allow garbage with adcs
...
for the usual reasons
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:02 -04:00
Alyssa Rosenzweig
6994fc3a01
IR,OpcodeDispatcher,JIT: fuse adcs flags
...
The usual tricks, also requires introducing a bare adc op to optimize adcs to,
but we wanted that anyway!
Also support a zero source, so we can calculate "foo + CF" in one instruction to
optimize the "lock adc" cases.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:49:32 -04:00
Ryan Houdek
d703f3ccee
Fixes zero register flag generation
...
Fixes 140976d322
Adds a unit test to ensure it keeps working.
2024-02-24 16:32:25 -08:00
Alyssa Rosenzweig
80e632db8a
OpcodeDispatcher: garbage collect
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
852e3c4e93
OpcodeDispatcher: fuse XADD
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
e86547bbcb
OpcodeDispatcher: fuse INC
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
883cca2e8f
OpcodeDispatcher: use AddWithFlags
...
give it the same treatment we just gave sub.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
8540332520
IR: add AddWithFlags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
df5bdefb8a
OpcodeDispatcher: merge secondary ALU with primary ALU
...
It's the same, stop copypasting. This gets our flag and arithmetic opts (current
and future) applied to secondary ALU too.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
3d1fb7701c
OpcodeDispatcher: optimize sub
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
140976d322
OpcodeDispatcher: prep primary ALU for better flags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
9a11d3b1a2
OpcodeDispatcher: fuse NEG
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
96e652879f
OpcodeDispatcher: fuse DEC
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
cc1c1dd047
OpcodeDispatcher: return result from SUB flag calculate
...
for fusion
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
c1d572951f
OpcodeDispatcher: drop unused GenerateFlags_SUB arg
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
dd9d3264dd
OpcodeDispatcher: smarten SUB flag generation
...
we don't need the result, we can use subs and come out ahead in practice. also a
step towards better fusion
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
2aaf957ad8
RedundantFlagCalculationElimination: DCE as we go
...
This is required to ensure single-iteration convergence with a sequence like:
write C
whatever = load C
rmif C, whatever
invalidate C
avoids regressing the "DEC dead" case with future work in the series.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
d459b2f9b5
IR: propagate 0 into sub
...
now that we have to handle it, we may as well take advantage of it.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
22ab7f2b3e
IR: add SubWithFlags op (arm64 subs)
...
with 8/16-bit handling to keep everything uniform.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
f7e32373ce
JIT: allow #0 in sub
...
turns into neg, this will be generated via SubWithFlags -> Sub opts.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
25d422e92b
JIT: use GetZeroableReg for NZCVSelect
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
cfbeece09f
JIT: use GetZeroableReg for CondAddNZCV
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
a597a09825
JIT: use GetZeroableReg for SubNZCV
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
99854ff310
JIT: add GetZeroableReg helper
...
for inlining constant zeroes in applicable sources
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Ryan Houdek
0a64f8a9c5
Moves SignalDelegator TLS tracking to the frontend
...
FEXCore doesn't need track the TLS state of the SignalDelegator, this is
a frontend concept.
Removes the tracking from the backend and keeps it in the frontend.
2024-02-24 01:07:29 -08:00
Ryan Houdek
59ec88f48d
Merge pull request #3438 from Sonicadvance1/move_tls_allocation
...
Moves JITSymbol allocation
2024-02-23 14:49:04 -08:00
Ryan Houdek
6ec628fa31
Merge pull request #3433 from bylaws/arm64ec-pt1
...
Arm64Emitter: Introduce ARM64EC SRA mappings
2024-02-23 14:48:43 -08:00
Alyssa Rosenzweig
5378ae2e76
Merge pull request #3436 from alyssarosenzweig/ir/af-simplify
...
Simplify CalculateAF
2024-02-22 08:17:07 -04:00
Ryan Houdek
bd4a81a2a1
Moves JITSymbol allocation
...
This isn't actually using TLS allocations. Instead it is an allocation
tied to the InternalThreadState object.
2024-02-21 17:57:34 -08:00
Ryan Houdek
3d671cba10
Allocator: Early return if past the end of the allocation range
...
Fixes a bug where it would eventually hit the stack region and remap the
range as RW even for ranges that don't overlap the stack.
2024-02-21 15:37:46 -08:00
Ryan Houdek
d4be2dc636
Merge pull request #3434 from bylaws/arm64ec-pt3
...
FEXCore: Expose AbsoluteLoopTopAddress to the frontend
2024-02-21 14:31:04 -08:00
Alyssa Rosenzweig
2bcd285851
Merge pull request #3430 from Sonicadvance1/tsc_scale
...
Implement small TSC scaling
2024-02-21 13:16:27 -04:00
Alyssa Rosenzweig
8762bc1fa3
OpcodeDispatcher: simplify CalculateAF signature
...
- Res is unused
- SrcSize doesn't matter since we ignore the high bits, might as well always use
32-bit, it doesn't matter
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-21 12:48:15 -04:00
Billy Laws
5b4162b712
FEXCore: Expose AbsoluteLoopTopAddress to the frontend
...
ARM64EC has a shared SRA mapping between ARM64 and X64 code, so there
needs to be a public way to enter the dispatcher without refilling SRA
from the in-memory context struct.
2024-02-21 11:46:24 +00:00
Billy Laws
cb5c07f4b1
Arm64Emitter: Introduce ARM64EC SRA mappings
...
See https://learn.microsoft.com/en-us/cpp/build/arm64ec-windows-abi-conventions?view=msvc-170
note that since mm registers are volatile there is no need to match the
mapping for them when in JIT, so they can be used as scratch regs.
Disallowed regs are also wiped on context switches, so they cannot be
taken advantage of to e.g. avoid spilling.
2024-02-21 11:18:10 +00:00
Ryan Houdek
b902b8edab
Implement small TSC scaling
...
Games engines are expecting >1Ghz cycle counters. Scale them to work
around the issue.
Resolves the excessive busy waiting in Unreal Engine 5 games.
2024-02-20 12:05:44 -08:00
Alyssa Rosenzweig
0503c89ff6
OpcodeDispatcher: use NZCV update helpers
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-19 14:12:54 -04:00
Alyssa Rosenzweig
6dd410698a
OpcodeDispatcher: add helpers for updating NZCV metadata
...
to reduce error-prone copypaste
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-19 14:12:54 -04:00
Ryan Houdek
808ced455d
FEXCore: Add a frontend pointer to InternalThreadState
...
FEXCore is guaranteed to not touch this pointer and can be used by
frontends to store thread-specific data.
2024-02-15 02:06:16 -08:00
Ryan Houdek
9cab746aa7
Merge pull request #3407 from neobrain/feature_libfwd_arguments_on_guest_stack
...
Library Forwarding: Allocate packed arguments on the guest stack if needed
2024-02-12 16:31:34 -08:00
Alyssa Rosenzweig
68232366e4
OpcodeDispatcher: don't mask add/sub sources
...
not needed in the new approach
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-12 12:36:28 -04:00
Alyssa Rosenzweig
d7ff1b78fb
IR: handle 8/16-bit AddNZCV/SubNZCV
...
we can do it more effectively than the current s/w lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-12 12:36:09 -04:00
Mai
780b48620b
Merge pull request #3420 from Sonicadvance1/preserve_all_3419
...
Fix #3419
2024-02-10 23:24:38 -05:00
Ryan Houdek
4a0878fa92
Fix #3419
2024-02-10 19:55:51 -08:00
Ryan Houdek
df3d6938ae
Merge pull request #3410 from alyssarosenzweig/opt/nzcv-pass-2
...
Add NZCV+PF/AF optimization pass
2024-02-10 05:03:12 -08:00