Commit Graph

978 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
49e798ab2b
Merge pull request #3461 from alyssarosenzweig/opt/sbc
Optimize SBC
2024-02-27 11:29:45 -04:00
Ryan Houdek
946c805d84
Merge pull request #3459 from Sonicadvance1/fix_591
Capture a 64-bit process trying to jump to 32-bit syscall handler
2024-02-26 21:57:03 -08:00
Ryan Houdek
0b34035085
Merge pull request #3439 from Sonicadvance1/allocate_first_4gb_of_64bit
FEXLoader: Allocate the second 4GB of virtual memory when executing 32-bit
2024-02-26 18:46:22 -08:00
Alyssa Rosenzweig
12cc980603 OpcodeDispatcher: shuffle adc flag order
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
f3d55dd721 OpcodeDispatcher: shuffle SBC flag order
avoids clobbering nzcv

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
91cef6b76f OpcodeDispatcher: use native ADC even for 8/16-bit
we mask off the upper bits, and they agree in the lower bits.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
270cbf39b5 OpcodeDispatcher: specialize SALC
this gets rid of the awkward non-flag SBB case, which streamlines SBB. while
getting better codegen for the demon opcode (-:

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:33:41 -04:00
Alyssa Rosenzweig
2e0be0a5e7 OpcodeDispatcher: allow more upper garbage with adc
missed this last series.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:42:13 -04:00
Alyssa Rosenzweig
d60c089697 OpcodeDispatcher: allow upper garbage with sbb
for the usual reasons

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
e76ebeab58 OpcodeDispatcher: use 1-op "src + CF"
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
333271d490 OpcodeDispatcher: fuse sbb when flags calculated
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
a750870abf OpcodeDispatcher: use fused sbcs calculations
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Alyssa Rosenzweig
15db72ef60 IR: add Sbb, SbbWithFlags ops
For fusing sbc+sbcs

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Alyssa Rosenzweig
5f16f357af
Merge pull request #3456 from alyssarosenzweig/opt/adc
Optimize ADC
2024-02-26 09:39:37 -04:00
Ryan Houdek
4f028b8614
Capture a 64-bit process trying to jump to 32-bit syscall handler
Fixes #591

Adds a simple unittest
2024-02-26 05:37:29 -08:00
Alyssa Rosenzweig
7deb4976a3 RedundantFlagCalculationElimination: fix missing NEG case
can be predicated.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 17:37:16 -04:00
Alyssa Rosenzweig
1e153e0c81 OpcodeDispatcher: allow garbage with adcs
for the usual reasons

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:02 -04:00
Alyssa Rosenzweig
6994fc3a01 IR,OpcodeDispatcher,JIT: fuse adcs flags
The usual tricks, also requires introducing a bare adc op to optimize adcs to,
but we wanted that anyway!

Also support a zero source, so we can calculate "foo + CF" in one instruction to
optimize the "lock adc" cases.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:49:32 -04:00
Ryan Houdek
d703f3ccee
Fixes zero register flag generation
Fixes 140976d322

Adds a unit test to ensure it keeps working.
2024-02-24 16:32:25 -08:00
Alyssa Rosenzweig
80e632db8a OpcodeDispatcher: garbage collect
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
852e3c4e93 OpcodeDispatcher: fuse XADD
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
e86547bbcb OpcodeDispatcher: fuse INC
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
883cca2e8f OpcodeDispatcher: use AddWithFlags
give it the same treatment we just gave sub.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
8540332520 IR: add AddWithFlags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
df5bdefb8a OpcodeDispatcher: merge secondary ALU with primary ALU
It's the same, stop copypasting. This gets our flag and arithmetic opts (current
and future) applied to secondary ALU too.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
3d1fb7701c OpcodeDispatcher: optimize sub
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
140976d322 OpcodeDispatcher: prep primary ALU for better flags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
9a11d3b1a2 OpcodeDispatcher: fuse NEG
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
96e652879f OpcodeDispatcher: fuse DEC
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
cc1c1dd047 OpcodeDispatcher: return result from SUB flag calculate
for fusion

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
c1d572951f OpcodeDispatcher: drop unused GenerateFlags_SUB arg
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
dd9d3264dd OpcodeDispatcher: smarten SUB flag generation
we don't need the result, we can use subs and come out ahead in practice. also a
step towards better fusion

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
2aaf957ad8 RedundantFlagCalculationElimination: DCE as we go
This is required to ensure single-iteration convergence with a sequence like:

  write C
  whatever = load C
  rmif C, whatever
  invalidate C

avoids regressing the "DEC dead" case with future work in the series.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
d459b2f9b5 IR: propagate 0 into sub
now that we have to handle it, we may as well take advantage of it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
22ab7f2b3e IR: add SubWithFlags op (arm64 subs)
with 8/16-bit handling to keep everything uniform.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
f7e32373ce JIT: allow #0 in sub
turns into neg, this will be generated via SubWithFlags -> Sub opts.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
25d422e92b JIT: use GetZeroableReg for NZCVSelect
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
cfbeece09f JIT: use GetZeroableReg for CondAddNZCV
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
a597a09825 JIT: use GetZeroableReg for SubNZCV
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
99854ff310 JIT: add GetZeroableReg helper
for inlining constant zeroes in applicable sources

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Ryan Houdek
0a64f8a9c5
Moves SignalDelegator TLS tracking to the frontend
FEXCore doesn't need track the TLS state of the SignalDelegator, this is
a frontend concept.

Removes the tracking from the backend and keeps it in the frontend.
2024-02-24 01:07:29 -08:00
Ryan Houdek
59ec88f48d
Merge pull request #3438 from Sonicadvance1/move_tls_allocation
Moves JITSymbol allocation
2024-02-23 14:49:04 -08:00
Ryan Houdek
6ec628fa31
Merge pull request #3433 from bylaws/arm64ec-pt1
Arm64Emitter: Introduce ARM64EC SRA mappings
2024-02-23 14:48:43 -08:00
Alyssa Rosenzweig
5378ae2e76
Merge pull request #3436 from alyssarosenzweig/ir/af-simplify
Simplify CalculateAF
2024-02-22 08:17:07 -04:00
Ryan Houdek
bd4a81a2a1
Moves JITSymbol allocation
This isn't actually using TLS allocations. Instead it is an allocation
tied to the InternalThreadState object.
2024-02-21 17:57:34 -08:00
Ryan Houdek
3d671cba10
Allocator: Early return if past the end of the allocation range
Fixes a bug where it would eventually hit the stack region and remap the
range as RW even for ranges that don't overlap the stack.
2024-02-21 15:37:46 -08:00
Ryan Houdek
d4be2dc636
Merge pull request #3434 from bylaws/arm64ec-pt3
FEXCore: Expose AbsoluteLoopTopAddress to the frontend
2024-02-21 14:31:04 -08:00
Alyssa Rosenzweig
2bcd285851
Merge pull request #3430 from Sonicadvance1/tsc_scale
Implement small TSC scaling
2024-02-21 13:16:27 -04:00
Alyssa Rosenzweig
8762bc1fa3 OpcodeDispatcher: simplify CalculateAF signature
- Res is unused
- SrcSize doesn't matter since we ignore the high bits, might as well always use
  32-bit, it doesn't matter

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-21 12:48:15 -04:00
Billy Laws
5b4162b712 FEXCore: Expose AbsoluteLoopTopAddress to the frontend
ARM64EC has a shared SRA mapping between ARM64 and X64 code, so there
needs to be a public way to enter the dispatcher without refilling SRA
from the in-memory context struct.
2024-02-21 11:46:24 +00:00