Alyssa Rosenzweig
49e798ab2b
Merge pull request #3461 from alyssarosenzweig/opt/sbc
...
Optimize SBC
2024-02-27 11:29:45 -04:00
Ryan Houdek
151e2279af
Linux: Converts passthrough syscalls to direct passthrough handlers
...
Reimagining of #3355 without any json generators or new concepts.
Fixes some mislabeling of system calls. Some getting inlined when they
shouldn't be, a lot not getting inlined when they can be.
This really cleans up the syscall implementation, all syscalls that can
be passthrough implementations require a very small two line
declaration.
Additionally cleans up a bit of implementation cruft where some
passthrough syscalls were using the glibc syscall handler, and some were
using the glibc implementation. We have had multiple issues in the past
where the glibc implementation does something subtly different than the
raw syscall and breaks things. Now all passthrough handlers do a system
call directly, removing at least one indirection and some ambiguity.
This makes it significantly easier to add new passthrough syscalls as
well. Only need to do a version check and add the three lines per
syscall. Which there are new syscalls incoming that we will want to add.
Tangible improvements:
- Syscalls are lower overhead than ever.
- When I'm adding more syscalls I have less chance of mucking it up.
2024-02-27 02:40:53 -08:00
Ryan Houdek
93ada89708
Linux: Move unimplement ustat and sysfs
...
AArch64 doesn't implement these and will return ENOSYS.
Moving them to NotImplemented so we can get a log if an application
tries to use these.
2024-02-27 02:39:36 -08:00
Mai
f41674bb7d
Merge pull request #3464 from Sonicadvance1/psychonauts_block
...
InstcountCI: Add a monster of a game block
2024-02-27 04:59:20 -05:00
Ryan Houdek
854fd70735
InstcountCI: Add a monster of a game block
...
Doing very little work with a bunch of instructions.
Hottest block in the Windows version of Psychonauts, it's just doing a
matrix swizzle but in the worst possible way.
2024-02-27 01:51:20 -08:00
Ryan Houdek
78a362581d
Update xxhash to v0.8.2
...
Switches to using upstream cmake files.
2024-02-26 23:57:25 -08:00
Mai
8c0d5c6583
Merge pull request #3462 from Sonicadvance1/update_vixl4
...
Update vixl
2024-02-27 02:28:39 -05:00
Ryan Houdek
1c184997e7
InstcountCI: Update for vixl update
2024-02-26 23:17:52 -08:00
Ryan Houdek
ccc699444d
Update vixl
...
Removes a commit from our fork.
2024-02-26 23:16:31 -08:00
Ryan Houdek
946c805d84
Merge pull request #3459 from Sonicadvance1/fix_591
...
Capture a 64-bit process trying to jump to 32-bit syscall handler
2024-02-26 21:57:03 -08:00
Ryan Houdek
118b8b200e
Merge pull request #3458 from Sonicadvance1/fix_635
...
Track unittest dependencies through to the custom target
2024-02-26 21:56:45 -08:00
Ryan Houdek
aa9d7c5629
Merge pull request #3460 from Sonicadvance1/add_unittest_for_3421_bug
...
Adds a unittest for a bug from #3421
2024-02-26 21:56:17 -08:00
Ryan Houdek
0b34035085
Merge pull request #3439 from Sonicadvance1/allocate_first_4gb_of_64bit
...
FEXLoader: Allocate the second 4GB of virtual memory when executing 32-bit
2024-02-26 18:46:22 -08:00
Alyssa Rosenzweig
b6bd826014
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:15 -04:00
Alyssa Rosenzweig
12cc980603
OpcodeDispatcher: shuffle adc flag order
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
f3d55dd721
OpcodeDispatcher: shuffle SBC flag order
...
avoids clobbering nzcv
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
91cef6b76f
OpcodeDispatcher: use native ADC even for 8/16-bit
...
we mask off the upper bits, and they agree in the lower bits.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
270cbf39b5
OpcodeDispatcher: specialize SALC
...
this gets rid of the awkward non-flag SBB case, which streamlines SBB. while
getting better codegen for the demon opcode (-:
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:33:41 -04:00
Alyssa Rosenzweig
2e0be0a5e7
OpcodeDispatcher: allow more upper garbage with adc
...
missed this last series.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:42:13 -04:00
Alyssa Rosenzweig
d60c089697
OpcodeDispatcher: allow upper garbage with sbb
...
for the usual reasons
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
e76ebeab58
OpcodeDispatcher: use 1-op "src + CF"
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
333271d490
OpcodeDispatcher: fuse sbb when flags calculated
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
a750870abf
OpcodeDispatcher: use fused sbcs calculations
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Alyssa Rosenzweig
15db72ef60
IR: add Sbb, SbbWithFlags ops
...
For fusing sbc+sbcs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Ryan Houdek
9687ac51f0
Merge pull request #3424 from Sonicadvance1/safer_clone_stack_handling
...
Linux: More safe stack cleanup for clone
2024-02-26 06:59:27 -08:00
Alyssa Rosenzweig
5f16f357af
Merge pull request #3456 from alyssarosenzweig/opt/adc
...
Optimize ADC
2024-02-26 09:39:37 -04:00
Ryan Houdek
4f028b8614
Capture a 64-bit process trying to jump to 32-bit syscall handler
...
Fixes #591
Adds a simple unittest
2024-02-26 05:37:29 -08:00
Alyssa Rosenzweig
32a4abbea7
Merge pull request #3457 from alyssarosenzweig/bug/nzcv
...
RedundantFlagCalculationElimination: fix missing NEG case
2024-02-26 09:31:06 -04:00
Ryan Houdek
c00c9b397e
Adds a unittest for a bug from #3421
...
When the source arguments for LoadMem/StoreMem have bit 31 set then they
are incorrectly sign extending in some instances.
Detected this when testing #3421 but I don't have a proper fix for it.
2024-02-26 00:07:19 -08:00
Ryan Houdek
6b5d8bd8c0
Track unittest dependencies through to the custom target
...
Fixes #635
2024-02-25 19:27:52 -08:00
Alyssa Rosenzweig
7deb4976a3
RedundantFlagCalculationElimination: fix missing NEG case
...
can be predicated.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 17:37:16 -04:00
Alyssa Rosenzweig
2cfd71c159
InstCountCI: add dead ADC test
...
nothing else covers this case
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:25 -04:00
Alyssa Rosenzweig
0f26780de0
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:25 -04:00
Alyssa Rosenzweig
1e153e0c81
OpcodeDispatcher: allow garbage with adcs
...
for the usual reasons
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:02 -04:00
Alyssa Rosenzweig
6994fc3a01
IR,OpcodeDispatcher,JIT: fuse adcs flags
...
The usual tricks, also requires introducing a bare adc op to optimize adcs to,
but we wanted that anyway!
Also support a zero source, so we can calculate "foo + CF" in one instruction to
optimize the "lock adc" cases.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:49:32 -04:00
Alyssa Rosenzweig
0ef72bf118
Merge pull request #3451 from Sonicadvance1/fix_zero_reg_regression
...
Fixes zero register flag generation
2024-02-24 21:08:49 -04:00
Alyssa Rosenzweig
49ca0e2181
Merge pull request #3452 from Sonicadvance1/dxvk_mgrr_hotblock
...
Adds MGRR hottest block on render thread
2024-02-24 21:07:25 -04:00
Ryan Houdek
947ae1c243
Adds MGRR hottest block on render thread
...
Was about 7% CPU time in this looping block. Has some fairly obvious
performance improvements that can be done.
2024-02-24 16:49:46 -08:00
Ryan Houdek
d703f3ccee
Fixes zero register flag generation
...
Fixes 140976d322
Adds a unit test to ensure it keeps working.
2024-02-24 16:32:25 -08:00
Ryan Houdek
d8a18687e8
Merge pull request #3443 from alyssarosenzweig/opt/add-too
...
Fuse add + cmn -> adds
2024-02-24 15:12:16 -08:00
Alyssa Rosenzweig
045549f166
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:55:07 -04:00
Alyssa Rosenzweig
80e632db8a
OpcodeDispatcher: garbage collect
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
852e3c4e93
OpcodeDispatcher: fuse XADD
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
e86547bbcb
OpcodeDispatcher: fuse INC
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
883cca2e8f
OpcodeDispatcher: use AddWithFlags
...
give it the same treatment we just gave sub.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
8540332520
IR: add AddWithFlags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
df5bdefb8a
OpcodeDispatcher: merge secondary ALU with primary ALU
...
It's the same, stop copypasting. This gets our flag and arithmetic opts (current
and future) applied to secondary ALU too.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
3d1fb7701c
OpcodeDispatcher: optimize sub
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
140976d322
OpcodeDispatcher: prep primary ALU for better flags
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
9a11d3b1a2
OpcodeDispatcher: fuse NEG
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00