Commit Graph

8907 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
49e798ab2b
Merge pull request #3461 from alyssarosenzweig/opt/sbc
Optimize SBC
2024-02-27 11:29:45 -04:00
Ryan Houdek
151e2279af
Linux: Converts passthrough syscalls to direct passthrough handlers
Reimagining of #3355 without any json generators or new concepts.

Fixes some mislabeling of system calls. Some getting inlined when they
shouldn't be, a lot not getting inlined when they can be.

This really cleans up the syscall implementation, all syscalls that can
be passthrough implementations require a very small two line
declaration.
Additionally cleans up a bit of implementation cruft where some
passthrough syscalls were using the glibc syscall handler, and some were
using the glibc implementation. We have had multiple issues in the past
where the glibc implementation does something subtly different than the
raw syscall and breaks things. Now all passthrough handlers do a system
call directly, removing at least one indirection and some ambiguity.

This makes it significantly easier to add new passthrough syscalls as
well. Only need to do a version check and add the three lines per
syscall. Which there are new syscalls incoming that we will want to add.

Tangible improvements:
- Syscalls are lower overhead than ever.
- When I'm adding more syscalls I have less chance of mucking it up.
2024-02-27 02:40:53 -08:00
Ryan Houdek
93ada89708
Linux: Move unimplement ustat and sysfs
AArch64 doesn't implement these and will return ENOSYS.
Moving them to NotImplemented so we can get a log if an application
tries to use these.
2024-02-27 02:39:36 -08:00
Mai
f41674bb7d
Merge pull request #3464 from Sonicadvance1/psychonauts_block
InstcountCI: Add a monster of a game block
2024-02-27 04:59:20 -05:00
Ryan Houdek
854fd70735
InstcountCI: Add a monster of a game block
Doing very little work with a bunch of instructions.
Hottest block in the Windows version of Psychonauts, it's just doing a
matrix swizzle but in the worst possible way.
2024-02-27 01:51:20 -08:00
Ryan Houdek
78a362581d
Update xxhash to v0.8.2
Switches to using upstream cmake files.
2024-02-26 23:57:25 -08:00
Mai
8c0d5c6583
Merge pull request #3462 from Sonicadvance1/update_vixl4
Update vixl
2024-02-27 02:28:39 -05:00
Ryan Houdek
1c184997e7
InstcountCI: Update for vixl update 2024-02-26 23:17:52 -08:00
Ryan Houdek
ccc699444d
Update vixl
Removes a commit from our fork.
2024-02-26 23:16:31 -08:00
Ryan Houdek
946c805d84
Merge pull request #3459 from Sonicadvance1/fix_591
Capture a 64-bit process trying to jump to 32-bit syscall handler
2024-02-26 21:57:03 -08:00
Ryan Houdek
118b8b200e
Merge pull request #3458 from Sonicadvance1/fix_635
Track unittest dependencies through to the custom target
2024-02-26 21:56:45 -08:00
Ryan Houdek
aa9d7c5629
Merge pull request #3460 from Sonicadvance1/add_unittest_for_3421_bug
Adds a unittest for a bug from #3421
2024-02-26 21:56:17 -08:00
Ryan Houdek
0b34035085
Merge pull request #3439 from Sonicadvance1/allocate_first_4gb_of_64bit
FEXLoader: Allocate the second 4GB of virtual memory when executing 32-bit
2024-02-26 18:46:22 -08:00
Alyssa Rosenzweig
b6bd826014 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:15 -04:00
Alyssa Rosenzweig
12cc980603 OpcodeDispatcher: shuffle adc flag order
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
f3d55dd721 OpcodeDispatcher: shuffle SBC flag order
avoids clobbering nzcv

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
91cef6b76f OpcodeDispatcher: use native ADC even for 8/16-bit
we mask off the upper bits, and they agree in the lower bits.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
270cbf39b5 OpcodeDispatcher: specialize SALC
this gets rid of the awkward non-flag SBB case, which streamlines SBB. while
getting better codegen for the demon opcode (-:

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:33:41 -04:00
Alyssa Rosenzweig
2e0be0a5e7 OpcodeDispatcher: allow more upper garbage with adc
missed this last series.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:42:13 -04:00
Alyssa Rosenzweig
d60c089697 OpcodeDispatcher: allow upper garbage with sbb
for the usual reasons

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
e76ebeab58 OpcodeDispatcher: use 1-op "src + CF"
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
333271d490 OpcodeDispatcher: fuse sbb when flags calculated
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
a750870abf OpcodeDispatcher: use fused sbcs calculations
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Alyssa Rosenzweig
15db72ef60 IR: add Sbb, SbbWithFlags ops
For fusing sbc+sbcs

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:25:03 -04:00
Ryan Houdek
9687ac51f0
Merge pull request #3424 from Sonicadvance1/safer_clone_stack_handling
Linux: More safe stack cleanup for clone
2024-02-26 06:59:27 -08:00
Alyssa Rosenzweig
5f16f357af
Merge pull request #3456 from alyssarosenzweig/opt/adc
Optimize ADC
2024-02-26 09:39:37 -04:00
Ryan Houdek
4f028b8614
Capture a 64-bit process trying to jump to 32-bit syscall handler
Fixes #591

Adds a simple unittest
2024-02-26 05:37:29 -08:00
Alyssa Rosenzweig
32a4abbea7
Merge pull request #3457 from alyssarosenzweig/bug/nzcv
RedundantFlagCalculationElimination: fix missing NEG case
2024-02-26 09:31:06 -04:00
Ryan Houdek
c00c9b397e
Adds a unittest for a bug from #3421
When the source arguments for LoadMem/StoreMem have bit 31 set then they
are incorrectly sign extending in some instances.

Detected this when testing #3421 but I don't have a proper fix for it.
2024-02-26 00:07:19 -08:00
Ryan Houdek
6b5d8bd8c0
Track unittest dependencies through to the custom target
Fixes #635
2024-02-25 19:27:52 -08:00
Alyssa Rosenzweig
7deb4976a3 RedundantFlagCalculationElimination: fix missing NEG case
can be predicated.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 17:37:16 -04:00
Alyssa Rosenzweig
2cfd71c159 InstCountCI: add dead ADC test
nothing else covers this case

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:25 -04:00
Alyssa Rosenzweig
0f26780de0 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:25 -04:00
Alyssa Rosenzweig
1e153e0c81 OpcodeDispatcher: allow garbage with adcs
for the usual reasons

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:50:02 -04:00
Alyssa Rosenzweig
6994fc3a01 IR,OpcodeDispatcher,JIT: fuse adcs flags
The usual tricks, also requires introducing a bare adc op to optimize adcs to,
but we wanted that anyway!

Also support a zero source, so we can calculate "foo + CF" in one instruction to
optimize the "lock adc" cases.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-25 10:49:32 -04:00
Alyssa Rosenzweig
0ef72bf118
Merge pull request #3451 from Sonicadvance1/fix_zero_reg_regression
Fixes zero register flag generation
2024-02-24 21:08:49 -04:00
Alyssa Rosenzweig
49ca0e2181
Merge pull request #3452 from Sonicadvance1/dxvk_mgrr_hotblock
Adds MGRR hottest block on render thread
2024-02-24 21:07:25 -04:00
Ryan Houdek
947ae1c243
Adds MGRR hottest block on render thread
Was about 7% CPU time in this looping block. Has some fairly obvious
performance improvements that can be done.
2024-02-24 16:49:46 -08:00
Ryan Houdek
d703f3ccee
Fixes zero register flag generation
Fixes 140976d322

Adds a unit test to ensure it keeps working.
2024-02-24 16:32:25 -08:00
Ryan Houdek
d8a18687e8
Merge pull request #3443 from alyssarosenzweig/opt/add-too
Fuse add + cmn -> adds
2024-02-24 15:12:16 -08:00
Alyssa Rosenzweig
045549f166 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:55:07 -04:00
Alyssa Rosenzweig
80e632db8a OpcodeDispatcher: garbage collect
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
852e3c4e93 OpcodeDispatcher: fuse XADD
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
e86547bbcb OpcodeDispatcher: fuse INC
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
883cca2e8f OpcodeDispatcher: use AddWithFlags
give it the same treatment we just gave sub.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
8540332520 IR: add AddWithFlags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
df5bdefb8a OpcodeDispatcher: merge secondary ALU with primary ALU
It's the same, stop copypasting. This gets our flag and arithmetic opts (current
and future) applied to secondary ALU too.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
3d1fb7701c OpcodeDispatcher: optimize sub
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
140976d322 OpcodeDispatcher: prep primary ALU for better flags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00
Alyssa Rosenzweig
9a11d3b1a2 OpcodeDispatcher: fuse NEG
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-24 15:54:49 -04:00