9086 Commits

Author SHA1 Message Date
Tony Wasserka
31e976a5bc Library Forwarding: Update Vulkan definitions to v1.3.278 2024-02-29 19:08:38 +01:00
Ryan Houdek
009ae55ff0
Merge pull request #3475 from alyssarosenzweig/opt/lock-dec
Optimize lock dec
2024-02-29 08:44:24 -08:00
Ryan Houdek
98572b9e23
Merge pull request #3473 from Sonicadvance1/remove_mov_swap
Arm64: Stop moving source in atomic swap
2024-02-29 08:44:16 -08:00
Alyssa Rosenzweig
1ab234f6dd InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-29 09:28:21 -04:00
Alyssa Rosenzweig
fed5e6d546 OpcodeDispatcher: use fetchadd for atomic DEC
Avoids a NEG.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-29 09:28:21 -04:00
Alyssa Rosenzweig
f27e2246e2
Merge pull request #3468 from alyssarosenzweig/opt/miscs
Misc little opts
2024-02-29 09:18:17 -04:00
Mai
4779fb74de
Merge pull request #3474 from Sonicadvance1/fix_early_exit_region_reserving
Fix reserving range check
2024-02-29 07:22:07 -05:00
Ryan Houdek
eaf83aa6b4
Fix reserving range check
Fixes an issue where TestHarnessRunner was managing to reserve the space
below stack again, resulting in stack growth breaking. Would typically
only show up when using the vixl simulator under gdb for some reason.

This is likely the last bandage on this code before it gets completely
rewritten to be more readable.
2024-02-29 04:02:05 -08:00
Ryan Houdek
4f8b28e83b
InstcountCI: Update for swap improvement 2024-02-29 03:07:19 -08:00
Ryan Houdek
c318947695
Arm64: Stop moving source in atomic swap
ldswpal doesn't overwrite the source register and only reads the bits
required for the sized operation.
Not sure exactly why we were doing a copy here.

Removing it means improving Skyrim's hottest code block, as seen in #3472
2024-02-29 03:07:05 -08:00
Ryan Houdek
b3489d7262
ASM: Another sign extend bug in #3421
This time found in MGRR. It flips the problem space on its head.
2024-02-29 02:06:57 -08:00
Alyssa Rosenzweig
44d738fa93 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:35:34 -04:00
Alyssa Rosenzweig
811487ad98 OpcodeDispatcher: use real branch for INT
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:35:12 -04:00
Alyssa Rosenzweig
4f4e38ace2 OpcodeDispatcher: use real branch for rep cmps
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:35:12 -04:00
Alyssa Rosenzweig
edd6becc56 OpcodeDispatcher: use real branch for rep scas
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:35:12 -04:00
Alyssa Rosenzweig
e47a94cae7 OpcodeDispatcher: skip mask with shld
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:18:29 -04:00
Alyssa Rosenzweig
cc82dba1ca OpcodeDispatcher: use mvn for AF with constants
This reduces pointless constant usage. For now, it's no net change to
instcountci, but it should make it easier to get wins later. Hopefully.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:18:28 -04:00
Alyssa Rosenzweig
8232669b22 OpcodeDispatcher: simplify
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 10:00:13 -04:00
Alyssa Rosenzweig
28936073c4 OpcodeDispatcher: allow upper garbage on NEG
like SUB.

due to RA silliness, this is a loss for inst count but a win for cycles.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 09:59:57 -04:00
Alyssa Rosenzweig
ef2559d911 OpcodeDispatcher: allow garbage with SCAS
it's just feeding SUB flags which allow it

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-28 09:16:08 -04:00
Ryan Houdek
d24446ed13
Merge pull request #3466 from Sonicadvance1/fixed_opt
OpcodeDispatcher: Don't use AddShift with no shift
2024-02-28 04:51:10 -08:00
Ryan Houdek
67f13ba927
Adds sign extending address bug that was detected when testing #3421
Doesn't quite match the libc code directly because it uses `[gs:eax]`
with both having the sign bit set and we can't deal with that with ASM
tests. So match the behaviour in a different way.
2024-02-27 19:57:22 -08:00
Ryan Houdek
dc5239c003
InstCountCI: Update for previous fix 2024-02-27 19:57:06 -08:00
Ryan Houdek
f346f89678
OpcodeDispatcher: Don't use AddShift with no shift
This accidentally removed optimizations elsewhere that was only checking
for Add.
2024-02-27 19:56:12 -08:00
Ryan Houdek
2f9449cb5a
Merge pull request #3465 from alyssarosenzweig/icci/pa
InstCountCI: enable preserve_all
2024-02-27 16:39:46 -08:00
Ryan Houdek
139367d248
Merge pull request #3463 from Sonicadvance1/update_xxhash
Update xxhash to v0.8.2
2024-02-27 16:39:38 -08:00
Alyssa Rosenzweig
fcebad51bd InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-27 12:04:29 -04:00
Alyssa Rosenzweig
b50292493a InstCountCI: enable preserve_all ABI
This is what we'll actually ship (I hope), so that's the config we want to
track long-term. It's also a lot more managable resulting asm.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-27 12:03:58 -04:00
Ryan Houdek
b74de53056
Merge pull request #3449 from Sonicadvance1/syscalls_passthrough_not_json
Linux: Converts passthrough syscalls to direct passthrough handlers
2024-02-27 07:50:04 -08:00
Alyssa Rosenzweig
49e798ab2b
Merge pull request #3461 from alyssarosenzweig/opt/sbc
Optimize SBC
2024-02-27 11:29:45 -04:00
Ryan Houdek
151e2279af
Linux: Converts passthrough syscalls to direct passthrough handlers
Reimagining of #3355 without any json generators or new concepts.

Fixes some mislabeling of system calls. Some getting inlined when they
shouldn't be, a lot not getting inlined when they can be.

This really cleans up the syscall implementation, all syscalls that can
be passthrough implementations require a very small two line
declaration.
Additionally cleans up a bit of implementation cruft where some
passthrough syscalls were using the glibc syscall handler, and some were
using the glibc implementation. We have had multiple issues in the past
where the glibc implementation does something subtly different than the
raw syscall and breaks things. Now all passthrough handlers do a system
call directly, removing at least one indirection and some ambiguity.

This makes it significantly easier to add new passthrough syscalls as
well. Only need to do a version check and add the three lines per
syscall. Which there are new syscalls incoming that we will want to add.

Tangible improvements:
- Syscalls are lower overhead than ever.
- When I'm adding more syscalls I have less chance of mucking it up.
2024-02-27 02:40:53 -08:00
Ryan Houdek
93ada89708
Linux: Move unimplement ustat and sysfs
AArch64 doesn't implement these and will return ENOSYS.
Moving them to NotImplemented so we can get a log if an application
tries to use these.
2024-02-27 02:39:36 -08:00
Mai
f41674bb7d
Merge pull request #3464 from Sonicadvance1/psychonauts_block
InstcountCI: Add a monster of a game block
2024-02-27 04:59:20 -05:00
Ryan Houdek
854fd70735
InstcountCI: Add a monster of a game block
Doing very little work with a bunch of instructions.
Hottest block in the Windows version of Psychonauts, it's just doing a
matrix swizzle but in the worst possible way.
2024-02-27 01:51:20 -08:00
Ryan Houdek
78a362581d
Update xxhash to v0.8.2
Switches to using upstream cmake files.
2024-02-26 23:57:25 -08:00
Mai
8c0d5c6583
Merge pull request #3462 from Sonicadvance1/update_vixl4
Update vixl
2024-02-27 02:28:39 -05:00
Ryan Houdek
1c184997e7
InstcountCI: Update for vixl update 2024-02-26 23:17:52 -08:00
Ryan Houdek
ccc699444d
Update vixl
Removes a commit from our fork.
2024-02-26 23:16:31 -08:00
Ryan Houdek
946c805d84
Merge pull request #3459 from Sonicadvance1/fix_591
Capture a 64-bit process trying to jump to 32-bit syscall handler
2024-02-26 21:57:03 -08:00
Ryan Houdek
118b8b200e
Merge pull request #3458 from Sonicadvance1/fix_635
Track unittest dependencies through to the custom target
2024-02-26 21:56:45 -08:00
Ryan Houdek
aa9d7c5629
Merge pull request #3460 from Sonicadvance1/add_unittest_for_3421_bug
Adds a unittest for a bug from #3421
2024-02-26 21:56:17 -08:00
Ryan Houdek
0b34035085
Merge pull request #3439 from Sonicadvance1/allocate_first_4gb_of_64bit
FEXLoader: Allocate the second 4GB of virtual memory when executing 32-bit
2024-02-26 18:46:22 -08:00
Alyssa Rosenzweig
b6bd826014 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:15 -04:00
Alyssa Rosenzweig
12cc980603 OpcodeDispatcher: shuffle adc flag order
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
f3d55dd721 OpcodeDispatcher: shuffle SBC flag order
avoids clobbering nzcv

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
91cef6b76f OpcodeDispatcher: use native ADC even for 8/16-bit
we mask off the upper bits, and they agree in the lower bits.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:48:07 -04:00
Alyssa Rosenzweig
270cbf39b5 OpcodeDispatcher: specialize SALC
this gets rid of the awkward non-flag SBB case, which streamlines SBB. while
getting better codegen for the demon opcode (-:

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 16:33:41 -04:00
Alyssa Rosenzweig
2e0be0a5e7 OpcodeDispatcher: allow more upper garbage with adc
missed this last series.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:42:13 -04:00
Alyssa Rosenzweig
d60c089697 OpcodeDispatcher: allow upper garbage with sbb
for the usual reasons

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00
Alyssa Rosenzweig
e76ebeab58 OpcodeDispatcher: use 1-op "src + CF"
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-02-26 15:35:14 -04:00