8061 Commits

Author SHA1 Message Date
Ryan Houdek
ca87d8688d
Merge pull request #3153 from alyssarosenzweig/opt/adcs
Use adcs
2023-09-26 09:57:01 -07:00
Ryan Houdek
e32601f49d
Merge pull request #3161 from neobrain/fix_ctest_silent_failures
unittests: Instruct CTest to print output from tests on failure
2023-09-26 08:26:15 -07:00
Tony Wasserka
f4dd456c80 unittests: Instruct CTest to print output from tests on failure 2023-09-26 17:16:28 +02:00
Alyssa Rosenzweig
7b22dbfe24 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-26 10:05:59 -04:00
Alyssa Rosenzweig
7a06cc9727 IR: Use adcs/sbcs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-26 09:06:46 -04:00
Ryan Houdek
8b3881b5db
Merge pull request #3154 from alyssarosenzweig/opt/smol-carry
Optimize 8/16-bit CF calculation
2023-09-26 05:49:07 -07:00
Ryan Houdek
76d4637d9c
Merge pull request #3159 from neobrain/feature_update_vulkan
Thunks: Update Vulkan thunk to v1.3.261.1
2023-09-26 05:20:18 -07:00
Alyssa Rosenzweig
0d12cce74f
Merge pull request #3158 from Sonicadvance1/unittest_for_3153
unittests/ASM: Adds unit test caught by #3153
2023-09-26 08:15:40 -04:00
Tony Wasserka
04592af609 Thunks: Update Vulkan thunk to v1.3.261.1 2023-09-26 12:14:58 +02:00
Ryan Houdek
d8366c04dc unittests/ASM: Adds unit test caught by #3153 2023-09-26 00:28:45 -07:00
Ryan Houdek
533f35934c
Merge pull request #3155 from neobrain/opt_thunks_rebuilds
Thunks: Avoid recompiling thunk interfaces on FEXLoader changes
2023-09-25 19:21:09 -07:00
Alyssa Rosenzweig
35bb7cc801 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-25 19:41:31 -04:00
Alyssa Rosenzweig
5facb21d30 OpcodeDispatcher: Don't mask small add/sub carries
For the GPR result, the masking already happens as part of the bfi. So the only
point of masking is for the flag calculation. But actually, every flag except
carry will ignore the upper bits anyway. And the carry calculation actually
WANTS the upper bit as a faster impl.

Deletes a pile of code both in FEX and the output :-)

ADC/SBC could probably get similar treatment later.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-25 18:25:30 -04:00
Tony Wasserka
adead832a5 Thunks: Avoid recompiling thunk interfaces on FEXLoader changes
The interface files themselves don't use FEXLoader. Only the final library
does.
2023-09-25 23:04:09 +02:00
Ryan Houdek
5eed24a242
Merge pull request #3152 from Sonicadvance1/instcountci_x87_f64
InstCountCI: Support f64 reduced precision mode tests
2023-09-24 19:29:37 -07:00
Ryan Houdek
7907f70ed2 InstCountCI: Adds new x87 reduced precision mode tests 2023-09-24 18:50:05 -07:00
Ryan Houdek
7141332f6f InstCountCI: Support setting environment variables in tests
This will allow us to enable FEX options through environment variables
just like the ASM tests.
2023-09-24 18:50:01 -07:00
Ryan Houdek
234e029391
Merge pull request #3145 from Sonicadvance1/optimize_inline_calls
PassManager: Optimize out CPUID and XGetBV calls
2023-09-24 18:09:18 -07:00
Ryan Houdek
19a7b514e6
Merge pull request #3150 from alyssarosenzweig/opt/ornror
Optimize PF calculation in lahf
2023-09-24 18:05:57 -07:00
Ryan Houdek
220761a0e8
Merge pull request #3151 from Sonicadvance1/unique_name_workflow_jobs
Github: Changes jobs to have unique names
2023-09-24 18:03:57 -07:00
Alyssa Rosenzweig
cbd4daddff InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 20:59:28 -04:00
Alyssa Rosenzweig
c8519b0b87 OpcodeDispatcher: Remove LoadPF
Now unused, its former users all prefer LoadPFRaw since they can fold in some of
this math into the use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 20:59:28 -04:00
Alyssa Rosenzweig
68d32ad70d OpcodeDispatcher: Optimize PF in lahf
Use the raw popcount rather than the final PF and use some sneaky bit math to
come out 1 instruction ahead.

Closes #3117

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 20:59:28 -04:00
Ryan Houdek
62890f148f Github: Changes jobs to have unique names
These overlapping names make it impossible to ensure all checks are
required to pass before merge.

Unique names will fix this.
2023-09-24 17:52:47 -07:00
Alyssa Rosenzweig
1f02a6da34 IR: Add Ornror op
Mostly copypaste of Orlshl... we really should deduplicate this mess somehow.
Maybe a shift enum on the core Or op?

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 20:47:50 -04:00
Alyssa Rosenzweig
86063411dc Revert "OpcodeDispatcher: Use plain Lshl for flags"
This logic is unused since 8adfaa9aa ("OpcodeDispatcher: Use SelectCC for x87"),
which addressed the underlying issue.

This reverts commit df3833edbe3d34da4df28269f31340076238e420.
2023-09-24 20:47:50 -04:00
Ryan Houdek
9968e6431f Passes: Rename SyscallOptimization
This is now inlining multiple external calls out of the JIT. Rename it
to InlineCallOptimization.
2023-09-24 17:25:38 -07:00
Ryan Houdek
ff24f64b2a PassManager: Optimize out CPUID and XGetBV calls
If we const-prop the required functions and leafs then we can directly
encode the CPUID information rather than jumping out of the JIT.
In testing almost all CPUID executions const-prop which function is
getting called. Worst case that I found was only 85% const-prop rate.

This isn't quite 100% optimal since we need to call the RCLSE and
Constprop passes after we optimize these, which would remove some
redundant moves.

Sadly there seems to be a bug in the constprop pass that starts crashing
applications if that is done.
Easily enough tested by running Half-Life 2 and it immediately hitting
SIGILL.

Even without this optimization, this is stil a significant savings since
we aren't jumping out of the JIT anymore for these optimized CPUIDs.
2023-09-24 17:25:38 -07:00
Ryan Houdek
e9a7ef2534 CPUID: Describe CPUID functions if they return constant state or not
Most CPUID routines return constant data, there are four that don't.
Some CPUID functions also need the leaf descriptor, so we need to
describe that as well.

Functions that don't return constant data:
- function 1Ah - Returns different data depending on current CPU core
- function 8000_000{2,3,4} - Different data based on CPU core

Functions that need leaf constprop:
- 4h, 7h, Dh, 4000_0001h, 8000_001Dh
2023-09-24 17:25:38 -07:00
Ryan Houdek
842c57e221 CPUID: Constify some functions
These don't modify CPUIDEmu state.
2023-09-24 17:25:38 -07:00
Ryan Houdek
93aeb157b4
Merge pull request #3149 from Sonicadvance1/fail_on_change
InstCountCI: Fail CI if there was any difference.
2023-09-24 17:23:52 -07:00
Ryan Houdek
02ff9f200c InstCountCI: Upload diff and check for failure 2023-09-24 17:14:08 -07:00
Ryan Houdek
f65b40f298 InstCountCI: Fail if inst count has changed 2023-09-24 17:14:08 -07:00
Ryan Houdek
c38beff826
Merge pull request #3148 from Sonicadvance1/add_negative_primaries
InstCountCI: Adds negative immediate primary tests
2023-09-24 17:13:34 -07:00
Ryan Houdek
94c22b2269 InstCountCI: Adds negative immediate primary tests
Noticed these were missing
2023-09-24 17:02:58 -07:00
Ryan Houdek
bee97309f6
Merge pull request #3147 from alyssarosenzweig/opt/0924
More opts to the dispatcher + 1 to the JIT
2023-09-24 17:01:37 -07:00
Alyssa Rosenzweig
331941dec6 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 19:52:35 -04:00
Alyssa Rosenzweig
8798e0cba0 Arm64: Rewrite Set/GetRoundingMode
I went auditing for places to use cset and what I found was hot garbage.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 19:52:35 -04:00
Alyssa Rosenzweig
c5fc03dac4 OpcodeDispatcher: Use cset for blsr/etc flags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 19:52:35 -04:00
Alyssa Rosenzweig
e63871ed2e OpcodeDispatcher: Handle sub in CalculateOF
Gets us the constant source optimization without more code duplication. And
honestly I prefer the combined presentation.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 19:52:35 -04:00
Alyssa Rosenzweig
ea8b7633eb OpcodeDispatcher: Optimize OF calc of immediates
If we know the sign of one of the sources, we can do better when calculating OF.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-24 18:16:09 -04:00
Ryan Houdek
e795ec683d
Merge pull request #3139 from Sonicadvance1/workaround
FEXServerClient: Adds back ServerSocketPath config option
2023-09-23 17:09:04 -07:00
Ryan Houdek
6dc5c0d3be
Merge pull request #3144 from Sonicadvance1/optimize_redundant_store_load
RCLSE: Optimize redundant store->load operations
2023-09-23 17:06:10 -07:00
Ryan Houdek
eb5e0be569 FEXServerClient: Adds back ServerSocketPath config option
This option was disabled a few months ago when we switched the server
socket from a filesystem unix socket to an abstract socket.
This partially broke our chroot scripts which relied on this option
existing.

Readds support for an explicitly named abstract socket named from
config.

This is a workaround for dealing with chroots that change users.
They end up changing a user while doing operations and then can't
connect to the FEXServer anymore because environment variables have been
wiped away.
2023-09-23 16:59:58 -07:00
Ryan Houdek
be3ff804a6 InstCountCI: Update for optimization 2023-09-23 06:11:35 -07:00
Ryan Houdek
9ab2967d71 Arm64: Fixes wide shifts
movprfx is invalid to use when the source register matches the movprfx
destination.

This was getting picked up on by `TwoByte/0F_D1.asm` now that RCLSE is
working better now.
2023-09-23 06:06:18 -07:00
Ryan Houdek
d01b457727 RCLSE: Optimize redundant store->load operations
The bug that was causing crashes with this was due to inline syscalls.
Now that this is fixed we can re-enable store->load operations.

This allows constant propagation to work significantly better, which
means inline syscalls start working again. This can significantly
improve syscall performance in some cases.

This is most likely to improve performance in dxsetup and vc_redist but
hard to get a real profile.

Additionally this will let us inline cpuid results in the future which
is pretty nice.
2023-09-23 06:06:18 -07:00
Mai
4e9a114858
Merge pull request #3142 from Sonicadvance1/inline_syscall_fix
Arm64: Fixes inline syscalls
2023-09-23 09:03:49 -04:00
Mai
72d092e951
Merge pull request #3141 from Sonicadvance1/fix_simm9_range
ConstProp: Fixes unscaled signed 9-bit range
2023-09-23 09:03:01 -04:00
Mai
da3e172857
Merge pull request #3140 from Sonicadvance1/fix_core_sanitization
Config: Fixes core sanitization
2023-09-23 09:01:42 -04:00