Commit Graph

10160 Commits

Author SHA1 Message Date
Billy Laws
9d9bd750e2 ARM64EC: Install a custom call checker to bypass NTDLL function patches
Some programs will hook the NTDLL exports that FEX depends on, the
regular ARM64EC call checker will detect such patches and invoke the
JIT to run them, which leads to infinite recursion if those same
exports are used during code compilation. Fix this by resolving all
patchable FFSs to their native ARM implementations for all indirect
calls performed by FEX, skipping any x86 patches.
2024-08-09 11:48:18 +00:00
Ryan Houdek
85d1b573ef
Merge pull request #3927 from bylaws/winafp
ARM64EC: Set appropriate AFP and SVE256 state on JIT entry/exit
2024-08-08 22:21:23 -07:00
Ryan Houdek
7b1d9540b7
Merge pull request #3925 from bylaws/arm64ecrt
ARM64EC: Introduce FEX-side CRT and Windows API replacements
2024-08-08 17:57:25 -07:00
Ryan Houdek
1007f874bf
Merge pull request #3926 from bylaws/windef
FEXCore: Drop deferred signal handling on Windows
2024-08-08 17:33:44 -07:00
Ryan Houdek
24ea4b7537
Merge pull request #3924 from bylaws/svc
ARM64EC: Handle direct syscall instructions
2024-08-08 17:32:56 -07:00
Mai
4882f10536
Merge pull request #3888 from Sonicadvance1/avx128_optimize_blends
AVX128: Optimize blends
2024-08-07 17:08:19 -04:00
Billy Laws
fe43a2bcb2 ARM64EC: Set appropriate AFP and SVE256 state on JIT entry/exit 2024-08-07 18:34:35 +00:00
Billy Laws
6700511cdf FEXCore: Drop deferred signal handling on Windows
The async signal issues this handles do not exist on Windows.
2024-08-07 18:31:48 +00:00
Billy Laws
e836639427 ARM64EC: Manually define the ARM64EC linker structures 2024-08-07 15:49:41 +00:00
Billy Laws
7eb3f33162 Update jemalloc submodule 2024-08-07 15:49:41 +00:00
Billy Laws
b35a514ea6 CMake: Don't link CMake's set of extra system libraries on Windows
These are already linked in by default with clang, sp having these set
here only served to prevent the -nostdlib compiler option from having
any effect, as CMake will always explicitly include the libs in the
compiler cmdline.
2024-08-07 15:49:41 +00:00
Billy Laws
8f372821f8 Windows: Use the FEX CRT 2024-08-07 15:49:41 +00:00
Billy Laws
6376d06bed Windows: Introduce a minimal Windows API replacement 2024-08-07 15:49:41 +00:00
Billy Laws
1bb293d4be Windows: Pull in some math and formatting functions from Musl 2024-08-07 15:49:41 +00:00
Billy Laws
0e069f1e97 Windows: Introduce a minimal CRT replacement
It is dangerous for FEX to rely on the system CRT as calls can have side
effects that are also visible to the running application, and if patches
are applied to any CRT exports used during compilation the call checker
would trigger a reentry into the JIT to compile the patch and hence
deadlock. Only functions that FEX actively uses are implemented, with
the rest triggering an abort.
2024-08-07 15:49:41 +00:00
Billy Laws
9074b810a9 CMake: Always enable jemalloc for MinGW builds 2024-08-07 15:49:41 +00:00
Mai
c5fe8723d3
Merge pull request #3918 from Sonicadvance1/constexpr_config_maps
Config: Converts two LUT maps over linear scan arrays
2024-08-07 10:40:22 -04:00
Ryan Houdek
434bffac33
Merge pull request #3916 from Sonicadvance1/frontend_hostfeatures
FEX: Moves HostFeatures querying to the frontend
2024-08-07 05:33:55 -07:00
Ryan Houdek
e84848b16b
FEX: Moves HostFeatures querying to the frontend
This moves the CPU feature querying to the frontend. The primary purpose
here is for the wow64 frontend to not require linux-isms for querying
these features. This is required since non-Linux environments don't have
the "CPUID" feature for reading EL1 MSRs in EL0.

Wiring up the remaining wow64 registry querying is left for a future
exercise.

This also technically removes an xbyak requirement from FEXCore for when
building the x86 Test harness runner, but that doesn't really matter for
regular use cases.
2024-08-07 05:26:02 -07:00
Ryan Houdek
69ed39d49e
Merge pull request #3892 from Sonicadvance1/optimize_vpermq
AVX128: Optimize all cases of vpermq
2024-08-06 20:07:28 -07:00
Ryan Houdek
230bde6aef
InstcountCI: Adds vpermq coverage 2024-08-06 09:08:30 -07:00
Ryan Houdek
c24d7aacba
unittests/ASM: Adds vpermq test that covers all immediate encodings
To ensure we cover all tests when optimizing.
2024-08-06 09:08:30 -07:00
Ryan Houdek
e613876e9d
AVX128: Optimize all cases of vpermq
Started by cherry-picking some cases from the variants that appeared when running
Steam, games, AV1 convolve tests, openssl, ffmpeg, libjpeg-turbo,
openh264, libvpx, gemmlowp, libyuv, and dav1d.

Then turned it around and optimized them all since all variants end up
needing to be split in to two halves, that effectively means we need to
have 16 implementations, plus a couple of special cases for duplicated
results.

Fixes #3795
2024-08-06 09:08:30 -07:00
Mai
1473129a8f
Merge pull request #3920 from Sonicadvance1/fix_newline_asm
SpinWaitLock: Fixes missing newline in asm
2024-08-06 12:08:14 -04:00
Alyssa Rosenzweig
c42808cb70
Merge pull request #3904 from Sonicadvance1/packaging
Scripts: Workaround deprecated parse_version
2024-08-06 11:52:59 -04:00
Ryan Houdek
054c119e2e
Config: Converts two LUT maps over linear scan arrays
These two maps used for environment lookup translations were getting
globally initialized and then registers with atexit handlers.

Switch over to a constexpr array and just do linear scans. This plus
short-circuiting the environment loader so it skips all entries that
don't start with `FEX_` has the side benefit of cutting the CPU time to
1/10th the time.

This plus #3917 removes the global static initializers entirely from
this file.
2024-08-06 07:44:08 -07:00
Alyssa Rosenzweig
f75bd2f09b
Merge pull request #3922 from bylaws/structs
Windows: Pull in additional method and structure definitions from wine
2024-08-06 09:29:03 -04:00
Alyssa Rosenzweig
a7424416d9
Merge pull request #3921 from bylaws/reloadf
Arm64Emitter: Reload STATE before SRA fill on ARM64EC
2024-08-06 09:28:23 -04:00
Alyssa Rosenzweig
cadb0a2ddb
Merge pull request #3923 from bylaws/except
ARM64EC: Improvements to exception flag handling
2024-08-06 09:27:50 -04:00
Alyssa Rosenzweig
2da819c0f3
Merge pull request #3919 from Sonicadvance1/remove_vestigial_vixl_usage
CodeEmitter: Removes vestigial vixl usage
2024-08-06 09:26:24 -04:00
Alyssa Rosenzweig
e0c783de74
Merge pull request #3917 from Sonicadvance1/remove_static_vector
Config: Removes a static vector initializer
2024-08-06 09:26:03 -04:00
Billy Laws
6c003fcb9a Windows: Pull in more method/structure definitions from wine 2024-08-05 19:23:18 +00:00
Billy Laws
c4faffc0e2 Windows: Add complete NTDLL export definitions
Generated from wine's ntdll.spec
2024-08-05 19:22:12 +00:00
Billy Laws
cc2d21f411 WOW64: Resolve the wine unix call dispatcher at runtime 2024-08-05 19:22:12 +00:00
Billy Laws
59686a6c60 ARM64EC: Clear TF in the exception resumption context after a trap
Matches Windows behaviour.
2024-08-05 17:38:45 +00:00
Billy Laws
0ab864da17 ARM64EC: Reset the CPU area JIT state before handling exceptions
An exception in JIT code acts as a transition to ARM64EC code (in
NTDLL for exception handling etc) as such, much like ExitFunction,
InSimulation must be unset. InSyscallCallback is unset for robustness
against exception in the JIT itself.
2024-08-05 17:38:45 +00:00
Billy Laws
3c32271dd0 ARM64EC: Merge EFlags with the current JIT flags state on a ctx sync
Only NZCV and TF are passed through to BeginSimulation as the rest are
lost when converting to a native context and back on the ntdll side. To
prevent thread suspension from wiping out the rest of the flags, only
copy these specific flags into the current JIT EFlags state.
2024-08-05 17:38:45 +00:00
Billy Laws
507a95b817 ARM64EC: Map TF to PSTATE.SS when reconstructing a native context 2024-08-05 17:38:45 +00:00
Billy Laws
9bb9e954c2 ARM64EC: Spill EFlags when reconstructing state from in the JIT 2024-08-05 17:38:45 +00:00
Billy Laws
7f3582bb23 ARM64EC: Only clear the trap flag when handling an exception
Better matches Windows emulator behaviour.
2024-08-05 17:38:45 +00:00
Billy Laws
efe15ce336 ARM64EC: Handle direct syscall instructions
Most syscalls on Windows are done by calling into their NTDLL thunks,
however some DRMs parse out their numbers from NTDLL and directly call
them. Support this by redirecting to their entry thunks in the FEX
syscall handler.
2024-08-05 17:35:32 +00:00
Billy Laws
21b0f35ef4 ARM64EC: Populate a LUT mapping NTDLL FFS exports to their native impls
To prevent FEX from redirecting to x86 code when NTDLL exports it calls
into are patched, a custom call checker will be used that checks this LUT
to redirect calls rather than the FFS itself.
2024-08-05 17:35:32 +00:00
Billy Laws
ccf332d48e Arm64Emitter: Reload STATE before SRA fill on ARM64EC
While ARM64EC code cannot use x28, it can be cleared by the kernel
when performing syscalls etc so restore it from the TEB to be safe.
2024-08-05 17:31:01 +00:00
Ryan Houdek
802a32ce8a
SpinWaitLock: Fixes missing newline in asm
This would cause the atomic load after the wfe to be dropped,
effectively returning stale data.
2024-08-04 06:57:05 -07:00
Ryan Houdek
70c02d5c58
ARM64Emitter: Removes unused vixl CPU object 2024-08-03 22:26:00 -07:00
Ryan Houdek
2e4fb47848
HostFeatures: Read VL ourselves
Instead of calling out to vixl
2024-08-03 22:26:00 -07:00
Ryan Houdek
a4d5302369
Arm64: Adds Int helpers
One more vixl step removed.
2024-08-03 21:40:28 -07:00
Ryan Houdek
6ff3c90af3
CodeEmitter: Removes vestigial vixl usage
- IsImmLogical already existed in our CodeEmitter. We just forgot to
  allow nullptr arguments and to use it.
- Adds an equivalent IsImmAddSub helper and uses it

This gets us closer to removing vixl's global initializers from FEXCore.
2024-08-03 21:04:56 -07:00
Ryan Houdek
c114279118
Config: Removes a static vector initializer
Saw this vector was getting initialized at runtime, sticking around, and
installing an atexit handler. This is completely unnecessary, just use
the OPT_BASE handler directly to walk the environment variable names.
2024-08-03 19:13:29 -07:00
Ryan Houdek
7ffd3e55d5
Merge pull request #3915 from bylaws/winbase
ARM64EC: Support the JIT API as is used by Windows
2024-08-02 10:56:46 -07:00