7955 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
b1231c24ef OpcodeDispatcher: Omit AF xor for common constants
The only reason we need to XOR arguments for AF is to get bit 4 correct. But if
the operand in question is known to have bit 4 clear, the XOR will be an
effective no-op and can be skipped. This saves an instruction in a bunch of
common cases, like inc/dec. If we dedicated a register to AF to eliminate the
store, we would not save an instruction from this but would still come out ahead
due to an eor turning into a (zero cycle?) mov that can be handled by the
renamer.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-22 19:08:26 -04:00
Alyssa Rosenzweig
699aa85c4b OpcodeDispatcher: Opt PF selection
Fold the and in.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-22 19:07:42 -04:00
Alyssa Rosenzweig
2d65a3677b OpcodeDispatcher: Optimize NZCV selects
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-22 19:07:42 -04:00
Alyssa Rosenzweig
2a2619c0f5 IR: Add bit masking selects
Add new synthetic condition codes that do an AND as their relational operator,
testing the result. This is 1 IR op for things like

  (A & B) == 0 ? C : D

This can translate to

  tst A, B
  csel A, B, eq

In the future, if A is the NZCV register and B is a supported immediate, eg

  (NZCV & 0x80000000) == 0 ? C : D

this will be able to translate to a single instruction with the appropriate
condition

  csel A, B, pl

but that needs RA support.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-22 19:07:42 -04:00
Ryan Houdek
797c890ff6
Merge pull request #2874 from bylaws/wowfex
Add WOW64 JIT frontend
2023-09-22 15:47:59 -07:00
Ryan Houdek
879b41c184
Merge pull request #3134 from Sonicadvance1/remove_x86_jit
FEXCore: Removes x86 JIT.
2023-09-22 15:36:47 -07:00
Ryan Houdek
0fbf403787 Adds back in host testharnessrunner CI
Necessary for asm tests to still run in the host "core".
Useful for ensuring correct behaviour of our assembly tests.
2023-09-22 14:46:03 -07:00
Billy Laws
04cf418452 Windows: Add SPDX license identifiers 2023-09-22 10:12:40 -07:00
Billy Laws
057a7c6ee8 WOW64: Implement thread suspension handling
This provides more robust handling than a signal based approach, as the
suspender is able to wait for the suspendee to reach a suitable position and
flush its context to memory before returning.
2023-09-22 10:12:40 -07:00
Billy Laws
3d6955592b WOW64: Implement partial self-modifying code handling
This should support most simple cases of SMC, however programs which make use
of separate shared memory mappings for writing and execution are not handled.
The overall approach is the same as is done for linux, where RWX mappings are
protected to RX and then when a write occurs the signal handler invalidates the
faulting page and reprotects it to RWX until code in that page is jitted again.
2023-09-22 10:12:40 -07:00
Billy Laws
f57aee0a62 WOW64: Add a templated interval list implementation
Stores binary intervals in a sorted vector container, to be used for SMC
handling.
2023-09-22 10:12:40 -07:00
Billy Laws
c978fdd12f WOW64: Implement basic code invalidation handling 2023-09-22 10:12:40 -07:00
Billy Laws
19713bd20a WOW64: Implement exception handling with context restoration
When an exception occurs, pretend that we were just at the point of JIT entry
so the stack can be unwound to the wow64 SEH handler, which then handles
dispatching the exception to the x86 guest with the restored context.
2023-09-22 10:12:40 -07:00
Billy Laws
22b1fea96d WOW64: Handle unaligned atomic accesses
This is done in EnsureConsistentState rather than as a VEH to avoid needing to
go through all of wine's exception handling logic for such a hot path.
2023-09-22 10:12:40 -07:00
Billy Laws
be4fcaf65c WOW64: Report CPU features based off of the emulated cpuid 2023-09-22 10:12:40 -07:00
Billy Laws
2add8a7751 Windows: Introduce a barebones FEXCore-based WOW64 BT module
This allows for running x86 applications under wine without having to run all
of wine under FEX. The JIT is invoked when running application code and then
left when handling NT syscalls or unix calls to e.g. the Vulkan driver.
2023-09-22 10:12:40 -07:00
Billy Laws
9612133088 Windows: Generate import libraries for private ntdll and wow64 APIs
The MinGW supplied import libraries are incomplete and miss a lot of
functions necessary to implement lower level windows code. To avoid
needing to many resolve every function, pull in .def files from wine
that detail the entire ntdll and wow64 APIs.
2023-09-22 10:12:40 -07:00
Billy Laws
f46fd42977 Windows: Add a minimal set of wine-derived headers
These are cut down versions of wine headers containing only what is necessary
for WOW. This shouldn't carry any license implications for FEX, as per the
LGPLv3 license:

```
The object code form of an Application may incorporate material from a header
file that is part of the Library. You may convey such object code under terms
of your choice, provided that, if the incorporated material is not limited to
numerical parameters, data structure layouts and accessors, or small macros,
inline functions and templates (ten or fewer lines in length), you do both of
the following:

a) Give prominent notice with each copy of the object code that the Library is
used in it and that the Library and its use are covered by this License.
b) Accompany the object code with a copy of the GNU GPL and this license
document.
```
2023-09-22 10:12:40 -07:00
Billy Laws
51f8c83c76 Context: Add an alternative thread-oriented execute function 2023-09-22 10:12:40 -07:00
Billy Laws
d641d3f61e OpcodeDispatcher: Avoid redundantly passing args to WIN32 ABI syscalls 2023-09-22 10:12:39 -07:00
Ryan Houdek
02ae59a348 github: Disables default build test on x64 2023-09-21 18:30:03 -07:00
Ryan Houdek
64df9e31c6 github: Remove mingw tests from x86 CI 2023-09-21 18:30:03 -07:00
Ryan Houdek
d32bb993a8 github: Remove glibc fault tests from x86 CI 2023-09-21 18:30:03 -07:00
Ryan Houdek
b5cc9a12f2 FEXCore: Removes x86 JIT.
This is blocking performance improvements. This backend is almost
unilaterally unused except for when I'm testing if games run on Radeon
video drivers.

Hopefully AmpereOne and Orin/Grace can fulfill this role when they
launch next year.
2023-09-21 18:30:02 -07:00
Ryan Houdek
65b6df9dbb
Merge pull request #3133 from Sonicadvance1/remove_vestigial_interpreter
FEXCore: Removes vestigial Interpreter code
2023-09-21 18:15:32 -07:00
Ryan Houdek
31564354b1 FEXCore: Removes vestigial Interpreter code 2023-09-21 15:49:49 -07:00
Ryan Houdek
fea72ce19c
Merge pull request #3120 from Sonicadvance1/more_optimal_x87
FEXCore: Support preserve_all ABI for interpreter fallbacks
2023-09-21 15:35:37 -07:00
Ryan Houdek
2b7e1d10ec
Merge pull request #3131 from Sonicadvance1/optimize_btr
OpcodeDispatcher: Optimize lock btr
2023-09-21 15:06:55 -07:00
Ryan Houdek
5444810d64
Merge pull request #3132 from alyssarosenzweig/opt/orlshl
Optimize reconstructing x87, harder
2023-09-21 15:02:37 -07:00
Ryan Houdek
4a2ceabfdd InstCountCI: Add atomic bit test instructions
These all can likely be more optimal.
2023-09-21 14:54:51 -07:00
Ryan Houdek
1a4d1d820b OpcodeDispatcher: Optimize lock btr
This is an atomicFetchCLR, removes two mvn instructions that are back to
back negating the source.

We didn't have this instruction combination in InstCountCI so will be a
bit hard to see.
2023-09-21 14:54:51 -07:00
Ryan Houdek
0ae4bbb9c5 IR: Implements support for AtomicFetchCLR
This is the native ARM operation rather than fetchAnd. Will make an
instruction an instruction slightly more optimal.
2023-09-21 14:54:51 -07:00
Ryan Houdek
7d99eb05c6
Merge pull request #3128 from alyssarosenzweig/rm/interp
FEXCore: Gut interpreter
2023-09-21 14:51:44 -07:00
Alyssa Rosenzweig
8247ded2cf unittests: Remove stale comments
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 12:48:12 -04:00
Alyssa Rosenzweig
c52741c813 FEXCore: Gut interpreter
It is scarcely used today, and like the x86 jit, it is a significant
maintainence burden complicating work on FEXCore and arm64 optimization. Remove
it, bringing us down to 2 backends.

1 down, 1 to go.

Some interpreter scaffolding remains for x87 fallbacks. That is not a problem
here.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 12:48:12 -04:00
Alyssa Rosenzweig
75ffbc16f2 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 09:14:08 -04:00
Alyssa Rosenzweig
1596e33f58 OpcodeDispatcher: Remove pointless or
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 09:13:41 -04:00
Alyssa Rosenzweig
07d03f1610 OpcodeDispatcher: Don't opencode bfe, badly
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 09:13:41 -04:00
Alyssa Rosenzweig
a8b48dcacd OpcodeDispatcher: Swap some selects
...if it lets us use cset.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 09:13:41 -04:00
Alyssa Rosenzweig
bb87b2a19d OpcodeDispatcher: Use more Orlshl
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 09:13:41 -04:00
Alyssa Rosenzweig
19eff62c77 OpcodeDispatcher: Use orlshl for FCW
Potentially easier on the RA (bfi has a tied operand), mostly whatever here.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 08:55:25 -04:00
Mai
5fc8699db9
Merge pull request #3130 from Sonicadvance1/optimize_fsw
OpcodeDispatcher: Optimize reconstructing FSW
2023-09-21 08:35:16 -04:00
Mai
43fd159689
Merge pull request #3129 from Sonicadvance1/remove_non_explicit_selectcc
OpcodeDispatcher: Removes non-explicit SelectCC function
2023-09-21 08:33:30 -04:00
Ryan Houdek
758820ca86 InstCountCI: Update for optimized FSW reconstruction 2023-09-21 02:27:04 -07:00
Ryan Houdek
5664195e49 OpcodeDispatcher: Optimize reconstructing FSW
Minor optimization using Bfi to insert C0, C1, C2, & C3
2023-09-21 02:07:27 -07:00
Ryan Houdek
683daefc15 InstCountCI: Minor changes 2023-09-21 01:57:08 -07:00
Ryan Houdek
8e9e87f631 OpcodeDispatcher: Removes non-explicit SelectCC function
Renames the explicit sized one to `SelectCC`
Cleans up a bit of duplicated code.
2023-09-21 01:56:38 -07:00
Ryan Houdek
0a0865eb1c InstCountCI: Update for minor change 2023-09-20 18:51:18 -07:00
Ryan Houdek
d588d41ab9 InterpreterFallbacks: Converts X87 and String ops to preserve_all
This improves performance!
2023-09-20 18:51:18 -07:00
Ryan Houdek
8aa8d597f6 Arm64: Supports jumping out of the JIT with preserve_all ABI
This improves perferformance when jumping out of the Arm64 JIT by
reducing the number of registers we need to save.
2023-09-20 18:51:18 -07:00