FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-02-25 09:12:19 +00:00

Author	SHA1	Message	Date
Ryan Houdek	be3ff804a6	InstCountCI: Update for optimization	2023-09-23 06:11:35 -07:00
Ryan Houdek	9ab2967d71	Arm64: Fixes wide shifts movprfx is invalid to use when the source register matches the movprfx destination. This was getting picked up on by `TwoByte/0F_D1.asm` now that RCLSE is working better now.	2023-09-23 06:06:18 -07:00
Ryan Houdek	d01b457727	RCLSE: Optimize redundant store->load operations The bug that was causing crashes with this was due to inline syscalls. Now that this is fixed we can re-enable store->load operations. This allows constant propagation to work significantly better, which means inline syscalls start working again. This can significantly improve syscall performance in some cases. This is most likely to improve performance in dxsetup and vc_redist but hard to get a real profile. Additionally this will let us inline cpuid results in the future which is pretty nice.	2023-09-23 06:06:18 -07:00
Mai	4e9a114858	Merge pull request #3142 from Sonicadvance1/inline_syscall_fix Arm64: Fixes inline syscalls	2023-09-23 09:03:49 -04:00
Mai	72d092e951	Merge pull request #3141 from Sonicadvance1/fix_simm9_range ConstProp: Fixes unscaled signed 9-bit range	2023-09-23 09:03:01 -04:00
Mai	da3e172857	Merge pull request #3140 from Sonicadvance1/fix_core_sanitization Config: Fixes core sanitization	2023-09-23 09:01:42 -04:00
Ryan Houdek	28fa0bda31	Arm64: Fixes inline syscalls Ever since we reordered registers in `X86Enums.h` this has silently been broken. This wasn't hit because RCLSE has been broken ever since SRA was added, so inlinesyscalls just weren't ever happening. Quick fix while I think of a way to more strictly correlate these registers so it doesn't happen again.	2023-09-23 02:56:32 -07:00
Ryan Houdek	1f2a3cfa8b	ConstProp: Fixes unscaled signed 9-bit range The range was slightly incorrect which mostly wouldn't have caused issues. The lowest byte would have just generated slightly less optimal code. The upper byte could have generated broken code, which our CI couldn't catch since TSO instructions only get enabled when multiple threads are in-flight. Easy enough to fix.	2023-09-23 01:13:54 -07:00
Ryan Houdek	571b0fe47e	Config: Fixes core sanitization This would have caused core to try and initialize a custom core on Arm64, which causes a std::function assert because it doesn't support that. Users would likely get hit by this immediately since we deleted the interpreter and shifted all the core numbers.	2023-09-23 00:52:23 -07:00
Ryan Houdek	86ad35c418	Merge pull request #3138 from alyssarosenzweig/opt/train Requiem for the x86 jit	2023-09-22 16:33:15 -07:00
Alyssa Rosenzweig	0b27029c3f	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-22 19:10:41 -04:00
Alyssa Rosenzweig	223a6562ff	IR: Support <32-bit TestNZ Originally this was going to use setf8/setf16, but it looks like the approach of shift-and-test turns out to be faster. As a bonus this is a nice delete-the-code win :-) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-22 19:08:26 -04:00
Alyssa Rosenzweig	b1231c24ef	OpcodeDispatcher: Omit AF xor for common constants The only reason we need to XOR arguments for AF is to get bit 4 correct. But if the operand in question is known to have bit 4 clear, the XOR will be an effective no-op and can be skipped. This saves an instruction in a bunch of common cases, like inc/dec. If we dedicated a register to AF to eliminate the store, we would not save an instruction from this but would still come out ahead due to an eor turning into a (zero cycle?) mov that can be handled by the renamer. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-22 19:08:26 -04:00
Alyssa Rosenzweig	699aa85c4b	OpcodeDispatcher: Opt PF selection Fold the and in. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-22 19:07:42 -04:00
Alyssa Rosenzweig	2d65a3677b	OpcodeDispatcher: Optimize NZCV selects Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-22 19:07:42 -04:00
Alyssa Rosenzweig	2a2619c0f5	IR: Add bit masking selects Add new synthetic condition codes that do an AND as their relational operator, testing the result. This is 1 IR op for things like (A & B) == 0 ? C : D This can translate to tst A, B csel A, B, eq In the future, if A is the NZCV register and B is a supported immediate, eg (NZCV & 0x80000000) == 0 ? C : D this will be able to translate to a single instruction with the appropriate condition csel A, B, pl but that needs RA support. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-22 19:07:42 -04:00
Ryan Houdek	797c890ff6	Merge pull request #2874 from bylaws/wowfex Add WOW64 JIT frontend	2023-09-22 15:47:59 -07:00
Ryan Houdek	879b41c184	Merge pull request #3134 from Sonicadvance1/remove_x86_jit FEXCore: Removes x86 JIT.	2023-09-22 15:36:47 -07:00
Ryan Houdek	0fbf403787	Adds back in host testharnessrunner CI Necessary for asm tests to still run in the host "core". Useful for ensuring correct behaviour of our assembly tests.	2023-09-22 14:46:03 -07:00
Billy Laws	04cf418452	Windows: Add SPDX license identifiers	2023-09-22 10:12:40 -07:00
Billy Laws	057a7c6ee8	WOW64: Implement thread suspension handling This provides more robust handling than a signal based approach, as the suspender is able to wait for the suspendee to reach a suitable position and flush its context to memory before returning.	2023-09-22 10:12:40 -07:00
Billy Laws	3d6955592b	WOW64: Implement partial self-modifying code handling This should support most simple cases of SMC, however programs which make use of separate shared memory mappings for writing and execution are not handled. The overall approach is the same as is done for linux, where RWX mappings are protected to RX and then when a write occurs the signal handler invalidates the faulting page and reprotects it to RWX until code in that page is jitted again.	2023-09-22 10:12:40 -07:00
Billy Laws	f57aee0a62	WOW64: Add a templated interval list implementation Stores binary intervals in a sorted vector container, to be used for SMC handling.	2023-09-22 10:12:40 -07:00
Billy Laws	c978fdd12f	WOW64: Implement basic code invalidation handling	2023-09-22 10:12:40 -07:00
Billy Laws	19713bd20a	WOW64: Implement exception handling with context restoration When an exception occurs, pretend that we were just at the point of JIT entry so the stack can be unwound to the wow64 SEH handler, which then handles dispatching the exception to the x86 guest with the restored context.	2023-09-22 10:12:40 -07:00
Billy Laws	22b1fea96d	WOW64: Handle unaligned atomic accesses This is done in EnsureConsistentState rather than as a VEH to avoid needing to go through all of wine's exception handling logic for such a hot path.	2023-09-22 10:12:40 -07:00
Billy Laws	be4fcaf65c	WOW64: Report CPU features based off of the emulated cpuid	2023-09-22 10:12:40 -07:00
Billy Laws	2add8a7751	Windows: Introduce a barebones FEXCore-based WOW64 BT module This allows for running x86 applications under wine without having to run all of wine under FEX. The JIT is invoked when running application code and then left when handling NT syscalls or unix calls to e.g. the Vulkan driver.	2023-09-22 10:12:40 -07:00
Billy Laws	9612133088	Windows: Generate import libraries for private ntdll and wow64 APIs The MinGW supplied import libraries are incomplete and miss a lot of functions necessary to implement lower level windows code. To avoid needing to many resolve every function, pull in .def files from wine that detail the entire ntdll and wow64 APIs.	2023-09-22 10:12:40 -07:00
Billy Laws	f46fd42977	Windows: Add a minimal set of wine-derived headers These are cut down versions of wine headers containing only what is necessary for WOW. This shouldn't carry any license implications for FEX, as per the LGPLv3 license: ``` The object code form of an Application may incorporate material from a header file that is part of the Library. You may convey such object code under terms of your choice, provided that, if the incorporated material is not limited to numerical parameters, data structure layouts and accessors, or small macros, inline functions and templates (ten or fewer lines in length), you do both of the following: a) Give prominent notice with each copy of the object code that the Library is used in it and that the Library and its use are covered by this License. b) Accompany the object code with a copy of the GNU GPL and this license document. ```	2023-09-22 10:12:40 -07:00
Billy Laws	51f8c83c76	Context: Add an alternative thread-oriented execute function	2023-09-22 10:12:40 -07:00
Billy Laws	d641d3f61e	OpcodeDispatcher: Avoid redundantly passing args to WIN32 ABI syscalls	2023-09-22 10:12:39 -07:00
Ryan Houdek	02ae59a348	github: Disables default build test on x64	2023-09-21 18:30:03 -07:00
Ryan Houdek	64df9e31c6	github: Remove mingw tests from x86 CI	2023-09-21 18:30:03 -07:00
Ryan Houdek	d32bb993a8	github: Remove glibc fault tests from x86 CI	2023-09-21 18:30:03 -07:00
Ryan Houdek	b5cc9a12f2	FEXCore: Removes x86 JIT. This is blocking performance improvements. This backend is almost unilaterally unused except for when I'm testing if games run on Radeon video drivers. Hopefully AmpereOne and Orin/Grace can fulfill this role when they launch next year.	2023-09-21 18:30:02 -07:00
Ryan Houdek	65b6df9dbb	Merge pull request #3133 from Sonicadvance1/remove_vestigial_interpreter FEXCore: Removes vestigial Interpreter code	2023-09-21 18:15:32 -07:00
Ryan Houdek	31564354b1	FEXCore: Removes vestigial Interpreter code	2023-09-21 15:49:49 -07:00
Ryan Houdek	fea72ce19c	Merge pull request #3120 from Sonicadvance1/more_optimal_x87 FEXCore: Support preserve_all ABI for interpreter fallbacks	2023-09-21 15:35:37 -07:00
Ryan Houdek	2b7e1d10ec	Merge pull request #3131 from Sonicadvance1/optimize_btr OpcodeDispatcher: Optimize lock btr	2023-09-21 15:06:55 -07:00
Ryan Houdek	5444810d64	Merge pull request #3132 from alyssarosenzweig/opt/orlshl Optimize reconstructing x87, harder	2023-09-21 15:02:37 -07:00
Ryan Houdek	4a2ceabfdd	InstCountCI: Add atomic bit test instructions These all can likely be more optimal.	2023-09-21 14:54:51 -07:00
Ryan Houdek	1a4d1d820b	OpcodeDispatcher: Optimize lock btr This is an atomicFetchCLR, removes two mvn instructions that are back to back negating the source. We didn't have this instruction combination in InstCountCI so will be a bit hard to see.	2023-09-21 14:54:51 -07:00
Ryan Houdek	0ae4bbb9c5	IR: Implements support for AtomicFetchCLR This is the native ARM operation rather than fetchAnd. Will make an instruction an instruction slightly more optimal.	2023-09-21 14:54:51 -07:00
Ryan Houdek	7d99eb05c6	Merge pull request #3128 from alyssarosenzweig/rm/interp FEXCore: Gut interpreter	2023-09-21 14:51:44 -07:00
Alyssa Rosenzweig	8247ded2cf	unittests: Remove stale comments Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-21 12:48:12 -04:00
Alyssa Rosenzweig	c52741c813	FEXCore: Gut interpreter It is scarcely used today, and like the x86 jit, it is a significant maintainence burden complicating work on FEXCore and arm64 optimization. Remove it, bringing us down to 2 backends. 1 down, 1 to go. Some interpreter scaffolding remains for x87 fallbacks. That is not a problem here. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-21 12:48:12 -04:00
Alyssa Rosenzweig	75ffbc16f2	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-21 09:14:08 -04:00
Alyssa Rosenzweig	1596e33f58	OpcodeDispatcher: Remove pointless or Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-21 09:13:41 -04:00
Alyssa Rosenzweig	07d03f1610	OpcodeDispatcher: Don't opencode bfe, badly Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-21 09:13:41 -04:00

1 2 3 4 5 ...

7867 Commits