FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2024-12-15 09:59:28 +00:00

Author	SHA1	Message	Date
Ryan Houdek	d4be2dc636	Merge pull request #3434 from bylaws/arm64ec-pt3 FEXCore: Expose AbsoluteLoopTopAddress to the frontend	2024-02-21 14:31:04 -08:00
Alyssa Rosenzweig	2bcd285851	Merge pull request #3430 from Sonicadvance1/tsc_scale Implement small TSC scaling	2024-02-21 13:16:27 -04:00
Alyssa Rosenzweig	8762bc1fa3	OpcodeDispatcher: simplify CalculateAF signature - Res is unused - SrcSize doesn't matter since we ignore the high bits, might as well always use 32-bit, it doesn't matter Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-21 12:48:15 -04:00
Billy Laws	5b4162b712	FEXCore: Expose AbsoluteLoopTopAddress to the frontend ARM64EC has a shared SRA mapping between ARM64 and X64 code, so there needs to be a public way to enter the dispatcher without refilling SRA from the in-memory context struct.	2024-02-21 11:46:24 +00:00
Billy Laws	cb5c07f4b1	Arm64Emitter: Introduce ARM64EC SRA mappings See https://learn.microsoft.com/en-us/cpp/build/arm64ec-windows-abi-conventions?view=msvc-170 note that since mm registers are volatile there is no need to match the mapping for them when in JIT, so they can be used as scratch regs. Disallowed regs are also wiped on context switches, so they cannot be taken advantage of to e.g. avoid spilling.	2024-02-21 11:18:10 +00:00
Ryan Houdek	b902b8edab	Implement small TSC scaling Games engines are expecting >1Ghz cycle counters. Scale them to work around the issue. Resolves the excessive busy waiting in Unreal Engine 5 games.	2024-02-20 12:05:44 -08:00
Alyssa Rosenzweig	0503c89ff6	OpcodeDispatcher: use NZCV update helpers Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-19 14:12:54 -04:00
Alyssa Rosenzweig	6dd410698a	OpcodeDispatcher: add helpers for updating NZCV metadata to reduce error-prone copypaste Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-19 14:12:54 -04:00
Ryan Houdek	808ced455d	FEXCore: Add a frontend pointer to InternalThreadState FEXCore is guaranteed to not touch this pointer and can be used by frontends to store thread-specific data.	2024-02-15 02:06:16 -08:00
Ryan Houdek	9cab746aa7	Merge pull request #3407 from neobrain/feature_libfwd_arguments_on_guest_stack Library Forwarding: Allocate packed arguments on the guest stack if needed	2024-02-12 16:31:34 -08:00
Alyssa Rosenzweig	68232366e4	OpcodeDispatcher: don't mask add/sub sources not needed in the new approach Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-12 12:36:28 -04:00
Alyssa Rosenzweig	d7ff1b78fb	IR: handle 8/16-bit AddNZCV/SubNZCV we can do it more effectively than the current s/w lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-12 12:36:09 -04:00
Mai	780b48620b	Merge pull request #3420 from Sonicadvance1/preserve_all_3419 Fix #3419	2024-02-10 23:24:38 -05:00
Ryan Houdek	4a0878fa92	Fix #3419	2024-02-10 19:55:51 -08:00
Ryan Houdek	df3d6938ae	Merge pull request #3410 from alyssarosenzweig/opt/nzcv-pass-2 Add NZCV+PF/AF optimization pass	2024-02-10 05:03:12 -08:00
Ryan Houdek	ba41da7da0	Merge pull request #3414 from Sonicadvance1/fix_one_mutex_hang Fixes one mutex hang	2024-02-09 05:54:40 -08:00
Ryan Houdek	2480bab409	Fixes one mutex hang When code invalidation is happening we currently have the issue that a thread can acquire the code invalidation mutex in the middle of invalidation. This is due to us acquiring and releasing the mutex between each thread's code invalidation. We need to hold the mutex for the entire duration for all thread's code invalidation. This fixes a rare hang on proton startup and resolves a consistent hang on Proton application shutdown. This now puts us on par with FEX-2312.1 with hanging. This does not fix a relatively rare hang on fork (which also existed with FEX-2312.1). This also does not fix the issue that the intersection of our mutexes between frontend and backend are very convoluted. In part of the work that is going to fix the rare fork mutex hang will change more of this.	2024-02-08 18:18:00 -08:00
Alyssa Rosenzweig	ad7202e7d7	OpcodeDispatcher: optimize test -1 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-08 14:10:13 -04:00
Alyssa Rosenzweig	175a57dd27	OpcodeDispatcher: emit AndWithFlags directly for primary alu Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 14:22:28 -04:00
Alyssa Rosenzweig	e2ce60148c	OpcodeDispatcher: emit AndWithFlags directly for 2ndary alu rely on opt pass to drop the flags. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 14:22:28 -04:00
Alyssa Rosenzweig	99660129f3	IR: implement AndWithFlags for 8/16-bit easier to deal with in the JIT Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 14:22:28 -04:00
Alyssa Rosenzweig	308d9a751c	RedundantFlagCalculationElimination: optimize rmif Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 13:06:26 -04:00
Alyssa Rosenzweig	4bd28c0ed8	RedundantFlagCalculationElimination: optimize condaddnzcv Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 13:06:26 -04:00
Alyssa Rosenzweig	8397f3ac99	RedundantFlagCalculationElimination: refine AXFLAG Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 13:06:26 -04:00
Alyssa Rosenzweig	0452bc7212	RedundantFlagCalculationElimination: optimize condjump Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 13:06:26 -04:00
Alyssa Rosenzweig	3d7ed89ffb	RedundantFlagCalculationElimination: optimize NZCVSelect Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 13:06:25 -04:00
Alyssa Rosenzweig	23ab0a978e	RedundantFlagCalculationElimination: also handle InvalidateFlags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 13:06:25 -04:00
Alyssa Rosenzweig	7f47a9ef0e	IR: add local dead flag elimination pass RCLSE ignores NZCV and doesn't optimize stores which doesn't help us with PF/AF either. So, we add a new pass for dead flag elimination (cannibalizing the old and broken dead flag elimination pass). This is a simple local optimizer that walks each block backwards, converging in linear time & constant space in a single iteration. Right now, it doesn't do a ton (other than a nice reduction in silliness in the hot Sonic block), but it provides the framework to fuse comparisons. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-06 13:06:25 -04:00
Alyssa Rosenzweig	4331753ca0	Merge pull request #3408 from alyssarosenzweig/opt/tst Optimize TST	2024-02-06 11:28:02 -04:00
Paulo Matos	fa8bcfd67a	Clean up access to possible nullptr Patch suggested by @Sonicadvance1	2024-02-06 12:31:38 +00:00
Tony Wasserka	a1343e9296	Revert "Add cmake option DISABLE_CLANG_PRESERVE_ALL"	2024-02-05 22:31:45 +01:00
Alyssa Rosenzweig	235f32ce8c	Merge pull request #3401 from Sonicadvance1/runtime_preserve_all HostFeatures: Supports runtime disabling of preserve_all	2024-02-05 15:34:46 -04:00
Alyssa Rosenzweig	2e0cb2fbd4	OpcodeDispatcher: optimize TST it's just an AndWithFlags setting the PF. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-05 15:32:21 -04:00
Alyssa Rosenzweig	4790a7ba79	IR: add AndWithFlags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-05 15:32:21 -04:00
Tony Wasserka	df3e51fc8c	Library Forwarding: Allocate packed arguments on the guest stack if needed This is required for host-side calls to guest functions on 32-bit guests. Since the host stack is allocated before FEX blocks memory inaccessible to the guest, the guest would otherwise fail to read the packed argument data.	2024-02-05 18:10:34 +01:00
Ryan Houdek	0139498072	SpinWaitLock: Removes unused variable in spin-loop fallback Tmp was no longer being used, forgot to remove it.	2024-02-05 07:22:52 -08:00
Ryan Houdek	472a701e2b	Merge pull request #3403 from Sonicadvance1/fix_spinlock_contended_lock SpinLockWait: Fixes unexpected lock success	2024-02-05 06:51:42 -08:00
Ryan Houdek	cce6011205	SpinLockWait: Fixes unexpected lock success With a contended unique lock, we forgot to reset the `Expected` value to zero. This was causing a contended mutex to incorrectly succeed. Noticed this when converting some pthread mutexes over to spinloops to remove strace noise. The reference wfe_mutex library I wrote didn't have this problem since the implementation is slightly different.	2024-02-03 01:10:57 -08:00
Ryan Houdek	c437129ed8	Revert "Revert "FEXLoader: Moves thread management to the frontend"" This reverts commit `5358af7794`.	2024-02-03 00:57:36 -08:00
Alyssa Rosenzweig	8d3f0b6f02	OpcodeDispatcher: reassociate and sink W in sha1 We only need each part of W extracted in the corresponding round, so sink the extract into the round to reduce pressure. Further, W and E are added and then never used again. So, by reassociating we can do the add upfront, killing W and E at the start and further reducing pressure. Eliminates spilling in sha1rnds4. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	60f7b9bcc4	OpcodeDispatcher: optimze sha1's 2/3 expr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	a487557173	OpcodeDispatcher: extract BitwiseAtLeastTwo Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	394b4888bb	OpcodeDispatcher: reassociate and remat C0, G0 costs 2 moves and eliminates the rest of our spilling Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	142cbdd852	OpcodeDispatcher: expand, reassociate, and interleave sha256 calc Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	2f9102f78d	OpcodeDispatcher: expand & interleave sha256 calc Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	c9824d04cb	OpcodeDispatcher: sink sha256 extracts Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	9c2a569539	OpcodeDispatcher: reexpress Major in sha256 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	515aa4ce3e	OpcodeDispatcher: fuse eor+ror in sha256 This reduces instructions a ton. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	f616beb992	OpcodeDispatcher: CSE sha Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	0dcf1e12b8	OpcodeDispatcher: copyprop sha logic prepare for clever Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Alyssa Rosenzweig	2cbf544ef5	OpcodeDispatcher: expand sha logic no functional change, just preparing for cleverness. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-02-02 13:03:07 -04:00
Ryan Houdek	0eed73beeb	HostFeatures: Supports runtime disabling of preserve_all This is used for instcountci to ensure instruction counts don't change when a compiler supports this feature or not. Always runtime disable when running in instcountci. CMake option from #3394 can still be useful so leaving that in place.	2024-02-02 08:59:04 -08:00
Mai	6993f4fd8d	Merge pull request #3400 from Sonicadvance1/revert_runtime_longmode_switch Revert #3303	2024-02-02 11:53:44 -05:00
Mai	920a8db369	Merge pull request #3397 from pmatos/XCHGOp Improve XCHG operations	2024-02-02 11:53:22 -05:00
Paulo Matos	4623544f69	Improve XCHG operations Marking loads as allowing upper garbage simplifies some operations. Update InstCountCI as well.	2024-02-02 08:16:13 +00:00
Ryan Houdek	ccf1402fe6	Revert "FEXCore: Accurately store segment descriptors" This reverts commit `8648fb1485`.	2024-02-01 18:14:30 -08:00
Ryan Houdek	da0e1b515a	Revert "OpcodeDispatcher: Initial support for runtime long-mode switch" This reverts commit `9e5d7aa5fe`.	2024-02-01 18:14:24 -08:00
Ryan Houdek	690cb6fa48	Config: Fixes JSON parsing of "ArgumentHandler" types When `4d109c9ce0` fixed parsing strenum types in the json, it also added `ArgumentHandler` types to the json parsing. This was incorrect as those types are already stored in the json in their decoded numerical format. Without this change, all config options with `ArgumentHandler` will decode as "0" which is incorrect. The main killer here is that SMCChecks gets disabled (visible in both FEXConfig and when applications are running) which was causing spurious failures.	2024-02-01 16:20:57 -08:00
Ryan Houdek	cec1814a09	Merge pull request #3384 from pmatos/CDQOp-Opt Optimize CDQOp	2024-01-31 17:51:23 -08:00
Mai	4d49ac7c3d	Merge pull request #3387 from alyssarosenzweig/opt/rotates Optimize rotates	2024-01-31 18:20:40 -05:00
Alyssa Rosenzweig	6d13d9fb56	Merge pull request #3395 from pmatos/StaticAnalysis Code cleanup - mainly dead store removal; NFC	2024-01-31 17:24:48 -04:00
Mai	ae7dc250db	Merge pull request #3386 from alyssarosenzweig/opt/shift Optimize shifts a bit	2024-01-31 14:11:58 -05:00
Mai	f4086b25e6	Merge pull request #3385 from alyssarosenzweig/opt/bmi Optimize bit manipulation instructions	2024-01-31 14:07:23 -05:00
Paulo Matos	e4560ed0c8	Code cleanup - mainly dead store removal; NFC scan-build found a few dead stores that can be easily cleaned-up	2024-01-31 08:35:55 +00:00
Paulo Matos	6d58ea31b9	Add cmake option DISABLE_CLANG_PRESERVE_ALL Forces disabling use of __attribute__((preserve_all)). Until CI uses clang17, where this attribute was added, instcountci fails when FEX is compiled with clang>=17.	2024-01-31 08:29:20 +00:00
Alyssa Rosenzweig	f3eee8f305	OpcodeDispatcher: optimize bextr's length sanitize reordering the operations saves an immediate move. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	f66085f4a7	OpcodeDispatcher: optimize bextr's (1 << x) - 1 little algebraic trick I cribbed from llvm Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	c9461d9997	OpcodeDispatcher: optimize BEXTR flag setting use native test. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	d5eb99fac8	OpcodeDispatcher: optimize popcount flags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	f3175848b1	OpcodeDispatcher: use lzcnt flag gen for tzcnt as far as flags go, they're identical: set ZF for zero output, set CF for output = DestSize, undef the rest. merge the impls, so we get the optimized lzcnt impl for tzcnt. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	3dd597a591	OpcodeDispatcher: optimize lzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	e8e05252f0	OpcodeDispatcher: optimize BLSI and explain why the suss thing we did before was actually right all along. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	93cef53ec0	OpcodeDispatcher: optimize blsr flags reorder to avoid nzcv clobber Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	3a19133267	OpcodeDispatcher: fix inverted BLSR carry Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	9b309b2102	OpcodeDispatcher: optimize blsmsk flags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	fe88b904c9	OpcodeDispatcher: fix missing SF set with blsmsk	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	2e63c6d547	OpcodeDispatcher: fix inverted CF with blsmsk CF set if SRC = 0 per https://www.felixcloutier.com/x86/blsmsk Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:28:06 -04:00
Alyssa Rosenzweig	0bc9e1a409	OpcodeDispatcher: clobber OF with shift immediate Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig	338f12845d	OpcodeDispatcher: save a constant in shld one weird trick Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig	b3ae81f75f	OpcodeDispatcher: allow garbage on shld shift Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig	c1a1c37980	OpcodeDispatcher: mark ideas to improve SHLD a bit tricky right now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig	fb6f850bb4	OpcodeDispatcher: remove rcl sub Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	b6d8749525	OpcodeDispatcher: remove select from rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	d3f1397325	OpcodeDispatcher: eliminate constants in RCR Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	0a164428fa	OpcodeDispatcher: eliminate select in RCR the nzcv clobber I actually came ofr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	7496175100	OpcodeDispatcher: optimize 32-bit rcl/rcr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	0616a9cef1	OpcodeDispatcher: eliminate move in rcr 1-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	97f8775354	OpcodeDispatcher: optimize <32-bit rcr op1 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	c92099aa98	OpcodeDispatcher: fuse orlshl in rcr 1-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	7c288b09f1	OpcodeDispatcher: rmif mask rcl smaller OF Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	680af7b1b0	OpcodeDispatcher: rcr op 8x1 cleanup Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	349bc9efab	OpcodeDispatcher: unify rcr op 1bit codepaths get additional opt for <32-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	ad5c3cb268	OpcodeDispatcher: rmif mask for OF in rcr smaller Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	be8d37ef3d	OpcodeDispatcher: optimize 32-bit rol/ror imm Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	6ad2514bfe	OpcodeDispatcher: rmif mask rcl smaller cf better on flagm. extra moves on non-flagm but, meh. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	3fa6129a14	OpcodeDispatcher: rmif mask rcr smaller cf and do some constant folding to do so more. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	a57cebaf58	OpcodeDispatcher: skip OF calc for constant rotate >= 2 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	34fdb14da1	OpcodeDispatcher: add and use AndConst this skips the constant folding, which saves the branching in the rotate immediate implementations. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	974baca09c	OpcodeDispatcher: allow upper garbage with rcl/rcr smaller we're masking immediately to something smaller Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	f22094a493	OpcodeDispatcher: use a branch for 8/16-bit rotate flags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	d979b3a1da	OpcodeDispatcher: note idea to further optimize rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	6d82c957fa	OpcodeDispatcher: fuse orlshl in rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Mai	fa3352004e	Merge pull request #3381 from alyssarosenzweig/opt/masking Allow upper garbage on a bunch of instructions	2024-01-30 10:07:53 -05:00
Ryan Houdek	ce2924731e	vixl/simulator: Enlarge simulator stack size Simulator stack size defaults to 8KB. This new unit test requires at least 15360 stack size. Just push it up to 8MB.	2024-01-29 19:48:38 -08:00
Ryan Houdek	bc67910ee4	Merge pull request #3382 from pmatos/TypoFix Fix typos; NFC	2024-01-29 16:10:14 -08:00
Mai	31a4158957	Merge pull request #3383 from alyssarosenzweig/opt/ptest Optimize PTEST and VTESTP	2024-01-29 13:30:53 -05:00
Mai	58f3d3caf5	Merge pull request #3380 from alyssarosenzweig/opt/pdep Optimize PDEP	2024-01-29 13:27:15 -05:00
Alyssa Rosenzweig	ae48228943	OpcodeDispatcher: optimize vtestps/vtestpd I don't really care about AVX but do the same thing we did for vptest. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:24:11 -04:00
Alyssa Rosenzweig	e8e35e48c7	OpcodeDispatcher: optimize ptest with tst Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8b8f27a88f	OpcodeDispatcher: optimize ptest with umaxv to check if the vector is zero, umaxv its elements and check if the reduced scalar is zero. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8e7906a665	IR: add UMaxV will be used to accelerate ptest Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:19:22 -04:00
Paulo Matos	027fbbf051	Optimize CDQOp	2024-01-29 17:18:02 +00:00
Paulo Matos	ca31a0404c	ConstProp should generate 32bit constants when required	2024-01-29 17:15:47 +00:00
Paulo Matos	f644959c7c	Fixing some typos; NFC	2024-01-29 17:14:53 +00:00
Alyssa Rosenzweig	16a54742e6	OpcodeDispatcher: optimize 32-bit tzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad9aa0bc87	OpcodeDispatcher: optimize 32-bit lzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	bd2b3f35a3	OpcodeDispatcher: optimize 32-bit popcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	50169ce640	OpcodeDispatcher: optimize 32-bit pext Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	baae2d68f9	OpcodeDispatcher: optimize 32-bit bextr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad8d038b8a	OpcodeDispatcher: optimize 32-bit blsi Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	820932e3c7	OpcodeDispatcher: optimize 32-bit blsmsk Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	6f11f2e6f4	OpcodeDispatcher: optimize 32-bit blsr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	f5ad7682c3	OpcodeDispatcher: optimize 32-bit pdep Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Alyssa Rosenzweig	04805f351b	JIT: rewrite pdep implementation - use better algorithm that is O(# set bits) instead of O(# total bits) - eliminate spilling by careful management of our temporaries - fix nzcv clobber bug (whoops) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Mai	750b0b70bc	Merge pull request #3356 from Sonicadvance1/modify_code_lock Jitarm64: Implements spin-loop futex for JIT blocks	2024-01-23 13:46:59 -05:00
Ryan Houdek	56d8080ec9	Merge pull request #3345 from Sonicadvance1/fix_syscall_registers OpcodeDispatcher: Fixes syscall rcx/r11 generation	2024-01-22 15:21:13 -08:00
Ryan Houdek	c0be974272	Merge pull request #3368 from bylaws/preprcr FEXCore: Fix RCL/RCR shift wraparound behaviour	2024-01-21 13:44:49 -08:00
Billy Laws	e323938173	FEXCore: Fix RCL/RCR shift wraparound behaviour This ends up being cleaner to handle outside of CalculateFlags_ShiftVariable as constant masking is only needed for RCL/RCR.	2024-01-21 18:15:50 +00:00
Billy Laws	407e26bfee	FEXCore: Use TMP1-4 for values that need preserving across spills The ARM64EC SRA layout will use x0-3 for x86_64 registers, as such any arguments passed to C ABI functions need to proxy their arguments through the temporaries and move as appropriate.	2024-01-21 16:21:13 +00:00
Ryan Houdek	a6c57f71e9	SpinWaitLock: Fixes potential extra wait that would occur on contended lock We had a chance of doing an additional bogus wfe if the expected value was hit in one iteration of a loop. Not the biggest problem on current hardware where WFE only ever sleeps for 1-4 system cycles, but on future hardware where WFE might actually sleep for longer then this could have been an issue.	2024-01-17 10:41:16 -08:00
Ryan Houdek	2af7e997f4	Spinlocks: Fix assembly Need to have a source be +r so it doesn't get overwritten.	2024-01-17 10:19:38 -08:00
Ryan Houdek	ab6c00bbcf	FEXCore/Utils: Rename FutexSpinWait to SpinWaitLock	2024-01-17 10:19:38 -08:00
Ryan Houdek	e18453cb57	Jitarm64: Implements spin-loop futex for JIT blocks This will ensure that multiple concurrent SIGBUS handlers in the same code block doesn't modify the same code.	2024-01-17 10:19:38 -08:00
Ryan Houdek	39f49782da	Arm64: Move ParanoidTSO checks up out of the non-paranoid code bath	2024-01-17 10:19:38 -08:00
Ryan Houdek	2c5dd20f3c	FutexSpinWait: Implement spin-loop Unique mutex.	2024-01-17 10:19:38 -08:00
Ryan Houdek	136fa78825	FEXCore: Implements an efficient spin-loop API This will only be used internally inside of FEXCore for efficient shared codecach backpatch spin-loops.	2024-01-17 10:19:38 -08:00
Ryan Houdek	f956f008ea	Merge pull request #3372 from alyssarosenzweig/opt/cmpxchg-review Optimize GPR cmpxchg	2024-01-15 05:11:12 -08:00
Ryan Houdek	1f7a619c79	OpcodeDispatcher: Fixes syscall rcx/r11 generation Noticed this while writing #3342. Fixes #3343 The syscall instruction is defined in the documentation that it will set RCX to the next instruction's RIP and R11 to be RFLAGS. We entirely skipped this which I noticed while writing unit tests. Adds unittests to test both 32-bit and 64-bit behaviour because our helper shares code with both. I don't know if anything actually relied on this behaviour but we should definitely support it.	2024-01-12 19:14:30 -08:00
Alyssa Rosenzweig	58127bd0e8	OpcodeDispatcher: optimize trivial cmpxchgs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-12 12:23:34 -04:00
Alyssa Rosenzweig	e8945dfb6d	OpcodeDispatcher: optimize gpr cmpxchg NZCV stuff. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-12 12:03:28 -04:00
Ryan Houdek	8c3163096b	Merge pull request #3363 from Sonicadvance1/fix_label_allocations ArmEmitter: Support single use forward labels	2024-01-12 00:26:31 -08:00
Ryan Houdek	615cfe0246	Merge pull request #3361 from Sonicadvance1/decompose_std_function FEXCore: Decompose some std::function usage to regular pointers	2024-01-10 16:55:29 -08:00
Ryan Houdek	3d5f876585	Fixes some new glibc allocations that cropped up I guess this was handled by brk things before.	2024-01-09 13:55:04 -08:00
Ryan Houdek	37102400b5	Arm64: Switches uses of forward label over to SingleUse if possible Primary goal for this is to ensure that the delinker doesn't need to allocate any memory. This delinker can end up getting hit heavily with JIT code so we don't want it to be allocating memory.	2024-01-08 22:18:20 -08:00
Ryan Houdek	c01e6283ae	CodeEmitter: Support a single use forward label Currently all uses of the forward label calls in to jemalloc to allocate memory. This allows a forward label that doesn't require any memory allocation, which is the common case in FEX.	2024-01-08 22:18:20 -08:00
Ryan Houdek	248dc97993	FEXCore: Decompose some std::function usage to regular pointers The delinker step of the JIT was using std::function with capture lambdas that required memory allocation when unnecessary. Because the compiler can't see through our std::function usage it could never decompose these by itself. By passing the Thread's frame and record to the function as arguments then we can have the signature be a raw function pointer. This fixes an area of concern from: https://github.com/FEX-Emu/FEX/blob/main/docs/ProgrammingConcerns.md#stdfunction-and-lambdas	2024-01-06 19:39:54 -08:00
Ryan Houdek	d488592eda	Merge pull request #3339 from Sonicadvance1/pass_thread_unaligned_fault_handler FEXCore: Pass thread object to HandleUnalignedAccess	2024-01-04 18:20:37 -08:00
Ryan Houdek	743df8dfae	Merge pull request #3327 from Sonicadvance1/remove_syscall_indirection Arm64: Removes a vtable indirection in syscalls	2024-01-04 18:19:40 -08:00
Ryan Houdek	4b3792196f	Merge pull request #3303 from Sonicadvance1/initial_runtime_longmode_switch OpcodeDispatcher: Initial support for runtime long-mode switch	2024-01-04 18:17:54 -08:00
Ryan Houdek	db7d7a6bd7	Merge pull request #3349 from Sonicadvance1/revert_frontend_ownership Revert "FEXLoader: Moves thread management to the frontend"	2024-01-03 14:25:04 -08:00
Alyssa Rosenzweig	04a88ed3ab	Merge pull request #3353 from Sonicadvance1/public_interface_cleaning FEXCore interface cleaning	2024-01-03 15:14:54 -04:00
Alyssa Rosenzweig	9da08b40bd	Merge pull request #3344 from Sonicadvance1/xbyak_upstream Externals: Update xbyak to v7.02 and switch away from fork	2024-01-03 15:13:58 -04:00
Alyssa Rosenzweig	5467c3e478	Merge pull request #3357 from Sonicadvance1/remove_non_sra FEXCore: Removes SRA option, it's now permanently enabled	2024-01-03 15:10:04 -04:00
wannacu	4e7bab849c	JIT: Fixes broken register in VTBX1 If the Dst register is allocated as VectorIndices or VectorTable, using Dst as an operand to perform the tbx operation will result in an error. For example: %131(FPR0) i128 = LoadNamedVectorIndexedConstant u8:Tmp:RegisterSize, #0x6, #0xaa0 %132(FPR0) i128 = VTBX1 u8:Tmp:RegisterSize, %129(FPRFixed6) i32v4, %126(FPRFixed10) i16v8, %131(FPR0) i128 Since the tbx instruction's destination register is also the original operand, this is consistent with the semantics of VTBX1. Therefore, directly using VectorSrcDst as the destination operand for the tbx instruction is safe.	2023-12-29 16:18:40 +08:00
Ryan Houdek	d098545c20	FEXCore: Removes SRA option, it's now permanently enabled	2023-12-28 18:28:02 -08:00
Ryan Houdek	5358af7794	Revert "FEXLoader: Moves thread management to the frontend" This reverts commit `58f2693954`.	2023-12-27 04:33:50 -08:00
Ryan Houdek	25bcddf3a5	FEXCore: Removes context wide and map lookup While locking a shared_lock and doing an empty table lookup is fairly fast, just remove them from the hot path entirely if no custom IR handlers are installed. This is only used for our IRLoader, which is losing its importance significantly and should probably be removed anyway.	2023-12-26 11:11:44 -08:00
Ryan Houdek	f785b38e4d	Merge pull request #3352 from Sonicadvance1/remove_irloader Removes IRLoader, unittests, and public interface	2023-12-26 11:08:26 -08:00
Ryan Houdek	b115c144fb	FEXCore: Removes NetStream from public API Only used by GDBServer. NFC.	2023-12-25 07:07:17 -08:00
Ryan Houdek	d8f20751fe	FEXCore: Moves IREmitter from the public API to backend No functional change	2023-12-25 07:00:29 -08:00
Ryan Houdek	1977747fc2	Removes IRLoader, unittests, and public interface This unit test hasn't really served any purpose for a while now and mostly just causes pain when reworking things in the IR. Just remove the IRLoader, its unit tests, the github action steps and the public FEXCore interface to it. Since it isn't used by anything other than Thunks. Also moves some IR definitions from the public API to the backend.	2023-12-25 07:00:29 -08:00
Ryan Houdek	257016bf12	FEXCore: Moves BucketList out of public API NFC	2023-12-25 06:58:22 -08:00
Ryan Houdek	69d65fba4a	FEXCore: Removes unused SyscallVisitor This was expected to be part of the syscall optimizations we did but ended up getting manifested in a different way. Remove it.	2023-12-25 06:42:11 -08:00
Ryan Houdek	bce694ebb5	FEXCore: Moves BitUtils to FHU No functional change	2023-12-25 06:38:51 -08:00
Ryan Houdek	5d37d5db1a	FEXCore: Optimize HostFeatures and CPUID feature calculation Need #3348 merged first. As I was casually thinking, this code made me realize that it was quite branch heavy and could likely be optimized to logic. The previous code generated some fairly nasty branch heavy code. This can be optimized to be branchless and take roughly five instructions per flag. Using a bitfield for each feature would turn each calculation in to 3-4 instructions but that seems overkill. Very minor thing.	2023-12-25 04:58:15 -08:00
Ryan Houdek	4d109c9ce0	Config: Fixes parsing strenum inside of json files This wasn't wired up before.	2023-12-23 22:32:59 -08:00
Ryan Houdek	db9b326534	FEXCore: Support disabling CPUID features based on config Need to be able to disable sha by config.	2023-12-23 22:32:29 -08:00
Ryan Houdek	1c34b25538	FEX: Removes legacy kernel 32-bit allocator We only used this so that our Xavier CI system which were running old kernels could run unit tests. We have now removed the Xaviers from CI and this is no longer necessary. Stop pretending that we support kernels older than 5.0 and allowing this fallback. The 32-bit allocator is still used for the MAP_32BIT mmap flag, so the load bearing code can't be fully removed. Just remove the config and the frontend things using it.	2023-12-21 06:21:01 -08:00
Ryan Houdek	38ad3f0e05	FEXCore: Pass thread object to HandleUnalignedAccess Currently no functional change but public API breaks should come early. The thread state object will be used for looking up thread specific codebuffers in the future when we support MDWE with code mirrors.	2023-12-21 01:55:25 -08:00
Ryan Houdek	266f7feecb	Arm64: Removes a vtable indirection in syscalls We can safely call virtual functions through the JIT with a little bit of work. FEX's JIT has quite a few steps before it gets to a syscall handler. Before this commit: JIT->static HandleSyscall->SyscallHandler::HandleSyscall->SyscallHandler After this commit: JIT->SyscallHandler::HandleSyscall->SyscallHandler A bit hard to notice this when this interface can spin at 67-million calls per second though.	2023-12-21 01:55:02 -08:00
Ryan Houdek	f9902142f7	Utils: Add ability to get VTable entries to PMF helper This will be useful to remove an indirection.	2023-12-21 01:55:02 -08:00
Ryan Houdek	9e5d7aa5fe	OpcodeDispatcher: Initial support for runtime long-mode switch This has the Frontend and OpcodeDispatcher select their operating mode depending on the incoming code segment long-mode flag. Adds some asserts since currently it is unexpected if the configuration changes at runtime. This is fairly straightforward for an initial setup but isn't fully fleshed out. Right now FEX's x86 tables aren't setup in a way to support choosing a different instruction decoding depending on runtime operating mode change, so that would break in interesting ways. Primarily this just gets FEX setup to start piping the operating mode through from the frontend to the backend. This is a long term task, so it is going to take a long time to iron out all the issues.	2023-12-21 01:54:19 -08:00
Ryan Houdek	8648fb1485	FEXCore: Accurately store segment descriptors Previously we were only storing the 32-bit base address which isn't actually how segment descriptors work. In reality segment descriptors are 64-bit descriptors that are laid out in a particular layout depending on the 4-bit type value. In reality we only care about code and data segment layouts since the rest are bonkers. Describe these descriptors correctly and setup a default code descriptor for the operating mode that FEX is starting in.	2023-12-21 01:54:18 -08:00
Ryan Houdek	8b24f7fc26	Externals: Update xbyak to v7.02 and switch away from fork The last few patches we need have been upstreamed so we shouldn't need our downstream fork anymore.	2023-12-21 01:52:05 -08:00
Ryan Houdek	00669a1c89	Merge pull request #3336 from Sonicadvance1/warn_on_mdwe FEXCore: Warn if MDWE is set	2023-12-20 13:27:43 -08:00
Ryan Houdek	1cedc3d85a	FEXCore: Warn if MDWE is set This will result in FEX not being able to allocate executable memory. We can use shared memory in the future to work around this but for now we don't support that as a fix.	2023-12-20 13:19:28 -08:00
Ryan Houdek	58f2693954	FEXLoader: Moves thread management to the frontend Lots going on here. This moves OS thread object lifetime management and internal thread state lifetime management to the frontend. This causes a bunch of thread handling to move from the FEXCore Context to the frontend. Looking at `FEXCore/include/FEXCore/Core/Context.h` really shows how much of the API has moved to the frontend that FEXCore no longer needs to manage. Primarily this makes FEXCore itself no longer need to care about most of the management of the emulation state. A large amount of the behaviour moved wholesale from Core.cpp to LinuxEmulation's ThreadManager.cpp. Which this manages the lifetimes of both the OS threads and the FEXCore thread state objects. One feature lost was the instruction capability, but this was already buggy and is going to be rewritten/fixed when gdbserver work continues. Now that all of this management is moved to the frontend, the gdbserver can start improving since it can start managing all thread state directly.	2023-12-19 17:43:04 -08:00
Mai	b4b8e81f24	Merge pull request #3321 from Sonicadvance1/thread_frontend_ownership_take2 FEXCore: Changes ParentThread ownership from the CTX to the frontend, take 2	2023-12-19 20:37:59 -05:00
Ryan Houdek	a8f797d36b	Dispatcher: Convert GetCompileBlockPtr to using PMF helper This was older code that was written before the PMF helper was available. Switch it over.	2023-12-19 17:20:37 -08:00
Ryan Houdek	93ec676ce8	Merge pull request #3340 from Sonicadvance1/exitfunctionlink_data FEXCore: Describe exit function linking object with a structure	2023-12-19 16:17:21 -08:00
Mai	3d2cbc5d08	Merge pull request #3317 from Sonicadvance1/fix_imul_flags2 OpcodeDispatcher: Fixes flags generation in imul	2023-12-19 11:43:21 -05:00
Mai	81c85d73b2	Merge pull request #3330 from Sonicadvance1/optimize_sib_addr_calc OpcodeDispatcher: Optimize SIB addr calculation	2023-12-19 11:41:42 -05:00
Mai	5b4e9c6907	Merge pull request #3323 from Sonicadvance1/remove_unused_check Dispatcher: Removes unused asserting CompileBlock function	2023-12-19 11:38:57 -05:00
Ryan Houdek	aa2e8704bc	FEXCore: Changes ParentThread ownership from the CTX to the frontend, take 2 Similar to #3284 but works around some of the bugs that one introduced. This is the minimal amount of changes to move the ownership from FEXCore to the frontend. Since the frontends don't yet have a full thread state tracking, there is an opaque pointer that needs to be managed. In the followup commits this will be changed to have the syscall handler to be the thread object manager.	2023-12-18 14:54:07 -08:00
Ryan Houdek	cf86ae6b65	FEXCore: Describe exit function linking object with a structure Instead of just poking raw uint64_t data values, describe it with a struct. This will be a read-only in the future.	2023-12-18 13:31:53 -08:00
Ryan Houdek	86654907bf	Merge pull request #3334 from Sonicadvance1/remove_old_x86jit_references FEXCore: Removes stale references to x86 JIT	2023-12-18 04:15:29 -08:00
Ryan Houdek	12b72f908b	Merge pull request #3335 from Sonicadvance1/remove_internalthreadstate_header FEXCore: Removes old InternalThreadState header	2023-12-18 04:14:57 -08:00
Ryan Houdek	6c8a54ff84	FEXCore: Removes old InternalThreadState header This was a temporary header to help with when this header was migrated to our public API headers. It's temporary nature is no longer necessary, just get rid of it.	2023-12-15 18:51:25 -08:00
Ryan Houdek	bcc2901d7f	FEXCore: Removes stale references to x86 JIT It doesn't exist anymore.	2023-12-15 18:46:57 -08:00
Ryan Houdek	358bbb51ff	CPUID: Removes Init and just uses constructor No need to wait for initialization on for this anymore. Ever since Init was refactored to do basically no work, this hasn't been necessary. CPUID does need to still be initialized after HostFeatures though, so need to ensure correct member ordering there.	2023-12-15 18:43:23 -08:00
Ryan Houdek	1a2f41922c	OpcodeDispatcher: Optimize SIB addr calculation When the address calculation for SIB has both index and base then we can optimize this to an add with a shifted register. This will convert a three instruction sequence in to one instruction in most cases.	2023-12-15 13:08:46 -08:00
Ryan Houdek	0ede707e0b	IR: Adds support for AddShift IR op This matches x86 SIB's operation of `scale * index + base`	2023-12-15 13:08:06 -08:00
Ryan Houdek	12923ba1b7	Merge pull request #3322 from Sonicadvance1/remove_unused_exithandler PassManager: Removes unused exit handler	2023-12-14 01:55:06 -08:00
Ryan Houdek	e657a27607	Dispatcher: Removes unused asserting CompileBlock function While we were calling this function, its asserting nature hasn't been used for a long time. This used to trigger more frequently when CompileBlock would fail to compile code, either due to not being able to decode an instruction or hitting an instruction that FEX doesn't understand. When these cases are hit today we still generate code blocks which generate SIGILL. This means that this code was actually never hit. Completely remove this function and have the JIT's dispatcher call the CompileBlock function directly. Signature is slightly different since we need to set x3 to be 0.	2023-12-11 17:20:51 -08:00
Ryan Houdek	8bb5462554	PassManager: Removes unused exit handler git blame shows that `718b3e6b4c` added this handler. It doesn't explain why this was desired but it was never wired up to anything. Just remove it.	2023-12-11 17:14:41 -08:00
Ryan Houdek	98f21a2a28	X86Tables: Converts tables to be mostly consteval Reduces the ELF's VM size from 9.8MB down to 9.37MB and should reduce initialization time a smidge. Slammed this out while waiting for other PRs to get reviewed.	2023-12-11 10:03:52 -08:00
Ryan Houdek	5660065eea	FEXCore: Moves OS thread creation to the frontend Fairly lightweight since it is almost 1:1 transplanting the code from FEXCore in to the SyscallHandler's thread creation code. Minor changes: - ExecutionThreadHandler gets freed before executing the thread - Saves 16-bytes of memory per thread - Start all threads paused by default - Since I moved the code to the frontend, I noticed we needed to do some post thread-creation setup. - Without the pause we were racing code execution with TLS setup and a few other things.	2023-12-11 06:22:50 -08:00
Ryan Houdek	7524029a06	Merge pull request #3294 from Sonicadvance1/mov_xid_check FEXCore: Moves XID check to the frontend	2023-12-07 01:17:45 -08:00
Ryan Houdek	5c6f229e76	OpcodeDispatcher: Fixes flags generation imul On overflow with 32-bit we weren't setting the flags correctly.	2023-12-07 01:08:02 -08:00
Ryan Houdek	acdb4c7061	IR: Adds support for {S,U}Mull Lets us do a 32-bit multiply returning a 64-bit result, signed and unsigned.	2023-12-07 01:06:58 -08:00

... 2 3 4 5 6 ...

1082 Commits