FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-02-21 23:31:24 +00:00

Author	SHA1	Message	Date
Ryan Houdek	cb56728e57	FEXCore/X86Tables: Removes unused supports REP flag This flag was being set in the tables but was actually unused.	2023-11-25 16:54:51 -08:00
Ryan Houdek	b89c3a4573	FEXCore: Removes x86 DebugInfo table This has long since been unused. Originally implemented for some fuzzing tests but has been abandoned and that should likely be implemented some other way.	2023-11-25 16:50:24 -08:00
Ryan Houdek	98f9a65202	Merge pull request #3275 from bylaws/x87fix FEXCore: Work around broken preserve_all support in Windows clang	2023-11-22 05:53:14 -08:00
Ryan Houdek	8726c8fb73	Merge pull request #2691 from neobrain/refactor_scoped_signal_mask ScopedSignalMask: Clean up API and use std::unique_lock/shared_lock	2023-11-19 04:53:05 -08:00
Tony Wasserka	d33b0cb9e3	SignalScopeGuards: Improve code gen for GuardSignalDeferringSectionWithFallback	2023-11-18 12:10:02 +01:00
Ryan Houdek	11993daec4	FEXCore: Hides eflags reconstruction information in the core The frontend shouldn't need to know any information about how to reconstruct eflags. Just give us the information we need and it'll work out. There are still some inherit limitations of this and some edge cases that might give invalid data, but it is roughly as close as it was before. Just provide if the PC was in the JIT, the host GPRs, and the PState object from the signal information and FEXCore does the rest. We don't need to change the signature for `SetFlagsFromCompactedEFLAGS` because during reloading of register state automatically does this for us.	2023-11-17 20:38:42 -04:00
Billy Laws	05b78339f6	FEXCore: Work around broken preserve_all support in Windows clang While clang side code to support preserve_all on Windows is in place (and thus there are no errors for using it), there are still some parts missing on the LLVM side. [1] [1] `2402b14046/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp (L88)`	2023-11-17 22:13:39 +00:00
Alyssa Rosenzweig	f60608a9c0	OpcodeDispatcher: allow garbage for 32bit inc/dec Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	153d871be2	OpcodeDispatcher: allow garbage for 32-bit cmp Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	3dfb94b524	OpcodeDispatcher: use axflag for x87-f64 fcmp only 1 instr saved but meh. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	d1e43d94e9	OpcodeDispatcher: use axflag for fcmp faster Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	1b490e0e53	CodeEmitter: add ax/xaflag Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	094146d630	IR: add axflag Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	23c2a53683	OpcodeDispatcher: move fcmp flag fixup to dispatcher simpler and much faster Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	82b7689ca4	OpcodeDispatcher: remove fcmp deferral no longer load bearing, delete the abstraction. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	149f3e6f6d	OpcodeDispatcher: rm flagsOp unused since select rework Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Alyssa Rosenzweig	2dcae23776	Arm64Emitter: Dedicate registers for PF/AF Many flag-generating instructions like cmp need to save calculations for deferred PF and AF flag calculation. Currently, they require a store per flag, which is prohibitively expensive for hot instructions like cmp. By instead pinning PF/AF temporary results to registers (x26/x27 by convention here), we eliminate many stores altogether and turn the rest into zero-cycle moves (on 64-bit at least, this isn't optimal for 32-bit emulation due to CTX->GetGPRSize shenanigans, need to check if this requirement can be lifted..). To implement, we model as SRA and then the existing SRA code is able to generate good code with little manual tuning. (Future work will get us to excellent code with more tuning ;) ). The tradeoff is reducing the working dynamic GPR set by 2 registers, which might increase spilling in some cases. I think it's worth it in practice, though. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-17 17:37:24 -04:00
Tony Wasserka	92e4e75217	Merge DeferredSignalMutex.h and ScopedSignalMask.h into a single file Using a single file makes sense now that the individual files are much shorter and share common utility classes.	2023-11-17 10:56:34 +01:00
Tony Wasserka	5ca35bf77c	ForkableMutex: Simplify WIN32 implementation	2023-11-17 10:56:34 +01:00
Tony Wasserka	c956b82d27	ScopedSignalMask/DeferredSignalMutex: Clean up API and use std::unique_lock/shared_lock	2023-11-17 10:56:34 +01:00
Ryan Houdek	c1d5fae018	Merge pull request #3273 from alyssarosenzweig/opt/shifts Optimize shifts/rotates	2023-11-14 14:13:56 -08:00
Alyssa Rosenzweig	56841f0e50	OpcodeDispatcher: avoid moves with 64bit imul Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-14 08:40:41 -04:00
Alyssa Rosenzweig	723146050b	OpcodeDispatcher: allow garbage with multiplies Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-14 08:40:41 -04:00
Alyssa Rosenzweig	85b1aa4c2d	OpcodeDispatcher: optimize mul flags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-14 08:40:41 -04:00
Ryan Houdek	c69082b1a4	OpcodeDispatcher: Optimize three sha instructions - sha1nexte - Takes advantage of sha1h if supported - Does the operation in a vector otherwise - sha1msg2 - Instead of dumping everything to GPRs, we can do this with vectors - Mostly matches ARM's sha1su1 instruction, but it is /just/ different enough to be annoying. - sha256msg1 - Directly matches sha256u0 - Leaves the previous implementation alone	2023-11-13 18:38:02 -08:00
Ryan Houdek	74b2548982	IR: Implement support for VUSHRAI IR op This matches Arm64 usra semantics. This instruction is useful for implementing vector element rotate.	2023-11-13 18:22:45 -08:00
Ryan Houdek	f31656ec65	IR: Support sha1h and sha256u0 These match our needs so wire them up	2023-11-13 18:22:45 -08:00
Ryan Houdek	e91420c405	HostFeatures: fixup SHA checks - Simulator doesn't support SHA - Use DisableCrypto option to disable sha as well - Only enable SHA if ARM cpu supports both SHA1 and SHA2	2023-11-13 18:22:45 -08:00
Ryan Houdek	25df59a65d	ArmEmitter: Fixes sha256u1 emitter Noticed this was actually emitting sha256h2	2023-11-13 18:22:45 -08:00
Alyssa Rosenzweig	83fdd5720f	IR: Add ccmn Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 22:05:02 -04:00
Alyssa Rosenzweig	89b00c89aa	OpcodeDispatcher: optimize rcl 1-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	d38917b5f0	OpcodeDispatcher: rm pointless constant Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	910e0242c1	OpcodeDispatcher: optimize rcr 1-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	651b7bb75d	OpcodeDispatcher: optimize RCL Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	c9f13ae1dd	OpcodeDispatcher: optimize RCR the usual ways Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	862e575100	OpcodeDispatcher: avoid some ubfx for rcr with flagm Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	4669c4541c	OpcodeDispatcher: don't zero for flagm ror missed earlier in the PR, would be annoying to rebase in. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	769a8c41c4	OpcodeDispatcher: Branch over shift=0 flags Not supposed to touch flags at all, so don't! instead of making a terrible mess of csels. a lot less instructions, and probably faster because the branch should be predicted correctly in practice in hot loops. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	bec9dba2b1	OpcodeDispatcher: use shifted xor + rmif for rotates eliminates lots of Bfe on flagm. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	205ba2ea13	IR: add shifted xor for rotates. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	2073f6d287	OpcodeDispatcher: don't zero nzcv for flagm shifts Faster for flagm. would be slower for !flagm because bfi slowness... Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	0f25a960ee	OpcodeDispatcher: remove bfe for small shl imm We allow the garbage in flags calculation, it's ignored. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	282ed3e309	OpcodeDispatcher: optimize bsf/bsr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	cd031a7d38	OpcodeDispatcher: avoid some ubfx for flagm Do the masking as part of the rmif, for free. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	e1885ed0bd	OpcodeDispatcher: Use 64-bit ubfx for larger shifts. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-13 21:21:01 -04:00
Alyssa Rosenzweig	238e52f74a	OpcodeDispatcher: Don't mask 32-bit bzhi either Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-12 17:36:46 -04:00
Alyssa Rosenzweig	9398b931fb	Arm64Emitter: Handle 32-bit negatives Noticed in the area. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-12 17:36:42 -04:00
Alyssa Rosenzweig	b2a9785959	OpcodeDispatcher: optimize bzhi Trickery to save an instruction :') Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-12 17:32:24 -04:00
Alyssa Rosenzweig	224a1f19a3	OpcodeDispatcher: improve bzhi flag gen Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-12 17:32:24 -04:00
Alyssa Rosenzweig	e25849b2cb	OpcodeDispatcher: fix BZHI flag calculation needs SF. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-12 17:32:24 -04:00

1 2 3 4 5 ...

682 Commits