FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-01-31 19:42:54 +00:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	ad5c3cb268	OpcodeDispatcher: rmif mask for OF in rcr smaller Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	be8d37ef3d	OpcodeDispatcher: optimize 32-bit rol/ror imm Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	6ad2514bfe	OpcodeDispatcher: rmif mask rcl smaller cf better on flagm. extra moves on non-flagm but, meh. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	3fa6129a14	OpcodeDispatcher: rmif mask rcr smaller cf and do some constant folding to do so more. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	a57cebaf58	OpcodeDispatcher: skip OF calc for constant rotate >= 2 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	34fdb14da1	OpcodeDispatcher: add and use AndConst this skips the constant folding, which saves the branching in the rotate immediate implementations. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	974baca09c	OpcodeDispatcher: allow upper garbage with rcl/rcr smaller we're masking immediately to something smaller Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	f22094a493	OpcodeDispatcher: use a branch for 8/16-bit rotate flags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	d979b3a1da	OpcodeDispatcher: note idea to further optimize rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	6d82c957fa	OpcodeDispatcher: fuse orlshl in rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Mai	fa3352004e	Merge pull request #3381 from alyssarosenzweig/opt/masking Allow upper garbage on a bunch of instructions	2024-01-30 10:07:53 -05:00
Ryan Houdek	ce2924731e	vixl/simulator: Enlarge simulator stack size Simulator stack size defaults to 8KB. This new unit test requires at least 15360 stack size. Just push it up to 8MB.	2024-01-29 19:48:38 -08:00
Ryan Houdek	bc67910ee4	Merge pull request #3382 from pmatos/TypoFix Fix typos; NFC	2024-01-29 16:10:14 -08:00
Mai	31a4158957	Merge pull request #3383 from alyssarosenzweig/opt/ptest Optimize PTEST and VTESTP	2024-01-29 13:30:53 -05:00
Mai	58f3d3caf5	Merge pull request #3380 from alyssarosenzweig/opt/pdep Optimize PDEP	2024-01-29 13:27:15 -05:00
Alyssa Rosenzweig	ae48228943	OpcodeDispatcher: optimize vtestps/vtestpd I don't really care about AVX but do the same thing we did for vptest. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:24:11 -04:00
Alyssa Rosenzweig	e8e35e48c7	OpcodeDispatcher: optimize ptest with tst Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8b8f27a88f	OpcodeDispatcher: optimize ptest with umaxv to check if the vector is zero, umaxv its elements and check if the reduced scalar is zero. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8e7906a665	IR: add UMaxV will be used to accelerate ptest Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:19:22 -04:00
Paulo Matos	f644959c7c	Fixing some typos; NFC	2024-01-29 17:14:53 +00:00
Alyssa Rosenzweig	16a54742e6	OpcodeDispatcher: optimize 32-bit tzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad9aa0bc87	OpcodeDispatcher: optimize 32-bit lzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	bd2b3f35a3	OpcodeDispatcher: optimize 32-bit popcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	50169ce640	OpcodeDispatcher: optimize 32-bit pext Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	baae2d68f9	OpcodeDispatcher: optimize 32-bit bextr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad8d038b8a	OpcodeDispatcher: optimize 32-bit blsi Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	820932e3c7	OpcodeDispatcher: optimize 32-bit blsmsk Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	6f11f2e6f4	OpcodeDispatcher: optimize 32-bit blsr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	f5ad7682c3	OpcodeDispatcher: optimize 32-bit pdep Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Alyssa Rosenzweig	04805f351b	JIT: rewrite pdep implementation - use better algorithm that is O(# set bits) instead of O(# total bits) - eliminate spilling by careful management of our temporaries - fix nzcv clobber bug (whoops) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Mai	750b0b70bc	Merge pull request #3356 from Sonicadvance1/modify_code_lock Jitarm64: Implements spin-loop futex for JIT blocks	2024-01-23 13:46:59 -05:00
Ryan Houdek	56d8080ec9	Merge pull request #3345 from Sonicadvance1/fix_syscall_registers OpcodeDispatcher: Fixes syscall rcx/r11 generation	2024-01-22 15:21:13 -08:00
Ryan Houdek	c0be974272	Merge pull request #3368 from bylaws/preprcr FEXCore: Fix RCL/RCR shift wraparound behaviour	2024-01-21 13:44:49 -08:00
Billy Laws	e323938173	FEXCore: Fix RCL/RCR shift wraparound behaviour This ends up being cleaner to handle outside of CalculateFlags_ShiftVariable as constant masking is only needed for RCL/RCR.	2024-01-21 18:15:50 +00:00
Billy Laws	407e26bfee	FEXCore: Use TMP1-4 for values that need preserving across spills The ARM64EC SRA layout will use x0-3 for x86_64 registers, as such any arguments passed to C ABI functions need to proxy their arguments through the temporaries and move as appropriate.	2024-01-21 16:21:13 +00:00
Ryan Houdek	a6c57f71e9	SpinWaitLock: Fixes potential extra wait that would occur on contended lock We had a chance of doing an additional bogus wfe if the expected value was hit in one iteration of a loop. Not the biggest problem on current hardware where WFE only ever sleeps for 1-4 system cycles, but on future hardware where WFE might actually sleep for longer then this could have been an issue.	2024-01-17 10:41:16 -08:00
Ryan Houdek	2af7e997f4	Spinlocks: Fix assembly Need to have a source be +r so it doesn't get overwritten.	2024-01-17 10:19:38 -08:00
Ryan Houdek	ab6c00bbcf	FEXCore/Utils: Rename FutexSpinWait to SpinWaitLock	2024-01-17 10:19:38 -08:00
Ryan Houdek	e18453cb57	Jitarm64: Implements spin-loop futex for JIT blocks This will ensure that multiple concurrent SIGBUS handlers in the same code block doesn't modify the same code.	2024-01-17 10:19:38 -08:00
Ryan Houdek	39f49782da	Arm64: Move ParanoidTSO checks up out of the non-paranoid code bath	2024-01-17 10:19:38 -08:00
Ryan Houdek	2c5dd20f3c	FutexSpinWait: Implement spin-loop Unique mutex.	2024-01-17 10:19:38 -08:00
Ryan Houdek	136fa78825	FEXCore: Implements an efficient spin-loop API This will only be used internally inside of FEXCore for efficient shared codecach backpatch spin-loops.	2024-01-17 10:19:38 -08:00
Ryan Houdek	f956f008ea	Merge pull request #3372 from alyssarosenzweig/opt/cmpxchg-review Optimize GPR cmpxchg	2024-01-15 05:11:12 -08:00
Ryan Houdek	1f7a619c79	OpcodeDispatcher: Fixes syscall rcx/r11 generation Noticed this while writing #3342. Fixes #3343 The syscall instruction is defined in the documentation that it will set RCX to the next instruction's RIP and R11 to be RFLAGS. We entirely skipped this which I noticed while writing unit tests. Adds unittests to test both 32-bit and 64-bit behaviour because our helper shares code with both. I don't know if anything actually relied on this behaviour but we should definitely support it.	2024-01-12 19:14:30 -08:00
Alyssa Rosenzweig	58127bd0e8	OpcodeDispatcher: optimize trivial cmpxchgs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-12 12:23:34 -04:00
Alyssa Rosenzweig	e8945dfb6d	OpcodeDispatcher: optimize gpr cmpxchg NZCV stuff. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-12 12:03:28 -04:00
Ryan Houdek	8c3163096b	Merge pull request #3363 from Sonicadvance1/fix_label_allocations ArmEmitter: Support single use forward labels	2024-01-12 00:26:31 -08:00
Ryan Houdek	615cfe0246	Merge pull request #3361 from Sonicadvance1/decompose_std_function FEXCore: Decompose some std::function usage to regular pointers	2024-01-10 16:55:29 -08:00
Ryan Houdek	3d5f876585	Fixes some new glibc allocations that cropped up I guess this was handled by brk things before.	2024-01-09 13:55:04 -08:00
Ryan Houdek	37102400b5	Arm64: Switches uses of forward label over to SingleUse if possible Primary goal for this is to ensure that the delinker doesn't need to allocate any memory. This delinker can end up getting hit heavily with JIT code so we don't want it to be allocating memory.	2024-01-08 22:18:20 -08:00

1 2 3 4 5 ...

807 Commits