FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-01-31 11:32:07 +00:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	b3ae81f75f	OpcodeDispatcher: allow garbage on shld shift Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig	c1a1c37980	OpcodeDispatcher: mark ideas to improve SHLD a bit tricky right now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Mai	fa3352004e	Merge pull request #3381 from alyssarosenzweig/opt/masking Allow upper garbage on a bunch of instructions	2024-01-30 10:07:53 -05:00
Ryan Houdek	ce2924731e	vixl/simulator: Enlarge simulator stack size Simulator stack size defaults to 8KB. This new unit test requires at least 15360 stack size. Just push it up to 8MB.	2024-01-29 19:48:38 -08:00
Ryan Houdek	bc67910ee4	Merge pull request #3382 from pmatos/TypoFix Fix typos; NFC	2024-01-29 16:10:14 -08:00
Mai	31a4158957	Merge pull request #3383 from alyssarosenzweig/opt/ptest Optimize PTEST and VTESTP	2024-01-29 13:30:53 -05:00
Mai	58f3d3caf5	Merge pull request #3380 from alyssarosenzweig/opt/pdep Optimize PDEP	2024-01-29 13:27:15 -05:00
Alyssa Rosenzweig	ae48228943	OpcodeDispatcher: optimize vtestps/vtestpd I don't really care about AVX but do the same thing we did for vptest. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:24:11 -04:00
Alyssa Rosenzweig	e8e35e48c7	OpcodeDispatcher: optimize ptest with tst Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8b8f27a88f	OpcodeDispatcher: optimize ptest with umaxv to check if the vector is zero, umaxv its elements and check if the reduced scalar is zero. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8e7906a665	IR: add UMaxV will be used to accelerate ptest Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:19:22 -04:00
Paulo Matos	f644959c7c	Fixing some typos; NFC	2024-01-29 17:14:53 +00:00
Alyssa Rosenzweig	16a54742e6	OpcodeDispatcher: optimize 32-bit tzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad9aa0bc87	OpcodeDispatcher: optimize 32-bit lzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	bd2b3f35a3	OpcodeDispatcher: optimize 32-bit popcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	50169ce640	OpcodeDispatcher: optimize 32-bit pext Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	baae2d68f9	OpcodeDispatcher: optimize 32-bit bextr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad8d038b8a	OpcodeDispatcher: optimize 32-bit blsi Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	820932e3c7	OpcodeDispatcher: optimize 32-bit blsmsk Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	6f11f2e6f4	OpcodeDispatcher: optimize 32-bit blsr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	f5ad7682c3	OpcodeDispatcher: optimize 32-bit pdep Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Alyssa Rosenzweig	04805f351b	JIT: rewrite pdep implementation - use better algorithm that is O(# set bits) instead of O(# total bits) - eliminate spilling by careful management of our temporaries - fix nzcv clobber bug (whoops) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Mai	750b0b70bc	Merge pull request #3356 from Sonicadvance1/modify_code_lock Jitarm64: Implements spin-loop futex for JIT blocks	2024-01-23 13:46:59 -05:00
Ryan Houdek	56d8080ec9	Merge pull request #3345 from Sonicadvance1/fix_syscall_registers OpcodeDispatcher: Fixes syscall rcx/r11 generation	2024-01-22 15:21:13 -08:00
Ryan Houdek	c0be974272	Merge pull request #3368 from bylaws/preprcr FEXCore: Fix RCL/RCR shift wraparound behaviour	2024-01-21 13:44:49 -08:00
Billy Laws	e323938173	FEXCore: Fix RCL/RCR shift wraparound behaviour This ends up being cleaner to handle outside of CalculateFlags_ShiftVariable as constant masking is only needed for RCL/RCR.	2024-01-21 18:15:50 +00:00
Billy Laws	407e26bfee	FEXCore: Use TMP1-4 for values that need preserving across spills The ARM64EC SRA layout will use x0-3 for x86_64 registers, as such any arguments passed to C ABI functions need to proxy their arguments through the temporaries and move as appropriate.	2024-01-21 16:21:13 +00:00
Ryan Houdek	a6c57f71e9	SpinWaitLock: Fixes potential extra wait that would occur on contended lock We had a chance of doing an additional bogus wfe if the expected value was hit in one iteration of a loop. Not the biggest problem on current hardware where WFE only ever sleeps for 1-4 system cycles, but on future hardware where WFE might actually sleep for longer then this could have been an issue.	2024-01-17 10:41:16 -08:00
Ryan Houdek	2af7e997f4	Spinlocks: Fix assembly Need to have a source be +r so it doesn't get overwritten.	2024-01-17 10:19:38 -08:00
Ryan Houdek	ab6c00bbcf	FEXCore/Utils: Rename FutexSpinWait to SpinWaitLock	2024-01-17 10:19:38 -08:00
Ryan Houdek	e18453cb57	Jitarm64: Implements spin-loop futex for JIT blocks This will ensure that multiple concurrent SIGBUS handlers in the same code block doesn't modify the same code.	2024-01-17 10:19:38 -08:00
Ryan Houdek	39f49782da	Arm64: Move ParanoidTSO checks up out of the non-paranoid code bath	2024-01-17 10:19:38 -08:00
Ryan Houdek	2c5dd20f3c	FutexSpinWait: Implement spin-loop Unique mutex.	2024-01-17 10:19:38 -08:00
Ryan Houdek	136fa78825	FEXCore: Implements an efficient spin-loop API This will only be used internally inside of FEXCore for efficient shared codecach backpatch spin-loops.	2024-01-17 10:19:38 -08:00
Ryan Houdek	f956f008ea	Merge pull request #3372 from alyssarosenzweig/opt/cmpxchg-review Optimize GPR cmpxchg	2024-01-15 05:11:12 -08:00
Ryan Houdek	1f7a619c79	OpcodeDispatcher: Fixes syscall rcx/r11 generation Noticed this while writing #3342. Fixes #3343 The syscall instruction is defined in the documentation that it will set RCX to the next instruction's RIP and R11 to be RFLAGS. We entirely skipped this which I noticed while writing unit tests. Adds unittests to test both 32-bit and 64-bit behaviour because our helper shares code with both. I don't know if anything actually relied on this behaviour but we should definitely support it.	2024-01-12 19:14:30 -08:00
Alyssa Rosenzweig	58127bd0e8	OpcodeDispatcher: optimize trivial cmpxchgs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-12 12:23:34 -04:00
Alyssa Rosenzweig	e8945dfb6d	OpcodeDispatcher: optimize gpr cmpxchg NZCV stuff. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-12 12:03:28 -04:00
Ryan Houdek	8c3163096b	Merge pull request #3363 from Sonicadvance1/fix_label_allocations ArmEmitter: Support single use forward labels	2024-01-12 00:26:31 -08:00
Ryan Houdek	615cfe0246	Merge pull request #3361 from Sonicadvance1/decompose_std_function FEXCore: Decompose some std::function usage to regular pointers	2024-01-10 16:55:29 -08:00
Ryan Houdek	3d5f876585	Fixes some new glibc allocations that cropped up I guess this was handled by brk things before.	2024-01-09 13:55:04 -08:00
Ryan Houdek	37102400b5	Arm64: Switches uses of forward label over to SingleUse if possible Primary goal for this is to ensure that the delinker doesn't need to allocate any memory. This delinker can end up getting hit heavily with JIT code so we don't want it to be allocating memory.	2024-01-08 22:18:20 -08:00
Ryan Houdek	c01e6283ae	CodeEmitter: Support a single use forward label Currently all uses of the forward label calls in to jemalloc to allocate memory. This allows a forward label that doesn't require any memory allocation, which is the common case in FEX.	2024-01-08 22:18:20 -08:00
Ryan Houdek	248dc97993	FEXCore: Decompose some std::function usage to regular pointers The delinker step of the JIT was using std::function with capture lambdas that required memory allocation when unnecessary. Because the compiler can't see through our std::function usage it could never decompose these by itself. By passing the Thread's frame and record to the function as arguments then we can have the signature be a raw function pointer. This fixes an area of concern from: https://github.com/FEX-Emu/FEX/blob/main/docs/ProgrammingConcerns.md#stdfunction-and-lambdas	2024-01-06 19:39:54 -08:00
Ryan Houdek	d488592eda	Merge pull request #3339 from Sonicadvance1/pass_thread_unaligned_fault_handler FEXCore: Pass thread object to HandleUnalignedAccess	2024-01-04 18:20:37 -08:00
Ryan Houdek	743df8dfae	Merge pull request #3327 from Sonicadvance1/remove_syscall_indirection Arm64: Removes a vtable indirection in syscalls	2024-01-04 18:19:40 -08:00
Ryan Houdek	4b3792196f	Merge pull request #3303 from Sonicadvance1/initial_runtime_longmode_switch OpcodeDispatcher: Initial support for runtime long-mode switch	2024-01-04 18:17:54 -08:00
Ryan Houdek	db7d7a6bd7	Merge pull request #3349 from Sonicadvance1/revert_frontend_ownership Revert "FEXLoader: Moves thread management to the frontend"	2024-01-03 14:25:04 -08:00
Alyssa Rosenzweig	04a88ed3ab	Merge pull request #3353 from Sonicadvance1/public_interface_cleaning FEXCore interface cleaning	2024-01-03 15:14:54 -04:00
Alyssa Rosenzweig	9da08b40bd	Merge pull request #3344 from Sonicadvance1/xbyak_upstream Externals: Update xbyak to v7.02 and switch away from fork	2024-01-03 15:13:58 -04:00

1 2 3 4 5 ...

799 Commits