FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-01-19 04:42:27 +00:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	b3ae81f75f	OpcodeDispatcher: allow garbage on shld shift Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig	c1a1c37980	OpcodeDispatcher: mark ideas to improve SHLD a bit tricky right now. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:26:59 -04:00
Alyssa Rosenzweig	fb6f850bb4	OpcodeDispatcher: remove rcl sub Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	b6d8749525	OpcodeDispatcher: remove select from rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	d3f1397325	OpcodeDispatcher: eliminate constants in RCR Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	0a164428fa	OpcodeDispatcher: eliminate select in RCR the nzcv clobber I actually came ofr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	7496175100	OpcodeDispatcher: optimize 32-bit rcl/rcr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	0616a9cef1	OpcodeDispatcher: eliminate move in rcr 1-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	97f8775354	OpcodeDispatcher: optimize <32-bit rcr op1 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	c92099aa98	OpcodeDispatcher: fuse orlshl in rcr 1-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	7c288b09f1	OpcodeDispatcher: rmif mask rcl smaller OF Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	680af7b1b0	OpcodeDispatcher: rcr op 8x1 cleanup Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	349bc9efab	OpcodeDispatcher: unify rcr op 1bit codepaths get additional opt for <32-bit Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	ad5c3cb268	OpcodeDispatcher: rmif mask for OF in rcr smaller Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	be8d37ef3d	OpcodeDispatcher: optimize 32-bit rol/ror imm Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	6ad2514bfe	OpcodeDispatcher: rmif mask rcl smaller cf better on flagm. extra moves on non-flagm but, meh. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	3fa6129a14	OpcodeDispatcher: rmif mask rcr smaller cf and do some constant folding to do so more. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	a57cebaf58	OpcodeDispatcher: skip OF calc for constant rotate >= 2 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	34fdb14da1	OpcodeDispatcher: add and use AndConst this skips the constant folding, which saves the branching in the rotate immediate implementations. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	974baca09c	OpcodeDispatcher: allow upper garbage with rcl/rcr smaller we're masking immediately to something smaller Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	f22094a493	OpcodeDispatcher: use a branch for 8/16-bit rotate flags Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	d979b3a1da	OpcodeDispatcher: note idea to further optimize rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Alyssa Rosenzweig	6d82c957fa	OpcodeDispatcher: fuse orlshl in rcl Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-30 22:22:57 -04:00
Mai	fa3352004e	Merge pull request #3381 from alyssarosenzweig/opt/masking Allow upper garbage on a bunch of instructions	2024-01-30 10:07:53 -05:00
Ryan Houdek	ce2924731e	vixl/simulator: Enlarge simulator stack size Simulator stack size defaults to 8KB. This new unit test requires at least 15360 stack size. Just push it up to 8MB.	2024-01-29 19:48:38 -08:00
Ryan Houdek	bc67910ee4	Merge pull request #3382 from pmatos/TypoFix Fix typos; NFC	2024-01-29 16:10:14 -08:00
Mai	31a4158957	Merge pull request #3383 from alyssarosenzweig/opt/ptest Optimize PTEST and VTESTP	2024-01-29 13:30:53 -05:00
Mai	58f3d3caf5	Merge pull request #3380 from alyssarosenzweig/opt/pdep Optimize PDEP	2024-01-29 13:27:15 -05:00
Alyssa Rosenzweig	ae48228943	OpcodeDispatcher: optimize vtestps/vtestpd I don't really care about AVX but do the same thing we did for vptest. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:24:11 -04:00
Alyssa Rosenzweig	e8e35e48c7	OpcodeDispatcher: optimize ptest with tst Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8b8f27a88f	OpcodeDispatcher: optimize ptest with umaxv to check if the vector is zero, umaxv its elements and check if the reduced scalar is zero. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:20:17 -04:00
Alyssa Rosenzweig	8e7906a665	IR: add UMaxV will be used to accelerate ptest Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:19:22 -04:00
Paulo Matos	027fbbf051	Optimize CDQOp	2024-01-29 17:18:02 +00:00
Paulo Matos	ca31a0404c	ConstProp should generate 32bit constants when required	2024-01-29 17:15:47 +00:00
Paulo Matos	f644959c7c	Fixing some typos; NFC	2024-01-29 17:14:53 +00:00
Alyssa Rosenzweig	16a54742e6	OpcodeDispatcher: optimize 32-bit tzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad9aa0bc87	OpcodeDispatcher: optimize 32-bit lzcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	bd2b3f35a3	OpcodeDispatcher: optimize 32-bit popcnt Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	50169ce640	OpcodeDispatcher: optimize 32-bit pext Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	baae2d68f9	OpcodeDispatcher: optimize 32-bit bextr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	ad8d038b8a	OpcodeDispatcher: optimize 32-bit blsi Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	820932e3c7	OpcodeDispatcher: optimize 32-bit blsmsk Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	6f11f2e6f4	OpcodeDispatcher: optimize 32-bit blsr Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:11:25 -04:00
Alyssa Rosenzweig	f5ad7682c3	OpcodeDispatcher: optimize 32-bit pdep Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Alyssa Rosenzweig	04805f351b	JIT: rewrite pdep implementation - use better algorithm that is O(# set bits) instead of O(# total bits) - eliminate spilling by careful management of our temporaries - fix nzcv clobber bug (whoops) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-01-29 13:06:56 -04:00
Mai	750b0b70bc	Merge pull request #3356 from Sonicadvance1/modify_code_lock Jitarm64: Implements spin-loop futex for JIT blocks	2024-01-23 13:46:59 -05:00
Ryan Houdek	56d8080ec9	Merge pull request #3345 from Sonicadvance1/fix_syscall_registers OpcodeDispatcher: Fixes syscall rcx/r11 generation	2024-01-22 15:21:13 -08:00
Ryan Houdek	c0be974272	Merge pull request #3368 from bylaws/preprcr FEXCore: Fix RCL/RCR shift wraparound behaviour	2024-01-21 13:44:49 -08:00
Billy Laws	e323938173	FEXCore: Fix RCL/RCR shift wraparound behaviour This ends up being cleaner to handle outside of CalculateFlags_ShiftVariable as constant masking is only needed for RCL/RCR.	2024-01-21 18:15:50 +00:00
Billy Laws	407e26bfee	FEXCore: Use TMP1-4 for values that need preserving across spills The ARM64EC SRA layout will use x0-3 for x86_64 registers, as such any arguments passed to C ABI functions need to proxy their arguments through the temporaries and move as appropriate.	2024-01-21 16:21:13 +00:00

1 2 3 4 5 ...

953 Commits