FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-02-20 23:02:12 +00:00

Author	SHA1	Message	Date
lioncash	f57debeb29	OpcodeDispatcher: Handle VPACKSSWB	2022-12-17 02:42:09 +00:00
lioncash	4ac031df59	OpcodeDispatcher: Move PACKSSOp impl to a regular function We can reuse it with AVX versions.	2022-12-17 02:13:26 +00:00
Ryan Houdek	78b53bfa49	Merge pull request #2266 from lioncash/arith OpcodeDispatcher: Handle vector versions of VPSRA{D, W}	2022-12-16 18:05:05 -08:00
lioncash	c53fb7d697	OpcodeDispatcher: Handle VPSRAD (vector)	2022-12-17 01:51:42 +00:00
lioncash	a1a52450cb	OpcodeDispatcher: Handle VPSRAW (vector)	2022-12-17 01:40:25 +00:00
Ryan Houdek	fabf453046	Merge pull request #2265 from lioncash/pextrw OpcodeDispatcher: Handle remaining PEXTRW opcode	2022-12-16 17:35:33 -08:00
lioncash	68916ae2d9	OpcodeDispatcher: Move PSRAOp implementation to regular function We can reuse this with the AVX variant.	2022-12-17 01:23:02 +00:00
lioncash	bf56b7b2da	OpcodeDispatcher: Handle remaining PEXTRW opcode	2022-12-17 01:14:22 +00:00
Ryan Houdek	905eb015c0	Merge pull request #2264 from lioncash/addsub OpcodeHandler: Handle VADDSUBP{D, S}	2022-12-16 16:54:35 -08:00
lioncash	858f13e76a	OpcodeDispatcher: Handle VADDSUBPD	2022-12-17 00:41:25 +00:00
lioncash	169d7bbf50	OpcodeDispatcher: Handle VADDSUBPS	2022-12-17 00:29:29 +00:00
lioncash	31c8d4acac	OpcodeDispatcher: Factor out ADDSUB impl into regular function We can reuse this with the AVX versions	2022-12-17 00:16:38 +00:00
lioncash	8291e600fa	OpcodeDispatcher: Simplify ADDSUBPOp Rather than looping vectors, we can interleave them together directly with IR ops.	2022-12-17 00:11:51 +00:00
Ryan Houdek	b26e4109fa	Merge pull request #2263 from lioncash/mull OpcodeDispatcher: Handle VPMULL{D, B}	2022-12-16 15:40:53 -08:00
lioncash	dcc218a168	OpcodeDispatcher: Handle VPMULLD	2022-12-16 23:20:25 +00:00
lioncash	49b9b18b4a	OpcodeDispatcher: Handle VPMULLW	2022-12-16 23:07:06 +00:00
Ryan Houdek	ad3bf189c0	Merge pull request #2262 from lioncash/rlog OpcodeDispatcher: Handle vector variants of VPSRL{D, Q, W}	2022-12-16 14:56:56 -08:00
lioncash	47b21fa758	OpcodeDispatcher: Handle VPSRLQ (vector) Also mark VPMOVMSKB as UNDEC, since it's not implemented yet.	2022-12-16 22:18:40 +00:00
lioncash	b6e82965df	OpcodeDispatcher: Handle VPSRLD (vector)	2022-12-16 22:09:52 +00:00
lioncash	8dc8785340	OpcodeDispatcher: Handle VPSRLW (vector)	2022-12-16 22:00:53 +00:00
lioncash	c710ab60b0	OpcodeDispatcher: Factor out PSRLDOp implementation to regular function This will be used with the AVX variants of the shifts also	2022-12-16 21:43:11 +00:00
Ryan Houdek	c86ba7646c	Merge pull request #2259 from lioncash/pextr OpcodeDispatcher: Handle VPEXTR{B, D, Q, W}/VEXTRACTPS	2022-12-16 11:25:38 -08:00
Ryan Houdek	c1e301a5ed	Merge pull request #2257 from lioncash/limm OpcodeDispatcher: Handle immediate variants of VPSLL{D, Q, W}	2022-12-16 11:23:16 -08:00
Mai	0ebb15c732	Merge pull request #2258 from Sonicadvance1/fixed_syscall_spill Arm64: Inline Syscall spill optimization	2022-12-16 18:48:08 +00:00
lioncash	37c743b616	OpcodeDispatcher: Handle VPSLLQ (immediate)	2022-12-16 18:37:27 +00:00
lioncash	c810ae4018	OpcodeDispatcher: Handle VPSLLD (immediate)	2022-12-16 18:37:27 +00:00
lioncash	d3481c8271	OpcodeDispatcher: Handle VPSLLW (immediate)	2022-12-16 18:37:27 +00:00
lioncash	7c1e152441	OpcodeDispatcher: Extract PSLLI impl to regular function This will be reused for the AVX variants.	2022-12-16 18:37:20 +00:00
lioncash	f11ac8674d	OpcodeDispatcher: Handle VEXTRACTPS	2022-12-16 18:13:55 +00:00
Ryan Houdek	1fecf89bfc	Arm64: Inline Syscall spill optimization This was likely an issue with signals racing to the spill handler, which we have fixed bugs with over the past few months. This means we don't need to spill all SRA GPR registers anymore, at most we need to spill three registers that intersect with syscall arguments.	2022-12-16 10:04:16 -08:00
lioncash	21ad0fa334	OpcodeDispatcher: Handle VPEXTRQ VPEXTRQ uses VEX.W to handle size differencing, since it shares an encoding spot with VPEXTRD, so we need to handle that a little differently.	2022-12-16 18:02:24 +00:00
lioncash	3429815103	OpcodeDispatcher: Handle VPEXTRD	2022-12-16 17:33:01 +00:00
lioncash	559ff1582e	OpcodeDispatcher: Handle VPEXTRW	2022-12-16 17:29:16 +00:00
lioncash	2e973ae079	OpcodeDispatcher: Handle VPEXTRB	2022-12-16 14:37:47 +00:00
Mai	1ab4471ef9	Merge pull request #2255 from Sonicadvance1/optimize_sve_spillfill Arm64: Optimize SVE register spilling and filling	2022-12-16 13:19:05 +00:00
Ryan Houdek	40e073c8b2	Arm64: Optimize SVE register spilling and filling Causes the dispatcher to drop from 4476 bytes down to 3900 for SVE-256bit supporting targets. This is done by significantly reducing SVE loadstore ops. Going from 8 instructions per 4 registers, down to 2 instructions. This is done by switching from 1 register loadstore instructions up to 4 register loadstore instructions. Which should significantly improve performance on future SVE platforms. Filling and Spilling to the context is still using the old code path because SVE doesn't offer non-interleaving loadstores. Spilling and filling on the stack is fine because we don't need to match context state.	2022-12-16 00:25:50 -08:00
Ryan Houdek	58fab721b3	Merge pull request #2254 from lioncash/logical OpcodeDispatcher: Handle vector variants of VPSLL{D, Q, W}	2022-12-15 22:52:05 -08:00
lioncash	8fac21e43f	OpcodeDispatcher: Handle VPSLLQ (vector)	2022-12-16 06:34:00 +00:00
lioncash	d9a1e97bc1	OpcodeDispatcher: Handle VPSLLD (vector)	2022-12-16 06:34:00 +00:00
lioncash	848f1a2f78	OpcodeDispatcher: Handle VPSLLW (vector)	2022-12-16 06:34:00 +00:00
lioncash	7b8a46d934	OpcodeDispatcher: Move PSLL impl into a regular function	2022-12-16 06:33:58 +00:00
Mai	9a8852f9b6	Merge pull request #2250 from Sonicadvance1/optimize_spilling_filling Arm64: Optimizing spilling and filling	2022-12-16 04:47:22 +00:00
Mai	65e8bf9d72	Merge pull request #2253 from Sonicadvance1/single_page_dispatcher Arm64: Reduce dispatcher to 1 page	2022-12-16 04:44:55 +00:00
Ryan Houdek	344ec33ba5	Merge pull request #2252 from lioncash/fadd Arm64/VectorOps: Simplify FADDP result merging	2022-12-15 20:37:11 -08:00
Ryan Houdek	5dc7dfacb3	Arm64: Reduce dispatcher to 1 page We currently only use 2236 bytes, no need for two pages. Once #2250 is merged we will use 1716 bytes	2022-12-15 20:33:28 -08:00
lioncash	122aa8a69a	Arm64/VectorOps: Simplify FADDP result merging Keeps the implementation similarly in sync with VAddP.	2022-12-16 04:19:46 +00:00
Ryan Houdek	8ce6c08152	Merge pull request #2251 from lioncash/hadd OpcodeDispatcher: Handle VPHADDW/VPHADDD	2022-12-15 20:11:07 -08:00
Ryan Houdek	1beb791d52	Arm64: Optimizing spilling and filling Just makes these a little more optimal when jumping out of the JIT. Noticed these while working on the new emitter.	2022-12-15 20:04:16 -08:00
lioncash	27c0d4a9f5	OpcodeDispatcher: Handle VPHADDD	2022-12-16 03:28:57 +00:00
lioncash	dd4ba7562f	OpcodeDispatcher: Handle VPHADDW	2022-12-16 03:28:57 +00:00

1 2 3 4 5 ...

5456 Commits