FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-02-24 08:42:31 +00:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	099c683a5a	IR: Add FromNZCV mode to CondJump In this mode, rather than the branch comparing its arguments and then jumping based on the result, the branch simply jumps by the native comparison based on the NZCV value. This allows us to map x86 branches to arm64 branches 1:1. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-12 17:32:24 -04:00
Alyssa Rosenzweig	ec14a65e23	OpcodeDispatcher: optimize LOOP invert need to reorder since the select clobbers nzcv. before/after diff on the loopne unit test: > 4308: [INFO] cset w20, ne 40c41 < 4308: [INFO] mrs x20, nzcv --- > 4308: [INFO] mrs x21, nzcv 42,49c43,48 < 4308: [INFO] cset x21, ne < 4308: [INFO] ubfx w22, w20, #30, #1 < 4308: [INFO] eor x22, x22, #0x1 < 4308: [INFO] and x21, x21, x22 < 4308: [INFO] msr nzcv, x20 < 4308: [INFO] cbnz x21, #+0x8 (addr 0xfffed66f8094) < 4308: [INFO] b #+0x1c (addr 0xfffed66f80ac) < 4308: [INFO] ldr x0, pc+8 (addr 0xfffed66f809c) --- > 4308: [INFO] cset x22, ne > 4308: [INFO] and x20, x22, x20 > 4308: [INFO] msr nzcv, x21 > 4308: [INFO] cbnz x20, #+0x8 (addr 0xfffec94e8090) > 4308: [INFO] b #+0x1c (addr 0xfffec94e80a8) > 4308: [INFO] ldr x0, pc+8 (addr 0xfffec94e8098) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 21:03:58 -04:00
Alyssa Rosenzweig	0d70c6a0d0	OpcodeDispatcher: Use cset for getting nzcv flags Usually better in practice... some rotates are slightly regressed by this but they were already terrible. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 21:03:58 -04:00
Alyssa Rosenzweig	bfa069c4d5	OpcodeDispatcher: Dirty NZCV on new blocks This worked only by accident before. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 21:03:58 -04:00
Alyssa Rosenzweig	61bdf64e15	OpcodeDispatcher: Cleanup PF select Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 21:03:58 -04:00
Alyssa Rosenzweig	482b35c283	OpcodeDispatcher: Optimize JA/JNA selects Chain two csel instructions together, which should be optimal. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 10:46:59 -04:00
Alyssa Rosenzweig	b74d886017	OpcodeDispatcher: Cleanup selectcc Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 10:46:59 -04:00
Alyssa Rosenzweig	1667abad7e	OpcodeDispatcher: Don't fold flags in SelectCC Not beneficial in the new approach. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 10:46:59 -04:00
Alyssa Rosenzweig	b187a853e7	OpcodeDispatcher: Use NZCVSelect for SelectCC Massively better codegen. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 10:46:58 -04:00
Alyssa Rosenzweig	228c7d142e	IR: Allow inline 0 on NZCVSelect via the 0 reg. We really need a more generalized approach to taking advantage of wzr, but this optimizes the special case I care about for seta (saving a move to make the impl optimal). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 09:42:26 -04:00
Alyssa Rosenzweig	041199644c	IR: Add NZCVSelect op This will replace Select soon, as it lets us take advantage of NZCV-generating instructions and it doesn't clobber NZCV. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-10 09:04:42 -04:00
Alyssa Rosenzweig	3767f3633d	Merge pull request #3263 from alyssarosenzweig/opt/not-garbage OpcodeDispatcher: Make "not" not garbage	2023-11-09 19:58:02 -04:00
Ryan Houdek	af3253947e	Merge pull request #3162 from alyssarosenzweig/opt/nzcv-native Keep guest SF/ZF/CF/OF flags resident in host NZCV	2023-11-09 15:10:45 -08:00
Ryan Houdek	efc5eb2933	Merge pull request #3250 from Sonicadvance1/gdbserver_frontend_move FEXLoader: Wire up gdbserver in the frontend	2023-11-09 14:48:59 -08:00
Ryan Houdek	b4eeb96375	Merge pull request #3261 from Sonicadvance1/tuning_options FEX: Only pass CPU tunables to FEXCore and FEXLoader	2023-11-09 14:48:42 -08:00
Alyssa Rosenzweig	c9832e3d34	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 15:22:25 -04:00
Alyssa Rosenzweig	da3e3fc7a3	OpcodeDispatcher: Optimize some selects Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 15:21:10 -04:00
Alyssa Rosenzweig	b5c83f0628	JIT: Optimize 8/16-bit TestNZ This is cursed. I blame the darling. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 14:50:36 -04:00
Alyssa Rosenzweig	03087a55ba	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 12:02:20 -04:00
Alyssa Rosenzweig	1ce3c16b30	OpcodeDispatcher: Make "not" not garbage Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 12:02:20 -04:00
Alyssa Rosenzweig	bf702850a9	InstCountCI: Add not bh case Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 11:32:45 -04:00
Alyssa Rosenzweig	584c4cc05e	OpcodeDispatcher: Mask with rmif sometimes For CF/OF calculation, this saves an instruction on flagm platforms. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 10:05:51 -04:00
Alyssa Rosenzweig	279afd88bb	OpcodeDispatcher: Generalize rmif trick Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 10:05:51 -04:00
Alyssa Rosenzweig	c0a6d82025	OpcodeDispatcher: Use rmif for NZCV inserts Optimizes piles of s/w flag generation on flagm. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	3a03e1c93c	OpcodeDispatcher: rework InsertNZCV in prep for rmif. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	bdaa70405f	OpcodeDispatcher: simplify flag control ops Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	1281145982	OpcodeDispatcher: Optimize popf mostly for easier debugging tbh Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	87cac09477	OpcodeDispatcher: optimize cmc Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	5336129b58	IR: Optimize sub/sbb Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	72fc2b522d	IR: Optimize add/adc Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	b6f6c84790	IR: Optimize tests Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	11e9be13b1	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	afdb8753ba	IR: Remove some implicit flag clobbers Do it explicitly for sve-256 and punt on optimizing, so we avoid regressing code gen otherwise. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	f6a2e6739d	IR: VCMPEQ doesn't clobber nzcv cmeq not cmpeq! Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	d6569d510d	Arm64: Keep host flags resident in NZCV Rather than the context. Effectively a static register allocation scheme for flags. This will let us optimize out a LOT of flag handling code, keeping things in NZCV rather than needing to copy between NZCV and memory all the time. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	04e4993d9b	OpcodeDispatcher: Add a kludge to save NZCV less Some opcodes only clobber NZCV under certain circumstances, we don't yet have a good way of encoding that. In the mean time this hot fixes some would-be instcountci regressions. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	783e09d67d	ConstProp: Remove select code motion Problematic in the new approach and not sure what it's trying to accomplish tbh. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	314f478225	ConstProp: remove select+branch fusion Not beneficial in the new approach to flags. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	c1dbc28aa2	OpcodeDispatcher: Implement SaveNZCV Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	8f7e393ffb	Arm64: Don't clobber NZCV in CondJump Again we need to handle this one specially because the dispatcher can't insert restore code after the branch. It should be optimized in the near future, don't worry. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	b3055523b4	IR: Switch to dedicated NZCV load/store Semantics differ markedly from the non-NZCV flags, splitting this out makes it a lot easier to do things correctly imho. Gets the dest/src size correct (important for spilling), as well as makes our existing opt passes skip this which is needed for correctness at the moment anyway. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Alyssa Rosenzweig	cf6b21564c	InstCountCI: disable flagm explicitly more Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-09 09:40:51 -04:00
Mai	996a4c023c	Merge pull request #3260 from Sonicadvance1/ender_lilies_nzcv_unittest unittests/ASM: Adds unittest found from Ender Lilies that crashed with NZCV	2023-11-08 20:24:09 +01:00
Ryan Houdek	0dcbdcc0e2	FEX: Only pass CPU tunables to FEXCore and FEXLoader This fixes an issue where CPU tunables were ending up in the thunk generator which means if your CPU doesn't support all the features on the Builder then it would crash with SIGILL. This was happening with Canonical's runners because they typically only support ARMv8.2 but we are compiling packages to run on ARMv8.4 devices. cc: FEX-2311.1	2023-11-08 05:50:33 -08:00
Ryan Houdek	5bdd422db6	unittests/ASM: Adds unittest found from Ender Lilies that crashed with NZCV SHA instructions are very large right now and cause register spilling due to their codegen. Ender Lilies has a really large block in a function called `sha1_block_data_order` that was causing FEX to spill NZCV flags incorrectly. The assumption which held true before NZCV optimizations were a thing was that all flags were either 1-bit in an 8-bit container, or just 8-bit (x87 TOP flag). NZCV host flags broke this assumption by making its flags 32-bit which ended up breaking when encounting spilling situations.	2023-11-08 04:52:22 -08:00
Alyssa Rosenzweig	bf147f47b5	Merge pull request #3258 from Sonicadvance1/remove_warnings_16 Arm64Emitter: Fixes warning	2023-11-08 07:05:26 -04:00
Alyssa Rosenzweig	3f1f7faf34	Merge pull request #3257 from alyssarosenzweig/helper/derive-op Add helper for deriving ops by opcode	2023-11-08 06:53:45 -04:00
Ryan Houdek	1fc6725826	Arm64Emitter: Fixes warning	2023-11-08 01:27:27 -08:00
Ryan Houdek	fa8c35feba	Docs: Update for release FEX-2311 FEX-2311	2023-11-07 09:42:16 -08:00
Alyssa Rosenzweig	73958b9163	OpcodeDispatcher: Use DeriveOp Replace every instance of the Op overwrite pattern, and ban that anti-pattern from the codebase in the future. This will prevent piles of NZCV related regressions. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-11-07 12:05:00 -04:00

1 2 3 4 5 ...

8237 Commits