FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2024-12-14 09:28:34 +00:00

Author	SHA1	Message	Date
Alyssa Rosenzweig	af21b8f3c7	Move External/FEXCore/ to FEXCore/ It is not an external component, and it makes paths needlessly long. Ryan seemed amenable to this when we discussed on IRC earlier. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-08-17 16:32:16 -04:00
Lioncache	e14e71aaff	IR: Allow 128-bit broadcasts in VBroadcastFromMem Now all vbroadcast implementations go down the more optimal path. For non-SVE 128-bit cases where we only have 128-bit wide registers, we behave like ld1rqb and just act as a normal 128-bit load for interface convenience.	2023-08-17 14:20:12 -04:00
Ryan Houdek	a9dea29f03	Merge pull request #2917 from lioncash/quad ARMEmitter: Handle SVE load and broadcast quadword groups	2023-08-17 10:59:27 -07:00
Lioncache	f97df2a40f	ARMEmitter: Handle SVE load and broadcast quadword (scalar plus scalar) group	2023-08-17 13:32:20 -04:00
Lioncache	6f9bc1e2fe	ARMEmitter: Handle SVE load and broadcast quadword (scalar plus imm) category	2023-08-17 13:32:17 -04:00
Ryan Houdek	49b8b7cd2c	Merge pull request #2916 from Sonicadvance1/128bit_predicate Arm64Emitter: Ensure that 128-bit predicate is generated with SVE	2023-08-17 09:56:56 -07:00
Ryan Houdek	059c022255	Arm64Emitter: Ensure that 128-bit predicate is generated with SVE In the case of running on a 128-bit SVE system this predicate wasn't setup. Since we never had any predicate usage before this wasn't an issue. Now that #2914 is using the 128-bit predicate we need to make sure that we are generating it.	2023-08-17 09:37:55 -07:00
Lioncache	25708be807	IR: Add TSO handling to VBroadcastFromMem	2023-08-17 12:30:57 -04:00
Lioncache	c8e3ca481f	OpcodeDispatcher: Remove explicit zero-extending in VBROADCASTOp Since the implementations zero the upper lanes when appropriate, we can remove the unnecessary explicit move.	2023-08-17 12:17:24 -04:00
Lioncache	879bc5176e	IR: Add VBroadcastFromMem opcode Allows the implementations of the vbroadcast instructions to perform the load and broadcast in one operation as opposed to doing the load and then broadcast separately. Notably, the broadcasting loads can also be used on systems that have SVE 128-bit support as well, not only 256-bit. On non-SVE systems, we use the equivalent AdvSIMD instructions.	2023-08-17 12:17:24 -04:00
Ryan Houdek	6d562f8b3b	Merge pull request #2911 from Sonicadvance1/stop_abusing_orr Arm64: Stop abusing orr in LoadConstant	2023-08-17 09:13:36 -07:00
Ryan Houdek	1029bb1fae	Merge pull request #2910 from Sonicadvance1/minor_bfi_opt Arm64: Optimize non-optimal BFI move case	2023-08-17 09:12:37 -07:00
Ryan Houdek	1343c14db0	Merge pull request #2909 from Sonicadvance1/optimize_clzero_clear Arm64: Optimize CacheLine{Clear,Clean}	2023-08-17 09:10:45 -07:00
Lioncache	77c64285cb	OpcodeDispatcher: Remove unused variable in AVXVectorUnaryOpImpl Forgot to remove this when getting rid of the unnecessary explicit zero-extending behavior	2023-08-17 11:26:41 -04:00
Ryan Houdek	fe37c89109	FEXCore/Config: Stop making temporary string copies For config values that were string objects we were unnecessary creating copies each time the string was accessed. Convert the () operator over to returning a reference.	2023-08-16 21:35:13 -07:00
Ryan Houdek	23fd79a3b3	Arm64: Stop abusing orr in LoadConstant The current implementation uses orr excessively. This has FEX missing hardware optimization opportunities where some CPU cores will zero-cycle move constants that fit in to the 16-bits of movz/movk. First evaluate up front if the number of 16-bit segments is > 1, in those cases we should check if it is a bitfield that can be moved in one instruction with orr. After that point we will use movz for 16-bit constant moves. Additionally this optimizes the case where a constant of zero is loaded to be a `mov <reg>, zr` which gets renamed in most hardware.	2023-08-16 19:35:15 -07:00
Ryan Houdek	a3b40c37c2	Arm64: Optimize non-optimal BFI move case Commonly we are doing a BFI into a 32-bit register, which is hitting the ubfx (lsr alias) path. In the case of 32-bit destination we can also do a regular move, which will take advantage of CPU's rename functionality and give a minor speed boost.	2023-08-16 14:35:41 -07:00
Ryan Houdek	4522a766e0	Arm64: Optimize CacheLine{Clear,Clean} When the cacheline size matches the expected x86 cacheline size then we can remove the spurious move + add.	2023-08-16 14:20:22 -07:00
Ryan Houdek	fc12958095	FEXCore/IR: Fixes bug in IRDumper without specification Didn't notice this in the previous PR, When DUMPIR=stderr without and selection of where to place it in PASSMANAGERDUMPIR it was supposed to put the dumper at the end of the passes. We need to make sure that it it placed at the end of the passes rather than current `it`.	2023-08-16 13:51:03 -07:00
Mai	df3d4efc80	Merge pull request #2904 from Sonicadvance1/instcountci_only_arm GIthub: Only enable InstCountCI on an ARM platform	2023-08-15 17:33:10 -04:00
Ryan Houdek	1441cb76b9	HostFeatures: Adds support for overriding ARMv8.1 LSE atomics Always enable it on the InstCountCI.	2023-08-15 14:12:27 -07:00
Lioncache	17956eac5f	OpcodeDispatcher: Eliminate unnecessary moves in AVXVectorUnaryOpImpl We no longer need to do any manual zero-extending here, since this will occur automatically on hardware with SVE when 128-bit AdvSIMD is used.	2023-08-15 15:43:34 -04:00
Lioncache	2708374d95	Arm64/VectorOps: Remove redundant move in VFRSqrt SVE path We can perform the SQRT first and then broadcast 1.0 into the destination since all the intermediary work is done, meaning we don't have to worry about Dst and Vector aliasing one another.	2023-08-15 15:22:21 -04:00
Lioncache	6acce60855	ARMEmitter: Handle SVE load and broadcast element group These can be used to improve vbroadcast implementations from doing a mem load+dup in the non-GPR case into just directly loading into the destination.	2023-08-15 13:47:12 -04:00
Lioncache	81115f64f6	ARMEmitter: Handle SVE Store Multiple Structures (scalar plus scalar)	2023-08-15 10:18:37 -04:00
Lioncache	0176efa3bb	ARMEmitter: Handle SVE Load Multiple Structures (scalar plus scalar) group	2023-08-15 10:01:15 -04:00
Ryan Houdek	398e76be89	X86Tables: Fixes typo	2023-08-14 16:04:05 -07:00
Ryan Houdek	f248e7f3e7	Config: If DumpIR is enabled, default enable a passmanager option If DumpIR is enabled but the PassManagerDumpIR option isn't enabled then this currently does nothing. As a convenience, enable dumping the final optimized IR if an option hasn't been specified.	2023-08-14 12:29:56 -07:00
Ryan Houdek	e51606c669	Config: Fixes mixup in PassManagerDumpIR The opt and pass options were inverted in PassManager. Renames the enum to make this more clear.	2023-08-14 12:28:37 -07:00
Ryan Houdek	648d8aeb65	Config: Adds missing server option to DumpIR description This was accepted but I failed to describe it when added.	2023-08-14 12:22:35 -07:00
Ryan Houdek	112c463655	Config: Ensure OutputLog to server doesn't try to expand path "server" isn't a path, this was missed when it was added.	2023-08-14 12:20:58 -07:00
Alyssa Rosenzweig	7ecbbd6c04	ConstProp: Fix set-but-not-used mask variable I think this was the intended logic? Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-08-14 11:59:59 -04:00
Ryan Houdek	ef3887ca4f	X86Tables: Fixes typo in VEX table	2023-08-13 12:32:14 -07:00
Lioncache	8fd810c3c0	ARMEmitter: Migrate adr off SVEMemOperand We need to move the modifier enum out of the SVEMemOperand class since it's also used with adr. Plus, this can also be convenient not being tied down to the class itself. This also makes accessing modifiers less noisy, since the class	2023-08-11 22:17:46 -04:00
Lioncache	068db933bf	ARMEmitter: Handle SVE ADR	2023-08-11 19:50:21 -04:00
Lioncache	0bf74a1f3e	ARMEmitter: Use signed imm8 handler with dup_imm Lets us deduplicate the behavior used for dup and cpy.	2023-08-11 18:30:45 -04:00
Lioncache	78f06c7fcb	ARMEmitter: Handle SVE CPY (immediate) Also adds the relevant aliases.	2023-08-11 18:24:26 -04:00
Ryan Houdek	6d1fcfce09	Merge pull request #2877 from Sonicadvance1/classification_adds InstructionCountCI: Adds three more instruction tables	2023-08-11 15:09:29 -07:00
Ryan Houdek	5a0a6dd0ca	Merge pull request #2880 from lioncash/vfp ARMEmitter: Migrate off vixl float utils	2023-08-11 14:35:23 -07:00
Mai	da17e24996	Merge pull request #2879 from Sonicadvance1/fix_adcx FEXCore: Fixes bug with 32-bit adcx	2023-08-11 17:27:00 -04:00
Lioncache	aef0795dc8	ARMEmitter: Make FloatToEquivalentUInt a little more robust Rather than compare sizes, we should be comparing the types directly, prevents any shenanigans from happening if interface changes occur to Float16.	2023-08-11 17:16:14 -04:00
Lioncache	f63498a558	ARMEmitter: Migrate off vixl float utils	2023-08-11 17:12:34 -04:00
Ryan Houdek	9aa3fde174	FEXCore: Fixes bug with 32-bit adcx When a 32-bit adcx instruction was encountered, it was getting treated as a 16-bit adcx instruction instead. This is because of the 0x66 prefix required to handle this instruction. Adds a unit test to ensure it doesn't break again.	2023-08-11 14:09:31 -07:00
Ryan Houdek	8fce13386a	Merge pull request #2878 from lioncash/fcpy ARMEmitter: Handle SVE FCPY (predicated)	2023-08-11 13:58:31 -07:00
Lioncache	247c7ce784	ARMEmitter: Handle SVE FCPY (predicated) While we're at it, we can reduce our dependence on vixl's utils by implementing our own based off the pseudocode of VFPExpandImm.	2023-08-11 16:18:20 -04:00
Ryan Houdek	e8e52af2e8	CodeSizeValidation: Adds support for overriding CLZero support Vixl simulator by default doesn't support this.	2023-08-11 11:12:46 -07:00
Ryan Houdek	b8b4dd8008	FEXCore/Utils: Add the ability to write a fextl::string	2023-08-11 09:08:54 -07:00
Ryan Houdek	ec8855f8fb	Arm64: Consolidate simulator and diassembler code in to Arm64Emitter This was confusingly split between Arm64Emitter, Arm64Dispatcher, and Arm64JIT. - Arm64JIT objects were unnecessary and free to be deleted. - Arm64Dispatcher simulator and decoder moved to Arm64Emitter - Arm64Emitter disassembler and decoder renamed - Dropped usage of the PrintDisassembler since it is hardcoded to go through a FILE* type - We instead want its output to go through LogMan, which means using a split Decoder+Disassembler object pair. - Can't reuse the object from the vixl simulator since the simulator registers the decoder as a visitor, causing the simulator to execute while disassembling instructions if reused. - Disassembly output for blocks and dispatcher now output through Logman - Blocks wrapped in Begin/End text for tracking purposes for CI.	2023-08-11 08:05:10 -07:00
Ryan Houdek	969ad9b3b0	Merge pull request #2869 from Sonicadvance1/sve_128bit_ci Github: Adds a CI runner for 128-bit SVE testing	2023-08-11 07:37:47 -07:00
Ryan Houdek	5de7eeea20	Merge pull request #2876 from lioncash/comment ARMEmitter: Remove resolved TODO comment	2023-08-11 07:22:19 -07:00

1 2 3 4 5 ...

4056 Commits