FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-02-10 09:04:53 +00:00

Author	SHA1	Message	Date
Ryan Houdek	ef3887ca4f	X86Tables: Fixes typo in VEX table	2023-08-13 12:32:14 -07:00
Lioncache	8fd810c3c0	ARMEmitter: Migrate adr off SVEMemOperand We need to move the modifier enum out of the SVEMemOperand class since it's also used with adr. Plus, this can also be convenient not being tied down to the class itself. This also makes accessing modifiers less noisy, since the class	2023-08-11 22:17:46 -04:00
Lioncache	068db933bf	ARMEmitter: Handle SVE ADR	2023-08-11 19:50:21 -04:00
Lioncache	0bf74a1f3e	ARMEmitter: Use signed imm8 handler with dup_imm Lets us deduplicate the behavior used for dup and cpy.	2023-08-11 18:30:45 -04:00
Lioncache	78f06c7fcb	ARMEmitter: Handle SVE CPY (immediate) Also adds the relevant aliases.	2023-08-11 18:24:26 -04:00
Ryan Houdek	6d1fcfce09	Merge pull request #2877 from Sonicadvance1/classification_adds InstructionCountCI: Adds three more instruction tables	2023-08-11 15:09:29 -07:00
Ryan Houdek	5a0a6dd0ca	Merge pull request #2880 from lioncash/vfp ARMEmitter: Migrate off vixl float utils	2023-08-11 14:35:23 -07:00
Mai	da17e24996	Merge pull request #2879 from Sonicadvance1/fix_adcx FEXCore: Fixes bug with 32-bit adcx	2023-08-11 17:27:00 -04:00
Lioncache	aef0795dc8	ARMEmitter: Make FloatToEquivalentUInt a little more robust Rather than compare sizes, we should be comparing the types directly, prevents any shenanigans from happening if interface changes occur to Float16.	2023-08-11 17:16:14 -04:00
Lioncache	f63498a558	ARMEmitter: Migrate off vixl float utils	2023-08-11 17:12:34 -04:00
Ryan Houdek	9aa3fde174	FEXCore: Fixes bug with 32-bit adcx When a 32-bit adcx instruction was encountered, it was getting treated as a 16-bit adcx instruction instead. This is because of the 0x66 prefix required to handle this instruction. Adds a unit test to ensure it doesn't break again.	2023-08-11 14:09:31 -07:00
Ryan Houdek	8fce13386a	Merge pull request #2878 from lioncash/fcpy ARMEmitter: Handle SVE FCPY (predicated)	2023-08-11 13:58:31 -07:00
Lioncache	247c7ce784	ARMEmitter: Handle SVE FCPY (predicated) While we're at it, we can reduce our dependence on vixl's utils by implementing our own based off the pseudocode of VFPExpandImm.	2023-08-11 16:18:20 -04:00
Ryan Houdek	e8e52af2e8	CodeSizeValidation: Adds support for overriding CLZero support Vixl simulator by default doesn't support this.	2023-08-11 11:12:46 -07:00
Ryan Houdek	b8b4dd8008	FEXCore/Utils: Add the ability to write a fextl::string	2023-08-11 09:08:54 -07:00
Ryan Houdek	ec8855f8fb	Arm64: Consolidate simulator and diassembler code in to Arm64Emitter This was confusingly split between Arm64Emitter, Arm64Dispatcher, and Arm64JIT. - Arm64JIT objects were unnecessary and free to be deleted. - Arm64Dispatcher simulator and decoder moved to Arm64Emitter - Arm64Emitter disassembler and decoder renamed - Dropped usage of the PrintDisassembler since it is hardcoded to go through a FILE* type - We instead want its output to go through LogMan, which means using a split Decoder+Disassembler object pair. - Can't reuse the object from the vixl simulator since the simulator registers the decoder as a visitor, causing the simulator to execute while disassembling instructions if reused. - Disassembly output for blocks and dispatcher now output through Logman - Blocks wrapped in Begin/End text for tracking purposes for CI.	2023-08-11 08:05:10 -07:00
Ryan Houdek	969ad9b3b0	Merge pull request #2869 from Sonicadvance1/sve_128bit_ci Github: Adds a CI runner for 128-bit SVE testing	2023-08-11 07:37:47 -07:00
Ryan Houdek	5de7eeea20	Merge pull request #2876 from lioncash/comment ARMEmitter: Remove resolved TODO comment	2023-08-11 07:22:19 -07:00
Lioncache	73288f377f	ARMEmitter: Remove resolved TODO comment I forgot to remove this after implementing the normal gather instruction handling.	2023-08-11 10:03:10 -04:00
Lioncache	0aaa9503c9	ARMEmitter: Add missing ld1w (scalar plus scalar) tests ld1sw was mistakenly tested twice. Also groups the tests by data sizes.	2023-08-11 09:51:04 -04:00
Lioncache	14cc23b6c3	ARMEmitter: Handle contiguous first fault load (scalar plus scalar) group Adds the only missing implementation category for the first-faulting loads, making the interface more consistent.	2023-08-11 09:46:40 -04:00
Ryan Houdek	833c07e9e2	CoreState: Zero initialize some important members This was causing test failure locally where some values were set to uninitialized data. Ensure that gregs, YMM, and MMX registers are all zero initialized.	2023-08-10 22:27:59 -07:00
Ryan Houdek	186ec201aa	Config: Stop passing a temporary std::string_view outside of scope Was causing strenum variables to be parsed, then leaving scope would break the string.	2023-08-10 22:27:59 -07:00
Ryan Houdek	887c47c451	Config: Adds an option to override SVE width for CI	2023-08-10 21:25:57 -07:00
Ryan Houdek	0f3460e025	Config: Fixes typo in HostFeatures disable{sve,avx}	2023-08-10 21:25:57 -07:00
Ryan Houdek	fadba9a3e1	External: Update vixl Fixes simulator bug	2023-08-10 21:25:57 -07:00
Tony Wasserka	aed4dda3e4	Arm64: Remove unused function	2023-08-10 18:45:14 +02:00
Tony Wasserka	45d0f0d349	ARMEmitter: Fix warnings about unused variables in Release builds	2023-08-10 18:45:14 +02:00
Ryan Houdek	b1a3f82923	FEXCore: Fixes Arm64 stats disassembly Requires the IR headerop to house the number of host instructions this code is translating for the stats. Fixes compiling with disassembly enabled, will be used with the instruction count CI.	2023-08-10 03:23:25 -07:00
Lioncache	444961ad79	ARMEmitter: Handle SVE FP multiply-add long group	2023-08-09 15:20:04 -04:00
Lioncache	48a3271fbc	ARMEmitter: Handle SVE FP multiply-add long (indexed) group	2023-08-09 15:19:49 -04:00
Ryan Houdek	35e97ec9bc	IR: Adds Option to run the IRDumper with more configurations This is incredibly useful and I find myself hacking this feature in every time I am optimizing IR. Adds a new configuration option which allows dumping IR at various times. Before any optimization passes has happened After all optimizations passes have happened Before and After each IRPass to see what is breaking something. Needs #2864 merged first	2023-08-09 05:58:20 -07:00
Ryan Houdek	53ac8abce9	Merge pull request #2863 from Sonicadvance1/stats Arm64: Adds stats to the disassembly	2023-08-09 04:06:22 -07:00
Ryan Houdek	a23cb0447b	Arm64: Implement first SVE-128bit optimization This is a /very/ simple optimization purely because of a choice that ARM made with SVE in latest Cortex. Cortex-A715: - sxtl/sxtl2/uxtl/uxtl2 can execute 1 instruction per cycle. - sunpklo/sunpkhi/uunpklo/uunpkhi can execute 2 instructions per cycle. Cortex-X3: - sxtl/sxtl2/uxtl/uxtl2 can execute 2 instruction per cycle. - sunpklo/sunpkhi/uunpklo/uunpkhi can execute 4 instructions per cycle. This is fairly quirky since this optimization only works on SVE systems with 128-bit Vector length. Which since it is all of the current consumer platforms, it will work.	2023-08-09 03:51:57 -07:00
Ryan Houdek	f2aa2ce4bb	Arm64: Rename HostSupportsSVE We need to know the difference between the host supporting SVE with 128-bit registers versus 256-bit registers. Ensure we know the difference. No functional change here.	2023-08-09 03:51:56 -07:00
Ryan Houdek	cf93652708	Config: Adds support for overriding host features This allows use to both enable and disable regardless of what the host supports. This replaces the old `EnableAVX` option. Unlike the old EnableAVX option which was a binary option which could only disable, each of these options are technically trinary states. Not setting an option gives you the default detection, while explicitly enabling or disabling will toggle the option regardless of what the host supports. This will be used by the instruction count CI in the future.	2023-08-09 03:51:37 -07:00
Ryan Houdek	eaed5c4704	Merge pull request #2862 from Sonicadvance1/optimize_vector_zero ARM64: Optimize vector zeroing	2023-08-09 03:51:04 -07:00
Ryan Houdek	5f0efda8fe	ARM64: Fixes shift by immediate zero These would emit invalid instructions in most cases. Turn in to a move or a no-op if the shift is zero.	2023-08-09 02:16:17 -07:00
Ryan Houdek	d198d701aa	OpcodeDispatcher: Fixes vector shifts by immediate zero pslldq logic was wrong in the case of zero shift. The rest should just return their source in the case of zero shift.	2023-08-09 02:16:17 -07:00
Ryan Houdek	bf5719770e	Arm64: Remove erroneous LoadConstant This was a debug LoadConstant that would load the entry in to a temprary register to make it easier to see what RIP a block was in. This was implemented when FEX stopped storing the RIP in the CPU state for every block. This is now no longer necessary since FEX stores the in the tail data of the block. This was affecting instructioncountci when in a debug build.	2023-08-08 22:56:36 -07:00
Ryan Houdek	0f6a268243	Arm64: Adds stats to the disassembly I use this locally when looking for optimization opportunities in the JIT. The instruction count CI in the future will use this as well. Just get it upstreamed right away.	2023-08-08 22:28:52 -07:00
Ryan Houdek	e0461497a0	ARM64: Optimize vector zeroing `eor <reg>, <reg>, <reg>` is not the optimal way to zero a vector register on ARM CPUs. Instead we should move by constant or zero register to take advantage of zero-latency moves.	2023-08-08 22:24:11 -07:00
Ryan Houdek	68cb6e61d1	Merge pull request #2860 from lioncash/alias ARMEmitter: Add missing atomic aliases	2023-08-06 02:05:30 -07:00
Mai	5d0b2060e2	Merge pull request #2858 from Sonicadvance1/fix_clzero X86Tables: Fixes CLZero destination address	2023-08-06 05:05:07 -04:00
Lioncache	b7d05a65c7	ARMEmitter: Add missing atomic aliases	2023-08-04 21:49:59 -04:00
Lioncache	93fe2fe06c	ARMEmitter: Detemplatize LoadStoreAtomicLSE Lets us lessen some template instantiations.	2023-08-04 19:47:03 -04:00
Lioncache	1b53337925	ARMEmitter: Simplify LoadStoreAtomicLSE variant We can move the base opcode into the implementation function.	2023-08-04 19:35:54 -04:00
Billy Laws	52e5b8ccd9	OpcodeDispatcher: Fix 16-bit popa insertion behaviour The 16-bit writes shouldn't overwrite the upper half of the 32-bit register for POPA.	2023-08-04 17:24:55 +01:00
Ryan Houdek	617977357a	X86Tables: Fixes CLZero destination address This needs to default to 64-bit addresses, this was previously defaulting to 32-bit which was meaning the destination address was getting truncated. In a 32-bit process the address is still 32-bit. I'm actually surprised this hasn't caused spurious SIGSEGV before this point. Adds a 32-bit test to ensure that side is tested as well.	2023-08-04 02:31:30 -07:00
Alyssa Rosenzweig	a996e5300e	JIT: Use TST instead of CMN This is more obvious. llvm-mca says TST is half the cycle count of CMN for whatever it's defaulting to. dougallj's reference shows both as the same performance. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-08-02 17:51:33 -04:00

1 2 3 4 5 ...

4074 Commits