FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-02-13 03:02:47 +00:00

Author	SHA1	Message	Date
Ryan Houdek	e8e52af2e8	CodeSizeValidation: Adds support for overriding CLZero support Vixl simulator by default doesn't support this.	2023-08-11 11:12:46 -07:00
Ryan Houdek	6c7371af60	Merge pull request #2872 from Sonicadvance1/instruction_count_ci FEX: Adds instruction count CI	2023-08-11 11:12:00 -07:00
Ryan Houdek	acc7f2fa8f	FEX: Adds instruction count CI Implements CI for tracking instruction counts for generate blocks of code when transforming from x86 to ARM64 assembly. This will end up encompassing every instruction in our instruction tables similarly to how our assembly tests try to test everything in our instruction tables. Incidentally, the data for this CI is generated using our assembly tests. By enabling disassembly and instruction stats when executing a suite of instructions, this gives the stats that can be added to a json file. The current implementation only implements the SecondGroup table of instructions because it is a relatively small table and has known inefficiencies in the instruction implementations. As this gets merged I will be adding more tables of instructions to additional json files for testing. These JSON files will support adjusting CPU features regardless of the host features so it can test implementations depending on different CPU features. This will let us test things like one instruction having different "optimal" implementations depending on if it supports SVE128, SVE256, SVEI8MM, etc. This initial instruction auditing is what found the bug in our vector shift instructions by size of zero. If inspecting the result of the CI run, you can tell that these instructions still aren't "optimal" because they are doing loads and stores that can be eliminated. The "Optimal" in the JSON is purely for human readable and grepping ability to see what is optimal versus not. Same with the "Comment" section. According to my auditing spreadsheet, the total number of instructions that will end up in these json files will be about 1000, but we will likely end up with more since there will be edge cases that can be more optimal depending on arguments.	2023-08-11 09:10:36 -07:00
Ryan Houdek	b8b4dd8008	FEXCore/Utils: Add the ability to write a fextl::string	2023-08-11 09:08:54 -07:00
Ryan Houdek	ec8855f8fb	Arm64: Consolidate simulator and diassembler code in to Arm64Emitter This was confusingly split between Arm64Emitter, Arm64Dispatcher, and Arm64JIT. - Arm64JIT objects were unnecessary and free to be deleted. - Arm64Dispatcher simulator and decoder moved to Arm64Emitter - Arm64Emitter disassembler and decoder renamed - Dropped usage of the PrintDisassembler since it is hardcoded to go through a FILE* type - We instead want its output to go through LogMan, which means using a split Decoder+Disassembler object pair. - Can't reuse the object from the vixl simulator since the simulator registers the decoder as a visitor, causing the simulator to execute while disassembling instructions if reused. - Disassembly output for blocks and dispatcher now output through Logman - Blocks wrapped in Begin/End text for tracking purposes for CI.	2023-08-11 08:05:10 -07:00
Ryan Houdek	969ad9b3b0	Merge pull request #2869 from Sonicadvance1/sve_128bit_ci Github: Adds a CI runner for 128-bit SVE testing	2023-08-11 07:37:47 -07:00
Ryan Houdek	5de7eeea20	Merge pull request #2876 from lioncash/comment ARMEmitter: Remove resolved TODO comment	2023-08-11 07:22:19 -07:00
Ryan Houdek	0109e88082	Merge pull request #2875 from lioncash/ff ARMEmitter: Handle contiguous first fault load (scalar plus scalar) group	2023-08-11 07:09:32 -07:00
Lioncache	73288f377f	ARMEmitter: Remove resolved TODO comment I forgot to remove this after implementing the normal gather instruction handling.	2023-08-11 10:03:10 -04:00
Lioncache	0aaa9503c9	ARMEmitter: Add missing ld1w (scalar plus scalar) tests ld1sw was mistakenly tested twice. Also groups the tests by data sizes.	2023-08-11 09:51:04 -04:00
Lioncache	14cc23b6c3	ARMEmitter: Handle contiguous first fault load (scalar plus scalar) group Adds the only missing implementation category for the first-faulting loads, making the interface more consistent.	2023-08-11 09:46:40 -04:00
Ryan Houdek	5404dba360	Github: Adds a CI runner for 128-bit SVE testing We don't currently have a device in CI that can run SVE with 128-bit width registers. Until we have a device with this, make sure the vixl simulator is also running the ASM tests in this width.	2023-08-10 22:27:59 -07:00
Ryan Houdek	833c07e9e2	CoreState: Zero initialize some important members This was causing test failure locally where some values were set to uninitialized data. Ensure that gregs, YMM, and MMX registers are all zero initialized.	2023-08-10 22:27:59 -07:00
Ryan Houdek	186ec201aa	Config: Stop passing a temporary std::string_view outside of scope Was causing strenum variables to be parsed, then leaving scope would break the string.	2023-08-10 22:27:59 -07:00
Ryan Houdek	887c47c451	Config: Adds an option to override SVE width for CI	2023-08-10 21:25:57 -07:00
Ryan Houdek	0f3460e025	Config: Fixes typo in HostFeatures disable{sve,avx}	2023-08-10 21:25:57 -07:00
Ryan Houdek	fadba9a3e1	External: Update vixl Fixes simulator bug	2023-08-10 21:25:57 -07:00
Ryan Houdek	9d26af95ab	Merge pull request #2873 from neobrain/refactor_warning_fixes Various warning fixes	2023-08-10 16:58:14 -07:00
Tony Wasserka	aed4dda3e4	Arm64: Remove unused function	2023-08-10 18:45:14 +02:00
Tony Wasserka	e0d21e61cc	Syscalls: Fix warnings about unused variables in Release builds	2023-08-10 18:45:14 +02:00
Tony Wasserka	45d0f0d349	ARMEmitter: Fix warnings about unused variables in Release builds	2023-08-10 18:45:14 +02:00
Tony Wasserka	f1cc76614b	Include VIXL as a system library This suppresses warnings from VIXL headers.	2023-08-10 18:45:14 +02:00
Ryan Houdek	099f29f1ed	Merge pull request #2871 from Sonicadvance1/fix_stats_missing_member FEXCore: Fixes Arm64 stats disassembly	2023-08-10 06:23:02 -07:00
Ryan Houdek	b1a3f82923	FEXCore: Fixes Arm64 stats disassembly Requires the IR headerop to house the number of host instructions this code is translating for the stats. Fixes compiling with disassembly enabled, will be used with the instruction count CI.	2023-08-10 03:23:25 -07:00
Ryan Houdek	6f4a23dd15	Merge pull request #2870 from lioncash/indexed ARMEmitter: Handle SVE FP multiply-add long groups	2023-08-09 21:35:33 -07:00
Ryan Houdek	f3182036bc	Merge pull request #2867 from Sonicadvance1/dummy_thin_handlers FEX: Create a CommonTools static library	2023-08-09 21:34:46 -07:00
Lioncache	444961ad79	ARMEmitter: Handle SVE FP multiply-add long group	2023-08-09 15:20:04 -04:00
Lioncache	48a3271fbc	ARMEmitter: Handle SVE FP multiply-add long (indexed) group	2023-08-09 15:19:49 -04:00
Mai	ea8fbc61c2	Merge pull request #2868 from Sonicadvance1/irdumper_passmanager IR: Adds Option to run the IRDumper with more configurations	2023-08-09 10:28:25 -04:00
Ryan Houdek	35e97ec9bc	IR: Adds Option to run the IRDumper with more configurations This is incredibly useful and I find myself hacking this feature in every time I am optimizing IR. Adds a new configuration option which allows dumping IR at various times. Before any optimization passes has happened After all optimizations passes have happened Before and After each IRPass to see what is breaking something. Needs #2864 merged first	2023-08-09 05:58:20 -07:00
Ryan Houdek	53ac8abce9	Merge pull request #2863 from Sonicadvance1/stats Arm64: Adds stats to the disassembly	2023-08-09 04:06:22 -07:00
Ryan Houdek	fe351353f6	Merge pull request #2865 from Sonicadvance1/first_sve_opt Arm64: Implement first SVE-128bit optimization	2023-08-09 04:06:05 -07:00
Ryan Houdek	a23cb0447b	Arm64: Implement first SVE-128bit optimization This is a /very/ simple optimization purely because of a choice that ARM made with SVE in latest Cortex. Cortex-A715: - sxtl/sxtl2/uxtl/uxtl2 can execute 1 instruction per cycle. - sunpklo/sunpkhi/uunpklo/uunpkhi can execute 2 instructions per cycle. Cortex-X3: - sxtl/sxtl2/uxtl/uxtl2 can execute 2 instruction per cycle. - sunpklo/sunpkhi/uunpklo/uunpkhi can execute 4 instructions per cycle. This is fairly quirky since this optimization only works on SVE systems with 128-bit Vector length. Which since it is all of the current consumer platforms, it will work.	2023-08-09 03:51:57 -07:00
Ryan Houdek	f2aa2ce4bb	Arm64: Rename HostSupportsSVE We need to know the difference between the host supporting SVE with 128-bit registers versus 256-bit registers. Ensure we know the difference. No functional change here.	2023-08-09 03:51:56 -07:00
Ryan Houdek	cf93652708	Config: Adds support for overriding host features This allows use to both enable and disable regardless of what the host supports. This replaces the old `EnableAVX` option. Unlike the old EnableAVX option which was a binary option which could only disable, each of these options are technically trinary states. Not setting an option gives you the default detection, while explicitly enabling or disabling will toggle the option regardless of what the host supports. This will be used by the instruction count CI in the future.	2023-08-09 03:51:37 -07:00
Ryan Houdek	eaed5c4704	Merge pull request #2862 from Sonicadvance1/optimize_vector_zero ARM64: Optimize vector zeroing	2023-08-09 03:51:04 -07:00
Mai	c77ed78f5a	Merge pull request #2861 from Sonicadvance1/fix_vector_shift_by_zero FEXCore: Fixes vector shifts by zero	2023-08-09 05:52:10 -04:00
Ryan Houdek	348844a95b	FEX: Create a CommonTools static library Moves the dummy handlers over to this library. This will end up getting used for more than the mingw test harness runner once the instruction count CI is operational.	2023-08-09 02:27:13 -07:00
Ryan Houdek	e8fb322025	unittests: Adds tests for vector shifts with zero immediate To ensure FEX doesn't encounter the encoding bug again.	2023-08-09 02:16:17 -07:00
Ryan Houdek	5f0efda8fe	ARM64: Fixes shift by immediate zero These would emit invalid instructions in most cases. Turn in to a move or a no-op if the shift is zero.	2023-08-09 02:16:17 -07:00
Ryan Houdek	d198d701aa	OpcodeDispatcher: Fixes vector shifts by immediate zero pslldq logic was wrong in the case of zero shift. The rest should just return their source in the case of zero shift.	2023-08-09 02:16:17 -07:00
Mai	c4c7620ed5	Merge pull request #2866 from Sonicadvance1/remove_unnecessary_loadconstant Arm64: Remove erroneous LoadConstant	2023-08-09 05:10:11 -04:00
Ryan Houdek	bf5719770e	Arm64: Remove erroneous LoadConstant This was a debug LoadConstant that would load the entry in to a temprary register to make it easier to see what RIP a block was in. This was implemented when FEX stopped storing the RIP in the CPU state for every block. This is now no longer necessary since FEX stores the in the tail data of the block. This was affecting instructioncountci when in a debug build.	2023-08-08 22:56:36 -07:00
Ryan Houdek	0f6a268243	Arm64: Adds stats to the disassembly I use this locally when looking for optimization opportunities in the JIT. The instruction count CI in the future will use this as well. Just get it upstreamed right away.	2023-08-08 22:28:52 -07:00
Ryan Houdek	e0461497a0	ARM64: Optimize vector zeroing `eor <reg>, <reg>, <reg>` is not the optimal way to zero a vector register on ARM CPUs. Instead we should move by constant or zero register to take advantage of zero-latency moves.	2023-08-08 22:24:11 -07:00
Ryan Houdek	e9e5b6fb0b	Docs: Update for release FEX-2308 FEX-2308	2023-08-06 02:34:55 -07:00
Ryan Houdek	68cb6e61d1	Merge pull request #2860 from lioncash/alias ARMEmitter: Add missing atomic aliases	2023-08-06 02:05:30 -07:00
Mai	5d0b2060e2	Merge pull request #2858 from Sonicadvance1/fix_clzero X86Tables: Fixes CLZero destination address	2023-08-06 05:05:07 -04:00
Lioncache	b7d05a65c7	ARMEmitter: Add missing atomic aliases	2023-08-04 21:49:59 -04:00
Lioncache	93fe2fe06c	ARMEmitter: Detemplatize LoadStoreAtomicLSE Lets us lessen some template instantiations.	2023-08-04 19:47:03 -04:00

1 2 3 4 5 ...

7054 Commits