FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-05-23 01:25:49 +00:00

Author	SHA1	Message	Date
Mai	b282620a48	Merge pull request #3857 from Sonicadvance1/sve_bitperm Arm64: Implement support for SVE bitperm	2024-07-11 05:05:41 -04:00
Ryan Houdek	3ff1ff8f74	InstcountCI: Update for svebitperm	2024-07-11 01:46:35 -07:00
Ryan Houdek	e24b01b6cb	Arm64: Implement support for SVE bitperm	2024-07-11 01:46:35 -07:00
Tony Wasserka	9a8694c2f3	Merge pull request #3853 from neobrain/refactor_warn_fixes Fix all the warnings	2024-07-11 10:12:41 +02:00
Tony Wasserka	070a9148aa	Merge pull request #3852 from neobrain/refactor_opdispatch_codesize OpcodeDispatcher: Avoid template monomorphization to reduce FEXLoader binary size	2024-07-11 09:58:49 +02:00
Tony Wasserka	f19fe3b6f3	Fix warning about an expression with side effects being passed to __builtin_assume LOGMAN_THROW_AA_FMT has no benefit over LOGMAN_THROW_A_FMT here, so just use the latter.	2024-07-11 09:54:31 +02:00
Tony Wasserka	8d2b15665d	Fix unused-variable warnings	2024-07-11 09:54:30 +02:00
Tony Wasserka	4dec8f22f8	Fix packed-non-pod warnings	2024-07-11 09:54:30 +02:00
Tony Wasserka	a39b3aca78	Fix invalid-offsetof warnings due to JsonAllocator not being standard layout Inheritance can be used here instead, which allows the JsonAllocator to be reconstructed using a downcast.	2024-07-11 09:54:30 +02:00
Tony Wasserka	5dc4ab062d	Fix invalid-offsetof warnings due to InternalThreadState not being standard layout See https://github.com/llvm/llvm-project/issues/53021 for more information about unique_ptr turning non-standard-layout.	2024-07-11 09:54:30 +02:00
Ryan Houdek	5e56bdc0fd	InstcountCI: Add support for SVE bitperm	2024-07-10 21:48:37 -07:00
Ryan Houdek	3554d5c2f7	HostFeatures: Check for SVE bit permute extension	2024-07-10 21:45:07 -07:00
Mai	5fe405e1fb	Merge pull request #3855 from neobrain/fix_aotir_uniqueptr AOTIR: Change std::unique_ptr to fextl::unique_ptr	2024-07-10 17:04:12 -04:00
Tony Wasserka	8381d44bbd	Merge pull request #3854 from neobrain/fix_default_delete fextl: Properly handle nullptr arguments in fextl::default_delete	2024-07-10 23:00:52 +02:00
Tony Wasserka	56bb3744a5	AOTIR: Change std::unique_ptr to fextl::unique_ptr	2024-07-10 19:34:24 +02:00
Tony Wasserka	470b435afd	fextl: Properly handle nullptr arguments in fextl::default_delete This reflects behavior of std::default_delete.	2024-07-10 19:17:50 +02:00
Tony Wasserka	441187470e	OpcodeDispatcher: Avoid monomorphization of some AVX functions	2024-07-10 17:01:30 +02:00
Tony Wasserka	59fd13cc2f	OpcodeDispatcher: Avoid monomorphization of even more functions	2024-07-10 17:01:30 +02:00
Tony Wasserka	c9e7bfdf16	OpcodeDispatcher: Avoid monomorphization of more functions	2024-07-10 17:01:30 +02:00
Tony Wasserka	2d700c381e	OpcodeDispatcher: Avoid monomorphization of large functions	2024-07-10 17:01:30 +02:00
Ryan Houdek	72d6c8ebd6	Merge pull request #3820 from alyssarosenzweig/ir/drop-deferred Drop deferred flag infrastructure	2024-07-09 17:06:25 -07:00
Ryan Houdek	991c6941c1	Merge pull request #3849 from alyssarosenzweig/ir/drop-parser-2 Scripts: drop remnant of IR parser	2024-07-09 16:48:36 -07:00
Alyssa Rosenzweig	f974696e34	Scripts: drop remnant of IR parser unused. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2024-07-09 16:08:38 -04:00
Alyssa Rosenzweig	3ef9ea94e5	Merge pull request #3848 from pmatos/FTSTX87Tests Tests for X87 FTST	2024-07-09 09:10:29 -04:00
Paulo Matos	381ce23fd7	Tests for X87 FTST	2024-07-09 13:36:16 +02:00
Mai	af6a0be832	Merge pull request #3842 from Sonicadvance1/fix_f64_to_i32 VCVT{T,}PD2DQ fixes and optimization	2024-07-09 03:49:31 -04:00
Ryan Houdek	287fe5beac	InstcountCI: Update	2024-07-09 00:38:48 -07:00
Ryan Houdek	b9c214e6e8	OpcodeDispatcher: Use new IR op for vcvt{t,}pd2dq Also fixes a bug where it was failing to zero the upper bits of the destination register in the AVX128 implementation. Which the updated unit tests now check against. Fixes a minor precision issue that was reported in #2995. We still don't return correct values for overflow. x86 always returns maximum negative int32_t on overflow, ARM will return maximum negative or positive depending on sign of the double.	2024-07-09 00:38:47 -07:00
Ryan Houdek	d3d76aa8ce	IR: Adds new F64 -> I32 operation that changes behaviour depending on SVE SVE added the ability to do F64 -> I32 conversions directly without an fcvtn inbetween. So maybe sure to support them.	2024-07-09 00:38:47 -07:00
Ryan Houdek	3bea08da5f	Merge pull request #3843 from Sonicadvance1/remove_half_moves_fma3 Arm64: Remove one move if possible in FMA operations	2024-07-09 00:25:07 -07:00
Mai	7ccb252069	Merge pull request #3837 from Sonicadvance1/optimize_sve_vpgatherdq AVX128: Extends 32-bit indexes path for 128-bit operations	2024-07-08 22:01:02 -04:00
Ryan Houdek	31547462bb	InstcountCI: Update for final SVE AVX128 improvements.	2024-07-08 18:44:07 -07:00
Ryan Houdek	b3a7a973a1	AVX128: Extends 32-bit indexes path for 128-bit operations The codepath from #3826 was only targeting 256-bit sized operations. This missed the vpgatherdq/vgatherdpd 128-bit operations. By extending the codepath to understand 128-bit operations, we now hit these instruction variants. With this PR, we now have SVE128 codepaths that handle ALL variants of x86 gather instructions! There are zero ASIMD fallbacks used in this case! Of course depending on the instruction, the performance still leaves a lot to be desired, and there is no way to emulate x86 TSO behaviour without an ASIMD fallback, which we will likely need to add as a fallback at some point. Based on #3836 until that is merged.	2024-07-08 18:44:07 -07:00
Mai	22b26696ba	Merge pull request #3836 from Sonicadvance1/optimize_sve_vpgatherdd AVX128: Optimize the vpgatherdd/vgatherdps cases that would fall back to ASIMD	2024-07-08 21:43:36 -04:00
Ryan Houdek	495241f8ca	InstcountCI: Update for wide gather vpgatherdd SVE usage	2024-07-08 18:12:28 -07:00
Ryan Houdek	4afbfcae17	AVX128: Optimize the vpgatherdd/vgatherdps cases that would fall back to ASIMD With the introduction of the wide gathers in #3828 this has opened new avenues for optimizing these cases that would typically fall back to ASIMD. In the cases that 32-bit SVE scaling doesn't fit, we can instead sign extend the elements in to double-width address registers. This then feeds naturally in to the SVE path even though we end up needing to allocate 512-bits worth of address registers. This ends up being significantly better than the ASIMD path still. Relies on #3828 to be merged first Fixes #3829	2024-07-08 18:12:28 -07:00
Mai	3627de4cbc	Merge pull request #3828 from Sonicadvance1/optimize_wide_gathers AVX128: Optimize QPS/QD variant of gather loads!	2024-07-08 21:11:36 -04:00
Ryan Houdek	007c07e612	InstcountCI: Update for wide gathers	2024-07-08 17:19:18 -07:00
Ryan Houdek	ec7c8fd922	AVX128: Optimize QPS/QD variant of gather loads! SVE has a special version of their gather instruction that gets similar behaviour to x86's VGATHERQPS/VPGATHERQD instructions. The quirk of these instructions that the previous SVE implementation didn't handle and required ASIMD fallback, was that most gather instructions require the data element size and address element size to match. This x86 instruction uses a 64-bit address size while loading 32-bit elements. This matches this specific variant of the SVE instruction, but the data is zero-extended once loaded, requiring us to shuffle the data after it is loaded. This isn't the worst but the implementation is different enough that stuffing it in to the other gather load will cause headaches. Basically gets 32 instruction variants to use the SVE version! Fixes #3827	2024-07-08 17:19:18 -07:00
Ryan Houdek	c5a0ae7b34	IR: Adds new QPS gather load variant!	2024-07-08 17:19:18 -07:00
Ryan Houdek	4bd207ebf3	Arm64: Moves 128Bit gather ASIMD emulation to its own helper It is going to get reused.	2024-07-08 17:19:18 -07:00
Tony Wasserka	45011234d9	Merge pull request #3845 from pmatos/TESTJOBCOUNTFix Use nproc only if TEST_JOB_COUNT not specified	2024-07-08 22:31:23 +02:00
Paulo Matos	24017f379e	Use nproc only if TEST_JOB_COUNT not specified	2024-07-08 21:38:56 +02:00
Mai	aad7656b38	Merge pull request #3826 from Sonicadvance1/scale_32bit_gather AVX128: Extend 32-bit address indices when possible	2024-07-08 15:29:44 -04:00
Ryan Houdek	80de890f05	InstcountCI: Update for removed FMA moves	2024-07-08 04:50:49 -07:00
Ryan Houdek	62cec7b6b2	Arm64: Remove one move if possible in FMA operations If the destination isn't any of the incoming sources then we can avoid one of the moves at the end. This half works around the problem proposed in #3794, but doesn't solve the entire problem. To solve the other half of the moving problem means we need to solve the SRA allocation problem for this temporary register with addsub/subadd, so it gets allocated for both the FMA operation and the XOR operation.	2024-07-08 04:44:40 -07:00
Ryan Houdek	c9c163cd7b	unittests: Update vcv{t,tt}pd2dq tests to ensure upper bits of destination are cleared	2024-07-08 03:30:10 -07:00
Mai	95a9f32bf0	Merge pull request #3840 from Sonicadvance1/extend_vinsert128_tests unittests: Extends vinsert{i,f}128 tests for garbage data	2024-07-07 13:39:20 -04:00
Mai	c4ae761a0e	Merge pull request #3841 from Sonicadvance1/add_missing_cpu_names CPUID: Adds a few missing CPU names for new CPU cores	2024-07-07 13:38:27 -04:00
Ryan Houdek	0653b346e0	CPUID: Adds a few missing CPU names for new CPU cores These should be making their way to the market sooner rather than later so make sure we have the descriptor text for them.	2024-07-07 02:40:19 -07:00

1 2 3 4 5 ...

9926 Commits