FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-01-08 22:52:51 +00:00

Author	SHA1	Message	Date
Mai	af6a0be832	Merge pull request #3842 from Sonicadvance1/fix_f64_to_i32 VCVT{T,}PD2DQ fixes and optimization	2024-07-09 03:49:31 -04:00
Ryan Houdek	c9c163cd7b	unittests: Update vcv{t,tt}pd2dq tests to ensure upper bits of destination are cleared	2024-07-08 03:30:10 -07:00
Ryan Houdek	fa587398bd	unittests: Extends vinsert{i,f}128 tests for garbage data Just to ensure we don't hit an issue with masking the immediate bits. Fixes #3753	2024-07-07 02:16:21 -07:00
Ryan Houdek	8b9b1a90e4	unittests: Fixes typo in vpcmpgtw test	2024-07-01 14:42:23 -07:00
Ryan Houdek	babde31bf0	AVX128: Fixes vmovlhps We didn't have a unit test for this and we weren't implementing it at all. We treated it as vmovhps/vmovhpd accidentally. Once again caught by the libaom Intrinsics unit tests.	2024-07-01 13:54:11 -07:00
Ryan Houdek	aba7a3a830	AVX128: Fixes vblendps lower and upper selector	2024-06-27 17:20:39 -07:00
Ryan Houdek	9027d1eee7	AVX128: Fixes bug in vector immediate shift	2024-06-27 16:22:14 -07:00
Ryan Houdek	c6c147daf6	unittests: Updates vcvtps2ph test for failure case of writing too much memory.	2024-06-26 16:49:00 -07:00
Ryan Houdek	52e541d453	Unittests: Stop using AVX2 flag	2024-06-26 14:56:01 -07:00
Ryan Houdek	ba28e6f82e	unittests: Adds vcvtps2ph tests that use mxcsr	2024-06-26 14:08:20 -07:00
Ryan Houdek	94fd100fc7	Merge pull request #3719 from lioncash/f16c OpcodeDispatcher: Handle F16C operations	2024-06-26 12:12:13 -07:00
Lioncache	cd5a809ec9	OpcodeDispatcher: Handle VCVTPS2PH	2024-06-26 15:05:03 -04:00
Lioncache	045a8efbeb	OpcodeDispatcher: Handle VCVTPH2PS Fairly straightforward, since we already have handling for half-float conversions.	2024-06-26 15:05:00 -04:00
Ryan Houdek	54a1f7d833	Merge pull request #3764 from Sonicadvance1/rorx_masking BMI2: Ensure rorx immediate masks by operation size correctly.	2024-06-26 11:52:47 -07:00
Ryan Houdek	a515061465	BMI2: Ensure rorx immediate masks by operation size correctly.	2024-06-26 11:11:37 -07:00
Ryan Houdek	122ae5b710	unittests: Adds FMA3 unittests	2024-06-25 11:37:18 -07:00
Ryan Houdek	41923bac99	OpcodeDispatcher: Fixes PCMUL with weird selectors and zero-extend We had a bug where we weren't correctly ignoring the non-used bits in the selector. This was causing an assert in the ARM backend.	2024-06-25 12:54:03 -04:00
Alyssa Rosenzweig	77aaa9af4d	Merge pull request #3748 from Sonicadvance1/avx_15 AVX128: More instructions Part 4	2024-06-25 12:39:48 -04:00
Ryan Houdek	48e7aae38f	unittests: Adds support for 256-bit vpclmulqdq It's easy because the test was already written for this in mind.	2024-06-25 10:03:33 -04:00
Ryan Houdek	3a310b8815	Merge pull request #3756 from Sonicadvance1/fix_vmovhlps Fix VMOVLHPS instruction	2024-06-24 19:14:56 -07:00
Ryan Houdek	bd24ebc96a	unittests: Adds VMOVHLPS unit test A bit confusing because the instruction encoding is the same between VMOVHLPS and VMOVLPS so this unittest was missed. Implement the test to ensure it stays working	2024-06-24 17:22:55 -07:00
Ryan Houdek	99b2018d0e	unittests: Extend vmovntpd test	2024-06-24 16:32:13 -07:00
Ryan Houdek	f0d9c8c10a	AVX128: Fix vmovntdqa failing to zero upper 128-bits	2024-06-24 16:32:09 -07:00
Ryan Houdek	6941a59223	unittests: Split up vtestps unittest to accumulate flags in independent registers. Makes it easier to see what is failing on the 128-bit side versus 256-bit side.	2024-06-21 00:45:30 -07:00
Ryan Houdek	8fb801069f	unittests: Adds new VAES tests	2024-06-19 05:51:47 -07:00
Mikhail Nitenko	99a43283be	unittests/bextr: add SrcSize tests dougallj mentioned that adding these tests might expose a bug in bextr. Since bextr implementation was changed apparently it now works correctly, that's good.	2024-06-10 05:45:12 +00:00
Ryan Houdek	0f26bc20a3	unittests/ASM: Adds palignr tests for zero immediate These effectively turn in to moves.	2023-10-27 15:07:52 -07:00
Lioncache	24f2796141	VectorOps: Handle SVE VFCADD a little better If no registers alias, then we can move the first source directly into the destination and then perform the FCADD operation as opposed to using a temporary.	2023-10-19 14:48:46 +02:00
Lioncache	1f6c6345d9	VectorOps: Handle SVE VURAvg a little better We can perform less moves by checking for scenarios where aliasing occurs. Since addition is commutative (usually, general-case anyway), order of inputs doesn't strictly matter here.	2023-10-19 12:14:12 +02:00
Lioncache	3d23cd5765	VectorOps: Handle SVE VFDiv a little better In the event no source vectors alias the destination, we can just move the first source vector into it and then perform the divide without needing to move afterword.	2023-10-19 11:45:35 +02:00
Lioncache	39e658f02a	VectorOps: Handle more VUMin SVE cases better We can avoid needing to use movprfx here by moving directly into the destination when possible and just doing the UMIN directly	2023-10-18 18:48:13 +02:00
Lioncache	e89dd27f2a	VectorOps: Handle more VSMin SVE cases better We can avoid needing to use movprfx here by moving directly into the destination when possible and just doing the SMIN directly.	2023-10-18 18:48:13 +02:00
Lioncache	f85fae0041	VectorOps: Handle more VUMax SVE cases better We can avoid needing to use movprfx here by moving directly into the destination when possible and just doing the UMAX directly. Also expands the unsigned max tests to test values with the sign bit set to ensure all behavior is caught.	2023-10-18 18:48:12 +02:00
Lioncache	65eec673fc	VectorOps: Handle more VSMax SVE cases better Since SMAX performs a comparison and returns the max value regardless of how the operands are provided, we can check for when the second input aliases the destination.	2023-10-18 18:48:03 +02:00
Mai	ab4642af38	Merge pull request #3167 from Sonicadvance1/gatherqdps unittests/ASM: Implements tests for vpgatherqd/vgatherqps	2023-09-29 12:16:43 -04:00
Mai	d94e5ce7f4	Merge pull request #3168 from Sonicadvance1/gatherqqpd unittests/ASM: Implements tests for vpgatherqq/vgatherqpd	2023-09-29 12:16:12 -04:00
Ryan Houdek	a21def7d74	unittests/ASM: Implements tests for vpgatherqq/vgatherqpd Similar to previous tests, vpgatherqq and vgatherqpd are equivalent instructions. So the tests are the same with the mnemonic changed. This adds tests for an additional two sets of instructions. Getting us full coverage of all eight instructions if we include the tests from PR #3167 and #3166 Tests the same things as described in #3165 In addition, since these tests use 64-bit indices for address calculation, we can easily generate and indice vector that tests overflow. So every test at every displacement ALSO gains an additional overflow test to ensure correct behaviour around pointer overflow calculation.	2023-09-29 08:04:47 -07:00
Ryan Houdek	0d8d5444a4	unittests/ASM: Implements tests for vpgatherqd/vgatherqps Similar to previous tests, vgatherqd and vgatherqps are equivalent instructions. So the tests are the same with the mnemonic changed. This adds tests for an additional two sets of instructions, Getting us up to six total over the eight if we include the tests from #3166. Tests the same things as described in #3165 In addition, since these tests use 64-bit indices for address calculation, we can easily generate and indice vector that tests overflow. So every test at every displacement ALSO gains and additional overflow test to ensure correct behaviour around pointer overflow calculation.	2023-09-29 07:20:07 -07:00
Ryan Houdek	eedfad5036	unittests/ASM: Implements tests for vpgatherdq/vgatherpq Just like the previous tests, vpgatherdq and vgatherpq are equivalent instructions. So the tests are the same except for the instruction mnemonic again. This adds unittests for two more of the eight gather instructions. Getting us up to testing four in total. Specifically this adds tests for 32-bit indices while loading 64-bit element instructions. Same thing as PR #3165 for what it tests versus doesn't.	2023-09-28 22:49:03 -07:00
Ryan Houdek	9a01b440e3	unittests/ASM: Implements tests for vpgatherdd/vgatherps vpgatherdd and vgatherps are effectively the same instructions, so the tests are the same except for the instruction mnemonic. This adds unit tests for two of the eight gather instructions. Specifically this adds tests for the 32-bit indices loading 32-bit elements instructions. What it tests: - Tests all displacement scales - Tests multiple mask arrangements - Ensures the mask register is zero'd after the instruction What it doesn't test: - Doesn't test address size calculation overflow - Only would happen on 32-bit with 32-bit indices, or /really/ high base addresses - The instruction should behave as a mask to the address size - Effectively behaves like `(uint64_t)(base + index << ilog2(scale))` - Better idea is to just not expose AVX to 32-bit applications - Doesn't test VSIB immediate displacement - This just ends up being base_addr + imm so it isn't too interesting - We can add more tests in the future if we think we messed that up - Doesn't test partial fault behaviour - Because that's a nightmare. Specifically keeps each instruction test small and isolated so if a single register fails it is very easily to nail down which operation did it. I know some of our ASM tests do a chunk of work and spit out a result at the end which can be difficult to debug in some cases. Didn't want to do that which is why the tests are spread out across 16 files for these single class of instructions.	2023-09-28 19:58:34 -07:00
Lioncache	b0c8ff0ea6	Arm64/ALUOps: Remove spills in PEXT Reduces the number of emitted instructions for a corresponding PEXT instruction. We no longer spill for this IR op.	2023-09-15 19:39:51 -04:00
Lioncache	4a37ea4819	OpcodeDispatcher: Handle RORX corner cases better There are a few cases where we were emitting code when we didn't really need to, or could emit less.	2023-09-15 17:36:36 -04:00
Lioncache	bbed4d73ed	OpcodeDispatcher: Improve VPERMQ/VPERMPD broadcast cases For a bunch of cases that act as broadcasts (where all indices in the imm8 specify the same element), we can use VDupElement here rather than iterating through.	2023-08-19 01:08:30 -04:00
Lioncache	9e54ec2724	OpcodeDispatcher: Improve {V}PSRLDQ shift by 0 While it would be bizarre if this actually occurred frequently in practice, we can still tune it so there's no subpar assembly output in the cases it actually does happen.	2023-08-17 19:33:09 -04:00
Lioncache	f7c663240e	OpcodeDispatcher: Handle PCMPESTRM/VPCMPESTRM ...and with that all of the SSE4.2 string instructions are implemented now	2023-05-17 00:21:55 -04:00
Lioncache	82b4aef30d	OpcodeDispatcher: Handle PCMPISTRM/VPCMPISTRM	2023-05-16 22:59:54 -04:00
Ryan Houdek	88247141d7	Merge pull request #2649 from lioncash/istri OpcodeDispatcher: Handle PCMPISTRI/VPCMPISTRI	2023-05-02 14:35:47 -07:00
Lioncache	f502154f96	OpcodeDispatcher: Handle VPCMPISTRI	2023-05-02 14:00:05 -04:00
Lioncache	8369f9c25b	unittests: Add missing VPMASKMOVQ store test Realized I forgot to add this in the commit that added VPMASKMOVD/VPMASKMOVQ support.	2023-05-02 11:13:44 -04:00
Lioncache	c94721a04b	OpcodeDispatcher: Handle VPMASKMOVD/VPMASKMOVQ We can reuse the same helper we have for handling VMASKMOVPD and VMASKMOVPS, though we need to move some handling around to account for the fact that VPMASKMOVD and VPMASKMOVQ 'hijack' the REX.W bit to signify the element size of the operation.	2023-04-24 10:50:11 -04:00

1 2 3 4 5 ...

358 Commits