FEX-Emu/FEX - FEX - Gitea: Git with a cup of tea

mirror of https://github.com/FEX-Emu/FEX.git synced 2025-01-25 07:57:03 +00:00

Author	SHA1	Message	Date
Ryan Houdek	a21def7d74	unittests/ASM: Implements tests for vpgatherqq/vgatherqpd Similar to previous tests, vpgatherqq and vgatherqpd are equivalent instructions. So the tests are the same with the mnemonic changed. This adds tests for an additional two sets of instructions. Getting us full coverage of all eight instructions if we include the tests from PR #3167 and #3166 Tests the same things as described in #3165 In addition, since these tests use 64-bit indices for address calculation, we can easily generate and indice vector that tests overflow. So every test at every displacement ALSO gains an additional overflow test to ensure correct behaviour around pointer overflow calculation.	2023-09-29 08:04:47 -07:00
Ryan Houdek	85da0f0640	Merge pull request #3165 from Sonicadvance1/gatherddps unittests/ASM: Implements tests for vpgatherdd/vgatherps	2023-09-28 22:44:38 -07:00
Ryan Houdek	9a01b440e3	unittests/ASM: Implements tests for vpgatherdd/vgatherps vpgatherdd and vgatherps are effectively the same instructions, so the tests are the same except for the instruction mnemonic. This adds unit tests for two of the eight gather instructions. Specifically this adds tests for the 32-bit indices loading 32-bit elements instructions. What it tests: - Tests all displacement scales - Tests multiple mask arrangements - Ensures the mask register is zero'd after the instruction What it doesn't test: - Doesn't test address size calculation overflow - Only would happen on 32-bit with 32-bit indices, or /really/ high base addresses - The instruction should behave as a mask to the address size - Effectively behaves like `(uint64_t)(base + index << ilog2(scale))` - Better idea is to just not expose AVX to 32-bit applications - Doesn't test VSIB immediate displacement - This just ends up being base_addr + imm so it isn't too interesting - We can add more tests in the future if we think we messed that up - Doesn't test partial fault behaviour - Because that's a nightmare. Specifically keeps each instruction test small and isolated so if a single register fails it is very easily to nail down which operation did it. I know some of our ASM tests do a chunk of work and spit out a result at the end which can be difficult to debug in some cases. Didn't want to do that which is why the tests are spread out across 16 files for these single class of instructions.	2023-09-28 19:58:34 -07:00
Ryan Houdek	228ee7fa47	TestHarnessRunner: Support AVX2 flag detection	2023-09-28 19:58:34 -07:00
Ryan Houdek	98789a8039	FEXCore: Implement support for AVX2 feature detection	2023-09-28 19:57:08 -07:00
Ryan Houdek	14398742c3	Merge pull request #3164 from neobrain/fix_thunks_asan Thunks: Fix AddressSanitizer build	2023-09-28 12:05:55 -07:00
Tony Wasserka	5a7e3192da	Thunks: Fix AddressSanitizer build	2023-09-28 15:13:03 +02:00
Ryan Houdek	6b4ff4ae81	Merge pull request #3163 from alyssarosenzweig/opt/ascii-flags Optimize ASCII flags	2023-09-27 10:42:47 -07:00
Ryan Houdek	d1d3de80d1	Merge pull request #3157 from alyssarosenzweig/opt/unmask-in OpcodeDispatcher: Don't mask logic op inputs	2023-09-27 10:38:12 -07:00
Alyssa Rosenzweig	2e32e1367d	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-27 10:55:57 -04:00
Alyssa Rosenzweig	711583aa76	OpcodeDispatcher: Optimize PTEST flags Zero NZCV first to avoid RMW. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-27 10:55:57 -04:00
Alyssa Rosenzweig	3efac9646c	OpcodeDispatcher: Optimize ASCII flags Make the zeroing of undefined NZCV more obvious. Mitigates regressions from future work. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-27 10:31:31 -04:00
Alyssa Rosenzweig	095a362046	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 20:30:09 -04:00
Alyssa Rosenzweig	3bb64c64e3	OpcodeDispatcher: Don't mask for TEST Like AND. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 20:30:02 -04:00
Alyssa Rosenzweig	a4de164944	OpcodeDispatcher: Use lshr for ah/bh with AllowUpperGarbage If we ever get around to fusing ops with shifts in the ConstProp optimizer (may or may not be worthwhile), this will delete an instruction from things like "or al, bh". Even though lsr is the same speed as bfe on Firestorm, I feel if you ask for garbage you should get garbage C: Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 20:28:01 -04:00
Alyssa Rosenzweig	45a645fbbc	OpcodeDispatcher: Don't mask logic op inputs Pointless, upper bits ignored anyway. Deletes piles of uxt and even some 32-bit instruction moves. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 19:12:22 -04:00
Alyssa Rosenzweig	92211bf8c6	OpcodeDispatcher: Add AllowUpperGarbage option To load 8-bit sources without bfe'ing for al/bl/cl if the caller knows it doesn't need masking behaviour, but without lying about the size so the extract for ah/bh/ch will still work properly. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 19:08:20 -04:00
Alyssa Rosenzweig	728d3f8ac7	InstCountCI: Add a case with a hi 8-bit reg Noticeably different code pattern. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 18:33:55 -04:00
Ryan Houdek	ca87d8688d	Merge pull request #3153 from alyssarosenzweig/opt/adcs Use adcs	2023-09-26 09:57:01 -07:00
Ryan Houdek	e32601f49d	Merge pull request #3161 from neobrain/fix_ctest_silent_failures unittests: Instruct CTest to print output from tests on failure	2023-09-26 08:26:15 -07:00
Tony Wasserka	f4dd456c80	unittests: Instruct CTest to print output from tests on failure	2023-09-26 17:16:28 +02:00
Alyssa Rosenzweig	7b22dbfe24	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 10:05:59 -04:00
Alyssa Rosenzweig	7a06cc9727	IR: Use adcs/sbcs Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-26 09:06:46 -04:00
Ryan Houdek	8b3881b5db	Merge pull request #3154 from alyssarosenzweig/opt/smol-carry Optimize 8/16-bit CF calculation	2023-09-26 05:49:07 -07:00
Ryan Houdek	76d4637d9c	Merge pull request #3159 from neobrain/feature_update_vulkan Thunks: Update Vulkan thunk to v1.3.261.1	2023-09-26 05:20:18 -07:00
Alyssa Rosenzweig	0d12cce74f	Merge pull request #3158 from Sonicadvance1/unittest_for_3153 unittests/ASM: Adds unit test caught by #3153	2023-09-26 08:15:40 -04:00
Tony Wasserka	04592af609	Thunks: Update Vulkan thunk to v1.3.261.1	2023-09-26 12:14:58 +02:00
Ryan Houdek	d8366c04dc	unittests/ASM: Adds unit test caught by #3153	2023-09-26 00:28:45 -07:00
Ryan Houdek	533f35934c	Merge pull request #3155 from neobrain/opt_thunks_rebuilds Thunks: Avoid recompiling thunk interfaces on FEXLoader changes	2023-09-25 19:21:09 -07:00
Alyssa Rosenzweig	35bb7cc801	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-25 19:41:31 -04:00
Alyssa Rosenzweig	5facb21d30	OpcodeDispatcher: Don't mask small add/sub carries For the GPR result, the masking already happens as part of the bfi. So the only point of masking is for the flag calculation. But actually, every flag except carry will ignore the upper bits anyway. And the carry calculation actually WANTS the upper bit as a faster impl. Deletes a pile of code both in FEX and the output :-) ADC/SBC could probably get similar treatment later. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-25 18:25:30 -04:00
Tony Wasserka	adead832a5	Thunks: Avoid recompiling thunk interfaces on FEXLoader changes The interface files themselves don't use FEXLoader. Only the final library does.	2023-09-25 23:04:09 +02:00
Ryan Houdek	5eed24a242	Merge pull request #3152 from Sonicadvance1/instcountci_x87_f64 InstCountCI: Support f64 reduced precision mode tests	2023-09-24 19:29:37 -07:00
Ryan Houdek	7907f70ed2	InstCountCI: Adds new x87 reduced precision mode tests	2023-09-24 18:50:05 -07:00
Ryan Houdek	7141332f6f	InstCountCI: Support setting environment variables in tests This will allow us to enable FEX options through environment variables just like the ASM tests.	2023-09-24 18:50:01 -07:00
Ryan Houdek	234e029391	Merge pull request #3145 from Sonicadvance1/optimize_inline_calls PassManager: Optimize out CPUID and XGetBV calls	2023-09-24 18:09:18 -07:00
Ryan Houdek	19a7b514e6	Merge pull request #3150 from alyssarosenzweig/opt/ornror Optimize PF calculation in lahf	2023-09-24 18:05:57 -07:00
Ryan Houdek	220761a0e8	Merge pull request #3151 from Sonicadvance1/unique_name_workflow_jobs Github: Changes jobs to have unique names	2023-09-24 18:03:57 -07:00
Alyssa Rosenzweig	cbd4daddff	InstCountCI: Update Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-24 20:59:28 -04:00
Alyssa Rosenzweig	c8519b0b87	OpcodeDispatcher: Remove LoadPF Now unused, its former users all prefer LoadPFRaw since they can fold in some of this math into the use. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-24 20:59:28 -04:00
Alyssa Rosenzweig	68d32ad70d	OpcodeDispatcher: Optimize PF in lahf Use the raw popcount rather than the final PF and use some sneaky bit math to come out 1 instruction ahead. Closes #3117 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-24 20:59:28 -04:00
Ryan Houdek	62890f148f	Github: Changes jobs to have unique names These overlapping names make it impossible to ensure all checks are required to pass before merge. Unique names will fix this.	2023-09-24 17:52:47 -07:00
Alyssa Rosenzweig	1f02a6da34	IR: Add Ornror op Mostly copypaste of Orlshl... we really should deduplicate this mess somehow. Maybe a shift enum on the core Or op? Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2023-09-24 20:47:50 -04:00
Alyssa Rosenzweig	86063411dc	Revert "OpcodeDispatcher: Use plain Lshl for flags" This logic is unused since 8adfaa9aa ("OpcodeDispatcher: Use SelectCC for x87"), which addressed the underlying issue. This reverts commit df3833edbe3d34da4df28269f31340076238e420.	2023-09-24 20:47:50 -04:00
Ryan Houdek	9968e6431f	Passes: Rename SyscallOptimization This is now inlining multiple external calls out of the JIT. Rename it to InlineCallOptimization.	2023-09-24 17:25:38 -07:00
Ryan Houdek	ff24f64b2a	PassManager: Optimize out CPUID and XGetBV calls If we const-prop the required functions and leafs then we can directly encode the CPUID information rather than jumping out of the JIT. In testing almost all CPUID executions const-prop which function is getting called. Worst case that I found was only 85% const-prop rate. This isn't quite 100% optimal since we need to call the RCLSE and Constprop passes after we optimize these, which would remove some redundant moves. Sadly there seems to be a bug in the constprop pass that starts crashing applications if that is done. Easily enough tested by running Half-Life 2 and it immediately hitting SIGILL. Even without this optimization, this is stil a significant savings since we aren't jumping out of the JIT anymore for these optimized CPUIDs.	2023-09-24 17:25:38 -07:00
Ryan Houdek	e9a7ef2534	CPUID: Describe CPUID functions if they return constant state or not Most CPUID routines return constant data, there are four that don't. Some CPUID functions also need the leaf descriptor, so we need to describe that as well. Functions that don't return constant data: - function 1Ah - Returns different data depending on current CPU core - function 8000_000{2,3,4} - Different data based on CPU core Functions that need leaf constprop: - 4h, 7h, Dh, 4000_0001h, 8000_001Dh	2023-09-24 17:25:38 -07:00
Ryan Houdek	842c57e221	CPUID: Constify some functions These don't modify CPUIDEmu state.	2023-09-24 17:25:38 -07:00
Ryan Houdek	93aeb157b4	Merge pull request #3149 from Sonicadvance1/fail_on_change InstCountCI: Fail CI if there was any difference.	2023-09-24 17:23:52 -07:00
Ryan Houdek	02ff9f200c	InstCountCI: Upload diff and check for failure	2023-09-24 17:14:08 -07:00

1 2 3 4 5 ...

7929 Commits