687 Commits

Author SHA1 Message Date
Ryan Houdek
0d8d5444a4 unittests/ASM: Implements tests for vpgatherqd/vgatherqps
Similar to previous tests, vgatherqd and vgatherqps are equivalent
instructions. So the tests are the same with the mnemonic changed.

This adds tests for an additional two sets of instructions, Getting us
up to six total over the eight if we include the tests from #3166.

Tests the same things as described in #3165

In addition, since these tests use 64-bit indices for address
calculation, we can easily generate and indice vector that tests
overflow. So every test at every displacement ALSO gains and additional
overflow test to ensure correct behaviour around pointer overflow
calculation.
2023-09-29 07:20:07 -07:00
Ryan Houdek
9a01b440e3 unittests/ASM: Implements tests for vpgatherdd/vgatherps
vpgatherdd and vgatherps are effectively the same instructions, so the
tests are the same except for the instruction mnemonic.

This adds unit tests for two of the eight gather instructions.
Specifically this adds tests for the 32-bit indices loading 32-bit
elements instructions.

What it tests:
- Tests all displacement scales
- Tests multiple mask arrangements
- Ensures the mask register is zero'd after the instruction

What it doesn't test:
- Doesn't test address size calculation overflow
   - Only would happen on 32-bit with 32-bit indices, or /really/ high
     base addresses
   - The instruction should behave as a mask to the address size
   - Effectively behaves like `(uint64_t)(base + index << ilog2(scale))`
   - Better idea is to just not expose AVX to 32-bit applications
- Doesn't test VSIB immediate displacement
   - This just ends up being base_addr + imm so it isn't too interesting
   - We can add more tests in the future if we think we messed that up
- Doesn't test partial fault behaviour
   - Because that's a nightmare.

Specifically keeps each instruction test small and isolated so if a
single register fails it is very easily to nail down which operation did
it.
I know some of our ASM tests do a chunk of work and spit out a result at
the end which can be difficult to debug in some cases. Didn't want to do
that which is why the tests are spread out across 16 files for these
single class of instructions.
2023-09-28 19:58:34 -07:00
Ryan Houdek
e32601f49d
Merge pull request #3161 from neobrain/fix_ctest_silent_failures
unittests: Instruct CTest to print output from tests on failure
2023-09-26 08:26:15 -07:00
Tony Wasserka
f4dd456c80 unittests: Instruct CTest to print output from tests on failure 2023-09-26 17:16:28 +02:00
Ryan Houdek
d8366c04dc unittests/ASM: Adds unit test caught by #3153 2023-09-26 00:28:45 -07:00
Ryan Houdek
0fbf403787 Adds back in host testharnessrunner CI
Necessary for asm tests to still run in the host "core".
Useful for ensuring correct behaviour of our assembly tests.
2023-09-22 14:46:03 -07:00
Alyssa Rosenzweig
c52741c813 FEXCore: Gut interpreter
It is scarcely used today, and like the x86 jit, it is a significant
maintainence burden complicating work on FEXCore and arm64 optimization. Remove
it, bringing us down to 2 backends.

1 down, 1 to go.

Some interpreter scaffolding remains for x87 fallbacks. That is not a problem
here.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 12:48:12 -04:00
Lioncache
b0c8ff0ea6 Arm64/ALUOps: Remove spills in PEXT
Reduces the number of emitted instructions for a
corresponding PEXT instruction.

We no longer spill for this IR op.
2023-09-15 19:39:51 -04:00
Lioncache
4a37ea4819 OpcodeDispatcher: Handle RORX corner cases better
There are a few cases where we were emitting code when we
didn't really need to, or could emit less.
2023-09-15 17:36:36 -04:00
Ryan Houdek
c008671509 unittests/asm: Add test with inverted sources
To ensure this is tested with non sequential source registers.
2023-09-13 11:31:20 -07:00
Ryan Houdek
e37cef8283 unittests: Implement shufps optimization test
Tests all current forms of shufps optimizations.
2023-09-12 19:58:07 -07:00
Ryan Houdek
636f8aa4a7 Arm64: Fix undefined behaviour in Push operation
Arm64 store with writeback when source register is the same register as
the address is undefined behaviour.
Depending on hardware details this can do a whole bunch of things.

This situation happens when the x86 code does `push rsp` which is quite
common for applications to do. We would then convert this to a `str x8, [x8, #-8]!`
Which results in undefined behaviour.

Now that redundant loads are optimized this showed up as an issue. Adds
a unit test to ensure we don't hit this again.
2023-09-07 17:38:39 -07:00
Alyssa Rosenzweig
80ac824dd9 Unittests: Fix bogus lahf tests
Logical ops leave AF undefined so we can't expect it to be zero after. Mask the
result of lahf to avoid testing UB. These unit tests would regress from the work
in this MR otherwise.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-05 14:21:18 -04:00
Ryan Houdek
3a6d25f56a unittests/ASM: Update movmskpd test to include an edge of garbage but no sign bit 2023-08-24 17:27:54 -07:00
Lioncache
bbed4d73ed OpcodeDispatcher: Improve VPERMQ/VPERMPD broadcast cases
For a bunch of cases that act as broadcasts (where all
indices in the imm8 specify the same element), we
can use VDupElement here rather than iterating through.
2023-08-19 01:08:30 -04:00
Lioncache
9e54ec2724 OpcodeDispatcher: Improve {V}PSRLDQ shift by 0
While it would be bizarre if this actually occurred frequently
in practice, we can still tune it so there's no subpar assembly
output in the cases it actually does happen.
2023-08-17 19:33:09 -04:00
Ryan Houdek
9aa3fde174 FEXCore: Fixes bug with 32-bit adcx
When a 32-bit adcx instruction was encountered, it was getting treated
as a 16-bit adcx instruction instead. This is because of the 0x66 prefix
required to handle this instruction.

Adds a unit test to ensure it doesn't break again.
2023-08-11 14:09:31 -07:00
Ryan Houdek
e8fb322025 unittests: Adds tests for vector shifts with zero immediate
To ensure FEX doesn't encounter the encoding bug again.
2023-08-09 02:16:17 -07:00
Ryan Houdek
1d1bdfb96d OpcodeDispatcher: Fixes SHRD by immediate OF flag calculation
We were calculating this like the regular SHR instruction which isn't
correct.

With this resolved, Denuvo games get slightly farther.
2023-07-28 18:00:50 -07:00
Alyssa Rosenzweig
8d288300c4 unittests: Add test for preserving PF across zero shift
Tests for the regression from 7e6bb04db ("OpcodeDispatcher: Extract
CalculatePF"). This fails on main but passes with this PR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-07-25 11:21:52 -04:00
Alyssa Rosenzweig
68555546bc unittests: Add more PF coverage
FEX bugs folder of shame. This test fails on main, but passes with the
bug fix.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-07-13 08:39:10 -04:00
Ryan Houdek
92d0344d6a OpcodeDispatcher: Fixes bug with pcmpestri
When this instruction returns the index in to the ecx register, this is
defined as a 32-bit result. This means it actually gets zero-extended to
the full 64-bit GPR size on 64-bit processes.
Previously FEX was doing a 32-bit insert which leaves garbage data in
the upper 32-bits of the RCX register.

Adds a unit test to ensure the result is zero extended.
Fixes running Java games under FEX now that SSE4.2 is exposed.
2023-07-08 18:08:47 -07:00
Lioncache
ae536e44d7 OpcodeDispatcher: Handle XRSTOR 2023-06-13 17:47:45 -04:00
Lioncache
a69c457715 unittests: Add include search path for includes
Allows us to have a place to put helper includes and files that contain
macro utilities. This will be nice for making macro files that cut down
on verbosity across tests (e.g. Making tests for XSAVE would be way less
copy-pastey).
2023-06-13 14:40:43 -04:00
Lioncache
bec8b70e5d VectorFallbacks: Fix PCMPSTR fallback ZF/SF flag setting
So, uh, this was a little silly to track down. So, having the upper limit
as unsigned was a mistake, since this would cause negative valid lengths to
convert into an unsigned value within the first two flag comparison cases

A -1 valid length can occur if one of the strings starts with a null character
in a vector's first element. (It will be zero and we then subtract it to
make the length zero-based).

Fixes this edge-case up and expands a test to check for this in the future.
2023-06-12 13:13:24 -04:00
Ryan Houdek
5646428640 FEXCore: Implements support for xgetbv
This returns the `XFEATURE_ENABLED_MASK` register which reports what
features are enabled on the CPU.
This behaves similarly to CPUID where it uses an index register in ecx.

This is a prerequisite to enabling XSAVE/XRSTOR and AVX since
applications will expect this to exist.

xsetbv is a privileged instruction and doesn't need to be implemented.
2023-05-22 16:48:07 -07:00
Ryan Houdek
8b90caad95 unittests: Adds a Linux HostFeatures flag
Disables two tests that don't work under Wine
2023-05-17 21:09:31 -07:00
Ryan Houdek
b89dc56ae1 unittests: Update test so it can work on wine.
We don't necessarily care where this memory is, just that it can be
allocated. Move it to a memory location that works on both Linux and
Wine.
2023-05-17 21:07:40 -07:00
Ryan Houdek
cd0a340d29 unittests/ASM: Ensure wine harness runner works
Needs to execute the correct runner through wine, and needs to reserve
the low DOS region so something doesn't get loaded there.
2023-05-17 21:05:55 -07:00
Lioncache
f7c663240e OpcodeDispatcher: Handle PCMPESTRM/VPCMPESTRM
...and with that all of the SSE4.2 string instructions are implemented now
2023-05-17 00:21:55 -04:00
Lioncache
82b4aef30d OpcodeDispatcher: Handle PCMPISTRM/VPCMPISTRM 2023-05-16 22:59:54 -04:00
Ryan Houdek
88247141d7
Merge pull request #2649 from lioncash/istri
OpcodeDispatcher: Handle PCMPISTRI/VPCMPISTRI
2023-05-02 14:35:47 -07:00
Lioncache
f502154f96 OpcodeDispatcher: Handle VPCMPISTRI 2023-05-02 14:00:05 -04:00
Lioncache
8369f9c25b unittests: Add missing VPMASKMOVQ store test
Realized I forgot to add this in the commit that added
VPMASKMOVD/VPMASKMOVQ support.
2023-05-02 11:13:44 -04:00
Lioncache
c94721a04b OpcodeDispatcher: Handle VPMASKMOVD/VPMASKMOVQ
We can reuse the same helper we have for handling VMASKMOVPD and VMASKMOVPS,
though we need to move some handling around to account for the fact that
VPMASKMOVD and VPMASKMOVQ 'hijack' the REX.W bit to signify the element
size of the operation.
2023-04-24 10:50:11 -04:00
Lioncache
651c6f8ddf OpcodeDispatcher: Handle VCVTPS2PD/VCVTPD2PS 2023-04-18 10:29:57 -04:00
Lioncache
d1116456fc OpcodeDispatcher: Handle VCVTSD2SS/VCVTSS2SD 2023-04-18 08:13:23 -04:00
Lioncache
39c73d975b OpcodeDispatcher: Handle PCMPESTRI/VPCMPESTRI 2023-04-17 21:42:58 -04:00
Lioncache
830c1884d1 OpcodeDispatcher: Handle store variants of VMASKMOVPD/VMASKMOVPS
And with that, we support all of the AVX1-only instructions.

The remaining instructions for full AVX1 support is now just the SSE4.2
string instructions.
2023-03-29 14:03:23 -04:00
Lioncache
25960fe6b1 OpcodeDispatcher: Handle load variants of VMASKMOVP{D, S} 2023-03-28 10:35:23 -04:00
Lioncache
ef31e0c7c7 OpcodeDispatcher: Handle VMPSADBW 2023-03-27 16:00:24 -04:00
Lioncache
fea0162096 OpcodeDispatcher: Handle VPSLLVD/VPSLLVQ 2023-03-21 17:08:03 -04:00
Lioncache
233aef5289 OpcodeDispatcher: Handle VPSRLVD/VPSRLVQ 2023-03-21 15:03:59 -04:00
Lioncache
02e245d61f OpcodeDispatcher: Handle VCVTSI2SD 2023-03-14 15:36:54 -04:00
Lioncache
03724c8486 OpcodeDispatcher: Handle VCVTSI2SS 2023-03-14 15:28:55 -04:00
Lioncache
ce961a3ef4 OpcodeDispatcher: Handle VPINSRD/VPINSRQ
These are handled from the same encoding via the VEX.W bit.
2023-03-14 14:17:59 -04:00
Lioncache
8ce87d3b27 OpcodeDispatcher: Handle VPINSRW 2023-03-14 14:17:55 -04:00
Lioncache
24f03dd740 OpcodeDispatcher: Handle VPINSRB 2023-03-14 14:16:44 -04:00
Lioncache
b1a00b05c4 unittests: Change alignment directive in 256-bit VPSADBW test to 32
Meant to change this over when writing the test, but forgot.
2023-03-08 21:26:20 -05:00
Lioncache
d9da63d492 OpcodeDispatcher: Handle VPSADBW 2023-03-08 21:02:44 -05:00