Ryan Houdek
ecaca0fe15
unittests: Adds x87 integer indefinite test
...
Tests 16-bit, 32-bit, and 64-bit integer conversions
2024-07-04 17:53:28 -07:00
Ryan Houdek
90a6647fa4
Merge pull request #3811 from alyssarosenzweig/ra/fix-lsp
...
RA: fix interaction between SRA & shuffles
2024-07-04 14:20:46 -07:00
Alyssa Rosenzweig
2d75801024
unittests: add tricky RA test
...
this fails on current main with blocksize=500 due to mentioned RA bug. passes
with blocksize=1.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-04 13:37:13 -04:00
Ryan Houdek
95dd6ceba8
unittests: Adds MMX and x87 conflating unit test
...
This failed with prior RCLSE deletion caching.
2024-07-03 13:54:07 -07:00
Ryan Houdek
f453e1523e
Merge pull request #3803 from pmatos/NinjaCore
...
Use number of jobs as defined by TEST_JOB_COUNT
2024-07-03 12:42:14 -07:00
Paulo Matos
ad52514b97
Use number of jobs as defined by TEST_JOB_COUNT
...
At the moment we always run ctest with max number of cpus. If
undefined, it will keep current behaviour, otherwise it will
honour TEST_JOB_COUNT.
Therefore to run ctest one test at a time, use
`cmake ... -DTEST_JOB_COUNT=1`
2024-07-03 14:09:39 +02:00
Ryan Houdek
2d617ad173
InstcountCI: Update
2024-07-02 20:24:58 -07:00
Ryan Houdek
ba04da87e5
Merge pull request #3780 from Sonicadvance1/optimize_gathers
...
Optimize gathers slightly
2024-07-02 10:58:38 -07:00
Ryan Houdek
2e6b08cbcb
Merge pull request #3798 from Sonicadvance1/minor_128bit_vbsl_opt
...
Arm64: Minor VBSL optimization with SVE128
2024-07-01 18:57:46 -07:00
Ryan Houdek
472a373861
Merge pull request #3786 from Sonicadvance1/non_temporal_stores
...
OpcodeDispatcher: Implement support for non-temporal vector stores
2024-07-01 18:57:38 -07:00
Ryan Houdek
a451420911
Merge pull request #3783 from Sonicadvance1/optimize_vector_zeroregister
...
OpcodeDispatcher: Optimize x86 canonical vector zero register
2024-07-01 18:57:31 -07:00
Ryan Houdek
fb7167c2d2
CodeEmitter: Fixes vector {ldr,str}{b,h} with reg-reg source
...
We had failed to enable these implementations for the
`ExtendedMemOperand` helpers. We had already implemented the non-helper
forms, which are already tested in CI. These helpers just weren't
updated?
Noticed this when running libaom's SSE4.1 tests, where it managed to
execute a pmovzxbq instruction with reg+reg memory source and was
breaking the test results.
There are /very/ few vector register operations that access only 8-bit
or 16-bit in vectors so this flew under the radar for quite a while.
Fixes their unit tests.
Also adds a unittest using sse4.1 pmovzxbq to ensure we support the
reg+reg case, and also a few other instructions to test 8-bit and 16-bit
vector loads and stores.
2024-07-01 17:03:47 -07:00
Ryan Houdek
8b9b1a90e4
unittests: Fixes typo in vpcmpgtw test
2024-07-01 14:42:23 -07:00
Ryan Houdek
babde31bf0
AVX128: Fixes vmovlhps
...
We didn't have a unit test for this and we weren't implementing it at
all.
We treated it as vmovhps/vmovhpd accidentally. Once again caught by the
libaom Intrinsics unit tests.
2024-07-01 13:54:11 -07:00
Ryan Houdek
c282239077
InstcountCI: Add SVE128 VEX_map3
2024-06-30 16:27:58 -07:00
Ryan Houdek
5821054d91
Merge pull request #3789 from Sonicadvance1/avx128_minor_pshufb_opt
...
AVX128: Minor optimization to 256-bit vpshufb
2024-06-30 15:45:11 -07:00
Ryan Houdek
4626145374
Merge pull request #3792 from Sonicadvance1/avx128_fix_scalar_fma
...
AVX128: Fixes scalar FMA accidentally using vector wide
2024-06-30 15:36:09 -07:00
Ryan Houdek
a786d3621d
InstcountCI: Update for Scalar FMA
2024-06-30 14:36:56 -07:00
Ryan Houdek
c4604465ba
InstcountCI: Update
2024-06-30 13:41:14 -07:00
Ryan Houdek
672e885e40
InstcountCI: Adds canonical zero register tests
2024-06-29 22:21:53 -07:00
Ryan Houdek
a843ecf4c8
InstcountCI: Update for non-temporal stores
2024-06-29 22:05:56 -07:00
Ryan Houdek
cc0509c0f3
InstcountCI: Update
2024-06-29 19:27:39 -07:00
Ryan Houdek
a34ae24b3f
InstcountCI: Update for SVE non-base address reg
2024-06-29 13:16:02 -07:00
Ryan Houdek
e9a17b19c5
InstcountCI: Add SVE gathers without base addr
2024-06-29 13:07:32 -07:00
Ryan Houdek
ce8d111453
InstcountCI: Update
2024-06-29 13:04:21 -07:00
Ryan Houdek
500ad34769
Merge pull request #3778 from pmatos/LargeX87Blocks
...
Largest x87 blocks of code from games
2024-06-28 09:40:19 -07:00
Paulo Matos
70d8a10484
Largest x87 blocks of code from games
2024-06-28 16:50:58 +02:00
Ryan Houdek
9e94784e26
unittests: Adds test for xmm4 VSIB bug
2024-06-27 20:55:30 -07:00
Ryan Houdek
98d62a7eb1
InstcountCI: Update
2024-06-27 17:21:12 -07:00
Ryan Houdek
aba7a3a830
AVX128: Fixes vblendps lower and upper selector
2024-06-27 17:20:39 -07:00
Ryan Houdek
9027d1eee7
AVX128: Fixes bug in vector immediate shift
2024-06-27 16:22:14 -07:00
Ryan Houdek
b0eb63ab9a
FEXCore: Fixes address size override on GPR sources and destinations
...
When the source or destination is a register, the address size override
doesn't apply. We were accidentally applying it on all sources
regardless of type which was causing us to zero extend on operations
that aren't affected by address size override.
This fixes the OpenSSL cert error in every application, but most
importantly Steam.
2024-06-27 14:12:01 -07:00
Ryan Houdek
2e3242682d
Merge pull request #3771 from alyssarosenzweig/opt/asimd-masked
...
OpcodeDispatcher: optimize nzcv with asimd masked load/store
2024-06-27 10:27:10 -07:00
Alyssa Rosenzweig
3250d4e405
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-27 10:37:11 -04:00
Alyssa Rosenzweig
e61cb5b2c3
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-27 10:30:45 -04:00
Ryan Houdek
fc50e52157
InstCountCI: Adds AVX128 tests
2024-06-26 16:49:00 -07:00
Ryan Houdek
7669df0e16
InstCountCI: SVE256: Fixes behaviour change
2024-06-26 16:49:00 -07:00
Ryan Houdek
c6c147daf6
unittests: Updates vcvtps2ph test for failure case of writing too much memory.
2024-06-26 16:49:00 -07:00
Ryan Houdek
5133f480d1
InstcountCI: Update for xsave/xrstor behaviour changes with AVX
2024-06-26 16:49:00 -07:00
Ryan Houdek
3cdaf6736b
InstcountCI: Update for SVE256 FMA implementation
2024-06-26 14:56:01 -07:00
Ryan Houdek
52e541d453
Unittests: Stop using AVX2 flag
2024-06-26 14:56:01 -07:00
Ryan Houdek
ba28e6f82e
unittests: Adds vcvtps2ph tests that use mxcsr
2024-06-26 14:08:20 -07:00
Ryan Houdek
756fa2ecc5
Merge pull request #3766 from alyssarosenzweig/opt/f16c-round
...
Optimize vcvtps2ph
2024-06-26 14:03:24 -07:00
Alyssa Rosenzweig
cf834aa6da
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-26 16:46:21 -04:00
Ryan Houdek
991ecd558e
InstcountCI: Update for SVE256 gathers!
2024-06-26 16:00:53 -04:00
Alyssa Rosenzweig
d1d41f5645
Merge pull request #3763 from alyssarosenzweig/rclse/less-aggressive
...
Remove RCLSE
2024-06-26 15:14:14 -04:00
Ryan Houdek
94fd100fc7
Merge pull request #3719 from lioncash/f16c
...
OpcodeDispatcher: Handle F16C operations
2024-06-26 12:12:13 -07:00
Lioncache
cd5a809ec9
OpcodeDispatcher: Handle VCVTPS2PH
2024-06-26 15:05:03 -04:00
Lioncache
045a8efbeb
OpcodeDispatcher: Handle VCVTPH2PS
...
Fairly straightforward, since we already have handling for half-float conversions.
2024-06-26 15:05:00 -04:00
Ryan Houdek
54a1f7d833
Merge pull request #3764 from Sonicadvance1/rorx_masking
...
BMI2: Ensure rorx immediate masks by operation size correctly.
2024-06-26 11:52:47 -07:00