Commit Graph

1607 Commits

Author SHA1 Message Date
Ryan Houdek
ecaca0fe15
unittests: Adds x87 integer indefinite test
Tests 16-bit, 32-bit, and 64-bit integer conversions
2024-07-04 17:53:28 -07:00
Ryan Houdek
90a6647fa4
Merge pull request #3811 from alyssarosenzweig/ra/fix-lsp
RA: fix interaction between SRA & shuffles
2024-07-04 14:20:46 -07:00
Alyssa Rosenzweig
2d75801024 unittests: add tricky RA test
this fails on current main with blocksize=500 due to mentioned RA bug. passes
with blocksize=1.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-04 13:37:13 -04:00
Ryan Houdek
95dd6ceba8
unittests: Adds MMX and x87 conflating unit test
This failed with prior RCLSE deletion caching.
2024-07-03 13:54:07 -07:00
Ryan Houdek
f453e1523e
Merge pull request #3803 from pmatos/NinjaCore
Use number of jobs as defined by TEST_JOB_COUNT
2024-07-03 12:42:14 -07:00
Paulo Matos
ad52514b97 Use number of jobs as defined by TEST_JOB_COUNT
At the moment we always run ctest with max number of cpus. If
undefined, it will keep current behaviour, otherwise it will
honour TEST_JOB_COUNT.

Therefore to run ctest one test at a time, use
`cmake ... -DTEST_JOB_COUNT=1`
2024-07-03 14:09:39 +02:00
Ryan Houdek
2d617ad173
InstcountCI: Update 2024-07-02 20:24:58 -07:00
Ryan Houdek
ba04da87e5
Merge pull request #3780 from Sonicadvance1/optimize_gathers
Optimize gathers slightly
2024-07-02 10:58:38 -07:00
Ryan Houdek
2e6b08cbcb
Merge pull request #3798 from Sonicadvance1/minor_128bit_vbsl_opt
Arm64: Minor VBSL optimization with SVE128
2024-07-01 18:57:46 -07:00
Ryan Houdek
472a373861
Merge pull request #3786 from Sonicadvance1/non_temporal_stores
OpcodeDispatcher: Implement support for non-temporal vector stores
2024-07-01 18:57:38 -07:00
Ryan Houdek
a451420911
Merge pull request #3783 from Sonicadvance1/optimize_vector_zeroregister
OpcodeDispatcher: Optimize x86 canonical vector zero register
2024-07-01 18:57:31 -07:00
Ryan Houdek
fb7167c2d2
CodeEmitter: Fixes vector {ldr,str}{b,h} with reg-reg source
We had failed to enable these implementations for the
`ExtendedMemOperand` helpers. We had already implemented the non-helper
forms, which are already tested in CI. These helpers just weren't
updated?

Noticed this when running libaom's SSE4.1 tests, where it managed to
execute a pmovzxbq instruction with reg+reg memory source and was
breaking the test results.

There are /very/ few vector register operations that access only 8-bit
or 16-bit in vectors so this flew under the radar for quite a while.

Fixes their unit tests.

Also adds a unittest using sse4.1 pmovzxbq to ensure we support the
reg+reg case, and also a few other instructions to test 8-bit and 16-bit
vector loads and stores.
2024-07-01 17:03:47 -07:00
Ryan Houdek
8b9b1a90e4
unittests: Fixes typo in vpcmpgtw test 2024-07-01 14:42:23 -07:00
Ryan Houdek
babde31bf0
AVX128: Fixes vmovlhps
We didn't have a unit test for this and we weren't implementing it at
all.
We treated it as vmovhps/vmovhpd accidentally. Once again caught by the
libaom Intrinsics unit tests.
2024-07-01 13:54:11 -07:00
Ryan Houdek
c282239077
InstcountCI: Add SVE128 VEX_map3 2024-06-30 16:27:58 -07:00
Ryan Houdek
5821054d91
Merge pull request #3789 from Sonicadvance1/avx128_minor_pshufb_opt
AVX128: Minor optimization to 256-bit vpshufb
2024-06-30 15:45:11 -07:00
Ryan Houdek
4626145374
Merge pull request #3792 from Sonicadvance1/avx128_fix_scalar_fma
AVX128: Fixes scalar FMA accidentally using vector wide
2024-06-30 15:36:09 -07:00
Ryan Houdek
a786d3621d
InstcountCI: Update for Scalar FMA 2024-06-30 14:36:56 -07:00
Ryan Houdek
c4604465ba
InstcountCI: Update 2024-06-30 13:41:14 -07:00
Ryan Houdek
672e885e40
InstcountCI: Adds canonical zero register tests 2024-06-29 22:21:53 -07:00
Ryan Houdek
a843ecf4c8
InstcountCI: Update for non-temporal stores 2024-06-29 22:05:56 -07:00
Ryan Houdek
cc0509c0f3
InstcountCI: Update 2024-06-29 19:27:39 -07:00
Ryan Houdek
a34ae24b3f
InstcountCI: Update for SVE non-base address reg 2024-06-29 13:16:02 -07:00
Ryan Houdek
e9a17b19c5
InstcountCI: Add SVE gathers without base addr 2024-06-29 13:07:32 -07:00
Ryan Houdek
ce8d111453
InstcountCI: Update 2024-06-29 13:04:21 -07:00
Ryan Houdek
500ad34769
Merge pull request #3778 from pmatos/LargeX87Blocks
Largest x87 blocks of code from games
2024-06-28 09:40:19 -07:00
Paulo Matos
70d8a10484 Largest x87 blocks of code from games 2024-06-28 16:50:58 +02:00
Ryan Houdek
9e94784e26
unittests: Adds test for xmm4 VSIB bug 2024-06-27 20:55:30 -07:00
Ryan Houdek
98d62a7eb1
InstcountCI: Update 2024-06-27 17:21:12 -07:00
Ryan Houdek
aba7a3a830
AVX128: Fixes vblendps lower and upper selector 2024-06-27 17:20:39 -07:00
Ryan Houdek
9027d1eee7
AVX128: Fixes bug in vector immediate shift 2024-06-27 16:22:14 -07:00
Ryan Houdek
b0eb63ab9a
FEXCore: Fixes address size override on GPR sources and destinations
When the source or destination is a register, the address size override
doesn't apply. We were accidentally applying it on all sources
regardless of type which was causing us to zero extend on operations
that aren't affected by address size override.

This fixes the OpenSSL cert error in every application, but most
importantly Steam.
2024-06-27 14:12:01 -07:00
Ryan Houdek
2e3242682d
Merge pull request #3771 from alyssarosenzweig/opt/asimd-masked
OpcodeDispatcher: optimize nzcv with asimd masked load/store
2024-06-27 10:27:10 -07:00
Alyssa Rosenzweig
3250d4e405 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-27 10:37:11 -04:00
Alyssa Rosenzweig
e61cb5b2c3 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-27 10:30:45 -04:00
Ryan Houdek
fc50e52157
InstCountCI: Adds AVX128 tests 2024-06-26 16:49:00 -07:00
Ryan Houdek
7669df0e16
InstCountCI: SVE256: Fixes behaviour change 2024-06-26 16:49:00 -07:00
Ryan Houdek
c6c147daf6
unittests: Updates vcvtps2ph test for failure case of writing too much memory. 2024-06-26 16:49:00 -07:00
Ryan Houdek
5133f480d1
InstcountCI: Update for xsave/xrstor behaviour changes with AVX 2024-06-26 16:49:00 -07:00
Ryan Houdek
3cdaf6736b
InstcountCI: Update for SVE256 FMA implementation 2024-06-26 14:56:01 -07:00
Ryan Houdek
52e541d453
Unittests: Stop using AVX2 flag 2024-06-26 14:56:01 -07:00
Ryan Houdek
ba28e6f82e
unittests: Adds vcvtps2ph tests that use mxcsr 2024-06-26 14:08:20 -07:00
Ryan Houdek
756fa2ecc5
Merge pull request #3766 from alyssarosenzweig/opt/f16c-round
Optimize vcvtps2ph
2024-06-26 14:03:24 -07:00
Alyssa Rosenzweig
cf834aa6da InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-26 16:46:21 -04:00
Ryan Houdek
991ecd558e InstcountCI: Update for SVE256 gathers! 2024-06-26 16:00:53 -04:00
Alyssa Rosenzweig
d1d41f5645
Merge pull request #3763 from alyssarosenzweig/rclse/less-aggressive
Remove RCLSE
2024-06-26 15:14:14 -04:00
Ryan Houdek
94fd100fc7
Merge pull request #3719 from lioncash/f16c
OpcodeDispatcher: Handle F16C operations
2024-06-26 12:12:13 -07:00
Lioncache
cd5a809ec9 OpcodeDispatcher: Handle VCVTPS2PH 2024-06-26 15:05:03 -04:00
Lioncache
045a8efbeb OpcodeDispatcher: Handle VCVTPH2PS
Fairly straightforward, since we already have handling for half-float conversions.
2024-06-26 15:05:00 -04:00
Ryan Houdek
54a1f7d833
Merge pull request #3764 from Sonicadvance1/rorx_masking
BMI2: Ensure rorx immediate masks by operation size correctly.
2024-06-26 11:52:47 -07:00