Commit Graph

9940 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
926b6c3117 OpcodeDispatcher: don't defer mul flags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 16:24:54 -04:00
Alyssa Rosenzweig
c9f9304ba5 OpcodeDispatcher: stop deferring obscure bitwise
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 16:24:54 -04:00
Alyssa Rosenzweig
fabd6be5af OpcodeDispatcher: drop SUB defer
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 16:24:54 -04:00
Alyssa Rosenzweig
1bf31d20b6 OpcodeDispatcher: switch to CalculateFlags_SUB
most of these are deferred only to be calculated immediately anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 16:24:53 -04:00
Ryan Houdek
653bf04db0
Merge pull request #3819 from alyssarosenzweig/bug/rcr-smol
Fix 8/16-bit RCR
2024-07-05 12:49:23 -07:00
Ryan Houdek
b77a25b21a
Merge pull request #3818 from alyssarosenzweig/jit/shiftbymaskstozero
JIT: fix ShiftFlags masking
2024-07-05 12:49:16 -07:00
Alyssa Rosenzweig
9db6931cea InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 10:49:12 -04:00
Ryan Houdek
bad5cef52b unittests: Adds rotate with carry test for large rotates
FEX-Emu currently doesn't do large rotates for small data sources
correctly. This will fail CI until fixed in OpcodeDispatcher
2024-07-05 10:49:02 -04:00
Alyssa Rosenzweig
94bd79b2bf OpcodeDispatcher: fix 8/16-bit RCR
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 10:49:02 -04:00
Alyssa Rosenzweig
b746146f4e InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 09:57:42 -04:00
Ryan Houdek
8ac9bb5c72 unittests: Adds test for flags when shifting by zero 2024-07-05 09:57:42 -04:00
Alyssa Rosenzweig
1b552a6f62 JIT: fix ShiftFlags masking
we don't update flags for a nonzero shift that masks to zero.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-05 09:57:42 -04:00
Alyssa Rosenzweig
97329ccc7a
Merge pull request #3812 from Sonicadvance1/fix_rotates_with_zero
OpcodeDispatcher: Fixes rotates with zero not zero extending 32-bit result
2024-07-05 09:48:01 -04:00
Mai
f2d1f2de56
Merge pull request #3817 from Sonicadvance1/fix_x87_integer_indefinite
Softfloat: Fixes Integer indefinite return for 16-bit signed values
2024-07-04 23:11:44 -04:00
Ryan Houdek
692c2fae96
Merge pull request #3813 from alyssarosenzweig/bug/fix-sbb
Fix 16-bit SBB
2024-07-04 19:52:37 -07:00
Mai
3d65b701a2
Merge pull request #3816 from Sonicadvance1/fix_long_signed_divide
Arm64: Fixes long signed divide
2024-07-04 21:43:11 -04:00
Ryan Houdek
ecaca0fe15
unittests: Adds x87 integer indefinite test
Tests 16-bit, 32-bit, and 64-bit integer conversions
2024-07-04 17:53:28 -07:00
Ryan Houdek
8955f83ef6
Softfloat: Fixes Integer indefinite return for 16-bit signed values
Regardless of positive or negative value, if the converted integer
doesn't fit in to the converted int16_t then it returns INT16_MIN.
2024-07-04 17:43:28 -07:00
Ryan Houdek
1a8aaebd79
unittests: Adds long signed divide test 2024-07-04 16:43:21 -07:00
Ryan Houdek
38a823cc54
Arm64: Fixes long signed divide
The two halves are provided as two uint64_t values that shouldn't be
sign extended between them. Treat them as uint64_t until combined in to
a single int128_t. Fixes long signed divide.
2024-07-04 16:42:23 -07:00
Ryan Houdek
25306cb373
InstcountCI: Update 2024-07-04 14:35:43 -07:00
Ryan Houdek
1084a031e7
unittests: Adds test for previous fix
All of these results would have failed except for the rorx result.
2024-07-04 14:35:43 -07:00
Ryan Houdek
f6ec99bede
OpcodeDispatcher: Fixes rotates with zero not zero extending 32-bit result
For all the 32-bit rotates (except for RORX) we were failing to zero
extend the 32-bit result to the destination register when the rotate was
masked to zero.

Ensure we do this.
2024-07-04 14:35:42 -07:00
Ryan Houdek
90a6647fa4
Merge pull request #3811 from alyssarosenzweig/ra/fix-lsp
RA: fix interaction between SRA & shuffles
2024-07-04 14:20:46 -07:00
Alyssa Rosenzweig
a926bb81a9 InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-04 16:58:45 -04:00
Alyssa Rosenzweig
fbf41e3149 unittests: add test for small sbc flags
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-04 16:58:45 -04:00
Alyssa Rosenzweig
a38205069b OpcodeDispatcher: fix SBB carry flag
do it the naive way, just applying the x86 definitions of SBB.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-04 16:58:45 -04:00
Alyssa Rosenzweig
2d75801024 unittests: add tricky RA test
this fails on current main with blocksize=500 due to mentioned RA bug. passes
with blocksize=1.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-04 13:37:13 -04:00
Alyssa Rosenzweig
504511fe7e RA: fix interaction between SRA & shuffles
missed a Map. tricky case hit by the unit test added in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-04 13:37:13 -04:00
Ryan Houdek
d3399a261b
Docs: Update for release FEX-2407 2024-07-03 17:59:42 -07:00
Ryan Houdek
d2437e6a21
Merge pull request #3810 from Sonicadvance1/x87_mmx_unittest
unittests: Adds MMX and x87 conflating unit test
2024-07-03 14:39:05 -07:00
Ryan Houdek
95dd6ceba8
unittests: Adds MMX and x87 conflating unit test
This failed with prior RCLSE deletion caching.
2024-07-03 13:54:07 -07:00
Alyssa Rosenzweig
1a0d135201
Merge pull request #3809 from alyssarosenzweig/rm/old-md
FEXCore: remove very out-of-date optimizer docs
2024-07-03 15:46:27 -04:00
Ryan Houdek
f453e1523e
Merge pull request #3803 from pmatos/NinjaCore
Use number of jobs as defined by TEST_JOB_COUNT
2024-07-03 12:42:14 -07:00
Alyssa Rosenzweig
622b0bfbc9 FEXCore: remove very out-of-date optimizer docs
most of this doesn't exist and won't exist. nothing lost here but hopes &
dreams.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-07-03 11:36:48 -04:00
Paulo Matos
ad52514b97 Use number of jobs as defined by TEST_JOB_COUNT
At the moment we always run ctest with max number of cpus. If
undefined, it will keep current behaviour, otherwise it will
honour TEST_JOB_COUNT.

Therefore to run ctest one test at a time, use
`cmake ... -DTEST_JOB_COUNT=1`
2024-07-03 14:09:39 +02:00
Alyssa Rosenzweig
02a218c6e3
Merge pull request #3804 from Sonicadvance1/revert_rclse_drop
Revert removing RCLSE
2024-07-03 07:37:02 -04:00
Ryan Houdek
2d617ad173
InstcountCI: Update 2024-07-02 20:24:58 -07:00
Ryan Houdek
0d06e3e47d
Revert "OpcodeDispatcher: add cache"
This reverts commit 46676ca376.
2024-07-02 20:24:57 -07:00
Ryan Houdek
78aee4d96e
Revert "IR: drop RCLSE"
This reverts commit a5b24bfe4c.
2024-07-02 20:21:59 -07:00
Ryan Houdek
ba04da87e5
Merge pull request #3780 from Sonicadvance1/optimize_gathers
Optimize gathers slightly
2024-07-02 10:58:38 -07:00
Ryan Houdek
2e6b08cbcb
Merge pull request #3798 from Sonicadvance1/minor_128bit_vbsl_opt
Arm64: Minor VBSL optimization with SVE128
2024-07-01 18:57:46 -07:00
Ryan Houdek
472a373861
Merge pull request #3786 from Sonicadvance1/non_temporal_stores
OpcodeDispatcher: Implement support for non-temporal vector stores
2024-07-01 18:57:38 -07:00
Ryan Houdek
a451420911
Merge pull request #3783 from Sonicadvance1/optimize_vector_zeroregister
OpcodeDispatcher: Optimize x86 canonical vector zero register
2024-07-01 18:57:31 -07:00
Mai
2e84f21c18
Merge pull request #3802 from Sonicadvance1/fix_sse41_helper
CodeEmitter: Fixes vector {ldr,str}{b,h} with reg-reg source
2024-07-01 20:42:49 -04:00
Ryan Houdek
fb7167c2d2
CodeEmitter: Fixes vector {ldr,str}{b,h} with reg-reg source
We had failed to enable these implementations for the
`ExtendedMemOperand` helpers. We had already implemented the non-helper
forms, which are already tested in CI. These helpers just weren't
updated?

Noticed this when running libaom's SSE4.1 tests, where it managed to
execute a pmovzxbq instruction with reg+reg memory source and was
breaking the test results.

There are /very/ few vector register operations that access only 8-bit
or 16-bit in vectors so this flew under the radar for quite a while.

Fixes their unit tests.

Also adds a unittest using sse4.1 pmovzxbq to ensure we support the
reg+reg case, and also a few other instructions to test 8-bit and 16-bit
vector loads and stores.
2024-07-01 17:03:47 -07:00
Mai
d884eb9287
Merge pull request #3801 from Sonicadvance1/fix_vpcmpgtw_typo
unittests: Fixes typo in vpcmpgtw test
2024-07-01 18:16:54 -04:00
Ryan Houdek
8b9b1a90e4
unittests: Fixes typo in vpcmpgtw test 2024-07-01 14:42:23 -07:00
Ryan Houdek
e2d4010b59
Merge pull request #3800 from Sonicadvance1/fix_vmovlhps
AVX128: Fixes vmovlhps
2024-07-01 14:41:43 -07:00
Ryan Houdek
babde31bf0
AVX128: Fixes vmovlhps
We didn't have a unit test for this and we weren't implementing it at
all.
We treated it as vmovhps/vmovhpd accidentally. Once again caught by the
libaom Intrinsics unit tests.
2024-07-01 13:54:11 -07:00