Ryan Houdek
702ecf7637
AVX128: Implement support for round{ss,sd}
2024-06-24 15:19:08 -04:00
Ryan Houdek
0595f1e044
AVX128: Implement support for vround{ps,pd}
2024-06-24 15:19:08 -04:00
Ryan Houdek
cebb032bd3
AVX128: Implement support for vphminposuw
...
Reuses the non-AVX implementation since it only operates on 128-bits.
2024-06-24 15:19:08 -04:00
Ryan Houdek
8e32763ada
AVX128: Implements support for AVX string ops
...
Reuses the implementation from the SSE4.2 implementation, just
explicitly zeroes the hardcoded YMM0's upper 128-bits.
2024-06-24 15:19:08 -04:00
Ryan Houdek
7532337231
AVX128: Implements support for vector AES instructions
2024-06-24 15:19:08 -04:00
Ryan Houdek
4a66d4570e
AVX128: Implement support for a trinary operation with a passed in vector
...
Will be used for AES operations
2024-06-24 15:19:08 -04:00
Alyssa Rosenzweig
6f5e99d47d
OpcodeDispatcher: factor out TranslateRoundType
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 15:19:08 -04:00
Alyssa Rosenzweig
9ee9f5bddd
OpcodeDispatcher: tweak VectorRoundImpl signature
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 15:14:55 -04:00
Ryan Houdek
ddb9f6d3ad
Merge pull request #3746 from Sonicadvance1/avx_13
...
AVX128: More instructions
2024-06-24 11:40:52 -07:00
Ryan Houdek
d29139d88a
AVX128: Implement support for vextract{i,f}128
2024-06-24 14:27:19 -04:00
Ryan Houdek
317575ba99
AVX128: Implement support for cvtdq2{ps,pd}
2024-06-24 14:27:19 -04:00
Ryan Houdek
d4f2638a2e
AVX128: Implement support for cvt{t,}pd2pq
2024-06-24 14:27:19 -04:00
Ryan Houdek
b67d9be227
AVX128: Implement support for vcvt{pd2ps,ps2pd}
...
Fairly complex set of instructions due to the edge cases.
2024-06-24 14:27:19 -04:00
Ryan Houdek
d52add8fad
AVX128: Implement support for vcvt{ss2sd,sd2ss}
2024-06-24 14:27:19 -04:00
Ryan Houdek
aa9159d25c
AVX128: Implement support for vpmulh{u,}w
2024-06-24 14:27:19 -04:00
Ryan Houdek
94c777259e
AVX128: Implements support for vpmulhrsw
2024-06-24 14:27:19 -04:00
Ryan Houdek
c9f8fa5662
AVX128: Implement support for vpmul{u,}dq
2024-06-24 14:27:19 -04:00
Ryan Houdek
64ee6b119e
AVX128: Implement support for vaddsubp{s,d}
2024-06-24 14:27:19 -04:00
Ryan Houdek
d2ec9a8936
AVX128: Implement support for vpsubsw
2024-06-24 14:27:19 -04:00
Ryan Houdek
2a927453f7
AVX128: Implement support for vphsub{w,d}
2024-06-24 14:27:19 -04:00
Ryan Houdek
c19d489c9a
AVX128: Implement support for vinsertps
...
This one actually reuses the core base implementation which is nice.
2024-06-24 14:27:19 -04:00
Ryan Houdek
6012eb051b
AVX128: Implement support for vinsert{f128,i128}
2024-06-24 14:27:19 -04:00
Alyssa Rosenzweig
3974746473
OpcodeDispatcher: tweak PHSUBOpImpl
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
e1bcdcf387
OpcodeDispatcher: tweak PHSUBSOpImpl signature
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
fd5fbddae9
OpcodeDispatcher: tweak PMULLOpImpl for avx128
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
8ff72beddb
OpcodeDispatcher: tweak PMULHRSWOpImpl signature for avx128
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
cba5f7877b
OpcodeDispatcher: tweak ADDSUBPOpImpl signature for AVX128
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Alyssa Rosenzweig
9d7e9fd9fc
OpcodeDispatcher: add AVX128_Zext helper
...
should let us clean up a lot.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-24 14:14:23 -04:00
Ryan Houdek
082a0baff3
JIT: Implement missing Vector_FToF2
2024-06-24 14:14:23 -04:00
Ryan Houdek
3a4914315b
Arm64: Remove contiguous masked element optimization
...
This was a premature optimization and currently breaks. Just remove it
for now.
2024-06-24 07:49:00 -07:00
Ryan Houdek
448b5a338a
ARM64: Adds new FMA vector instructions
2024-06-24 07:48:05 -07:00
Ryan Houdek
9b68617fa8
InstCountCI: Update for pshuf fixes
2024-06-24 07:44:21 -07:00
Ryan Houdek
4c9890d7f8
OpcodeDispatcher: Fixes bug in pshuf{lw,hw}
...
This optimization was incorrect. Updates unittests to ensure it keeps
working.
2024-06-24 07:43:48 -07:00
Ryan Houdek
b2db04f5d7
Merge pull request #3745 from Sonicadvance1/add_x86_cmake_assert_back
...
Adds back cmake error on x86-64 hosts
2024-06-24 06:44:33 -07:00
Alyssa Rosenzweig
be8ff9ccb9
Merge pull request #3740 from Sonicadvance1/avx_12
...
AVX128: More various instructions
2024-06-24 09:28:40 -04:00
Ryan Houdek
9c531d97b0
AVX128: Implements the various vector shift instructions
...
These are very closely related to each other so it makes sense to
implement the roughly three different families in one commit.
2024-06-24 09:20:19 -04:00
Ryan Houdek
055d8d75a2
Revert "CI: Drop use of obsolete ENABLE_X86_HOST_DEBUG setting"
...
This reverts commit a054b998c5
.
2024-06-24 06:05:19 -07:00
Ryan Houdek
9fcf79ce0e
Adds back cmake error on x86-64 hosts
2024-06-24 06:05:19 -07:00
Ryan Houdek
6edf4619d4
Merge pull request #3742 from Sonicadvance1/export_avx_reg_helpers
...
FEXCore: Implement AVX reconstruction helpers
2024-06-24 05:57:44 -07:00
Ryan Houdek
8f769ce5a3
Merge pull request #3743 from alyssarosenzweig/cleanup/literal
...
X86Tables: add Literal() helper
2024-06-23 13:45:42 -07:00
Ryan Houdek
96ac71750a
Wow64: Use SSE register reconstruction helpers
...
It doesn't support AVX today but it should do in the future.
2024-06-21 17:13:56 -04:00
Ryan Houdek
d0852cf1bb
TestHarnessRunner: Reconverge YMM registers if AVX is supported
...
The TestHarness infrastructure doesn't understand the difference between
converged versus split view.
So fetch the split view immediately and reconverge the view manually
inside of the state object so it continues working with the split ymm
view.
2024-06-21 17:13:56 -04:00
Ryan Houdek
f5fea8af96
SignalDelegator: Use new YMM register reconstruction helpers
...
Otherwise we would be setting up signal handlers with incorrect register
state.
2024-06-21 17:13:56 -04:00
Ryan Houdek
d52a1da501
FEXCore: Implement support for fetching/setting YMM registers
...
Because we have two views of the YMM registers depending on if the host
supports SVE256 or not, add helper functions to fetch them correctly.
We fetch them in the way that Linux desires them in signal handlers, if
we want to return the converged view directly, that is easy to add
support for. It's unnecessary for now.
2024-06-21 17:13:56 -04:00
Ryan Houdek
abdcaa7c86
AVX128: Implement support for vpinsr{b,w,d,q}
2024-06-21 15:53:52 -04:00
Ryan Houdek
ad122cf463
AVX128: Implement support for vpmovmskb
2024-06-21 15:53:52 -04:00
Ryan Houdek
b58a57d225
AVX128: Implement support for vmovmskp{s,d}
2024-06-21 15:53:52 -04:00
Ryan Houdek
28d679de98
AVX128: Implement support for vpmov{s,z}{b,w,d}{w,d,q}
2024-06-21 15:53:52 -04:00
Ryan Houdek
d1dd055e6a
AVX128: Implement support for vpextr{b,w,d,q}
2024-06-21 15:53:52 -04:00
Ryan Houdek
3045578da4
AVX128: Implement vmov{d,q}
2024-06-21 15:53:52 -04:00