Ryan Houdek
00cf8d530c
Merge pull request #3752 from Sonicadvance1/fma_ir_operations
...
ARM64: Adds new FMA vector instructions
2024-06-25 09:07:06 -07:00
Alyssa Rosenzweig
98aa58e9f5
InstCountCI: Update
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 10:03:33 -04:00
Ryan Houdek
6911917819
Disable vpclmulqdq_256 on simulator
2024-06-25 10:03:33 -04:00
Ryan Houdek
a8255aa475
CPUID: Expose support for VPCLMULQDQ
...
Wasn't exposed before since we couldn't unit test the SVE256
implementation.
2024-06-25 10:03:33 -04:00
Ryan Houdek
48e7aae38f
unittests: Adds support for 256-bit vpclmulqdq
...
It's easy because the test was already written for this in mind.
2024-06-25 10:03:33 -04:00
Ryan Houdek
7069643ae6
AVX128: Implement support for VPCLMULQDQ
...
This is just the 128-bit version twice.
2024-06-25 10:03:33 -04:00
Ryan Houdek
34272fc134
AVX128: Implement support for vperm{d,ps}!
2024-06-25 10:03:33 -04:00
Ryan Houdek
1d41002dfe
AVX128: Implement support for variable vpermil{ps,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
563bf342d5
AVX128: Implement support for vptest
2024-06-25 10:03:33 -04:00
Ryan Houdek
efd5fabb95
AVX128: Implement support for vtest{ps,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
c1da525110
AVX128: Implement support for vperm2{f128,i128}
2024-06-25 10:03:33 -04:00
Ryan Houdek
5ce6c88a88
AVX128: Reenable {ldm,stm}mxcsr. Can use the regular implementation.
2024-06-25 10:03:33 -04:00
Ryan Houdek
64cce7c6fa
AVX128: Implement support for xsave/xrstor
2024-06-25 10:03:33 -04:00
Ryan Houdek
4544e5b51f
AVX128: Implement support for vblend{ps,pd}/vpblendvb
2024-06-25 10:03:33 -04:00
Ryan Houdek
eb3e314946
AVX128: Implement support for vmaskmovdqu
2024-06-25 10:03:33 -04:00
Ryan Houdek
8b65c3de10
AVX128: Implement vmaskmov{ps,pd}, vpmaskmov{d,q} using SVE2 gather loadstores.
2024-06-25 10:03:33 -04:00
Ryan Houdek
c2beb27a9d
AVX128: Implement support for vpalignr
2024-06-25 10:03:33 -04:00
Ryan Houdek
05fdec9e72
AVX128: Implement support for vmpsadbw
2024-06-25 10:03:33 -04:00
Ryan Houdek
e8e3c95349
AVX128: Implement support for vpsadbw
2024-06-25 10:03:33 -04:00
Ryan Houdek
a87fa3f246
AVX128: Implement support for vpshufb
2024-06-25 10:03:33 -04:00
Ryan Houdek
b31ad523f5
AVX128: Implement support for hsub{ps,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
34bce540ff
AVX128: Implement support for vpblendw/vpblendd/vblendps/vblendpd
2024-06-25 10:03:33 -04:00
Ryan Houdek
8ea38e1d80
AVX128: Implement support for vpmaddwd
2024-06-25 10:03:33 -04:00
Ryan Houdek
ce591a9541
AVX128: Implement support for vpmaddubsw
2024-06-25 10:03:33 -04:00
Ryan Houdek
a48c65cd65
AVX128: Implement support for vphaddsw
2024-06-25 10:03:33 -04:00
Ryan Houdek
c283f80f48
AVX128: Implement support for vhaddpd/vphadd{w,d}
2024-06-25 10:03:33 -04:00
Ryan Houdek
d6bf276b5a
AVX128: Implement support for imm vpermil{ps,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
96a51650b1
AVX128: Implement support for vshuf{ps,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
f35a9c74a2
AVX128: Implement support for vpshuf{lw,hw,d}
2024-06-25 10:03:33 -04:00
Ryan Houdek
e2457943f5
AVX128: Implement support for vperm{q,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
a05644172a
AVX128: Implement support for vdd{ps,pd}
2024-06-25 10:03:33 -04:00
Ryan Houdek
cc168ce0fb
VectorOps: Restructure DPPOpImpl. This will get reused by AVX128
2024-06-25 10:03:33 -04:00
Alyssa Rosenzweig
76bd22d279
OpcodeDispatcher: rm gratuitous lambda
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 10:03:33 -04:00
Alyssa Rosenzweig
18574f3cf1
OpcodeDispatcher: extract VPERMILRegOpImpl
...
for avx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 10:03:33 -04:00
Alyssa Rosenzweig
665215ab47
OpcodeDispatcher: extract PTestOpImpl
...
for avx128
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 09:52:48 -04:00
Alyssa Rosenzweig
6009f36403
OpcodeDispatcher: extract VPERMDIndices
...
and rename things accordingly.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 09:46:17 -04:00
Alyssa Rosenzweig
2580efda0d
OpcodeDispatcher: tweak VTestOp signature
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-25 09:46:17 -04:00
Ryan Houdek
3a310b8815
Merge pull request #3756 from Sonicadvance1/fix_vmovhlps
...
Fix VMOVLHPS instruction
2024-06-24 19:14:56 -07:00
Ryan Houdek
7ff96227c0
Merge pull request #3755 from Sonicadvance1/fix_avx128_vmovntdqa
...
AVX128: Fix vmovntdqa failing to zero upper 128-bits
2024-06-24 19:14:48 -07:00
Ryan Houdek
3e8d78051c
InstcountCI: Update
2024-06-24 17:26:18 -07:00
Ryan Houdek
bd24ebc96a
unittests: Adds VMOVHLPS unit test
...
A bit confusing because the instruction encoding is the same between
VMOVHLPS and VMOVLPS so this unittest was missed.
Implement the test to ensure it stays working
2024-06-24 17:22:55 -07:00
Ryan Houdek
6d3745b8f1
AVX128: Fixes VMOVLHPS instruction
...
We didn't have unit tests for this
2024-06-24 17:22:51 -07:00
Ryan Houdek
d0f0b975be
SVE256: Fixes VMOVLHPS instruction
...
We didn't have unit tests for this
2024-06-24 17:22:47 -07:00
Ryan Houdek
ff2e6ed59f
X86Tables: Fixes instruction encoding for VMOVLP{S,D}
...
These can have both register and memory modrm encoding
2024-06-24 17:22:43 -07:00
Ryan Houdek
99b2018d0e
unittests: Extend vmovntpd test
2024-06-24 16:32:13 -07:00
Ryan Houdek
f0d9c8c10a
AVX128: Fix vmovntdqa failing to zero upper 128-bits
2024-06-24 16:32:09 -07:00
Ryan Houdek
dce1b24c00
Merge pull request #3754 from Sonicadvance1/fix_avx128_stringops
...
AVX128: Fixes SSE4.2 string compare instructions
2024-06-24 16:30:42 -07:00
Ryan Houdek
b47e981932
AVX128: Fixes SSE4.2 string compare instructions
2024-06-24 15:54:06 -07:00
Ryan Houdek
dc44eb4caf
Merge pull request #3749 from Sonicadvance1/contigous_mask_optimization_removal
...
Arm64: Remove contiguous masked element optimization
2024-06-24 15:22:23 -07:00
Ryan Houdek
dfda6733f0
Merge pull request #3750 from Sonicadvance1/pshuf_bug
...
OpcodeDispatcher: Fixes bug in pshuf{lw,hw}
2024-06-24 15:22:07 -07:00