5406 Commits

Author SHA1 Message Date
lioncash
27c0d4a9f5 OpcodeDispatcher: Handle VPHADDD 2022-12-16 03:28:57 +00:00
lioncash
dd4ba7562f OpcodeDispatcher: Handle VPHADDW 2022-12-16 03:28:57 +00:00
lioncash
bd9d8e8fe5 x86_64: Correct handling for 128-bit/256-bit VAddP
Makes the behavior consistent with the ARM JIT.
2022-12-16 03:28:57 +00:00
lioncash
c7ac204322 Arm64/VectorOps: Simplify VAddP merging operation
We can just merge the two results together instead of shifting to the
left and then ORing together.
2022-12-16 03:28:52 +00:00
Ryan Houdek
91c00d2cb6
Merge pull request #2248 from lioncash/acc
X86Tables: Restrict CVTDQ2PD and CVTTSD2SI to 64-bit memory accesses
2022-12-15 16:48:42 -08:00
lioncash
e985dcdb22 X86Tables: Restrict CVTTSD2SI src to 64 bit
When accessing memory, this should only be doing a 64-bit access, rather
than a 128-bit one.
2022-12-15 23:59:15 +00:00
lioncash
ee9778480d X86Tables: Restrict CVTDQ2PD src to 64 bit
When accessing memory, this should only be doing a 64-bit access, rather
than a 128-bit one.
2022-12-15 23:46:48 +00:00
Mai
048daa4579
Merge pull request #2244 from Sonicadvance1/move_to_header
ARM64: Moves RA functions to header
2022-12-15 23:13:32 +00:00
Ryan Houdek
6ae8a1e55f ARM64: Moves RA functions to header
These are just some basic address calculations and a load, we want these
to be inlined as much as possible.
2022-12-15 15:00:33 -08:00
Ryan Houdek
dc2eaf6511
Merge pull request #2246 from lioncash/extend
OpcodeDispatcher: Handle VPMOVSXB{D, W, Q}/VPMOVSXW{D, Q}/VPMOVSXDQ/VPMOVZXB{D, W, Q}/VPMOVZXW{D, Q}/VPMOVZXDQ
2022-12-15 14:19:28 -08:00
Ryan Houdek
0e233a96f0
Merge pull request #2247 from lioncash/roundacc
OpcodeDispatcher: Narrow memory access with scalar rounding operations
2022-12-15 14:17:49 -08:00
lioncash
ba5fafcd7f OpcodeDispatcher: Narrow memory access with scalar rounding operations
These should only be accessing a 32-bit or 64-bit portion of memory
depending on single or double precision variants are used. Previously
we'd be doing a full 128-bit load.
2022-12-15 19:42:37 +00:00
lioncash
b12503fe32 OpcodeDispatcher: Handle VPMOVSXDQ 2022-12-15 18:10:38 +00:00
lioncash
aa63c7b94d OpcodeDispatcher: Handle VPMOVSXWQ 2022-12-15 18:08:00 +00:00
lioncash
cccbb7f595 OpcodeDispatcher: Handle VPMOVSXWD 2022-12-15 18:01:43 +00:00
lioncash
ce12ed60ae OpcodeDispatcher: Handle VPMOVSXBQ 2022-12-15 17:58:42 +00:00
lioncash
d7eab5f787 OpcodeDispatcher: Handle VPMOVSXBD 2022-12-15 17:54:51 +00:00
lioncash
21537a3636 OpcodeDispatcher: Handle VPMOVSXBW 2022-12-15 17:50:58 +00:00
lioncash
588a2611a7 OpcodeDispatcher: Handle VPMOVZXDQ 2022-12-15 17:45:14 +00:00
lioncash
2895a09101 OpcodeDispatcher: Handle VPMOVZXWQ 2022-12-15 17:41:15 +00:00
lioncash
5c8d40d9be OpcodeDispatcher: Handle VPMOVZXWD 2022-12-15 17:37:51 +00:00
lioncash
b4079cfea3 OpcodeDispatcher: Handle VPMOVZXBQ 2022-12-15 17:32:18 +00:00
lioncash
2b5570a910 OpcodeDispatcher: Handle VPMOVZXBD 2022-12-15 17:28:35 +00:00
lioncash
6bb0c5b24c OpcodeDispatcher: Handle VPMOVZXBW 2022-12-15 17:18:49 +00:00
lioncash
bc31f98f16 OpcodeDispatcher: Move ExtendVectorElements impl to regular function
This can be reused for the AVX versions.
2022-12-15 17:11:02 +00:00
Ryan Houdek
4b891d6147
Merge pull request #2245 from lioncash/split
OpcodeDispatcher: Move template impl to regular function where applicable
2022-12-14 18:18:43 -08:00
lioncash
58c3e20bd1 OpcodeDispatcher: Move template impl to regular function where applicable
Reduces the amount of code size generated by the specializations.

Only targets ones that are heavily templated like the generic op helper
functions.
2022-12-15 01:54:12 +00:00
Ryan Houdek
d5f3a091d0
Merge pull request #2216 from Sonicadvance1/32bit_host_thunk_support
Initial 32-bit host thunk feature support
2022-12-14 12:05:37 -08:00
Ryan Houdek
a14e03f35d Update guest thunk lib register usage comment 2022-12-14 11:40:33 -08:00
Ryan Houdek
5c1789952e GuestThunks: Disable stack protector on 32-bit 2022-12-14 11:29:19 -08:00
Ryan Houdek
f5809f24f7 GuestLibs: Fixes accidental guest lib setting 2022-12-14 11:29:19 -08:00
Ryan Houdek
122a9114a3 Thunks: 32-bit host library support 2022-12-14 11:29:19 -08:00
Ryan Houdek
d8f226b460 Support 32-bit thunks ABI 2022-12-14 11:29:19 -08:00
Ryan Houdek
7171c5ae39 Support 32-bit thunksdb 2022-12-14 11:29:19 -08:00
Ryan Houdek
798a78534a Support Indirect thunk callback with mm0 as custom ABI 2022-12-14 11:24:18 -08:00
Ryan Houdek
ae4a04b560 Fix incorrect THUNK_ABI prefix 2022-12-14 11:24:18 -08:00
Ryan Houdek
1971c8d505 32bit host thunk lib config path support 2022-12-14 11:24:18 -08:00
Ryan Houdek
1ca356371d
Merge pull request #2242 from lioncash/round
OpcodeDispatcher: Handle VROUNDS{D, S}/VROUNDP{D, S}
2022-12-13 23:00:51 -08:00
lioncash
27ea6096a2 OpcodeDispatcher: Handle VROUNDSD 2022-12-14 06:41:36 +00:00
lioncash
2244dd9847 OpcodeDispatcher: Handle VROUNDSS 2022-12-14 06:34:58 +00:00
lioncash
ca2f4bd468 OpcodeDispatcher: Handle VROUNDPD 2022-12-14 06:28:17 +00:00
lioncash
6b5c94be23 OpcodeDispatcher: Handle VROUNDPS 2022-12-14 06:27:59 +00:00
lioncash
779dc48d8d OpcodeDispatcher: Factor out VectorRound into VectorRoundImpl
This will be used in following commits for the AVX versions that use
this.
2022-12-14 05:52:16 +00:00
Ryan Houdek
4b2164768f
Merge pull request #2241 from lioncash/ins
OpcodeDispatcher: Handle VINSERTF128/VINSERTI128
2022-12-13 20:45:44 -08:00
lioncash
90828aeb11 OpcodeDispatcher: Handle VINSERTI128 2022-12-14 04:26:42 +00:00
lioncash
fe7c6da1e2 OpcodeDispatcher: Handle VINSERTF128 2022-12-14 04:24:04 +00:00
Ryan Houdek
f3d0fa6f60
Merge pull request #2240 from lioncash/perm2
OpcodeDispatcher: Handle VPERM2F128/VPERM2I128
2022-12-13 19:57:31 -08:00
lioncash
a9ad0d081c OpcodeDispatcher: Handle VPERM2I128 2022-12-14 03:41:29 +00:00
lioncash
54885bec32 OpcodeDispatcher: Handle VPERM2F128 2022-12-14 03:41:22 +00:00
Ryan Houdek
e8aa79bea9
Merge pull request #2239 from lioncash/dec
Frontend: Handle 256-bit destination sizes directly
2022-12-13 19:01:18 -08:00