1357 Commits

Author SHA1 Message Date
Lioncache
8a622a3c1a OpcodeDispatcher: Remove redundant moves from VAESEnc
Zero-extension will occur automatically upon storing if necessary.
2023-08-23 21:30:06 -04:00
Lioncache
f4848fd1a7 OpcodeDispatcher: Remove redundant moves from VAESEncLast
Zero-extension will occur automatically upon storing if necessary.
2023-08-23 21:28:14 -04:00
Lioncache
d37ce08ae9 OpcodeDispatcher: Remove redundant move from VAESDec
Zero-extension will occur automatically upon storing if necessary.
2023-08-23 21:26:49 -04:00
Lioncache
a6f1a9f8e8 OpcodeDispatcher: Remove redundant moves from VAESDecLast
Zero-extension will occur upon storing if necessary.
2023-08-23 21:25:15 -04:00
Lioncache
52ab3f6a1e OpcodeDispatcher: Remove redundant moves from VAESKeyGenAssist
Zero-extension will occur upon storing if necessary.

We can also join the AVX implementation with the SSE one.
2023-08-23 21:22:00 -04:00
Lioncache
410e99ba09 OpcodeDispatcher: Remove redundant moves from VPCLMULQDQOp
Zero-extension will occur if necessary upon storing.
2023-08-23 21:18:39 -04:00
Ryan Houdek
6e4765d48b
Merge pull request #2989 from lioncash/ins
Arm64/ConversionOps: Remove redundant moves in AdvSIMD VInsGPR
2023-08-23 18:09:15 -07:00
Ryan Houdek
172c8f3ba6
Merge pull request #2988 from lioncash/half
Arm64/ConversionOps: Add missing half-precision conversions to scalar functions
2023-08-23 17:56:23 -07:00
Lioncache
203a2b1105 Arm64/ConversionOps: Remove redundant moves in AdvSIMD VInsGPR
If Dst and DestVector alias one another, then we don't need to
move the vector unnecessarily.
2023-08-23 20:50:21 -04:00
Ryan Houdek
4297e13fcf
Merge pull request #2986 from lioncash/ext
Arm64/VectorOps: Remove redundant moves in SVE VExtr when possible
2023-08-23 17:40:50 -07:00
Lioncache
5ad56ad52e Arm64/ConversionOps: Add missing half-precision operations to Float_FromGPR_S
Provides parity with vector operations.
2023-08-23 20:36:34 -04:00
Lioncache
24e7baf28f Arm64/ConversionOps: Add missing half-precision conversions to Float_FToF
Provides parity with the vector conversion operations.
2023-08-23 20:31:49 -04:00
Lioncache
b248ae4c04 Arm64/VectorOps: Remove redundant moves from SVE SQSHL
We don't need to emit a move if the destination and source alias.
2023-08-23 20:12:09 -04:00
Lioncache
95bea864cf Arm64/VectorOps: Remove redundant moves from SVE SRSHR
We don't need to perform a move is the destination aliases
the source vector to be shifted.
2023-08-23 20:10:51 -04:00
Lioncache
47c4507bb6 Arm64/VectorOps: Remove redundant moves from SVE VSQXTUN2
We don't need to perform a move if the destination aliases the lower vector.
2023-08-23 20:10:29 -04:00
Lioncache
5ea0b6db28 Arm64/VectorOps: Remove redundant moves from SVE VSQXTN2
We don't need to perform a move if the destination aliases the
lower vector.
2023-08-23 20:02:28 -04:00
Mai
ee10153d14
Merge pull request #2984 from Sonicadvance1/optimize_pack
OpcodeDispatcher: Use new IR ops for pack instructions
2023-08-23 20:02:16 -04:00
Lioncache
d0d94adabe Arm64/VectorOps: Remove redundant moves in SVE VExtr when possible
We don't need to do any moves here is the destination aliases the
lower bits.
2023-08-23 19:56:10 -04:00
Ryan Houdek
926b8c2c97
Merge pull request #2985 from lioncash/shift
Arm64/VectorOps: Remove redundant moves from SVE variable/immediate/vector shifts when possible
2023-08-23 16:41:39 -07:00
Lioncache
18ebcdc9de Arm64/VectorOps: Remove redundant moves in VUshrNI2
If the destination and VectorLower alias, then we don't need
to emit a movprfx.
2023-08-23 18:52:32 -04:00
Lioncache
f31a9a52e6 Arm64/VectorOps: Remove redundant moves from SVE immediate vector shifts when possible
If the destination and source vector alias one another, then the
operation can largely be done in place.
2023-08-23 18:36:21 -04:00
Lioncache
03504a5f8c Arm64/VectorOps: Remove redundant moves from SVE vector shifts when possible
If the destination and the vector to be shifted alias, then we can
avoid needing to move some data around.
2023-08-23 18:24:58 -04:00
Lioncache
d29b4de1ee Arm64/VectorOps: Remove redundant moves from SVE variable vector register shifts when possible
In the event that the destination and the vector to be shifted
alias one another, then we can skip the movprfx, since it's not
necessary.
2023-08-23 18:24:53 -04:00
Ryan Houdek
fc4559d3c4 OpcodeDispatcher: Use new IR ops for pack instructions
The MMX and SSE versions of these instructions are now optimal.
2023-08-23 15:14:38 -07:00
Ryan Houdek
c508570da0 IR: Implements VSQXT{U,}NPair operations
This takes the two independent VSXT{U}N{2,} operations and merges them
in to a single IR operations.
In some cases this can result in a more optimal implementation since
there is no need for moves inbetween.
2023-08-23 15:13:07 -07:00
Lioncache
d5e145c4b0 Arm64/VectorOps: Remove redundant moves from SVE BSL when possible
If the destination and true vector alias one another, then we can
perform the operation in place instead of moving data around.
2023-08-23 17:54:10 -04:00
Ryan Houdek
350bca97c6
Merge pull request #2982 from lioncash/imin
Arm64/VectorOps: Remove redundant moves from SVE V{S,U}Min/V{S,U}Max when possible
2023-08-23 14:53:15 -07:00
Ryan Houdek
226405880f
Merge pull request #2981 from lioncash/fmin
Arm64/VectorOps: Remove redundant moves from SVE VFMin/VFMax when possible
2023-08-23 14:46:03 -07:00
Lioncache
37a8cb6821 Arm64/VectorOps: Remove redundant moves from SVE VSMax when possible
When the destination and first source alias one another, then we
can perform the operation in place instead of moving data around.
2023-08-23 17:34:52 -04:00
Lioncache
fe2c7dbf97 Arm64/VectorOps: Remove redundant moves from SVE VUMax when possible
When the destination and source alias one another, then we
can perform the operation in place without needing to move
data around.
2023-08-23 17:32:17 -04:00
Lioncache
787b4f37fb Arm64/VectorOps: Remove redundant moves from SVE VSMin when possible
When the destination and first source alias one another, then we can
perform the operation in place without moving any data.
2023-08-23 17:30:08 -04:00
Lioncache
c3faa019f5 Arm64/VectorOps: Remove redundant moves from SVE VUMin when possible
If the destination and first source alias, then we can perfom the operation
in place.
2023-08-23 17:27:52 -04:00
Ryan Houdek
da098d8204
Merge pull request #2979 from lioncash/div
Arm64/VectorOps: Remove moves from SVE VFDiv if possible
2023-08-23 14:22:46 -07:00
Lioncache
149852b122 Arm64/VectorOps: Remove redundant moves from SVE VFMax if possible
If Dst and Vector1 alias one another, then the operation can be
performed in place instead of moving data around.
2023-08-23 17:18:41 -04:00
Lioncache
ecf02846e6 Arm64/VectorOps: Remove redundant moves from SVE VFMin is possible
If Dst and Vector1 alias one another, then we can do the merging move
in place instead of shuffling data around.
2023-08-23 17:16:25 -04:00
Ryan Houdek
2501ebc1cd
Merge pull request #2980 from lioncash/avg
Arm64/VectorOps: Remove redundant moves from SVE VURAvg if possible
2023-08-23 14:05:51 -07:00
Lioncache
a8f7529847 Arm64/VectorOps: Remove moves from SVE VFDiv if possible
Given the operation is:

Dst = Vector1 / Vector2

If Dst and Vector1 alias one another, then we can just perform
the division as is without any moving of data around.
2023-08-23 16:53:56 -04:00
Lioncache
8431ab43a0 Arm64/VectorOps: Remove redundant moves from SVE VURAvg if possible
If Dst and Vector1 alias one another, then we can perform the operation
without needing to move any data around.
2023-08-23 16:51:17 -04:00
Mai
4b06069c0d
Merge pull request #2972 from Sonicadvance1/optimize_scalar_mov
OpcodeDispatcher: Optimizes scalar movd/movq
2023-08-23 16:30:45 -04:00
Ryan Houdek
b646f4b781
Merge pull request #2978 from lioncash/misc
OpcodeDispatcher: Remove redundant moves from remaining AVX ops
2023-08-23 13:18:25 -07:00
Ryan Houdek
8836ab8988 OpcodeDispatcher: Optimizes scalar movd/movq
MMX and SSE versions are now optimal.
2023-08-23 12:56:11 -07:00
Lioncache
ea9747289a OpcodeDispatcher: Remove redundant moves from remaining AVX ops
Zero-extension will occur automatically if necessary upon storing.
2023-08-23 15:31:59 -04:00
Lioncache
735e2060a3 OpcodeDispatcher: Remove redundant moves from VPACKUSOP/VPACKSSOp
Zero-extension will occur automatically if necessary.
2023-08-23 15:09:57 -04:00
Ryan Houdek
86ef6fe48d
Merge pull request #2976 from lioncash/mov
OpcodeDispatcher: Remove unnecessary moves from AVX move ops where applicable
2023-08-23 12:02:05 -07:00
Lioncache
8e7e91d61f OpcodeDispatcher: Remove redundant moves in VMOVLPOp
Zero-extension will automatically occur upon storing if necessary.
2023-08-23 14:23:57 -04:00
Lioncache
bcba3700c8 OpcodeDispatcher: Remove redundant moves from VMOVVectorNTOp
Zero-extension will automatically occur if necessary upon storing.

We can also join the SSE and AVX implementations.
2023-08-23 14:17:27 -04:00
Lioncache
7d05797e82 OpcodeDispatcher: Remove redundant moves from VMOVHPOp
Zero-extension will automatically occur if necessary upon storing.
2023-08-23 14:14:35 -04:00
Lioncache
c409ea78bc OpcodeDispatcher: Remove unnecessary moves from VMOV{A,U}PS/VMOV{A,U}PD
Zero-extension will occur automatically upon storing if necessary.

We can also join the SSE and AVX implementations together.
2023-08-23 14:10:49 -04:00
Ryan Houdek
a62ba75ede
Merge pull request #2975 from lioncash/scalar
Arm64/ConversionOps: Add scalar support to Vector_FToI
2023-08-23 10:57:38 -07:00
Lioncache
4a7ef3da13 OpcodeDispatcher: Remove unnecessary moves in AVXVectorRound
Zero-extension will occur automatically if necessary upon storing.
2023-08-23 13:41:56 -04:00