Lioncache
8a622a3c1a
OpcodeDispatcher: Remove redundant moves from VAESEnc
...
Zero-extension will occur automatically upon storing if necessary.
2023-08-23 21:30:06 -04:00
Lioncache
f4848fd1a7
OpcodeDispatcher: Remove redundant moves from VAESEncLast
...
Zero-extension will occur automatically upon storing if necessary.
2023-08-23 21:28:14 -04:00
Lioncache
d37ce08ae9
OpcodeDispatcher: Remove redundant move from VAESDec
...
Zero-extension will occur automatically upon storing if necessary.
2023-08-23 21:26:49 -04:00
Lioncache
a6f1a9f8e8
OpcodeDispatcher: Remove redundant moves from VAESDecLast
...
Zero-extension will occur upon storing if necessary.
2023-08-23 21:25:15 -04:00
Lioncache
52ab3f6a1e
OpcodeDispatcher: Remove redundant moves from VAESKeyGenAssist
...
Zero-extension will occur upon storing if necessary.
We can also join the AVX implementation with the SSE one.
2023-08-23 21:22:00 -04:00
Lioncache
410e99ba09
OpcodeDispatcher: Remove redundant moves from VPCLMULQDQOp
...
Zero-extension will occur if necessary upon storing.
2023-08-23 21:18:39 -04:00
Ryan Houdek
6e4765d48b
Merge pull request #2989 from lioncash/ins
...
Arm64/ConversionOps: Remove redundant moves in AdvSIMD VInsGPR
2023-08-23 18:09:15 -07:00
Ryan Houdek
172c8f3ba6
Merge pull request #2988 from lioncash/half
...
Arm64/ConversionOps: Add missing half-precision conversions to scalar functions
2023-08-23 17:56:23 -07:00
Lioncache
203a2b1105
Arm64/ConversionOps: Remove redundant moves in AdvSIMD VInsGPR
...
If Dst and DestVector alias one another, then we don't need to
move the vector unnecessarily.
2023-08-23 20:50:21 -04:00
Ryan Houdek
4297e13fcf
Merge pull request #2986 from lioncash/ext
...
Arm64/VectorOps: Remove redundant moves in SVE VExtr when possible
2023-08-23 17:40:50 -07:00
Lioncache
5ad56ad52e
Arm64/ConversionOps: Add missing half-precision operations to Float_FromGPR_S
...
Provides parity with vector operations.
2023-08-23 20:36:34 -04:00
Lioncache
24e7baf28f
Arm64/ConversionOps: Add missing half-precision conversions to Float_FToF
...
Provides parity with the vector conversion operations.
2023-08-23 20:31:49 -04:00
Lioncache
b248ae4c04
Arm64/VectorOps: Remove redundant moves from SVE SQSHL
...
We don't need to emit a move if the destination and source alias.
2023-08-23 20:12:09 -04:00
Lioncache
95bea864cf
Arm64/VectorOps: Remove redundant moves from SVE SRSHR
...
We don't need to perform a move is the destination aliases
the source vector to be shifted.
2023-08-23 20:10:51 -04:00
Lioncache
47c4507bb6
Arm64/VectorOps: Remove redundant moves from SVE VSQXTUN2
...
We don't need to perform a move if the destination aliases the lower vector.
2023-08-23 20:10:29 -04:00
Lioncache
5ea0b6db28
Arm64/VectorOps: Remove redundant moves from SVE VSQXTN2
...
We don't need to perform a move if the destination aliases the
lower vector.
2023-08-23 20:02:28 -04:00
Mai
ee10153d14
Merge pull request #2984 from Sonicadvance1/optimize_pack
...
OpcodeDispatcher: Use new IR ops for pack instructions
2023-08-23 20:02:16 -04:00
Lioncache
d0d94adabe
Arm64/VectorOps: Remove redundant moves in SVE VExtr when possible
...
We don't need to do any moves here is the destination aliases the
lower bits.
2023-08-23 19:56:10 -04:00
Ryan Houdek
926b8c2c97
Merge pull request #2985 from lioncash/shift
...
Arm64/VectorOps: Remove redundant moves from SVE variable/immediate/vector shifts when possible
2023-08-23 16:41:39 -07:00
Lioncache
18ebcdc9de
Arm64/VectorOps: Remove redundant moves in VUshrNI2
...
If the destination and VectorLower alias, then we don't need
to emit a movprfx.
2023-08-23 18:52:32 -04:00
Lioncache
f31a9a52e6
Arm64/VectorOps: Remove redundant moves from SVE immediate vector shifts when possible
...
If the destination and source vector alias one another, then the
operation can largely be done in place.
2023-08-23 18:36:21 -04:00
Lioncache
03504a5f8c
Arm64/VectorOps: Remove redundant moves from SVE vector shifts when possible
...
If the destination and the vector to be shifted alias, then we can
avoid needing to move some data around.
2023-08-23 18:24:58 -04:00
Lioncache
d29b4de1ee
Arm64/VectorOps: Remove redundant moves from SVE variable vector register shifts when possible
...
In the event that the destination and the vector to be shifted
alias one another, then we can skip the movprfx, since it's not
necessary.
2023-08-23 18:24:53 -04:00
Ryan Houdek
fc4559d3c4
OpcodeDispatcher: Use new IR ops for pack instructions
...
The MMX and SSE versions of these instructions are now optimal.
2023-08-23 15:14:38 -07:00
Ryan Houdek
c508570da0
IR: Implements VSQXT{U,}NPair operations
...
This takes the two independent VSXT{U}N{2,} operations and merges them
in to a single IR operations.
In some cases this can result in a more optimal implementation since
there is no need for moves inbetween.
2023-08-23 15:13:07 -07:00
Lioncache
d5e145c4b0
Arm64/VectorOps: Remove redundant moves from SVE BSL when possible
...
If the destination and true vector alias one another, then we can
perform the operation in place instead of moving data around.
2023-08-23 17:54:10 -04:00
Ryan Houdek
350bca97c6
Merge pull request #2982 from lioncash/imin
...
Arm64/VectorOps: Remove redundant moves from SVE V{S,U}Min/V{S,U}Max when possible
2023-08-23 14:53:15 -07:00
Ryan Houdek
226405880f
Merge pull request #2981 from lioncash/fmin
...
Arm64/VectorOps: Remove redundant moves from SVE VFMin/VFMax when possible
2023-08-23 14:46:03 -07:00
Lioncache
37a8cb6821
Arm64/VectorOps: Remove redundant moves from SVE VSMax when possible
...
When the destination and first source alias one another, then we
can perform the operation in place instead of moving data around.
2023-08-23 17:34:52 -04:00
Lioncache
fe2c7dbf97
Arm64/VectorOps: Remove redundant moves from SVE VUMax when possible
...
When the destination and source alias one another, then we
can perform the operation in place without needing to move
data around.
2023-08-23 17:32:17 -04:00
Lioncache
787b4f37fb
Arm64/VectorOps: Remove redundant moves from SVE VSMin when possible
...
When the destination and first source alias one another, then we can
perform the operation in place without moving any data.
2023-08-23 17:30:08 -04:00
Lioncache
c3faa019f5
Arm64/VectorOps: Remove redundant moves from SVE VUMin when possible
...
If the destination and first source alias, then we can perfom the operation
in place.
2023-08-23 17:27:52 -04:00
Ryan Houdek
da098d8204
Merge pull request #2979 from lioncash/div
...
Arm64/VectorOps: Remove moves from SVE VFDiv if possible
2023-08-23 14:22:46 -07:00
Lioncache
149852b122
Arm64/VectorOps: Remove redundant moves from SVE VFMax if possible
...
If Dst and Vector1 alias one another, then the operation can be
performed in place instead of moving data around.
2023-08-23 17:18:41 -04:00
Lioncache
ecf02846e6
Arm64/VectorOps: Remove redundant moves from SVE VFMin is possible
...
If Dst and Vector1 alias one another, then we can do the merging move
in place instead of shuffling data around.
2023-08-23 17:16:25 -04:00
Ryan Houdek
2501ebc1cd
Merge pull request #2980 from lioncash/avg
...
Arm64/VectorOps: Remove redundant moves from SVE VURAvg if possible
2023-08-23 14:05:51 -07:00
Lioncache
a8f7529847
Arm64/VectorOps: Remove moves from SVE VFDiv if possible
...
Given the operation is:
Dst = Vector1 / Vector2
If Dst and Vector1 alias one another, then we can just perform
the division as is without any moving of data around.
2023-08-23 16:53:56 -04:00
Lioncache
8431ab43a0
Arm64/VectorOps: Remove redundant moves from SVE VURAvg if possible
...
If Dst and Vector1 alias one another, then we can perform the operation
without needing to move any data around.
2023-08-23 16:51:17 -04:00
Mai
4b06069c0d
Merge pull request #2972 from Sonicadvance1/optimize_scalar_mov
...
OpcodeDispatcher: Optimizes scalar movd/movq
2023-08-23 16:30:45 -04:00
Ryan Houdek
b646f4b781
Merge pull request #2978 from lioncash/misc
...
OpcodeDispatcher: Remove redundant moves from remaining AVX ops
2023-08-23 13:18:25 -07:00
Ryan Houdek
8836ab8988
OpcodeDispatcher: Optimizes scalar movd/movq
...
MMX and SSE versions are now optimal.
2023-08-23 12:56:11 -07:00
Lioncache
ea9747289a
OpcodeDispatcher: Remove redundant moves from remaining AVX ops
...
Zero-extension will occur automatically if necessary upon storing.
2023-08-23 15:31:59 -04:00
Lioncache
735e2060a3
OpcodeDispatcher: Remove redundant moves from VPACKUSOP/VPACKSSOp
...
Zero-extension will occur automatically if necessary.
2023-08-23 15:09:57 -04:00
Ryan Houdek
86ef6fe48d
Merge pull request #2976 from lioncash/mov
...
OpcodeDispatcher: Remove unnecessary moves from AVX move ops where applicable
2023-08-23 12:02:05 -07:00
Lioncache
8e7e91d61f
OpcodeDispatcher: Remove redundant moves in VMOVLPOp
...
Zero-extension will automatically occur upon storing if necessary.
2023-08-23 14:23:57 -04:00
Lioncache
bcba3700c8
OpcodeDispatcher: Remove redundant moves from VMOVVectorNTOp
...
Zero-extension will automatically occur if necessary upon storing.
We can also join the SSE and AVX implementations.
2023-08-23 14:17:27 -04:00
Lioncache
7d05797e82
OpcodeDispatcher: Remove redundant moves from VMOVHPOp
...
Zero-extension will automatically occur if necessary upon storing.
2023-08-23 14:14:35 -04:00
Lioncache
c409ea78bc
OpcodeDispatcher: Remove unnecessary moves from VMOV{A,U}PS/VMOV{A,U}PD
...
Zero-extension will occur automatically upon storing if necessary.
We can also join the SSE and AVX implementations together.
2023-08-23 14:10:49 -04:00
Ryan Houdek
a62ba75ede
Merge pull request #2975 from lioncash/scalar
...
Arm64/ConversionOps: Add scalar support to Vector_FToI
2023-08-23 10:57:38 -07:00
Lioncache
4a7ef3da13
OpcodeDispatcher: Remove unnecessary moves in AVXVectorRound
...
Zero-extension will occur automatically if necessary upon storing.
2023-08-23 13:41:56 -04:00