lioncash
ad1d65e91a
IR: Handle 256-bit LoadRegister
...
Extends LoadRegister to handle 256-bit vectors.
2022-11-09 03:19:38 +00:00
lioncash
3a6c7803e8
IR: Handle 256-bit StoreRegister
...
Extends StoreRegister to handle 256-bit vectors.
2022-11-09 03:19:34 +00:00
lioncash
1eae07f1b8
IR: Handle 256-bit VAddV
...
Extends VAddV to handle 256-bit vectors.
2022-11-08 18:21:30 +00:00
Ryan Houdek
1ac7cd5835
OpcodeDispatcher: Disable PCLMUL if not supported on host
...
Might fix Steam on Pi4
2022-11-05 12:11:45 -07:00
Ryan Houdek
b1d98f4e58
Merge pull request #2134 from lioncash/temp
...
Arm64/ConversionOps: Eliminate use of temporary in Vector_FToF
2022-11-02 19:18:40 -07:00
Ryan Houdek
9e7daf61d0
Merge pull request #2133 from lioncash/inselem
...
IR: Handle 256-bit VInsElement
2022-11-02 18:59:18 -07:00
lioncash
6fbe25753b
IR: Handle 256-bit VInsElement
...
Extends VInsElem to handle 256-bit vectors.
2022-11-03 01:43:42 +00:00
lioncash
5536f1e835
Arm64/ConversionOps: Eliminate use of temporary in Vector_FToF
...
We can just use the destination register in this case.
2022-11-02 23:50:28 +00:00
lioncash
0de36706da
IR.json: Expand allowed size in LoadContext and StoreContext IR ops
...
These can now handle 256-bit destinations
2022-11-02 16:19:05 +00:00
lioncash
17722dad6d
IR: Handle 256-bit LoadContextIndexed
...
Extends LoadContextIndexed to handle 256-bit vectors.
2022-11-02 16:16:58 +00:00
lioncash
0371599996
IR: Handle 256-bit StoreContextIndexed
...
Extends StoreContextIndexed to handle 256-bit vectors.
2022-11-02 16:07:02 +00:00
lioncash
4ef35488db
Arm64/MemoryOps: Merge if statement into switch in ParanoidLoadMemTSO
...
There's nothing preventing the OpSize == 1 case from being merged into
the switch, so we can do that to make things a little more consistent.
2022-11-02 03:45:17 +00:00
lioncash
418a27e47e
IR: Handle 256-bit ParanoidStoreMemTSO
...
Extends ParanoidStoreMemTSO to handle 256-bit vectors.
2022-11-01 23:04:30 +00:00
lioncash
61c76d02cc
IR: Handle 256-bit StoreMemTSO
...
Extends StoreMemTSO to handle 256-bit vectors.
2022-11-01 22:59:29 +00:00
lioncash
d98641221d
IR: Handle 256-bit StoreMem
...
Extends StoreMem to handle 256-bit vectors.
2022-11-01 21:39:52 +00:00
Ryan Houdek
8a14f87a44
Merge pull request #2129 from lioncash/memory
...
IR: Handle 256-bit LoadMem/LoadMemTSO/ParanoidLoadMemTSO
2022-11-01 14:23:19 -07:00
lioncash
02ce71734c
IR: handle 256-bit ParanoidLoadTSO
...
Extends ParanoidLoadTSO to handle 256-bit vectors.
2022-11-01 21:02:42 +00:00
lioncash
96c2743280
IR: Handle 256-bit LoadMemTSO
...
Extends LoadMemTSO to handle 256-bit vectors.
2022-11-01 21:02:42 +00:00
lioncash
7bfc34b51c
IR: Handle 256-bit LoadMem
...
Extends LoadMem to handle 256-bit vectors.
2022-11-01 21:02:38 +00:00
Ryan Houdek
40d820fd05
Merge pull request #2127 from lioncash/spill
...
Arm64/MemoryOps: Remove lingering unnecessary ptrue instances
2022-11-01 10:47:15 -07:00
lioncash
d69287aaf7
Arm64/MemoryOps: Remove lingering unnecessary ptrue instances
...
Gets rid of some leftover bits from when we didn't have statically
allocated predicate registers.
2022-11-01 17:25:24 +00:00
Ryan Houdek
d475b0ba9e
Merge pull request #2126 from lioncash/ctx
...
IR: Handle 256-bit LoadContext/StoreContext
2022-11-01 10:22:26 -07:00
lioncash
1638b744b7
x86_64/MemoryOps: Ensure upper lane is cleared properly in FillRegister
...
Ensures that loaded values don't potentially have junk in the upper
lane. Will prevent potential wonky situations when implementing AVX
instructions.
2022-11-01 16:39:18 +00:00
lioncash
8b19894a06
IR: Handle 256-bit StoreContext
...
Extends StoreContext to handle 256-bit vectors.
2022-11-01 16:20:24 +00:00
lioncash
d04e40b5fd
Interpreter/MiscOps: Remove unused StopThread() function
...
This has been unused since ff1d51c7bda3a4151ffb13fc0f2caec5d8c5e3a5
Silences a compiler warning.
2022-11-01 15:59:17 +00:00
lioncash
75d797b5cd
IR: Handle 256-bit LoadContext
...
Extends LoadContext to handle 256-bit vectors.
2022-11-01 14:54:26 +00:00
Ryan Houdek
0e1a418678
WIP: Segment register index optimization
...
Segment registers are indexed significantly more than they are changed.
Pay the cost of indexing during the set and store rather than the per
register index.
Should be a fairly significant performance improvement for 32-bit
applications. At least on hardware that doesn't have a data dependent
prefetcher.
Breaks Steam atm and isn't clean.
2022-10-31 19:42:30 -07:00
Ryan Houdek
000677abb6
Merge pull request #2078 from Sonicadvance1/fix_48bit_va_stack
...
Allocator: Expand stack space when stealing virtual address space
2022-10-31 13:11:20 -07:00
Ryan Houdek
d6f8923f86
X87: Claim incoming float was in the range for trancendental ops
...
We don't detect the range of the long F80, so we need to set that the
source was in range to fix sin/cos/tan calculations.
If we don't set this flag to zero then glibc will do some additional
operations that causes the value to be incorrect.
Fixes the output of the test application in #2021 , probably fixes some
camera orientation problems in games as well.
2022-10-31 12:36:49 -07:00
Ryan Houdek
2e93d10eba
OpcodeDispatcher: Fixes ROR imm OF calculation
...
Turns out this was calculating OF incorrectly, breaking Denuvo early in
its execution.
Changes the ROL imm OF calculation code as well to be more consistent
and not keep src1 alive longer than it needs to be.
Also adds two new unit tests to ensure this stays correct.
2022-10-31 10:28:47 -07:00
Mai
70a3ceb64e
Merge pull request #2096 from Sonicadvance1/cleanup_64allocator
...
Utils/64BitAllocator: Minor cleanups and optimization for munmap
2022-10-31 16:47:53 +00:00
Ryan Houdek
2332c41510
Merge pull request #2119 from lioncash/tbl
...
IR: Handle 256-bit VTBL1
2022-10-26 20:44:23 -07:00
lioncash
ec3039c5a2
IR: Handle 256-bit VTBL1
...
Extends VTBL1 to handle 256-bit vectors.
2022-10-27 00:35:45 +00:00
lioncash
cd518d4726
Arm64/VectorOps: Make use of MOVPRFX where applicable
...
Allows hardware to pack the move and following destructive operation
together into one constructive operation if possible.
e.g.
movprfx VTMP1.D, VectorLower.D
addp VTMP1.B, Pred, VTMP1.B, VectorUpper.B
is allowed to be merged as if it executed constructively like:
addp VTMP1.B, Pred, VectorLower.B, VectorUpper.B
if the hardware supports it. If it doesn't, then the instructions will
behave like a regular move and destructive addp operation separately.
2022-10-26 21:38:48 +00:00
Ryan Houdek
b7d9c00dff
Merge pull request #2117 from lioncash/ins
...
IR: Handle 256-bit VInsGPR
2022-10-26 13:22:35 -07:00
lioncash
4b17575f5a
IR: Handle 256-bit VInsGPR
...
Extends VInsGPR to handle 256-bit vectors.
2022-10-26 19:45:27 +00:00
lioncash
d87ff5afa9
IR: Handle 256-bit VExtractToGPR
...
Extends VExtractToGPR to handle 256-bit vectors.
2022-10-26 17:43:31 +00:00
Ryan Houdek
62a24bd38f
Merge pull request #2075 from Sonicadvance1/gpuvis_profiler
...
FEXCore: Adds support for a timeline profiler interface
2022-10-26 08:44:54 -07:00
Ryan Houdek
7e810233d9
Merge pull request #2112 from lioncash/ftoi
...
IR: Handle 256-bit Vector_FToI
2022-10-26 00:07:04 -07:00
Ryan Houdek
74e18f4317
Merge pull request #2114 from lioncash/vec
...
Arm64/BranchOps: Remove unused std::vector
2022-10-25 22:02:09 -07:00
lioncash
1eea95cf18
Arm64/BranchOps: Remove unused std::vector
...
Removes a heap allocation for inline syscalls.
2022-10-26 04:34:57 +00:00
lioncash
6804916697
IR: Check for invalid conversion masks in Float_FromGPR_S
...
Previously this would silently ignore unhandled masks.
2022-10-26 03:59:25 +00:00
lioncash
819e61bf14
IR: Handle 256-bit Vector_FToI
...
Expands Vector_FToI to handle 256-bit vectors.
2022-10-26 03:42:23 +00:00
Ryan Houdek
b8f7e4c8ec
Merge pull request #2111 from lioncash/ftof
...
IR: Handle 256-bit Vector_FtoF
2022-10-25 20:17:26 -07:00
lioncash
17bcc0eed4
IR: Handle 256-bit Vector_FtoF
...
Extends Vector_FtoF to handle 256-bit vectors.
2022-10-26 02:58:06 +00:00
lioncash
9273538955
IR: Handle 256-bit Vector_FToS
...
Extends Vector_FToS to handle 256-bit vectors.
2022-10-26 00:59:11 +00:00
lioncash
9750189def
IR: Handle 256-bit Vector_FToZS
...
Extends Vector_FToZS to handle 256-bit vectors.
2022-10-26 00:53:53 +00:00
Ryan Houdek
cb17ee9871
Merge pull request #2109 from lioncash/stof
...
IR: Handle 256-bit Vector_SToF
2022-10-25 17:14:16 -07:00
lioncash
4c3b78ba9a
IR: Handle 256-bit Vector_SToF
...
Extends Vector_SToF to handle 256-bit vectors.
2022-10-25 23:54:41 +00:00
Ryan Houdek
27b022d4d9
Merge pull request #2106 from lioncash/dup
...
IR: Handle 256-bit VDupElement
2022-10-25 13:49:29 -07:00