2732 Commits

Author SHA1 Message Date
lioncash
ad1d65e91a IR: Handle 256-bit LoadRegister
Extends LoadRegister to handle 256-bit vectors.
2022-11-09 03:19:38 +00:00
lioncash
3a6c7803e8 IR: Handle 256-bit StoreRegister
Extends StoreRegister to handle 256-bit vectors.
2022-11-09 03:19:34 +00:00
lioncash
1eae07f1b8 IR: Handle 256-bit VAddV
Extends VAddV to handle 256-bit vectors.
2022-11-08 18:21:30 +00:00
Ryan Houdek
1ac7cd5835 OpcodeDispatcher: Disable PCLMUL if not supported on host
Might fix Steam on Pi4
2022-11-05 12:11:45 -07:00
Ryan Houdek
b1d98f4e58
Merge pull request #2134 from lioncash/temp
Arm64/ConversionOps: Eliminate use of temporary in Vector_FToF
2022-11-02 19:18:40 -07:00
Ryan Houdek
9e7daf61d0
Merge pull request #2133 from lioncash/inselem
IR: Handle 256-bit VInsElement
2022-11-02 18:59:18 -07:00
lioncash
6fbe25753b IR: Handle 256-bit VInsElement
Extends VInsElem to handle 256-bit vectors.
2022-11-03 01:43:42 +00:00
lioncash
5536f1e835 Arm64/ConversionOps: Eliminate use of temporary in Vector_FToF
We can just use the destination register in this case.
2022-11-02 23:50:28 +00:00
lioncash
0de36706da IR.json: Expand allowed size in LoadContext and StoreContext IR ops
These can now handle 256-bit destinations
2022-11-02 16:19:05 +00:00
lioncash
17722dad6d IR: Handle 256-bit LoadContextIndexed
Extends LoadContextIndexed to handle 256-bit vectors.
2022-11-02 16:16:58 +00:00
lioncash
0371599996 IR: Handle 256-bit StoreContextIndexed
Extends StoreContextIndexed to handle 256-bit vectors.
2022-11-02 16:07:02 +00:00
lioncash
4ef35488db Arm64/MemoryOps: Merge if statement into switch in ParanoidLoadMemTSO
There's nothing preventing the OpSize == 1 case from being merged into
the switch, so we can do that to make things a little more consistent.
2022-11-02 03:45:17 +00:00
lioncash
418a27e47e IR: Handle 256-bit ParanoidStoreMemTSO
Extends ParanoidStoreMemTSO to handle 256-bit vectors.
2022-11-01 23:04:30 +00:00
lioncash
61c76d02cc IR: Handle 256-bit StoreMemTSO
Extends StoreMemTSO to handle 256-bit vectors.
2022-11-01 22:59:29 +00:00
lioncash
d98641221d IR: Handle 256-bit StoreMem
Extends StoreMem to handle 256-bit vectors.
2022-11-01 21:39:52 +00:00
Ryan Houdek
8a14f87a44
Merge pull request #2129 from lioncash/memory
IR: Handle 256-bit LoadMem/LoadMemTSO/ParanoidLoadMemTSO
2022-11-01 14:23:19 -07:00
lioncash
02ce71734c IR: handle 256-bit ParanoidLoadTSO
Extends ParanoidLoadTSO to handle 256-bit vectors.
2022-11-01 21:02:42 +00:00
lioncash
96c2743280 IR: Handle 256-bit LoadMemTSO
Extends LoadMemTSO to handle 256-bit vectors.
2022-11-01 21:02:42 +00:00
lioncash
7bfc34b51c IR: Handle 256-bit LoadMem
Extends LoadMem to handle 256-bit vectors.
2022-11-01 21:02:38 +00:00
Ryan Houdek
40d820fd05
Merge pull request #2127 from lioncash/spill
Arm64/MemoryOps: Remove lingering unnecessary ptrue instances
2022-11-01 10:47:15 -07:00
lioncash
d69287aaf7 Arm64/MemoryOps: Remove lingering unnecessary ptrue instances
Gets rid of some leftover bits from when we didn't have statically
allocated predicate registers.
2022-11-01 17:25:24 +00:00
Ryan Houdek
d475b0ba9e
Merge pull request #2126 from lioncash/ctx
IR: Handle 256-bit LoadContext/StoreContext
2022-11-01 10:22:26 -07:00
lioncash
1638b744b7 x86_64/MemoryOps: Ensure upper lane is cleared properly in FillRegister
Ensures that loaded values don't potentially have junk in the upper
lane. Will prevent potential wonky situations when implementing AVX
instructions.
2022-11-01 16:39:18 +00:00
lioncash
8b19894a06 IR: Handle 256-bit StoreContext
Extends StoreContext to handle 256-bit vectors.
2022-11-01 16:20:24 +00:00
lioncash
d04e40b5fd Interpreter/MiscOps: Remove unused StopThread() function
This has been unused since ff1d51c7bda3a4151ffb13fc0f2caec5d8c5e3a5

Silences a compiler warning.
2022-11-01 15:59:17 +00:00
lioncash
75d797b5cd IR: Handle 256-bit LoadContext
Extends LoadContext to handle 256-bit vectors.
2022-11-01 14:54:26 +00:00
Ryan Houdek
0e1a418678 WIP: Segment register index optimization
Segment registers are indexed significantly more than they are changed.
Pay the cost of indexing during the set and store rather than the per
register index.

Should be a fairly significant performance improvement for 32-bit
applications. At least on hardware that doesn't have a data dependent
prefetcher.

Breaks Steam atm and isn't clean.
2022-10-31 19:42:30 -07:00
Ryan Houdek
000677abb6
Merge pull request #2078 from Sonicadvance1/fix_48bit_va_stack
Allocator: Expand stack space when stealing virtual address space
2022-10-31 13:11:20 -07:00
Ryan Houdek
d6f8923f86 X87: Claim incoming float was in the range for trancendental ops
We don't detect the range of the long F80, so we need to set that the
source was in range to fix sin/cos/tan calculations.

If we don't set this flag to zero then glibc will do some additional
operations that causes the value to be incorrect.

Fixes the output of the test application in #2021, probably fixes some
camera orientation problems in games as well.
2022-10-31 12:36:49 -07:00
Ryan Houdek
2e93d10eba OpcodeDispatcher: Fixes ROR imm OF calculation
Turns out this was calculating OF incorrectly, breaking Denuvo early in
its execution.

Changes the ROL imm OF calculation code as well to be more consistent
and not keep src1 alive longer than it needs to be.

Also adds two new unit tests to ensure this stays correct.
2022-10-31 10:28:47 -07:00
Mai
70a3ceb64e
Merge pull request #2096 from Sonicadvance1/cleanup_64allocator
Utils/64BitAllocator: Minor cleanups and optimization for munmap
2022-10-31 16:47:53 +00:00
Ryan Houdek
2332c41510
Merge pull request #2119 from lioncash/tbl
IR: Handle 256-bit VTBL1
2022-10-26 20:44:23 -07:00
lioncash
ec3039c5a2 IR: Handle 256-bit VTBL1
Extends VTBL1 to handle 256-bit vectors.
2022-10-27 00:35:45 +00:00
lioncash
cd518d4726 Arm64/VectorOps: Make use of MOVPRFX where applicable
Allows hardware to pack the move and following destructive operation
together into one constructive operation if possible.

e.g.

movprfx VTMP1.D, VectorLower.D
addp VTMP1.B, Pred, VTMP1.B, VectorUpper.B

is allowed to be merged as if it executed constructively like:

addp VTMP1.B, Pred, VectorLower.B, VectorUpper.B

if the hardware supports it. If it doesn't, then the instructions will
behave like a regular move and destructive addp operation separately.
2022-10-26 21:38:48 +00:00
Ryan Houdek
b7d9c00dff
Merge pull request #2117 from lioncash/ins
IR: Handle 256-bit VInsGPR
2022-10-26 13:22:35 -07:00
lioncash
4b17575f5a IR: Handle 256-bit VInsGPR
Extends VInsGPR to handle 256-bit vectors.
2022-10-26 19:45:27 +00:00
lioncash
d87ff5afa9 IR: Handle 256-bit VExtractToGPR
Extends VExtractToGPR to handle 256-bit vectors.
2022-10-26 17:43:31 +00:00
Ryan Houdek
62a24bd38f
Merge pull request #2075 from Sonicadvance1/gpuvis_profiler
FEXCore: Adds support for a timeline profiler interface
2022-10-26 08:44:54 -07:00
Ryan Houdek
7e810233d9
Merge pull request #2112 from lioncash/ftoi
IR: Handle 256-bit Vector_FToI
2022-10-26 00:07:04 -07:00
Ryan Houdek
74e18f4317
Merge pull request #2114 from lioncash/vec
Arm64/BranchOps: Remove unused std::vector
2022-10-25 22:02:09 -07:00
lioncash
1eea95cf18 Arm64/BranchOps: Remove unused std::vector
Removes a heap allocation for inline syscalls.
2022-10-26 04:34:57 +00:00
lioncash
6804916697 IR: Check for invalid conversion masks in Float_FromGPR_S
Previously this would silently ignore unhandled masks.
2022-10-26 03:59:25 +00:00
lioncash
819e61bf14 IR: Handle 256-bit Vector_FToI
Expands Vector_FToI to handle 256-bit vectors.
2022-10-26 03:42:23 +00:00
Ryan Houdek
b8f7e4c8ec
Merge pull request #2111 from lioncash/ftof
IR: Handle 256-bit Vector_FtoF
2022-10-25 20:17:26 -07:00
lioncash
17bcc0eed4 IR: Handle 256-bit Vector_FtoF
Extends Vector_FtoF to handle 256-bit vectors.
2022-10-26 02:58:06 +00:00
lioncash
9273538955 IR: Handle 256-bit Vector_FToS
Extends Vector_FToS to handle 256-bit vectors.
2022-10-26 00:59:11 +00:00
lioncash
9750189def IR: Handle 256-bit Vector_FToZS
Extends Vector_FToZS to handle 256-bit vectors.
2022-10-26 00:53:53 +00:00
Ryan Houdek
cb17ee9871
Merge pull request #2109 from lioncash/stof
IR: Handle 256-bit Vector_SToF
2022-10-25 17:14:16 -07:00
lioncash
4c3b78ba9a IR: Handle 256-bit Vector_SToF
Extends Vector_SToF to handle 256-bit vectors.
2022-10-25 23:54:41 +00:00
Ryan Houdek
27b022d4d9
Merge pull request #2106 from lioncash/dup
IR: Handle 256-bit VDupElement
2022-10-25 13:49:29 -07:00