8967 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
a8b59c16d6
Merge pull request #3513 from Sonicadvance1/panic_spilling_rip
RA: Adds RIP when a block panic spills
2024-03-25 13:13:15 -04:00
Ryan Houdek
3034edb0aa
RA: Adds RIP when a block panic spills
I find myself adding this every time I find a game that panic spills.
Let's just print it out.
2024-03-24 17:11:29 -07:00
Mai
002ca360f8
Merge pull request #3506 from Sonicadvance1/telemetry_rename
Telemetry: Rename old file instead of copying
2024-03-23 00:24:02 -04:00
Ryan Houdek
4952b2e16c
Telemetry: Rename old file instead of copying
Since we do an immediate overwrite of the file we are copying, we can
instead do a rename. Failure on rename is fine, will either mean the
telemetry file didn't exist initially, or some other permission error so
the telemetry will get lost regardless.
2024-03-21 22:51:20 -07:00
Ryan Houdek
8852d94416
Merge pull request #3503 from alyssarosenzweig/opt/loop
OpcodeDispatcher: optimize LOOP/N/E
2024-03-21 20:05:50 -07:00
Mai
0c24aea27e
Merge pull request #3502 from Sonicadvance1/remove_termux
Removes false termux support
2024-03-21 12:45:01 -04:00
Alyssa Rosenzweig
82ba16c6ed OpcodeDispatcher: optimize LOOP/N/E
Don't clobber NZCV.

Before/after assembly from the Primary_E1 unit test:

< 4340: [INFO] cset w20, ne
< 4340: [INFO] mrs x21, nzcv
< 4340: [INFO] cmp x5, #0x0 (0)
< 4340: [INFO] cset x22, ne
< 4340: [INFO] and x20, x22, x20
< 4340: [INFO] msr nzcv, x21
< 4340: [INFO] cbnz x20, #+0x8 (addr 0xffff896f8084)
< 4340: [INFO] b #+0x1c (addr 0xffff896f809c)
< 4340: [INFO] ldr x0, pc+8 (addr 0xffff896f808c)
---
> 4340: [INFO] csel x20, x5, xzr, ne
> 4340: [INFO] cbnz x20, #+0x8 (addr 0xfffed7308070)
> 4340: [INFO] b #+0x1c (addr 0xfffed7308088)
> 4340: [INFO] ldr x0, pc+8 (addr 0xfffed7308078)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-21 12:08:40 -04:00
Ryan Houdek
45ea0cd782
Removes false termux support
This was a funny joke that this was here, but it is fundamentally
incompatible with what we're doing. All those users are running proot
anyway because of how broken running under termux directly is.

Just remove this from here.
2024-03-20 22:04:32 -07:00
Ryan Houdek
6ce366ef35
Merge pull request #3499 from Sonicadvance1/overlapping_memcpy_unittest
unittests/ASM: Adds a test for overlapping memcpy using rep movs
2024-03-20 19:52:19 -07:00
Ryan Houdek
862d63adf2
unittests/ASM: Adds a test for overlapping memcpy using rep movs
Caused by #3478

This was missed in the review that it could cause issues. bylaws already
has a fix incoming that will get this unit test working.
2024-03-20 18:44:36 -07:00
Ryan Houdek
167896dc9d
Merge pull request #3501 from bylaws/memcpy
FEXCore: Fallback to the memcpy slow path for overlaps within 32 bytes
2024-03-20 18:42:56 -07:00
Billy Laws
12fb26f9c0 Update InstCountCI 2024-03-20 21:08:59 +00:00
Billy Laws
d490cb1b79 FEXCore: Fallback to the memcpy slow path for overlaps within 32 bytes
Take e.g a forward rep movsb copy from addr 0 to 1, the expected
behaviour since this is a bytewise copy is:
before: aaabbbb...
after: aaaaaaa...
but by copying in 32-byte chunks we end up with:
after: aaaabbbb...
due to the self overwrites not occuring within a single 32 bit copy.
2024-03-20 20:54:19 +00:00
Billy Laws
94fecb9dad FEXCore: Remove needless alignment checks for the mem{cpy,set} fastpath 2024-03-20 20:54:09 +00:00
Ryan Houdek
7dcacfe990
Merge pull request #3478 from bylaws/memcpy
FEXCore: Add non-atomic Memcpy and Memset IR fast paths
2024-03-18 18:56:44 -07:00
Billy Laws
29b05f6b90 Update InstCountCI 2024-03-18 23:30:19 +00:00
Billy Laws
8d4d8fe3e5 FEXCore: Add non-atomic Memcpy and Memset IR fast paths
When TSO is disabled, vector LDP/STP can be used for a two
instruction 32 byte memory copy which is significantly faster than the
current byte-by-byte copy. Performing two such copies directly after
oneanother also marginally increases copy speed for all sizes >=64.
2024-03-18 23:28:50 +00:00
Ryan Houdek
ab8ee64352
Merge pull request #3497 from Sonicadvance1/movmaskb_constant
JIT: Optimize pmovmaskb with a named vector constant
2024-03-18 16:08:40 -07:00
Alyssa Rosenzweig
2a9fcc6a66
Merge pull request #3492 from Sonicadvance1/implement_prefetch
OpcodeDispatcher: Implement support for the various prefetch instructions
2024-03-18 07:49:47 -04:00
Ryan Houdek
20da1e4244
InstcountCI: Update for pmovmaskb 2024-03-17 18:52:21 -07:00
Ryan Houdek
fd391b1b18
JIT: Optimize pmovmaskb with a named vector constant
I was looking at some other JIT overheads and this cropped up as some
overhead. Instead of materializing a constant using mov+movk+movk+movk,
load it from the named vector constant array.

In a micro-benchmark this improved performance by 34%.
In bytemark this improved on subbench by 0.82%
2024-03-17 18:40:46 -07:00
Mai
ba3029b1f6
Merge pull request #3495 from Sonicadvance1/implement_rdpid
OpcodeDispatcher: Implement rdpid
2024-03-15 17:29:12 -04:00
Ryan Houdek
6757a80365
InstcountCI: Update for prefetch changes 2024-03-15 13:20:28 -07:00
Ryan Houdek
f79991a9d8
OpcodeDispatcher: Implement rdpid
Missed this instruction when implementing rdtscp. Returns the same ID
result in a register just like rdtscp, but without the cycle counter
results. Doesn't touch any flags just like rdtscp.
2024-03-14 20:07:58 -07:00
Ryan Houdek
ca6b2e43e6
Merge pull request #3491 from alyssarosenzweig/rclse/waw
RCLSE: Optimize store-after-store
2024-03-14 03:23:05 -07:00
Ryan Houdek
8a3d08e1d8
Merge pull request #3483 from neobrain/refactor_stealmemoryregion
Allocator: Cleanup StealMemoryRegions implementation
2024-03-14 03:21:09 -07:00
Ryan Houdek
cd2a6ce820
Merge pull request #3469 from alyssarosenzweig/opt/df
Optimize DF representation
2024-03-14 03:18:41 -07:00
Ryan Houdek
4e269d8b80
Merge pull request #3494 from neobrain/fix_libfwd_float_as_int
Library Forwarding: Don't map float/double to fixed-size integers
2024-03-14 03:11:11 -07:00
Tony Wasserka
552e76c001 Library Forwarding: Don't map float/double to fixed-size integers
Fixes #3455.
2024-03-14 10:14:57 +01:00
Mai
caff3cb799
Merge pull request #3493 from Sonicadvance1/bug_for_3478
unittests/ASM: Implements a unit test for #3478
2024-03-13 22:57:39 -04:00
Ryan Houdek
0d33dacc37
unittests/ASM: Implements a unit test for #3478
This unit test recreates the error condition that #3478 causes.
With a string operation that is a backwards copy then the optimization
will read past the end of the page and result in a crash.

Seemingly only happens with backwards string operations, but test
forward and backward in this test.
2024-03-13 18:36:19 -07:00
Ryan Houdek
ba7b69eea2
InstCountCI: Adds prefetch addressing limits 2024-03-12 21:38:28 -07:00
Ryan Houdek
8056bee82b
OpcodeDispatcher: Implement support for the various prefetch instructions
x86 has a few prefetch instructions.
- prefetch - One of two classic 3DNow! instructions
   - Prefetch in to L1 data cache
- prefetchw - One of two classic 3DNow! instructions
   - Implies prefetch in to L1 data cache
   - Prefetch cacheline with intent to write and exclusive ownership

- prefetchnta
   - Prefetch non-temporal data in respect to /all/ cache levels
   - Assumes inclusive caches?
- prefetch{t0,t1,t2}
   - Prefetch data with respect to each cache level
   - T0 = L1 and higher
   - T1 = L2 and higher
   - T2 = L3 and higher

**Some silly duplicates**
- prefetchwt1
   - Duplicate of prefetchw but explicitly L1 data cache
- prefetch_exclusive
   - Duplicate of prefetch

God Of War 2018 uses prefetchw as a hint for exclusive ownership of the
cacheline in some very aggressive spin-loops. Let's implement the
operations to help it along.
2024-03-12 21:37:31 -07:00
Ryan Houdek
cc635a54f8
IR: Implements support for prefetch operation 2024-03-12 21:19:50 -07:00
Ryan Houdek
217d9d8c50
ARMEmitter: Fixes prfm with negative or unaligned offsets 2024-03-12 21:18:23 -07:00
Tony Wasserka
a047ac1699 Allocator: Test CollectMemoryGaps instead of StealMemoryRegions and restore the original interfaces 2024-03-12 10:49:31 +01:00
Tony Wasserka
bb0b114fc8 Allocator: Miscellaneous cleanups 2024-03-12 10:49:30 +01:00
Tony Wasserka
ccd6c15316 Allocator: Use std::from_chars instead of parsing digits manually 2024-03-12 10:49:30 +01:00
Tony Wasserka
0a1fe1c8c2 Allocator: Parse process mappings per-line instead of per-character 2024-03-12 10:49:30 +01:00
Tony Wasserka
f43fe5fd63 Allocator: Stop parsing more eagerly
This is a soft-revert of eaf83aa. That change is no longer needed, since the
stack special case is handled externally now.
2024-03-12 10:49:30 +01:00
Tony Wasserka
dce9f651fd Allocator: Split off memory gap collection to a separate function
This function can be unit-tested more easily, and the stack special is more
cleanly handled as a post-collection step.

There is a minor functional change: The stack special case didn't trigger
previously if the range end was within the stack mapping. This is now fixed.
2024-03-12 10:49:30 +01:00
Tony Wasserka
0d71f169d0 Allocator: Adopt a more testable interface for StealMemoryRegions 2024-03-12 10:49:30 +01:00
Tony Wasserka
430ac0f70a Allocator: Fix format strings 2024-03-12 10:49:30 +01:00
Alyssa Rosenzweig
063b81da1d InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-11 18:50:31 -04:00
Alyssa Rosenzweig
7629007cfa OpcodeDispatcher: allow upper garbage on STOS
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-11 18:50:31 -04:00
Alyssa Rosenzweig
03c6abdad4 OpcodeDispatcher: optimize DF add
fuse the shift the right way

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-11 18:50:31 -04:00
Alyssa Rosenzweig
c99cbe6d0a JIT: switch DF representation
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-11 18:50:31 -04:00
Alyssa Rosenzweig
e3ee65e491 OpcodeDispatcher: use transformed DF for memset/memcpy
Use the 1/-1 representation instead of 0/1. This will be better by the end of
the series.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-11 18:50:31 -04:00
Alyssa Rosenzweig
aee00f524c OpcodeDispatcher: use DF retrieval helpers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-11 18:50:31 -04:00
Alyssa Rosenzweig
a76321c6c1 OpcodeDispatcher: add DF retrieval helpers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-11 18:50:31 -04:00