9098 Commits

Author SHA1 Message Date
Mai
542f454630
Merge pull request #3517 from Sonicadvance1/remove_mman
FEXCore: Removes vestigial mman SMC checking
2024-03-26 23:28:05 -04:00
Ryan Houdek
4ea6305940
Merge pull request #3521 from neobrain/fix_libfwd_vkcreateinstance
Library Forwarding/vulkan: Fix query of vkCreateInstance function pointer
2024-03-26 08:59:23 -07:00
Tony Wasserka
4d8ffa2abb Library Forwarding/vulkan: Fix query of vkCreateInstance function pointer
The Vulkan specification states that querying "global commands" like
vkCreateInstance with a non-NULL instance is undefined behavior. Indeed, some
implementations will return null pointers in such cases.

Instead, we can drop the query from DoSetupWithInstance altogether, since
the library initializer will load the function pointer using dlsym instead.

Fixes #3519.
2024-03-26 16:25:51 +01:00
Ryan Houdek
ade0c46845
FEXLoader: Add a way to sleep a process on startup
I find myself reimplementing this nearly monthly. Actually codify it so
I can stop reimplementing it.
2024-03-26 07:48:09 -07:00
Ryan Houdek
1450c92b60
Merge pull request #3518 from pmatos/20MFix
Put <20M in double quotes to avoid truncate error
2024-03-26 05:10:45 -07:00
Paulo Matos
53f02ee869 Put <20M in double quotes to avoid truncate error
Avoids bash assuming < is redirection.
See error in https://github.com/FEX-Emu/FEX/actions/runs/8434045431/job/23096421079#step:18:16
2024-03-26 11:47:57 +00:00
Ryan Houdek
6f29e75f67
FEXCore: Removes vestigial mman SMC checking
This wasn't actually wired up to anything ever since some refactoring
occured two years ago.
2024-03-26 02:56:26 -07:00
Ryan Houdek
c1c797bcba
FEXConfig: Add new TSO levers
Nice and convenient when testing applications.
2024-03-26 02:50:54 -07:00
Alyssa Rosenzweig
bbc232741b InstCountCI: Update
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:20 -04:00
Alyssa Rosenzweig
dfe0bdd7f2 OpcodeDispatcher: rewrite DAS
exhaustively checked against the Intel pseudocode since this is tricky:

  def intel(AL, CF, AF):
      old_AL = AL
      old_CF = CF
      CF = False

      if (AL & 0x0F) > 9 or AF:
          Borrow = AL < 6
          AL = (AL - 6) & 0xff
          CF = old_CF or Borrow
          AF = True
      else:
          AF = False

      if (old_AL > 0x99) or old_CF:
          AL = (AL - 0x60) & 0xff
          CF = True

      return (AL & 0xff, CF, AF)

  def fex(AL, CF, AF):
      AF = AF | ((AL & 0xf) > 9)
      CF = CF | (AL > 0x99)
      NewCF = CF | (AF if (AL < 6) else CF)
      AL = (AL - 6) if AF else AL
      AL = (AL - 0x60) if CF else AL
      return (AL & 0xff, NewCF, AF)

  for AL in range(256):
      for CF in [False, True]:
          for AF in [False, True]:
              ref = intel(AL, CF, AF)
              test = fex(AL, CF, AF)
              print(AL, "CF" if CF else "", "AF" if AF else "", ref, test)
              assert(ref == test)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
e26481e3cc OpcodeDispatcher: simplify AAM
in the area.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
86b5a2f352 OpcodeDispatcher: simplify AAD
noticed in the area.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
2bf880c43a OpcodeDispatcher: rewrite AAS
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
583d4f8f94 OpcodeDispatcher: factor out CalculateAFForDecimal
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
3ca2c4377f OpcodeDispatcher: rewrite AAA
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Ryan Houdek
32ec4a3c81
Merge pull request #3508 from Sonicadvance1/stop_using_vla
ELFParser: Stop using a VLA
2024-03-25 12:18:37 -07:00
Ryan Houdek
76983476b9
Merge pull request #3504 from Sonicadvance1/fix_loop_a16
OpcodeDispatcher: Fixes 32-bit mode LOOP RCX register usage
2024-03-25 12:18:14 -07:00
Alyssa Rosenzweig
150af80f3f
Merge pull request #3512 from Sonicadvance1/panic_spilling_block
InstcountCI: Adds a block that is causing panic spilling
2024-03-25 13:13:48 -04:00
Alyssa Rosenzweig
a8b59c16d6
Merge pull request #3513 from Sonicadvance1/panic_spilling_rip
RA: Adds RIP when a block panic spills
2024-03-25 13:13:15 -04:00
Alyssa Rosenzweig
949717a95f OpcodeDispatcher: rewrite DAA implementation
Based on https://www.righto.com/2023/01/

New implementation is branchless, which is theoretically easier to RA. It's also
massively simpler which is good for a demon opcode.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 13:00:59 -04:00
Alyssa Rosenzweig
693d86dd67 OpcodeDispatcher: add SetAFAndFixup helper
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 12:59:19 -04:00
Alyssa Rosenzweig
ea4fce7a43 InstcountCI: add flagm primary 32-bit
track the demon opcodes

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 12:49:00 -04:00
Ryan Houdek
3034edb0aa
RA: Adds RIP when a block panic spills
I find myself adding this every time I find a game that panic spills.
Let's just print it out.
2024-03-24 17:11:29 -07:00
Ryan Houdek
c025039651
InstcountCI: Adds a block that is causing panic spilling 2024-03-24 17:10:00 -07:00
Ryan Houdek
64f47d1ec2
FEXCore: Adds more TSO control levers
Lets use control vector loadstores and memcpy/memset TSO visibility.
This just gives us a bit more configuration rather than TSO off or on.
2024-03-24 16:34:18 -07:00
Ryan Houdek
ea31363221
Linux/Threads: Fixes a stack memory leak for pthreads
Same situation as the last stack leak memory fix, this is fairly tricky
since it is dealing with stack pivoting. Fixes the memory leak around
pthread stack allocations, making memory usage lower for applications
that constantly spin-up and destroy threads (Like Steam).

We need to let glibc allocate a minimum sized stack (128KB and we can't
control it) to work around a race condition with DTV/TLS regions. This
means we need to do a stack pivot once the thread starts executing.

We also need to be careful because the `PThread` object is deleted
inside of the execution thread, which was resulting in a use-after-free
bug.

There are definitely some more memory leaks that I'm still fighting, and I have
noticed in my abusive thread creation program that we might want to
change some jemalloc options to more aggressively cut down on residency.
This is just one out of many.
2024-03-24 05:22:22 -07:00
Ryan Houdek
70befc216f
Telemetry: Allow redirecting directory that data is written to
This will be necessary
2024-03-24 00:47:35 -07:00
Ryan Houdek
50f62663ac
ELFParser: Stop using a VLA
Clang-18 complains about this, use a vector instead.
2024-03-22 22:51:57 -07:00
Ryan Houdek
60755acef0
FEXLoader: Add some debug-only tracking for FEX owned FDs
I remember seeing some application last year where they closed a FEX
owned FD but now I don't remember what it was. This can really mess us
up so add some debug tracking so we can try and find it again.

Might be something specifically around flatpack, appimage, or chrome's
sandbox. I have some ideas about how to work around these problems if
they crop up but need to find the problem applications again.
2024-03-22 22:49:26 -07:00
Mai
002ca360f8
Merge pull request #3506 from Sonicadvance1/telemetry_rename
Telemetry: Rename old file instead of copying
2024-03-23 00:24:02 -04:00
Ryan Houdek
4952b2e16c
Telemetry: Rename old file instead of copying
Since we do an immediate overwrite of the file we are copying, we can
instead do a rename. Failure on rename is fine, will either mean the
telemetry file didn't exist initially, or some other permission error so
the telemetry will get lost regardless.
2024-03-21 22:51:20 -07:00
Ryan Houdek
cccf263080
InstCountCI: Update for Telemetry offset changes 2024-03-21 21:10:03 -07:00
Ryan Houdek
5a35e119fe
Telemetry: Adds tracker for non-canonical memory access crash
This may be useful for tracking TSO faulting when it manages to fetch
stale data. While most TSO crashes are due to nullptr dereferences, this
can still check for the corruption case.
2024-03-21 20:47:36 -07:00
Ryan Houdek
9ab930cb26
unittests/ASM: Adds tests for loop instruction address size overrides
32-bit test would fail if the 16-bit address size override wasn't
respected.
2024-03-21 20:18:43 -07:00
Ryan Houdek
824f122680
OpcodeDispatcher: Fixes 32-bit mode LOOP RCX register usage
In 64-bit mode, the LOOP instruction's RCX register usage is 64-bit or
32-bit.
In 32-bit mode, the LOOP instruction's RCX register usage is 32-bit or
16-bit.

FEX wasn't handling the 16-bit case at all which was causing the LOOP
instruction to effectively always operate at 32-bit size. Now this is
correctly supported, and it also stops treating the operation as 64-bit.
2024-03-21 20:13:15 -07:00
Ryan Houdek
8852d94416
Merge pull request #3503 from alyssarosenzweig/opt/loop
OpcodeDispatcher: optimize LOOP/N/E
2024-03-21 20:05:50 -07:00
Mai
0c24aea27e
Merge pull request #3502 from Sonicadvance1/remove_termux
Removes false termux support
2024-03-21 12:45:01 -04:00
Alyssa Rosenzweig
82ba16c6ed OpcodeDispatcher: optimize LOOP/N/E
Don't clobber NZCV.

Before/after assembly from the Primary_E1 unit test:

< 4340: [INFO] cset w20, ne
< 4340: [INFO] mrs x21, nzcv
< 4340: [INFO] cmp x5, #0x0 (0)
< 4340: [INFO] cset x22, ne
< 4340: [INFO] and x20, x22, x20
< 4340: [INFO] msr nzcv, x21
< 4340: [INFO] cbnz x20, #+0x8 (addr 0xffff896f8084)
< 4340: [INFO] b #+0x1c (addr 0xffff896f809c)
< 4340: [INFO] ldr x0, pc+8 (addr 0xffff896f808c)
---
> 4340: [INFO] csel x20, x5, xzr, ne
> 4340: [INFO] cbnz x20, #+0x8 (addr 0xfffed7308070)
> 4340: [INFO] b #+0x1c (addr 0xfffed7308088)
> 4340: [INFO] ldr x0, pc+8 (addr 0xfffed7308078)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-21 12:08:40 -04:00
Ryan Houdek
45ea0cd782
Removes false termux support
This was a funny joke that this was here, but it is fundamentally
incompatible with what we're doing. All those users are running proot
anyway because of how broken running under termux directly is.

Just remove this from here.
2024-03-20 22:04:32 -07:00
Ryan Houdek
6ce366ef35
Merge pull request #3499 from Sonicadvance1/overlapping_memcpy_unittest
unittests/ASM: Adds a test for overlapping memcpy using rep movs
2024-03-20 19:52:19 -07:00
Ryan Houdek
862d63adf2
unittests/ASM: Adds a test for overlapping memcpy using rep movs
Caused by #3478

This was missed in the review that it could cause issues. bylaws already
has a fix incoming that will get this unit test working.
2024-03-20 18:44:36 -07:00
Ryan Houdek
167896dc9d
Merge pull request #3501 from bylaws/memcpy
FEXCore: Fallback to the memcpy slow path for overlaps within 32 bytes
2024-03-20 18:42:56 -07:00
Billy Laws
12fb26f9c0 Update InstCountCI 2024-03-20 21:08:59 +00:00
Billy Laws
d490cb1b79 FEXCore: Fallback to the memcpy slow path for overlaps within 32 bytes
Take e.g a forward rep movsb copy from addr 0 to 1, the expected
behaviour since this is a bytewise copy is:
before: aaabbbb...
after: aaaaaaa...
but by copying in 32-byte chunks we end up with:
after: aaaabbbb...
due to the self overwrites not occuring within a single 32 bit copy.
2024-03-20 20:54:19 +00:00
Billy Laws
94fecb9dad FEXCore: Remove needless alignment checks for the mem{cpy,set} fastpath 2024-03-20 20:54:09 +00:00
Ryan Houdek
7dcacfe990
Merge pull request #3478 from bylaws/memcpy
FEXCore: Add non-atomic Memcpy and Memset IR fast paths
2024-03-18 18:56:44 -07:00
Billy Laws
29b05f6b90 Update InstCountCI 2024-03-18 23:30:19 +00:00
Billy Laws
8d4d8fe3e5 FEXCore: Add non-atomic Memcpy and Memset IR fast paths
When TSO is disabled, vector LDP/STP can be used for a two
instruction 32 byte memory copy which is significantly faster than the
current byte-by-byte copy. Performing two such copies directly after
oneanother also marginally increases copy speed for all sizes >=64.
2024-03-18 23:28:50 +00:00
Ryan Houdek
ab8ee64352
Merge pull request #3497 from Sonicadvance1/movmaskb_constant
JIT: Optimize pmovmaskb with a named vector constant
2024-03-18 16:08:40 -07:00
Alyssa Rosenzweig
2a9fcc6a66
Merge pull request #3492 from Sonicadvance1/implement_prefetch
OpcodeDispatcher: Implement support for the various prefetch instructions
2024-03-18 07:49:47 -04:00