Commit Graph

1082 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
3b052e826f OpcodeDispatcher: calculate PF with integer ops
based on clang's __builtin_parity

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-04-01 14:12:32 -04:00
Alyssa Rosenzweig
65ec191dc1 IR: add XornShift
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-04-01 14:12:32 -04:00
Alyssa Rosenzweig
b1ddd8cd3b
Merge pull request #3541 from alyssarosenzweig/opt/clc
optimize clc
2024-04-01 13:51:10 -04:00
Alyssa Rosenzweig
f2d001e721
Merge pull request #3543 from alyssarosenzweig/ra/dead-code
RA: drop dead block interference code
2024-04-01 13:51:00 -04:00
Ryan Houdek
e2a095372e
Merge pull request #3534 from Sonicadvance1/move_ir_defines
FEXCore: Move nearly all IR definitions to internal
2024-04-01 10:00:20 -07:00
Ryan Houdek
5c29c9d464
Merge pull request #3527 from Sonicadvance1/move_type_defines
Moves FHU TypeDefines to FEXCore includes
2024-04-01 08:57:22 -07:00
Ryan Houdek
3bed305660
Merge pull request #3526 from Sonicadvance1/move_codeloader
FEXCore: Moves CodeLoader to frontend
2024-04-01 07:52:02 -07:00
Ryan Houdek
f6639c3594
Merge pull request #3525 from Sonicadvance1/move_cpubackend
FEXCore: Moves CPUBackend definition internal
2024-04-01 06:47:34 -07:00
Alyssa Rosenzweig
ca1ec232c9 RA: drop dead block interference code
Unused, and new RA won't use it either. Torch it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-31 20:51:11 -04:00
Alyssa Rosenzweig
4452f0acba ConstProp: optimize rmif with 0 for clc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-31 20:01:44 -04:00
Alyssa Rosenzweig
67baff8a57
Merge pull request #3537 from Sonicadvance1/remove_vla_ra
RA: Removes VLA usage
2024-03-31 14:44:37 -04:00
Ryan Houdek
fedc24be1e
RA: Removes VLA usage
Just like #3508, clang-18 complains about VLA usage.

This vector is relatively small, only around 18 elements but is
semi-dynamic depending on arch and if FEXCore is targeting Linux or
Win32.
2024-03-30 16:50:04 -07:00
Alyssa Rosenzweig
706065b0e2 OpcodeDispatcher: accelerate cmpxchg with flagm
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-30 14:12:59 -04:00
Alyssa Rosenzweig
9fd32f07cb JIT: preserve nzcv for the slow atomic path
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-30 14:12:59 -04:00
Alyssa Rosenzweig
deba6a1b76 JIT: add comment about unaligned backpatching
save future me some grief.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-30 14:12:59 -04:00
Alyssa Rosenzweig
d25ace43aa
Merge pull request #3528 from alyssarosenzweig/ra/xsave-xrstor
Eliminate crossblock liveness in xsave/xrstor
2024-03-30 14:11:25 -04:00
Ryan Houdek
ed3af580c5
FEXCore: Move nearly all IR definitions to internal
It has been a long time coming that FEX no longer needed to leak IR
implementation details to the frontend, this was legacy due to IR CI and
various other problems.

Now that the last bits of IR leaking has been removed, move everything
that we can internally to the implementation.
We still have a couple of minor details in the exposed IR.h to the
frontend, but these are limited to a few enums and some thunking struct
information rather than all the implementation details.

No functional change with this, just moving headers around.
2024-03-29 17:20:18 -07:00
Ryan Houdek
8564290f76
FEXCore: Remove DebugStore map
This hasn't been used and is blocking refactoring more code.
2024-03-29 14:58:44 -07:00
Alyssa Rosenzweig
c513b9685d OpcodeDispatcher: eliminate crossblock liveness in xsave/xrstor
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-29 09:57:16 -04:00
Ryan Houdek
d11a36eaea
Moves FHU TypeDefines to FEXCore includes
FEXCore includes was including an FHU header which would result in
compilation failure for external projects trying to link to libFEXCore.

Moves it over to fix this, it was the only FHU usage in FEXCore/include
NFC
2024-03-29 02:54:54 -07:00
Ryan Houdek
f46e88ebdb
FEXCore: Moves CPUBackend definition internal
This is no longer necessary to be part of the public API. Moves the
header internally.

Needed to pass through `IsAddressInCodeBuffer` from CPUBackend through
the Context object, but otherwise no functional change.
2024-03-29 02:27:29 -07:00
Ryan Houdek
20eb338644
FEXCore: Moves CodeLoader to frontend
FEXCore no longer has a need for this since a bunch of related code was
already moved to the frontend. Move the CodeLoader now.
2024-03-29 02:24:53 -07:00
Ryan Houdek
aa26b6288e
Merge pull request #3522 from alyssarosenzweig/ra/cmpxchg8
OpcodeDispatcher: eliminate branch in cmpxchg pair
2024-03-27 21:56:19 -07:00
Ryan Houdek
624bc3fce5
Merge pull request #3520 from Sonicadvance1/sleep_process
FEXLoader: Add a way to sleep a process on startup
2024-03-27 18:35:06 -07:00
Alyssa Rosenzweig
61758ea47d OpcodeDispatcher: eliminate branch in cmpxchg pair
In the old case:

* if we take the branch, 1 instruction
* if we don't take the branch, 3 instruction
* branch predictor fun
* 3 instructions of icache pressure

In the new case:

* unconditionally 2 instructions
* no branch predictor dependence
* 2 instructions of icache pressure

This should not be non-neglibly worse, and it simplifies things for RA.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-27 12:40:06 -04:00
Ryan Houdek
7b74ca1931
Merge pull request #3514 from alyssarosenzweig/opt/demon
rewrite Demon Addition Adjust (DAA) and other demonic opcodes
2024-03-26 23:24:00 -07:00
Ryan Houdek
24fd28ed9e
Merge pull request #3511 from Sonicadvance1/more_tso_levers
FEXCore: Adds more TSO control levers
2024-03-26 23:23:41 -07:00
Ryan Houdek
970d5d5b13
Merge pull request #3509 from Sonicadvance1/allow_telemetry_redirect
Telemetry: Allow redirecting directory that data is written to
2024-03-26 23:23:05 -07:00
Ryan Houdek
7f90ca53f7
Merge pull request #3505 from Sonicadvance1/telemetry_noncanonical
Telemetry: Adds tracker for non-canonical memory access crash
2024-03-26 23:21:32 -07:00
Ryan Houdek
ade0c46845
FEXLoader: Add a way to sleep a process on startup
I find myself reimplementing this nearly monthly. Actually codify it so
I can stop reimplementing it.
2024-03-26 07:48:09 -07:00
Ryan Houdek
6f29e75f67
FEXCore: Removes vestigial mman SMC checking
This wasn't actually wired up to anything ever since some refactoring
occured two years ago.
2024-03-26 02:56:26 -07:00
Alyssa Rosenzweig
dfe0bdd7f2 OpcodeDispatcher: rewrite DAS
exhaustively checked against the Intel pseudocode since this is tricky:

  def intel(AL, CF, AF):
      old_AL = AL
      old_CF = CF
      CF = False

      if (AL & 0x0F) > 9 or AF:
          Borrow = AL < 6
          AL = (AL - 6) & 0xff
          CF = old_CF or Borrow
          AF = True
      else:
          AF = False

      if (old_AL > 0x99) or old_CF:
          AL = (AL - 0x60) & 0xff
          CF = True

      return (AL & 0xff, CF, AF)

  def fex(AL, CF, AF):
      AF = AF | ((AL & 0xf) > 9)
      CF = CF | (AL > 0x99)
      NewCF = CF | (AF if (AL < 6) else CF)
      AL = (AL - 6) if AF else AL
      AL = (AL - 0x60) if CF else AL
      return (AL & 0xff, NewCF, AF)

  for AL in range(256):
      for CF in [False, True]:
          for AF in [False, True]:
              ref = intel(AL, CF, AF)
              test = fex(AL, CF, AF)
              print(AL, "CF" if CF else "", "AF" if AF else "", ref, test)
              assert(ref == test)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
e26481e3cc OpcodeDispatcher: simplify AAM
in the area.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
86b5a2f352 OpcodeDispatcher: simplify AAD
noticed in the area.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
2bf880c43a OpcodeDispatcher: rewrite AAS
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
583d4f8f94 OpcodeDispatcher: factor out CalculateAFForDecimal
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Alyssa Rosenzweig
3ca2c4377f OpcodeDispatcher: rewrite AAA
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 19:43:10 -04:00
Ryan Houdek
76983476b9
Merge pull request #3504 from Sonicadvance1/fix_loop_a16
OpcodeDispatcher: Fixes 32-bit mode LOOP RCX register usage
2024-03-25 12:18:14 -07:00
Alyssa Rosenzweig
949717a95f OpcodeDispatcher: rewrite DAA implementation
Based on https://www.righto.com/2023/01/

New implementation is branchless, which is theoretically easier to RA. It's also
massively simpler which is good for a demon opcode.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 13:00:59 -04:00
Alyssa Rosenzweig
693d86dd67 OpcodeDispatcher: add SetAFAndFixup helper
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-25 12:59:19 -04:00
Ryan Houdek
3034edb0aa
RA: Adds RIP when a block panic spills
I find myself adding this every time I find a game that panic spills.
Let's just print it out.
2024-03-24 17:11:29 -07:00
Ryan Houdek
64f47d1ec2
FEXCore: Adds more TSO control levers
Lets use control vector loadstores and memcpy/memset TSO visibility.
This just gives us a bit more configuration rather than TSO off or on.
2024-03-24 16:34:18 -07:00
Ryan Houdek
70befc216f
Telemetry: Allow redirecting directory that data is written to
This will be necessary
2024-03-24 00:47:35 -07:00
Ryan Houdek
4952b2e16c
Telemetry: Rename old file instead of copying
Since we do an immediate overwrite of the file we are copying, we can
instead do a rename. Failure on rename is fine, will either mean the
telemetry file didn't exist initially, or some other permission error so
the telemetry will get lost regardless.
2024-03-21 22:51:20 -07:00
Ryan Houdek
5a35e119fe
Telemetry: Adds tracker for non-canonical memory access crash
This may be useful for tracking TSO faulting when it manages to fetch
stale data. While most TSO crashes are due to nullptr dereferences, this
can still check for the corruption case.
2024-03-21 20:47:36 -07:00
Ryan Houdek
824f122680
OpcodeDispatcher: Fixes 32-bit mode LOOP RCX register usage
In 64-bit mode, the LOOP instruction's RCX register usage is 64-bit or
32-bit.
In 32-bit mode, the LOOP instruction's RCX register usage is 32-bit or
16-bit.

FEX wasn't handling the 16-bit case at all which was causing the LOOP
instruction to effectively always operate at 32-bit size. Now this is
correctly supported, and it also stops treating the operation as 64-bit.
2024-03-21 20:13:15 -07:00
Ryan Houdek
8852d94416
Merge pull request #3503 from alyssarosenzweig/opt/loop
OpcodeDispatcher: optimize LOOP/N/E
2024-03-21 20:05:50 -07:00
Alyssa Rosenzweig
82ba16c6ed OpcodeDispatcher: optimize LOOP/N/E
Don't clobber NZCV.

Before/after assembly from the Primary_E1 unit test:

< 4340: [INFO] cset w20, ne
< 4340: [INFO] mrs x21, nzcv
< 4340: [INFO] cmp x5, #0x0 (0)
< 4340: [INFO] cset x22, ne
< 4340: [INFO] and x20, x22, x20
< 4340: [INFO] msr nzcv, x21
< 4340: [INFO] cbnz x20, #+0x8 (addr 0xffff896f8084)
< 4340: [INFO] b #+0x1c (addr 0xffff896f809c)
< 4340: [INFO] ldr x0, pc+8 (addr 0xffff896f808c)
---
> 4340: [INFO] csel x20, x5, xzr, ne
> 4340: [INFO] cbnz x20, #+0x8 (addr 0xfffed7308070)
> 4340: [INFO] b #+0x1c (addr 0xfffed7308088)
> 4340: [INFO] ldr x0, pc+8 (addr 0xfffed7308078)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-03-21 12:08:40 -04:00
Ryan Houdek
45ea0cd782
Removes false termux support
This was a funny joke that this was here, but it is fundamentally
incompatible with what we're doing. All those users are running proot
anyway because of how broken running under termux directly is.

Just remove this from here.
2024-03-20 22:04:32 -07:00
Billy Laws
d490cb1b79 FEXCore: Fallback to the memcpy slow path for overlaps within 32 bytes
Take e.g a forward rep movsb copy from addr 0 to 1, the expected
behaviour since this is a bytewise copy is:
before: aaabbbb...
after: aaaaaaa...
but by copying in 32-byte chunks we end up with:
after: aaaabbbb...
due to the self overwrites not occuring within a single 32 bit copy.
2024-03-20 20:54:19 +00:00