27 Commits

Author SHA1 Message Date
Alyssa Rosenzweig
19a7b06b91 ConstProp: swallow up LongDivideElimination
as usual.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
6b9293979c ConstProp: swallow up InlineCallOptimization
No reason to have a separate pass for this, merging should be a bit faster since
it eliminates an IR walk.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-06-04 10:09:51 -04:00
Alyssa Rosenzweig
cb00d9171f IR: merge general DCE with flag DCE
Flag DCE needs to do general DCE anyway to converge in one pass. So we can move
the special syscall/atomic logic over to flag DCE and then drop the second DCE
pass altogether. Now local dead code of both is eliminated in a single pass.

Flag DCE is carefully written to converge in a single iteration which makes this
scheme work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 15:44:49 -04:00
Alyssa Rosenzweig
24cb02f4ff FEXCore: remove IRCompaction
New RA does not need it for correctness, and the slight slow down to new RA from
not compacting first is much smaller than the cost of compaction. Overall speeds
up node.js start time by ~6% on top of new RA.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-24 09:25:44 -04:00
Alyssa Rosenzweig
4448f84f29 IRValidation: merge in ValueDominanceValidation
All we actually need to validate is that each source has been previously defined
within the block. That checks everything we care about now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-21 19:34:31 -04:00
Ryan Houdek
9e1840e974
FEXCore: Moves CodeEmitter to FHU
Now that the vixl dependency is gone, this gets moved to FHU since the
frontend is going to need it for a microjit.
2024-05-13 12:48:10 -07:00
Alyssa Rosenzweig
7e663b91df IR: drop IRParser
Aside from its own self-test, the parser is unused and should remain that way,
since it's a maintenance burden with no real benefit. Burn it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2024-05-08 14:16:54 -04:00
Billy Laws
bd5b817c3a AllocatorHooks: Mark JIT code memory as EC code on ARM64EC
Executable mapped memory is treated as x86 code by default when
running under EC, VirtualAlloc2 needs to be used together with a
special flag to map JIT arm64 code.
2024-04-06 12:40:52 +00:00
Ryan Houdek
78a362581d
Update xxhash to v0.8.2
Switches to using upstream cmake files.
2024-02-26 23:57:25 -08:00
Ryan Houdek
ab6c00bbcf FEXCore/Utils: Rename FutexSpinWait to SpinWaitLock 2024-01-17 10:19:38 -08:00
Ryan Houdek
136fa78825 FEXCore: Implements an efficient spin-loop API
This will only be used internally inside of FEXCore for efficient shared
codecach backpatch spin-loops.
2024-01-17 10:19:38 -08:00
Ryan Houdek
b115c144fb FEXCore: Removes NetStream from public API
Only used by GDBServer.
NFC.
2023-12-25 07:07:17 -08:00
Ryan Houdek
6e8af295c5
Merge pull request #3290 from Sonicadvance1/move_signaldelegator
FEXCore: Moves more SignalDelegator functions to the frontend
2023-11-27 13:46:11 -08:00
Ryan Houdek
2070056d16 FEXCore: Moves more SignalDelegator functions to the frontend
As we are moving more and more OS specific code to the frontend, this is
another set of functions that can be moved to FEXLoader from FEXCore.

No functional change here, only code moved from protected to private and
to FEXLoader's SignalDelegator.

Once more thread handling is moved to the frontend we can move even more
out of FEXCore. As follows:
- CheckXIDHandler can get moved.
  - First pthread FEX makes would just call this.
- Register/UnregisterTLSState
  - This can happen in the clone/thread handler once the frontend
    handles it.

This leaves very little in the backend and is mostly an interface for
passing signal data to the frontend that it needs once a signal has
occured.
It additionally also is used for `SignalThread`.
2023-11-27 12:59:46 -08:00
Ryan Houdek
b89c3a4573 FEXCore: Removes x86 DebugInfo table
This has long since been unused. Originally implemented for some fuzzing
tests but has been abandoned and that should likely be implemented some
other way.
2023-11-25 16:50:24 -08:00
Ryan Houdek
efc5eb2933
Merge pull request #3250 from Sonicadvance1/gdbserver_frontend_move
FEXLoader: Wire up gdbserver in the frontend
2023-11-09 14:48:59 -08:00
Ryan Houdek
0dcbdcc0e2 FEX: Only pass CPU tunables to FEXCore and FEXLoader
This fixes an issue where CPU tunables were ending up in the thunk
generator which means if your CPU doesn't support all the features on
the *Builder* then it would crash with SIGILL. This was happening with
Canonical's runners because they typically only support ARMv8.2 but we
are compiling packages to run on ARMv8.4 devices.

cc: FEX-2311.1
2023-11-08 05:50:33 -08:00
Ryan Houdek
5dee921300 FEXLoader: Wire up gdbserver in the frontend
Requires #3249 to be merged first

Library alerting has been disabled for now, and storing IR while
gdbserver is running is removed.

Otherwise no functional change.
2023-11-03 20:23:51 -07:00
Ryan Houdek
4cff3e5f1f FEXCore/IR: Changes over to automated IR dispatch generation
Suggested by Alyssa. Adding an IR operation can be a little tedious
since you need to add the definition to JIT.cpp for the dispatch switch,
JITClass.h for the function declared, and then actually defining the
implementation in the correct file.

Instead support the common case where an IR operation just gets
dispatched through to the regular handler. This lets the developer just
put the function definition in to the json and the relevent cpp file and
it just gets picked up.

Some minor things:
- Needs to support dynamic dispatch for {Load,Store}Register and
  {Load,Store}Mem
   - This is just a bool in the json
- It needs to not output JIT dispatch for some IR operations
   - SSE4.2 string instructions and x87 operations
   - These go down the "Unhandled" path
- Needs to support a Dispatcher function override
   - This is just for handling NoOp IR operations that get used for
     other reasons.
- Finally removes VSMul and VUMul, consolidating to VMul
   - Unlike V{U,S}Mull, signed or unsigned doesn't change behaviour here
- Fixed a couple random handler names not matching the IR operation
  name.
2023-10-07 15:01:47 -07:00
Ryan Houdek
90570fd5f4 FEXCore: Merge Arm64Dispatcher in to Dispatcher
With the removal of the x86 JIT, there is no need to have these be
independent classes.

Merges the Arm64Dispatcher in to the base Dispatcher class.
No functional change, just moving code.
2023-09-30 09:31:55 -07:00
Ryan Houdek
9968e6431f Passes: Rename SyscallOptimization
This is now inlining multiple external calls out of the JIT. Rename it
to InlineCallOptimization.
2023-09-24 17:25:38 -07:00
Ryan Houdek
b5cc9a12f2 FEXCore: Removes x86 JIT.
This is blocking performance improvements. This backend is almost
unilaterally unused except for when I'm testing if games run on Radeon
video drivers.

Hopefully AmpereOne and Orin/Grace can fulfill this role when they
launch next year.
2023-09-21 18:30:02 -07:00
Ryan Houdek
fea72ce19c
Merge pull request #3120 from Sonicadvance1/more_optimal_x87
FEXCore: Support preserve_all ABI for interpreter fallbacks
2023-09-21 15:35:37 -07:00
Alyssa Rosenzweig
c52741c813 FEXCore: Gut interpreter
It is scarcely used today, and like the x86 jit, it is a significant
maintainence burden complicating work on FEXCore and arm64 optimization. Remove
it, bringing us down to 2 backends.

1 down, 1 to go.

Some interpreter scaffolding remains for x87 fallbacks. That is not a problem
here.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-21 12:48:12 -04:00
Ryan Houdek
745729cdc2 SoftFloat-3e: Adds preserve_all attribute to all functions used
This will let FEX's JIT be more optimal
2023-09-18 17:42:48 -07:00
Alyssa Rosenzweig
e6db2d0b96 IR: Remove phi nodes
It turns out that pure SSA isn't a great choice for the sort of emulation we do.
On one hand, it discards information from the guest binary's register allocation
that would let us skip stuff. On the other hand, it doesn't have nearly as many
benefits in this setting as in a traditional compiler... We really *don't* want
to do global RA or really any global optimization. We assume the guest optimizer
did its job for x86, we just need to clean up the mess left from going x86 ->
arm. So we just need enough SSA to peephole optimize.

My concrete IR proposals are that:

  * SSA values must be killed in the same block that they are defined.
  * Explicit LoadGPR/StoreGPR instructions can be used for global persistence.
  * LoadGPR/StoreGPR are eliminated in favour of SSA within a block.

This has a lot of nice properties for our setting:

  * Except for some internal REP instruction emulation (etc), we already have
    registers for everything that escapes block boundaries, so this form is very
    easy to go into -- straightforward local value numbering, not a full into
    SSA pass.

  * Spilling is entirely local (if it happens at all), since everything is in
    registers at block boundaries. This is excellent, because Belady's algorithm
    lets us spill nearly optimally in linear-time for individual blocks. (And
    the global version of Belady's algorithm is massively more complicated...)
    A nice fit for a JIT.

    Relatedly, it turns out allowing spilling is probably a decent decision,
    since the same spiller code can be used to rematerialize constants in a
    straightforward way. This is an issue with the current RA.

  * Register assignment is entirely local. For the same reason, we can assign
    registers "optimally" in linear time & memory (e.g. with linear scan). And
    the impl is massively simpler than a full blown SSA-based tree scan RA. For
    example, we don't have to worry about parallel copies or coalescing phis or
    anything. Massively nicer algorithm to deal with.

  * SSA value names can be block local which makes the validation implicit :~)

It also has remarkably few drawbacks, because we didn't want to do CFG global
optimization anyway given our time budget and the diminishng returns. The few
global optimizations we might want (flag escape analysis?) don't necessarily
benefit from pure SSA anyway.

Anyway, we explicitly don't want phi nodes in any of this. They're currently
unused. Let's just remove them so nobody gets the bright idea of changing that.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-09-05 16:35:12 -04:00
Alyssa Rosenzweig
af21b8f3c7 Move External/FEXCore/ to FEXCore/
It is not an external component, and it makes paths needlessly long.
Ryan seemed amenable to this when we discussed on IRC earlier.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-08-17 16:32:16 -04:00