725 Commits

Author SHA1 Message Date
Ryan Houdek
190f7c27e0
Merge pull request #3243 from Sonicadvance1/loadfile_unsized
FEXCore/FileLoading: Updates helper to load file that is backed by memory
2023-11-03 20:21:12 -07:00
Ryan Houdek
b15f0b5d36 FEXCore/FileLoading: Updates helper to load file that is backed by memory
When attempting to read files that aren't backed by a filesystem then
our current read file helpers fail since they query the file size
upfront.

Change the helper so that it doesn't query the size and just reads the file if it
can be opened. This lets us read `/proc/self/maps` using helpers.
2023-11-03 07:01:39 -07:00
Ryan Houdek
e2c65189ff GDBServer: Preparation work to get this moved to the frontend
GDBServer is inherently OS specific which is why all this code is
removed when compiling for mingw/win32. This should get moved to the
frontend before we start landing more work to clean this interface up.

Not really any functional change.

Changes:

FEXCore/Context: Adds new public interfaces, these were previously
private.
- WaitForIdle
   - If `Pause` was called or the process is shutting down then this
     will wait until all threads have paused or exited.
- WaitForThreadsToRun
   - If `Pause` was previously called and then `Run` was called to get
     them running again, this waits until all the threads have come out
     of idle to avoid races.
- GetThreads
   - Returns the `InternalThreadData` for all the current threads.
   - GDBServer needs to know all the internal thread data state when the
     threads are paused which is what this gives it.

GDBServer:
- Removes usages of internal data structures where possible.
   - This gets it clean enough that moving it out of FEXCore is now
     possible.
2023-11-02 20:11:01 -07:00
Ryan Houdek
03f63f99a8 FEXCore: Moves StringUtils to FEXCore headers
Once gdbserver gets moved to the frontend this will need to be in the
includes.
2023-11-02 20:09:12 -07:00
Tony Wasserka
3a90dbbb35 Thunks: Print error if guest-provided callbacks are called asynchronously from the host 2023-11-02 19:50:58 +01:00
Ryan Houdek
5103f2d92b
Merge pull request #3247 from alyssarosenzweig/refactor/nzcv-prereq
Preparatory patches for nzcv
2023-11-01 14:07:14 -07:00
Alyssa Rosenzweig
d4a6b031ea
Merge pull request #3245 from Sonicadvance1/remove_gdbpausecheck
FEXCore: Removes gdb pause check handler
2023-11-01 15:46:40 -04:00
Alyssa Rosenzweig
319cf4bf3d Arm64: Preserve flags in ExitFunction
Mostly harmless (except for fusion on cortexes), prepares us for nzcv work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
d75c0f2c50 IR: Add missing flag clobbers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
a586d3823d IR: VFCMPEQ does not clobber flag
Oversight.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
5522c6db9c OpcodeDispatcher: Use jump wrappers
Mostly automated replacement + renaming for build fixing.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
367e1658ad OpcodeDispatcher: Add jump wrappers
These should always be used in the dispatcher rather than the raw jumps they
translate to, as they ensure that flags are flushed. Eliminates a class of bugs
that will become a lot easier to hit with the new nzcv work.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:36 -04:00
Alyssa Rosenzweig
bbad06f81a ArchHelpers: Add cfinv()
FlagM goodness.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-11-01 15:44:35 -04:00
Mai
9612b2fe4b
Merge pull request #3240 from Sonicadvance1/optimize_palignr_zero
OpcodeDispatcher: Optimize palignr with zero immediate
2023-11-01 05:55:52 +01:00
Ryan Houdek
e5df636efd OpcodeDispatcher: Optimize pblendw
Requires #3238 to be merged first since this uses the tbx IR operation.

Worst case is now a three instruction sequence of ldr+ldr+tbx.
Some operations are special-cased, which definitely doesn't cover all
possible cases we could use without tbx, but as a worst case improvement
this is a significant improvement.
2023-10-31 21:18:02 -07:00
Ryan Houdek
de10cbad98 OpcodeDispatcher: Optimize blendps
A bunch of blendps swizzles weren't optimal. This optimizes all swizzles
to be optimal.

Two instructions can be more optimal without a tbx but the rest required
tbx to be optimal since they don't match ARM's swizzle mechanics.
2023-10-31 20:06:45 -07:00
Ryan Houdek
e4d9c264d8 IR: Adds support for tbx 2023-10-31 20:06:45 -07:00
Mai
18065199e3
Merge pull request #3242 from Sonicadvance1/gdbserver_threadnames
GdbServer: Fixes returning thread names
2023-11-01 04:02:14 +01:00
Mai
77d92872bc
Merge pull request #3212 from Sonicadvance1/dpp_opt
OpcodeDispatcher: Optimize 128-bit DPPS and DPPD
2023-11-01 04:01:05 +01:00
Ryan Houdek
460f13be71 FEXCore: Removes gdb pause check handler
gdbserver is currently entirely broken so this doesn't change behaviour.
The gdb pause check that we originally had an excessive amount of
overhead.
Instead use the pending interrupt fault check that was wired
up for wine.
This makes the check very lightweight and makes it more reasonable to
implement a way to have gdbserver support attaching to a process.
2023-10-31 18:40:00 -07:00
Ryan Houdek
47c9463217 GdbServer: Fixes returning thread names
gdb gets angry if we return text with `<No Name>` in an xml text field.
Instead only return a name if we have one and gdb will take care of the
rest.

Additionally change the formatting of the return packet, it doesn't need
the xml version header.
2023-10-29 17:16:46 -07:00
Alyssa Rosenzweig
bbd20b47ba
Merge pull request #3236 from Sonicadvance1/fix_strenum
Config: Fixes string enum parser with multiple arguments
2023-10-29 07:32:57 -04:00
Alyssa Rosenzweig
5b70209728
Merge pull request #3233 from Sonicadvance1/print_vixl
JIT: Implements Print support for vixl sim
2023-10-29 07:32:32 -04:00
Ryan Houdek
13cd8b33a2 OpcodeDispatcher: Optimize palignr with zero immediate
These turns in to moves
2023-10-27 15:09:31 -07:00
Ryan Houdek
09e3371a0d Config: Fixes string enum parser with multiple arguments
Messed up when originally implementing this, substr's second argument is
requested substring length, not the ending position.

Noticed this while trying to parse multiple FEX_HOSTFEATURES options.
2023-10-27 12:04:04 -07:00
Alyssa Rosenzweig
ff3f7345b6
Merge pull request #3235 from Sonicadvance1/cpu_names
CPUID: Adds some missing cpu core names
2023-10-27 08:00:43 -04:00
Alyssa Rosenzweig
8181e53727
Merge pull request #3234 from Sonicadvance1/minor_bfxil
Arm64: Minor optimization to bfxil and bfi
2023-10-27 08:00:28 -04:00
Alyssa Rosenzweig
5431aa5a28
Merge pull request #3232 from Sonicadvance1/assert_code
IR: Print assert code for IR EmitValidation
2023-10-27 07:58:51 -04:00
Ryan Houdek
1a293cc542 CPUID: Adds some missing cpu core names
EZ-PZ
2023-10-26 21:03:32 -07:00
Ryan Houdek
61f22911c7 Arm64: Minor optimization to bfxil and bfi
When the destination doesn't alias the source, we can remove a final mov
from both of these operations.

Does some minor code improvement.
2023-10-26 20:25:44 -07:00
Ryan Houdek
fe8778bb96 JIT: Implements Print support for vixl sim
Everytime I want to quickly output a value for testing I tend to use
Print which didn't work under the simulator.
Give this a quick fix to wire up the jump to the vixl sim.
2023-10-26 18:34:47 -07:00
Ryan Houdek
14e5ea1e22
Merge pull request #3230 from Sonicadvance1/optimize_atomic_fetch
IR: Optimize unused result atomic fetch mop to just atomic mop
2023-10-26 15:57:20 -07:00
Ryan Houdek
74f1205f33 IR: Print assert code for IR EmitValidation
Currently we don't get why an IR emit failed in the assert message. Put
the code in to the message so it is easier to see.
This also resolved the issue that when in RelWithDebInfo the assert line
would typically be the end of the IR emission function, so you couldn't
see which assert actually triggered. Now since the message is printed
this is easier

Before:
```
[ASSERT]
```

After:
```
[ASSERT] Size == FEXCore::IR::OpSize::i32Bit || Size == FEXCore::IR::OpSize::i64Bit
```

This is a lot easier and better data than what #3227 proposed.
2023-10-26 15:51:56 -07:00
Ryan Houdek
4045bfd187
Merge pull request #3229 from Sonicadvance1/instcountci_3dnow
OpcodeDispatcher: Optimize a few 3DNow! operations
2023-10-26 15:42:50 -07:00
Ryan Houdek
9db93a43dd IR: Adds comments that atomic op layout must match
If the fetch and nonfetch versions mismatched then the DCE optimization
which changes the IR operation would break handily.

Add a comment as a reminder so if anyone touches this they will
understand.
2023-10-26 15:40:55 -07:00
Ryan Houdek
5028434292 IR: Optimize unused result atomic fetch mop to just atomic mop
When the result of an atomic fetch operation is unused then we can
safely convert it to a non-fetching version of the operation.

This happens hundreds of times per process as far as I can tell. No idea
if this actually helps any hardware, but theoretically it can allow CPUs
to not stall waiting for writeback of the atomic operation.
Couldn't detect any performance improvements in the various little
things I was poking at least. Very trivial to support so add it.

Unaligned variants are already handled in our unaligned fault handler
since the only difference is the acquire semantic is dropped and the
destination register is the zero register.
2023-10-25 21:00:04 -07:00
Ryan Houdek
8f0461cac8 IR: Adds support for non-fetch Atomci CLR and NEG 2023-10-25 21:00:04 -07:00
Ryan Houdek
bab96b9441 OpcodeDispatcher: Allow garbage in upper bits for more ALU ops
Secondary ALU operations were missed and when the operation is 4-bytes
in size we can also allow garbage upper bits since the JIT will emit a
32-bit operation for this instruction which is safe.

Optimizes some bad codegen around 32-bit ALU operations.
2023-10-25 20:44:36 -07:00
Ryan Houdek
6177290e9d OpcodeDispatcher: Optimize a few 3DNow! operations
Instead of using VInsElement in pi2fw and pf2iw, just use uzp1 to ensure
we don't unintentionally add to RA pressure.

Additionally we can generate the constant needed for pmulhrw directly
using the movi instruction. Converts two instructions in to one.

Under FEX's current constraints this makes all 3DNow! instructions
optimal.
2023-10-25 19:27:54 -07:00
Ryan Houdek
20a54913bd IR: Support shifted imms in VectorImm
We will want to use this in some cases.
2023-10-25 19:27:00 -07:00
Tony Wasserka
adcdb32d49 Thunks: Fix function pointer support on 32-bit 2023-10-25 14:37:22 +02:00
Ryan Houdek
4466c50c2b ConstProp: Optimize SubShift and Add with negative
When SubShift (LSL) occurs with both sources constant then optimize away
the calculation.

Additionally if add is found to have one immediate constant where the
inverse of the constant fits in to ImmAddSub range, then invert the
constant and change it in to a sub.

This optimizes the cases when direction flag is known upfront in an
instruction.
2023-10-23 10:36:33 -07:00
Ryan Houdek
95c756b466 OpcodeDispatcher: Optimize DF pointer offset calculation
Previously this moved two constant, did a compare and a csel. Four
instructions in total. It also corrupts NZCV which we want to use for
other things.

This new codegen emits one constant and one subtract instruction, two
instructions total and doesn't touch NZCV.

More optimal!
2023-10-23 09:27:41 -07:00
Ryan Houdek
99465faf63 IR: Implements support for subtract with shifted register
Will be used soon.
2023-10-23 09:27:41 -07:00
Ryan Houdek
e018917f76
Merge pull request #3216 from alyssarosenzweig/opt/nzcv-infra
Prep commits for NZCV modelling
2023-10-23 07:39:47 -07:00
Alyssa Rosenzweig
6de8bc6848 IR: Annotate instructions with implicit flag clobber
Audit the code base and mark any instruction that implicitly clobbers flags so
it can get special handling in the dispatcher to spill NZCV ahead of emitting.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:21:47 -04:00
Alyssa Rosenzweig
d87155e4ee IR: Add infrastructure for modelling flag clobbers
Lots of instructions clobber NZCV inadvertently but are not intended to write to
the host flags from the IR point-of-view. As an example, Abs logically has no
side effects but physically clobbers NZCV due to its cmp/csneg impl on non-CSSC
hw. Add infrastructure to model this in the IR so we can deal with it when we
start using NZCV for things.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:21:47 -04:00
Alyssa Rosenzweig
42259974c4 Arm64: Preserve NZCV in VInsertElement
So we don't need to mark VInsertElement as implicit clobber in the common case.
Only afects sve256 which doesn't exist yet.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:21:34 -04:00
Alyssa Rosenzweig
e455996dbd OpcodeDispatcher: Remove silly shift branching
The flag generation code does this internally and more efficiently.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:16:38 -04:00
Alyssa Rosenzweig
b5dd1d05e9 Dispatcher: Preserve NZCV
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-23 10:16:38 -04:00