When attempting to read files that aren't backed by a filesystem then
our current read file helpers fail since they query the file size
upfront.
Change the helper so that it doesn't query the size and just reads the file if it
can be opened. This lets us read `/proc/self/maps` using helpers.
GDBServer is inherently OS specific which is why all this code is
removed when compiling for mingw/win32. This should get moved to the
frontend before we start landing more work to clean this interface up.
Not really any functional change.
Changes:
FEXCore/Context: Adds new public interfaces, these were previously
private.
- WaitForIdle
- If `Pause` was called or the process is shutting down then this
will wait until all threads have paused or exited.
- WaitForThreadsToRun
- If `Pause` was previously called and then `Run` was called to get
them running again, this waits until all the threads have come out
of idle to avoid races.
- GetThreads
- Returns the `InternalThreadData` for all the current threads.
- GDBServer needs to know all the internal thread data state when the
threads are paused which is what this gives it.
GDBServer:
- Removes usages of internal data structures where possible.
- This gets it clean enough that moving it out of FEXCore is now
possible.
These should always be used in the dispatcher rather than the raw jumps they
translate to, as they ensure that flags are flushed. Eliminates a class of bugs
that will become a lot easier to hit with the new nzcv work.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Requires #3238 to be merged first since this uses the tbx IR operation.
Worst case is now a three instruction sequence of ldr+ldr+tbx.
Some operations are special-cased, which definitely doesn't cover all
possible cases we could use without tbx, but as a worst case improvement
this is a significant improvement.
A bunch of blendps swizzles weren't optimal. This optimizes all swizzles
to be optimal.
Two instructions can be more optimal without a tbx but the rest required
tbx to be optimal since they don't match ARM's swizzle mechanics.
gdbserver is currently entirely broken so this doesn't change behaviour.
The gdb pause check that we originally had an excessive amount of
overhead.
Instead use the pending interrupt fault check that was wired
up for wine.
This makes the check very lightweight and makes it more reasonable to
implement a way to have gdbserver support attaching to a process.
gdb gets angry if we return text with `<No Name>` in an xml text field.
Instead only return a name if we have one and gdb will take care of the
rest.
Additionally change the formatting of the return packet, it doesn't need
the xml version header.
Messed up when originally implementing this, substr's second argument is
requested substring length, not the ending position.
Noticed this while trying to parse multiple FEX_HOSTFEATURES options.
Everytime I want to quickly output a value for testing I tend to use
Print which didn't work under the simulator.
Give this a quick fix to wire up the jump to the vixl sim.
Currently we don't get why an IR emit failed in the assert message. Put
the code in to the message so it is easier to see.
This also resolved the issue that when in RelWithDebInfo the assert line
would typically be the end of the IR emission function, so you couldn't
see which assert actually triggered. Now since the message is printed
this is easier
Before:
```
[ASSERT]
```
After:
```
[ASSERT] Size == FEXCore::IR::OpSize::i32Bit || Size == FEXCore::IR::OpSize::i64Bit
```
This is a lot easier and better data than what #3227 proposed.
If the fetch and nonfetch versions mismatched then the DCE optimization
which changes the IR operation would break handily.
Add a comment as a reminder so if anyone touches this they will
understand.
When the result of an atomic fetch operation is unused then we can
safely convert it to a non-fetching version of the operation.
This happens hundreds of times per process as far as I can tell. No idea
if this actually helps any hardware, but theoretically it can allow CPUs
to not stall waiting for writeback of the atomic operation.
Couldn't detect any performance improvements in the various little
things I was poking at least. Very trivial to support so add it.
Unaligned variants are already handled in our unaligned fault handler
since the only difference is the acquire semantic is dropped and the
destination register is the zero register.
Secondary ALU operations were missed and when the operation is 4-bytes
in size we can also allow garbage upper bits since the JIT will emit a
32-bit operation for this instruction which is safe.
Optimizes some bad codegen around 32-bit ALU operations.
Instead of using VInsElement in pi2fw and pf2iw, just use uzp1 to ensure
we don't unintentionally add to RA pressure.
Additionally we can generate the constant needed for pmulhrw directly
using the movi instruction. Converts two instructions in to one.
Under FEX's current constraints this makes all 3DNow! instructions
optimal.
When SubShift (LSL) occurs with both sources constant then optimize away
the calculation.
Additionally if add is found to have one immediate constant where the
inverse of the constant fits in to ImmAddSub range, then invert the
constant and change it in to a sub.
This optimizes the cases when direction flag is known upfront in an
instruction.
Previously this moved two constant, did a compare and a csel. Four
instructions in total. It also corrupts NZCV which we want to use for
other things.
This new codegen emits one constant and one subtract instruction, two
instructions total and doesn't touch NZCV.
More optimal!
Audit the code base and mark any instruction that implicitly clobbers flags so
it can get special handling in the dispatcher to spill NZCV ahead of emitting.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Lots of instructions clobber NZCV inadvertently but are not intended to write to
the host flags from the IR point-of-view. As an example, Abs logically has no
side effects but physically clobbers NZCV due to its cmp/csneg impl on non-CSSC
hw. Add infrastructure to model this in the IR so we can deal with it when we
start using NZCV for things.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
So we don't need to mark VInsertElement as implicit clobber in the common case.
Only afects sve256 which doesn't exist yet.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>