istringstream is a very slow way to parse this, let's make it a bit
quicker.
Some implementation numbers:
1. Original implementation - 1833556 calculations per second
2. std::strtoul implementation - 4666818 calculations per second
- 2.54x the istringstream implementation
3. str::from_chars implementation - 5120718 calculations per second
- 1.09x the std::strtoul implementation
- 2.79x th istringstream implementation
This message is complaining each time VFORK was using with clone, but we
are handling VFORK here now.
This is just causing debug messages for no reason.
Remove the message and remove the flag removal option.
Not sure how this ever managed to work before this point actually. We
were returning a 64-bit pointer when we were supposed to be returning a
32-bit pointer.
Seemingly this was overwriting the len stack variable so then Steam's
chromehtml.so library was checking the results thoroughly and detecting
that the robust list wasn't setup before this point.
SOMEHOW this worked if FEX was built locally, but broke from the PPA
builders? Not sure how that happened, but theoretically on the next PPA
release this is now fixed and Steam can run from those builds.
Also when setting the robust list, make sure to return EINVAL if the
size doesn't match what's expected there.
FEXCore has no need to understand how to load these layers. Which
requires json parsing.
Move these to the frontend which is already doing the configuration
layer setup and initialization tasks anyway.
Means FEXCore itself no longer needs to link to tiny-json which can be
left to the frontend.
This is just confusing end users these days and no longer matters as a
debug option.
Remove from the GUI initially, maybe afterwards we will even remove
setting this at all and always auto-detect.
This basically just means that we detect ArchLinux and set a flag that
it is a rolling release, skipping doing the version check for an "exact"
match in that instance.
When a signal handler is not installed and is a terminal failure, make
sure to save telemetry before faulting.
We know when an application is going down in this case so we can make
sure to have the telemetry data saved.
Adds a telemetry signal mask data point as well to know which signal
took it down.
Noticed this while debugging Proton Experimental hanging and thought
this could be related. Didn't solve that issue but this should be merged
anyway.
vfork doesn't fork the host's process space in to the child process.
Saving Copy-On-Write overhead problems. It also puts the parent process
to sleep until the fork terminates or executes.
This is a major issue under FEX where we can't emulate vfork correctly
because we need to do other work before this process terminates or
executes a new process. We have been treating `vfork` as a `fork` this
entire time.
This can likely cause problems for applications that actually use vfork
to wait for a process to complete. So let's actually emulate that
feature by using a pipe with poll to determine when that FD gets
removed.
FEX can't use waitpid to wait for this process to terminate since we
would affect the guest also wanted to use a waitpid.
I originally wrote this emulation prior to me fully understanding how
the syscall works. So there are two optimizations here.
1) No need to consume the incoming buffer at all.
- Originally I thought the incoming dirent structures were used to
calculate offset.
- This is not the case, the FD's file position is used instead.
- This means we can remove the incoming buffer consuming overhead
entirely.
2) No need to allocate a temporary buffer at all.
- With getdents and getdents64 we are guaranteed to be dealing with
structures that are the same size or smaller than the host
structure.
- This lets us encode the real host dirents in to the provided
buffer.
- After the `getdents64` host syscall, we then iterate forward
through the list, modifying as we go.
- Need to make sure to shift the elements of the structure in order.
- Need to make sure to use memmove on the `d_name` member since the
movement region can overlap.
These two optimizations significantly reduce the amount of time spent in
getdents, which has a noticeable impact on load times.
Side-tangent: I noticed a fun quirk of how NFS operates with getdents.
If the FSCache hasn't populated the metadata for that folder, then it
will early return with "some" data, not fully maxing out the buffer. The
kernel will start prefetching metadata assuming directory iterating is
happening. The next `getdents` happens and it should return a larger
number of elements.
Very neat.
Currently WINE's longjump doesn't work, so instead set a flag that if
HLT is attempted, just exit the JIT.
This will get our unittests executing at least.
Needs to be alligned to allocation size. Which is a page on Linux, or
64k on Windows.
In order to map at `0x1'0000` on Wine, we need to use a special case DOS
area allocation path.
New versions of CEF rely on this existing. It will get this value and
run strdup on it, even if it is nullptr.
Fixes a steamwebhelper process constantly crashing with the Steam Beta
client.
Only missing auxv values now
- AT_PAGESZ
- AT_EXECFD (for execveat?)
- AT_PHDR
- All the random cache information values.
From https://github.com/AsahiLinux/linux/commits/bits/220-tso
This fails gracefully in the case the upstream kernel doesn't support
this feature, so can go in early.
This feature allows FEX to use hardware's TSO emulation capability to
reduce emulation overhead from our atomic/lrcpc implementation.
In the case that the TSO emulation feature is enabled in FEX, we will
check if the hardware supports this feature and then enable it.
If the hardware feature is supported it will then use regular memory
accesses with the expectation that these are x86-TSO in strength.
The only hardware that anyone cares about that supports this is Apple's
M class SoCs. Theoretically NVIDIA Denver/Carmel supports sequentially
consistent, which isn't quite the same thing. I haven't cared to check
if multithreaded SC has as strong of guarantees. But also since
Carmel/Denver hardware is fairly rare, it's hard to care about for our
use case.
This can be done in an OS agnostic fashion. FEXCore knows the details of
its JIT and should be done in FEXCore itself.
The frontend is only necessary to inform FEXCore where the fault occured
and provide the array of GPRs for accessing and modifying the signal
state.
This is necessary for supporting both Linux and Wine signal contexts
with their unaligned access handlers.
In the case that an unknown drm device shows up, send it down the
default handler. This handler is just a passthrough and assumes that the
kernel doesn't have any compat handlers for that device.
This is nicer than crashing or returning EPERM, since then downstream
drm drivers like Xe, Asahi, and PVR and still try to run.
We of course still want to run their struct definitions through CI once
they go upstream.
These strings don't actually null-terminate and previous checks were
working just because it would usually be null terminated due to
initialization.
Since this isn't guaranteed, I noticed a failure to determine drm device
due to some trailing garbage in the string.
Fixes#2560
This was a forgotten member of the context that needs to be saved and
restored when jumping around the signal state.
When FEX was receiving signals back to back, there was a chance that the
signals would ride the edge of having set `InSyscallInfo` in the JIT,
which meant the SRA state would get saved once, then another signal
would occur with the previous SRA data, saving SRA again. Then when
unwinding the frames it would corrupt the SRA registers. This would
result in trying to load SRA state that was no longer valid, looking
like a crash in the JIT that was hard to see what happens.
Should also make Mono games a little less crash happy.
Cleans up CPUState memcpy as well, since it was a little weird looking.
There are a handful of siginfo layout types. To determine which layout
to use we need to check a combination of si_code and signal number
because each one in isolation doesn't explain the layout type.
Once the layout is calculated then calculate the siginfo using that
layout type. Adds a couple of different layout types to our guest
siginfo_t to handle the previously missing types.
In the case that the symlink following would immediately fail (Due to
the first query not existing or not being a symlink), then FEX would
immediately return the first query. Then the resulting syscall using
this result would try to run the syscall on that file, immediately
error, and try again on the file outside of the rootfs. This gives us
three syscalls for the price of one.
This adds an additional syscall cost to every syscall that that takes
the `FollowSymlink` code path, which is quite a few.
Instead now, check to see if the filepath exists at all, and if the
filepath doesn't even exist for the first fstatat, return a `NoEntry`
immediately. This will cause the resulting syscall waiting for the
result to skip the EmulatedFDPath check and run regular syscall.
If the filepath is symlink it will still loop to track through the
entire symlink tree to find the one that is in the rootfs.
If the filepath is just a file, then the `IsLink` step will fail,
returning the filepath to the file inside of the rootfs. (Getting this
step wrong will make wine/proton immediately break).
With this resolved, this converts a lot of three syscall operations down
to two.
This is a very OS specific operation and it living in FEXCore doesn't
make much sense. This still requires some strong collaboration between
FEXCore and the frontend but it is now split between the locations.
There's still a bit more cleanup work that can be done after this is
merged, but we need to get this burning fire out of the way.
This is necessary for llvm-mingw, this requires all previous PRs to be
merged first.
After this is merged, most of the llvm-mingw work is complete, just some
minor cleanups.
To be merged first:
- #2602
- #2604
- #2605
- #2607
- #2610
- #2615
- #2619
- #2621
- #2622
- #2624
- #2625
- #2626
- #2627
- #2628
- #2629
This is supposed to be a fuzzing based test but has been unused and not
fully supported for a long time.
Just remove it since we won't be coming back to this.
This lets all the path generation for the config to be in the frontend.
This then informs FEXCore where things should live.
This is for llvm-mingw. While paths aren't quite generated correctly,
this gets the code closer to compiling.