FEXCore doesn't need track the TLS state of the SignalDelegator, this is
a frontend concept.
Removes the tracking from the backend and keeps it in the frontend.
Previously: Would keep one clone thread's stack active for teardown
delaying.
With aggressive cloning and teardown, this was unsafe.
Only reap the stack when told it is safe to do so.
The creation mutex could have been held if the parent thread was in the
middle of creating a thread when forking. This would result in a
deadlock once the fork child attempted to create another thread.
Forcefully dropping the lock in the fork child works around this
deadlock. This comes at the expense of potentially leaving resources
guarded by the thread creation mutex in an invalid state. Crashes caused
by this are easier to reason about than a delayed deadlock, though.
Moves the CTX LockBeforeFork in to the Syscallhandler's LockBeforeFork.
This lets the syscall handler just call its own LockBeforeFork and
UnlockAfterFork functions rather than two on each call site.
Also moves the CTX->UnlockAfterFork in to the SyscallHandler's to be
consistent with the LockBeforeFork half.
No functional change.
When code invalidation is happening we currently have the issue that a
thread can acquire the code invalidation mutex in the middle of
invalidation. This is due to us acquiring and releasing the mutex
between each thread's code invalidation.
We need to hold the mutex for the entire duration for all thread's code
invalidation.
This fixes a rare hang on proton startup and resolves a consistent hang
on Proton application shutdown.
This now puts us on par with FEX-2312.1 with hanging.
This does not fix a relatively rare hang on fork (which also existed with FEX-2312.1).
This also does not fix the issue that the intersection of our mutexes
between frontend and backend are very convoluted. In part of the work
that is going to fix the rare fork mutex hang will change more of this.
ARM64, x86 (64-bit), and x86 (32-bit) each have different alignment
requirements, so this change ensures that consistent data layout is
used for packing and unpacking.
If the thread object is added to the tracking vector immediately then
there ends up being a race condition before the thread manages to fill
out the thread-specific data that only occurs at the start of the new
thread.
This manifests in a crash when a thread is allocating memory while
another thread is getting constructed. Easy fix is to defer the tracking
until the thread has setup its state.
This is used for instcountci to ensure instruction counts don't change
when a compiler supports this feature or not. Always runtime disable
when running in instcountci.
CMake option from #3394 can still be useful so leaving that in place.
We are required in our syscall emulation to handle cases where pointers
are invalid. This means we need to pessimistically assume a memcpy will
fault when reading application memory.
This implements a signal handler based approach to catching the SIGSEGV
on memcpy and returning an EFAULT if it faults.
This behaves exactly like pidof but only searches for FEX applications.
This fixes a long standing annoyance of mine that pidof doesn't work for
FEX. This behaves exactly like pidof but knows how to decode the command
line options to pull out the program data.
If the Linux kernel ever accepts the patches for binfmt_misc to change
how the interpreter is handled then this will become redundant, but
until that happens here is a utility that I want.