This can be used to allow automatically handling structures that require
special behavior for one member but are automatically repackable otherwise.
The feature is enabled using the new custom_repack annotation and requires
additional repacking functions to be defined in the host file for each
customized member.
To avoid performance traps, several conditions must hold for exit repacking
to apply:
* Argument must be a pointer
* The pointee type must have differing data layout between guest and host
* The pointee type must be non-const
Arguments that don't meet the first two conditions are safe *not* to repack
on exit, since they're either passed by copy or have consistent data layout.
The third condition is a heuristic: In principle, an API function could modify
data through nested pointers even if the argument pointer is const. However,
automatic repacking is not supported for such types anyway, so this is a safe
heuristic to use.
Pointer types inherently cause data layout compatibility issues, so they're
worth special-casing here. The wrappers will type-pun pointers to 32-bit or
64-bit integers (matching the guest architecture) to avoid direct host-side
use of guest pointers without consideration.
The guest_layout wrapper provides an architecture-agnostic representation of
the guest data layout of each struct used in a thunked library. A constructor
is added to host_layout to allow conversion of the data to the host layout.
For types that are already fully compatible, both layout wrappers are simple
type aliases to minimize overhead.
This allows future changes to emit interdependent helper structures in the
same order.
The sort algorithm was chosen for simplicity rather than performance. It's
fast enough in practice even for APIs as large as Vulkan.
These are setup to be nullptr by default. Instead of providing no-op
lock instructions just have them be nullptr.
It's already part of the API that they need to be nullptr checked before
calling and this matches behaviour of the real libX11 library. Removes
some spam that is unnecessary.
These annotations allow for a given type or parameter to be treated as
"compatible" even if data layout analysis can't infer this automatically.
assume_compatible_data_layout is more powerful than is_opaque, since it
allows for structs containing members of a certain type to be automatically
inferred as "compatible".
Conversely however, is_opaque enforces that the underlying data is never
accessed directly, since non-pointer uses of the type would still be
detected as "incompatible".
This annotation can be used for data types that can't be repacked
automatically even with custom repack annotations. With ptr_passthrough,
the types are wrapped in guest_layout and passed to the host like that.
Previously, two functions with the same signature would always be wrapped
in the same logic. This change allows customizing one function with
annotations while leaving the other one unchanged.
A couple of games were hitting these. Not sure how they were missed in
PR #3159 but adds the missing one.
Small rearrangement to make this easier as well. Hopefully thunk stuff
lands sooner rather than later to automate this for Vulkan.
Maybe `-isystem` instead of `-I` needs to be used unlike what #2076,
might depend on what is installed on the host system.
This runs the data layout analysis pass added in the previous change twice:
Once for the host architecture and once for the guest architecture. This
allows the new DataLayoutCompareAction to query architecture differences for
each type, which can then be used to instruct code generation accordingly.
Currently, type compatibility is classified into 3 categories:
* Fully compatible (same size/alignment for the type itself and any members)
* Repackable (incompatibility can be resolved with emission of automatable
repacking code, e.g. when struct members are located at differing offsets
due to padding bytes)
* Incompatible
The set of these types is tracked in AnalysisAction, to which extensive
verification logic is added to detect potential incompatibilities and to
enforce use of annotatations where needed.
Two bugs here that caused thunking X11 thunking in Wine/Proton to not
work.
The easier of the two. The various variadic functions that we thunk
actually take key:value pairs where the first is a string pointer, and
the value can be various things.
We need to handle these as true key:value pairs rather than finding the
first nullptr and dropping the remainder.
Additionally, there are 12 keys that specify a callback that FEX needs
to catch and convert to host callable. Wine is the first application
that I have seen that actually uses this. If these callbacks aren't
wired up then it it can miss events.
The harder of the two problems is the `libX11_Variadic_u64` function was
subtly incorrect. Nothing had previously truly exercised this and my
test program didn't notice anything wrong while writing it.
The first incorrect thing was that it was subtracting the nullptr ender
variable before the stack size calculation, causing the value to
overwrite the stack if the number of remaining elements was event.
Secondly the assembly that was storing two elements per step was
decrementing the counter by 8 instead of two. Didn't pick this up before
since I believe the code was only hitting the non-pair path before.
This gets Proton thunking working under FEX now.
Fixes#2754
These panicking fallbacks are at times not ending up in as plt calls
for some reason that I haven't been able to reproduce locally.
So far the only way I can reproduce is building with Canonical's PPA
build system, since rebuilding locally didn't resolve the issue.
This will change the failure mode from these panicking asserts happening
at call time, to dlopen failing during relocation when loading the
thunk. Which LD_DEBUG=all can be used for debugging relocation failure
in that case.
Our regex would only ever capture a single digit, so versions that had
more than one digit per section would lose additional digits.
Fixes and moves the helper to a cmake file to be shared between
GuestLibs and HostLibs.
Uses the fix in xcb because Fedora ships an older version that doesn't
have some of FEX's newer symbols.
Forgot to initialize CBDone to false before the helper thread is
started.
This fixes an issue where an XCB context is created, then stopped, then
another is created but immediately exits because CBDone was still true
from the previous run.
Also adds the handler for `xcb_connect_to_fd` so we don't miss that
usage.
Fixes a crash in xcb thunks since more stuff on FEX side has moved over
to jemalloc.
Destructors don't actually get called when a shared library is removed.
It's some weirdo quirk that we can't work around. Instead refcount
Display connections being created and disconnected. Creating the thread
on the first display creation, and tearing down on final display
teardown.
Must be merged before #2564 otherwise that PR will break thunks.
These need to be bit-exact following exactly what is shown in the
assembly.
libunwind parses where EIP is to see if it is in a stack frame.
Also needsto live in VDSO otherwise backtrace doesn't work.
1) The host library needs to be loaded in the global namespace.
2) We need to use `RTLD_DEFAULT` instead of querying the object
directly.
We need to load the host library in the global namespace so the symbols
end up in the global symbol table. This follows how all these symbols
/usually/ get loaded. Either by linking directly to the library or how
loaders will end up loading these.
We need to use RTLD_DEFAULT to follow symbol overriding rules that tend
to occur. For example, MangoHUD will LD_PRELOAD a library that provides
GLX and EGL symbols. Which FEX's thunk libraries need to pick up this
override.
If we are querying the host library directly then we fail to pickup
these overrides, thus breaking MangoHUD and other overlays.
Fixes a crash that occurs due to `_XInitDisplayLock` due to the display
lock function being initialized to our own handler.
Once XInitThreads is called once then it becomes a no-op.
steamwebhelper was hitting this.
This was generating GOT prologues even on naked functions which was
breaking VDSO on 32-bit.
Fixes almost every 32-bit application when running with debug options.
Clang thunks already have these default enabled, but let's also enable
this on the GCC side.
sse2 will enable most things we care about, which matches ASIMD quite
closely.
fpmath=sse removes some x87 usage for 32-bit thunks specifically.
Should effectively be a non-functional-change
Each one of these are sorted through the DefinitionExtracy.py script
running over a temporary header file for each set of includes.
eg:
```bash
$ cat test.h
#include <X11/Xproto.h>
#include <X11/XKBlib.h>
#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <X11/Xresource.h>
#include <X11/ImUtil.h>
$ ./Scripts/DefinitionExtract.h test.h > out.txt
```
Any custom defined types have been sorted appropriately.
A bunch of missing XKB definitions were missing and added in the
process.
I've had this stashed in my git stash for a while now, I just haven't
cleaned it up.
Fixes a bunch of thunks around X11 applications missing symbols.
Due to how we use a modified ABI for these indirect functions, we don't
have a clean way to say that the host_addr lives in a side-argument.
The previous inline asm that moved the value from r11 in to a variable
worked up until you hit functions with 8 or more arguments. At that
point the compiler was generating code before our inline assembly and
using r11 as a temporary, thus destroying our value.
Then a crash would occur and it was very hard to determine why. It would
end up calling some random function (0x1 in this case) from an indirect
call.
This made it /look/ like it was calling an invalid function returned
from the loader but in reality it was a corrupt register loading bad
data.
To work around this case, we can use an inline asm register variable and
a volatile asm block that "sets" the variable. In this case GCC and
Clang both seem to extend the live range of the register from the start
of the function to the use of the variable.
This resolves the issue for now, and I tested quite a large number of
function signatures to see if it would break in the future.
Theoretically our functional testing should catch this, but we don't
currently have something that abuses all the functions like this
currently.
Fairly straightforward, just requires enabling lld in this case since
cross-compiling doesn't work well with gnu linker.
Also lld doesn't understand the linker script program header symbolic
names for read/write/execute. So we need to use the raw number there.
Works around an issue where GCC 11 generates broken `init_array` section
and also plt sections that glibc doesn't understand.
The `mov ebp, ecx` was breaking vsyscall and was expected to be used
with the `syscall` instruction rather than `int 0x80`.
Remove that to fix it.
Also remove the pushes and pops around the syscall instruction, these
are unnecessary in an emulated environment, we won't clobber the
registers.
Fixes Steam execution with VDSO.