Moves from FEXCore to FEX::HLE. Also moves the ThunkFunctions that get
exposed to a namespace to make it more obvious that these are
thunkhandlers rather than just static functions.
alternative to #3638. this is theoretically better for side-by-side diffs. in
practice it may make other diffs worse since all the \'s change when part of the
macro change.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
X11 displays and xcb connections managed by the guest libX11 can't be used by
the host, but we can create intermediary objects using the host libX11. This
allows to connect guest-managed objects to the host window system integration
APIs in OpenGL/Vulkan.
This is required for host-side calls to guest functions on 32-bit guests.
Since the host stack is allocated before FEX blocks memory inaccessible to
the guest, the guest would otherwise fail to read the packed argument data.
ARM64, x86 (64-bit), and x86 (32-bit) each have different alignment
requirements, so this change ensures that consistent data layout is
used for packing and unpacking.
Generally, implicit integer conversions are prohibited for data wrapped in
guest_layout/host_layout, but a few types are exceptional:
* char vs signed char vs unsigned char vs other 8-bit ints
* wchar_t vs other 32-bit ints
* size_t vs uint32_t (32-bit only)
* long long vs other 64-bit ints
* long vs long long (64-bit only)
These combinations have the same data size, so conversions between them are
explicitly allowed now.
Some types (notably size_t on 32-bit) have different sizes on the guest than on
the host. This template function must be aware of these differences, so a
second parameter list with fixed-size types must be provided to describe the
guest types.
Note that this information can't be queried through type traits: To a C++
compiler, size_t is indistuingishable from uint64_t. For this reason, the
correct guest type must indeed be provided externally.
This can be used to allow automatically handling structures that require
special behavior for one member but are automatically repackable otherwise.
The feature is enabled using the new custom_repack annotation and requires
additional repacking functions to be defined in the host file for each
customized member.
To avoid performance traps, several conditions must hold for exit repacking
to apply:
* Argument must be a pointer
* The pointee type must have differing data layout between guest and host
* The pointee type must be non-const
Arguments that don't meet the first two conditions are safe *not* to repack
on exit, since they're either passed by copy or have consistent data layout.
The third condition is a heuristic: In principle, an API function could modify
data through nested pointers even if the argument pointer is const. However,
automatic repacking is not supported for such types anyway, so this is a safe
heuristic to use.
Pointer types inherently cause data layout compatibility issues, so they're
worth special-casing here. The wrappers will type-pun pointers to 32-bit or
64-bit integers (matching the guest architecture) to avoid direct host-side
use of guest pointers without consideration.
The guest_layout wrapper provides an architecture-agnostic representation of
the guest data layout of each struct used in a thunked library. A constructor
is added to host_layout to allow conversion of the data to the host layout.
For types that are already fully compatible, both layout wrappers are simple
type aliases to minimize overhead.
These annotations allow for a given type or parameter to be treated as
"compatible" even if data layout analysis can't infer this automatically.
assume_compatible_data_layout is more powerful than is_opaque, since it
allows for structs containing members of a certain type to be automatically
inferred as "compatible".
Conversely however, is_opaque enforces that the underlying data is never
accessed directly, since non-pointer uses of the type would still be
detected as "incompatible".
This annotation can be used for data types that can't be repacked
automatically even with custom repack annotations. With ptr_passthrough,
the types are wrapped in guest_layout and passed to the host like that.
Previously, two functions with the same signature would always be wrapped
in the same logic. This change allows customizing one function with
annotations while leaving the other one unchanged.
Fixes#2754
These panicking fallbacks are at times not ending up in as plt calls
for some reason that I haven't been able to reproduce locally.
So far the only way I can reproduce is building with Canonical's PPA
build system, since rebuilding locally didn't resolve the issue.
This will change the failure mode from these panicking asserts happening
at call time, to dlopen failing during relocation when loading the
thunk. Which LD_DEBUG=all can be used for debugging relocation failure
in that case.
1) The host library needs to be loaded in the global namespace.
2) We need to use `RTLD_DEFAULT` instead of querying the object
directly.
We need to load the host library in the global namespace so the symbols
end up in the global symbol table. This follows how all these symbols
/usually/ get loaded. Either by linking directly to the library or how
loaders will end up loading these.
We need to use RTLD_DEFAULT to follow symbol overriding rules that tend
to occur. For example, MangoHUD will LD_PRELOAD a library that provides
GLX and EGL symbols. Which FEX's thunk libraries need to pick up this
override.
If we are querying the host library directly then we fail to pickup
these overrides, thus breaking MangoHUD and other overlays.
Due to how we use a modified ABI for these indirect functions, we don't
have a clean way to say that the host_addr lives in a side-argument.
The previous inline asm that moved the value from r11 in to a variable
worked up until you hit functions with 8 or more arguments. At that
point the compiler was generating code before our inline assembly and
using r11 as a temporary, thus destroying our value.
Then a crash would occur and it was very hard to determine why. It would
end up calling some random function (0x1 in this case) from an indirect
call.
This made it /look/ like it was calling an invalid function returned
from the loader but in reality it was a corrupt register loading bad
data.
To work around this case, we can use an inline asm register variable and
a volatile asm block that "sets" the variable. In this case GCC and
Clang both seem to extend the live range of the register from the start
of the function to the use of the variable.
This resolves the issue for now, and I tested quite a large number of
function signatures to see if it would break in the future.
Theoretically our functional testing should catch this, but we don't
currently have something that abuses all the functions like this
currently.
Fairly straightforward, just requires enabling lld in this case since
cross-compiling doesn't work well with gnu linker.
Also lld doesn't understand the linker script program header symbolic
names for read/write/execute. So we need to use the raw number there.
Works around an issue where GCC 11 generates broken `init_array` section
and also plt sections that glibc doesn't understand.
Use the fastcall ABI for 32-bit x86 to make our lives easier.
Fastcall ABI puts the first two 32-bit arguments in ECX and EDX
respectively.
Compilers are nice today and allow us to do cross-abi function calls
like this.
This avoids the need to provide a fallback definition for platform-specific
macros. The definitions are only added host-side, since only Host.h is
included in any interface files.