We have supported this since #163 but we haven't been exposing the
feature in hwcap2.
We have exposed it in CPUID this entire time, just not in hwcap2.
This is a new procfs symlink path that changes behaviour of binfmt_misc
when exposed. We need to check both procfs/exe and procfs/interpreter
and see if they exist AND also differ.
Once/if they do then we can disable a bunch of checking of paths once
they do. The fallback when none of this is supported has the same
behaviour has previously where it still does all the regular checking.
During binfmt_misc install cmake will check the kernel version for the
raw binfmt_misc writing. Which will never pass until we have a real
kernel version that it is upstreamed in.
For update-binfmts we add a new optional argument where the tool will
drop the flag if the host kernel version isn't new enough to handle the
option.
Now that PF calculation is deferred, the cost of calculating PF correctly should
be tolerable. Remove the speed hack to skip PF. It's fundamentally broken, and
there are enough broken things in FEX as it is that we don't need to maintain
this one ;-)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Currently FEX's internal EFLAGS representation is a perfect 1:1 mapping
between bit offset and byte offset. This is going to change with #3038.
There should be no reason that the frontend needs to understand how to
reconstruct the compacted flags from the internal representation.
Adds context helpers and moves all the logic to FEXCore. The locations
that previously needed to handle this have been converted over to use
this.
When the application calls exit_group we no longer need to care about
cleanup because the entire process group is leaving.
Just immediately call exit group and get out. Might revisit this in the
future.
Fixes#2752
This relies on wine's behaviour passing through linux paths and env vars,
so that the config in the user's home directory can be accessed outside
of the wine prefix.
Instead of allocating a temporary copy of the string, return a view of
it instead. Should improve the performance of system calls that take
file paths. Since it was allocating a string for every single syscall
that uses them in this case.
This will allow investigating the Arm64 directly next to the test, plus
publicly linking directly to badly behaving tests.
Perfect for nerdsniping implementations.
Implements CI for tracking instruction counts for generate blocks of
code when transforming from x86 to ARM64 assembly.
This will end up encompassing every instruction in our instruction
tables similarly to how our assembly tests try to test everything in our
instruction tables.
Incidentally, the data for this CI is generated using our assembly
tests. By enabling disassembly and instruction stats when executing a
suite of instructions, this gives the stats that can be added to a json
file.
The current implementation only implements the SecondGroup table of
instructions because it is a relatively small table and has known
inefficiencies in the instruction implementations. As this gets merged I
will be adding more tables of instructions to additional json files for
testing.
These JSON files will support adjusting CPU features regardless of the
host features so it can test implementations depending on different CPU
features. This will let us test things like one instruction having
different "optimal" implementations depending on if it supports SVE128,
SVE256, SVEI8MM, etc.
This initial instruction auditing is what found the bug in our vector
shift instructions by size of zero. If inspecting the result of the CI
run, you can tell that these instructions still aren't "optimal" because
they are doing loads and stores that can be eliminated.
The "Optimal" in the JSON is purely for human readable and grepping
ability to see what is optimal versus not. Same with the "Comment"
section.
According to my auditing spreadsheet, the total number of instructions
that will end up in these json files will be about 1000, but we will
likely end up with more since there will be edge cases that can be more
optimal depending on arguments.
Moves the dummy handlers over to this library. This will end up getting
used for more than the mingw test harness runner once the instruction
count CI is operational.
Allows us to consume an array of strings and convert it to an mask of
enum values. This is a quality of life change that allows us to specify
a mask of options.
The first configuration option added to support this is to control the
vixl disassembler. Now by default the vixl disassembler doesn't
disassemble any blocks and needs to be enabled individually.
eg:
```
FEXLoader --disassemble=blocks <args>
FEXLoader --disassemble=dispatcher <args>
FEXLoader --disassemble=blocks,dispatcher <args>
```
Has the additional convenience option of just passing in numbers as
well.
```
FEXLoader --disassemble=2 <args>
FEXLoader --disassemble=1 <args>
FEXLoader --disassemble=3 <args>
```
Also of course all of this works through environment variables.
```
FEX_DISASSEMBLE=blocks FEXInterpreter <args>
FEX_DISASSEMBLE=dispatcher FEXInterpreter <args>
FEX_DISASSEMBLE=blocks,dispatcher FEXInterpreter <args>
```
While only used fairly sparingly now, this is likely to have some
additional configurations using this in the future. Since we already
have some configs that are basically using enums, but just by doing
string comparisons.
This was asked for by a developer, so I figured I would throw it
together quick.
There was one holdout variable that was in a TLS object in FEXCore. Move
it to the frontend with the rest of the TLS variables.
Allows us to remove "Frontend" TLS management to be the only TLS
management.
Removes the @PREFIX_ARCH@ replacement string in the thunks path.
The library prefix paths now get generated upfront and everything gets
replaced to handle the differences between multiarch distros.
Fixes Thunks on Arch and Fedora.
When a fork occurs FEX needs to be incredibly careful as any thread
(that isn't forking) that holds a lock will vanish when the fork occurs.
At this point if the newly forked process tries to use these mutexes
then the process hangs indefinitely.
The three major mutexes that need to be held during a fork:
- Code Invalidation mutex
- This is the highest priority and causes us to hang frequently.
- This is highly likely to occur when one thread is loading shared
libraries and another thread is forking.
- Happens frequently with Wine and steam.
- VMA tracking mutex
- This one happens when one thread is allocating memory while a fork
occurs.
- This closely relates to the code invalidation mutex, just happens at
the syscall layer instead of the FEXCore layer.
- Happens as frequently as the code invalidation mutex.
- Allocation mutex
- This mutex is used for FEX's 64-bit Allocator, this happens when FEX
is allocating memory on one thread and a fork occurs.
- Fairly infrequent because jemalloc doesn't allocate VMA regions that
often.
While this likely doesn't hit all of the FEX mutexes, this hits the ones
that are burning fires and are happening frequently.
- FEXCore: Adds forkable mutex/locks
Necessary since we have a few locations in FEX that need to be locked
before and after a fork.
When a fork occurs the locks must be locked prior to the fork. Then
afterwards they either need to unlock or be set to default
initialization state.
- Parent
- Does an unlock
- Child
- Sets the lock to default initialization state
- This is because it pthreads does TID based ownership checking on
unique locks and refcount based waiting for shared locks.
- No way to "unlock" after fork in this case other than default
initializing.
Fixes a spurious `No such file or directory` error when `ls` is trying
to query a path's xattributes that come from the emulated rootfs.
These syscalls don't support the *at variants, so it can't use the optimized `GetEmulatedFDPath` implementation.
It must also return an error on a found file path, which makes their
implementation be slightly different than the other user of of
`GetEmulatedPath`. In the case of error, it must only return an error
from the emulated path if it is /not/ ENOENT.
Before:
```
$ FEXInterpreter /usr/bin/ls -alth /usr/bin/wine-stable
/usr/bin/ls: /usr/bin/wine-stable: No such file or directory
-rwxr-xr-x 1 ryanh ryanh 1.1K Sep 24 2022 /usr/bin/wine-stable
```
After:
```
$ FEXInterpreter /usr/bin/ls -alth /usr/bin/wine-stable
-rwxr-xr-x 1 ryanh ryanh 1.1K Sep 24 2022 /usr/bin/wine-stable
```
istringstream is a very slow way to parse this, let's make it a bit
quicker.
Some implementation numbers:
1. Original implementation - 1833556 calculations per second
2. std::strtoul implementation - 4666818 calculations per second
- 2.54x the istringstream implementation
3. str::from_chars implementation - 5120718 calculations per second
- 1.09x the std::strtoul implementation
- 2.79x th istringstream implementation
This message is complaining each time VFORK was using with clone, but we
are handling VFORK here now.
This is just causing debug messages for no reason.
Remove the message and remove the flag removal option.