now that we do everything via NZCV, this is mostly vestigial. DF/x87 flags are
sufficiently rare to be "don't care"s here, and we don't even have multiblock
enabled yet!
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This is no longer necessary and it also no longer provides us any useful
information. Since we expose the AVX CPUID flag, basically everything
uses VEX encoding now, so it is basically always set.
Some locations could end up with SRA registers that only spilled one
register.
Allow passing in temporaries from the call site.
Fixes rpid and syscalls asserting.
When AFP is supported then we can actually support DAZ. This might also
fix the audio corruption in Animal Well but I can't test it until Steam
is running on Oryon. Requires a bit of plumbing for MXCSR which we were
hacking around before but now we actually want to store the value.
Fixes#3856
A disowned buffer could be unclaimed or claimed by a different thread in
the time between the !IsFree check and locking the allocation mutex.
Fix this and prevent such errors in the future by always checking
ownership with the allocator locked before attempting to unclaim
buffers.
With a brute force search of methods between 1-3 instructions we cover a
lot more cases more optimally.
There's definitely still more cases (and probably some that can reduce
from 3 instruction to 2), but covering 44 cases is a pretty good margin
already.
VPSHUFD and VPERMILPS are aliases of each other.
Reuses the implementation path from the PSHUFD implementation which has
a few swizzles and then a table lookup.
VPERMILPD is a very simple swizzle per 128-bit lane.
Fixes#3797Fixes#3784
nothing is optimizing around this, it's just adding pointless complexity. if we
want to actually optimize F80Cmp, the right way would be to lift the
implementation into the OpcodeDispatcher or JIT. it wouldn't be terribly
difficult. This kludge doesn't get us closer there.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This causes a global initializer that registers an atexit handler.
Be smarter, use an std::array and pass its data around using a span
instead.
Removes the global initializer and removes the atexit installation
We never use more than one logging method at a time so this was
overengineered for what it is doing.
Instead only allow one handler for messages and throw messages each
which just is a pointer.
Removes a global initializer and an atexit handler being installed
This is the initial split to decouple AVX256 composed operations from
their MMX/SSE counterparts. This is to work around the subtle
differences with AVX/SSE zext/insert behaviour.
This was doing a 128-bit load from memory and then a 64-bit zero extend
which looked like a spurious move but it was trying to match the
behaviour of vmovq where it needed the zero extend.
Also adds a unit test to ensure that we aren't loading too much data by
loading right up against a page boundary.
Fixes#3787