Instead of clearing a hardcoded 16 bytes, adjust for the actual number
of instructions modified. The implementation will still only clear a
single cacheline so it doesn't change behaviour.
This used to exist in the FEXCore header since the unaligned handler was
done in the frontend. Once it got moved in to FEXCore it had stayed
there. Move it over now.
In the case of a visibility tear when one thread is backpatching while
another is executing. The executing thread can /potentially/ see the
writing of instructions depending on coherency rules or filling of
cachelines.
By ensuring the DMB instructions are backpatched over the NOP
instructions first, this ensures correct atomic visibility even on tear.
When code buffers are shared between threads, FEX needs to be careful
around backpatching its code buffers, since one thread might have
backpatched the code that another thread was also planning on
backpatching.
To handle this case, when the handler fails to find a backpatchable
instruction, check if it was already backpatched. This can be determined
by atomically reading the instructions back and seeing if they have
turned in to the non-atomic variants.
In most cases we can just return saying that it has been handled, in the
case of a store we need to back the PC up 4 bytes to ensure the DMB is
executed before the non-atomic store.
These handlers don't do any code backpatching so locking the spinlock
futex isn't necessary. Move them before the lock to make them a bit more
efficient once code buffers get shared.
Somewhere there was an assumption made that INC and DEC supported the
repeat prefix. This isn't actually the case, while the prefix can be
encoded, it is a nop and should only expect to be used for padding.
Adds a unittest to ensure that behaviour is as expected.
We now have two types of destinations:
* regular destinations. These are SSA. You get exactly 1 per instruction. This
is what almost every instruction should use.
* special destinations, introduced here. These are *not* SSA. They must be
allocated with a special instruction (added later in this PR), and then they
are mutated by the instruction. There are two types, either pure destinations
("out") or read-modify-write source+destinations ("in-out"). The former are
useful for instructions that return multiple destinations, like Memcpy. The
latter are useful for instructions that need a source tied with a special
destination (currently just Pop, introduced later in this series).
Special destinations reuse the mechanism of sources, to get around the
limitations on regular destinations in our current IR. Ops with special
destinations desugar to ops with no destination but extra sources prefixed Out
or Inout.
They further require HasSideEffects so we don't optimize ourselves into corners.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
I don't know what I was thinking when I wrote that code. Drop the silly logic
and let ConstProp inline the immediates. This fixes a lot of silly code
generated for 32-bit.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>