This is a quality of life improvement for people that want to tinker
with the InstCountCI but they may not necessarily have an Arm64 device
available immediately for poking.
As long as the vixl disassembler is enabled then the InstCountCI tests
can run and get bit-accurate encodings just like on an Arm64 device.
This also ensures that behaviour is consistent with or without the vixl
simulator enabled which is very important when running on x86 hosts.
This was only required on x86 devices trying to escape the emulation.
Since x86 is now remove, this is entirely unnecessary.
When Steam launches applications with `/bin/sh`, this will remain under
the emulation and not escape these days.
With the removal of the x86 JIT, there is no need to have these be
independent classes.
Merges the Arm64Dispatcher in to the base Dispatcher class.
No functional change, just moving code.
Similar to previous tests, vpgatherqq and vgatherqpd are equivalent
instructions. So the tests are the same with the mnemonic changed.
This adds tests for an additional two sets of instructions. Getting us
full coverage of all eight instructions if we include the tests from
PR #3167 and #3166
Tests the same things as described in #3165
In addition, since these tests use 64-bit indices for address
calculation, we can easily generate and indice vector that tests
overflow. So every test at every displacement ALSO gains an additional
overflow test to ensure correct behaviour around pointer overflow
calculation.
Similar to previous tests, vgatherqd and vgatherqps are equivalent
instructions. So the tests are the same with the mnemonic changed.
This adds tests for an additional two sets of instructions, Getting us
up to six total over the eight if we include the tests from #3166.
Tests the same things as described in #3165
In addition, since these tests use 64-bit indices for address
calculation, we can easily generate and indice vector that tests
overflow. So every test at every displacement ALSO gains and additional
overflow test to ensure correct behaviour around pointer overflow
calculation.
Just like the previous tests, vpgatherdq and vgatherpq are equivalent
instructions. So the tests are the same except for the instruction
mnemonic again.
This adds unittests for two more of the eight gather instructions.
Getting us up to testing four in total.
Specifically this adds tests for 32-bit indices while loading 64-bit
element instructions.
Same thing as PR #3165 for what it tests versus doesn't.
vpgatherdd and vgatherps are effectively the same instructions, so the
tests are the same except for the instruction mnemonic.
This adds unit tests for two of the eight gather instructions.
Specifically this adds tests for the 32-bit indices loading 32-bit
elements instructions.
What it tests:
- Tests all displacement scales
- Tests multiple mask arrangements
- Ensures the mask register is zero'd after the instruction
What it doesn't test:
- Doesn't test address size calculation overflow
- Only would happen on 32-bit with 32-bit indices, or /really/ high
base addresses
- The instruction should behave as a mask to the address size
- Effectively behaves like `(uint64_t)(base + index << ilog2(scale))`
- Better idea is to just not expose AVX to 32-bit applications
- Doesn't test VSIB immediate displacement
- This just ends up being base_addr + imm so it isn't too interesting
- We can add more tests in the future if we think we messed that up
- Doesn't test partial fault behaviour
- Because that's a nightmare.
Specifically keeps each instruction test small and isolated so if a
single register fails it is very easily to nail down which operation did
it.
I know some of our ASM tests do a chunk of work and spit out a result at
the end which can be difficult to debug in some cases. Didn't want to do
that which is why the tests are spread out across 16 files for these
single class of instructions.
If we ever get around to fusing ops with shifts in the ConstProp optimizer (may
or may not be worthwhile), this will delete an instruction from things like "or
al, bh".
Even though lsr is the same speed as bfe on Firestorm, I feel if you ask for
garbage you should get garbage C:
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Pointless, upper bits ignored anyway. Deletes piles of uxt and even some 32-bit
instruction moves.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
To load 8-bit sources without bfe'ing for al/bl/cl if the caller knows it
doesn't need masking behaviour, but without lying about the size so the extract
for ah/bh/ch will still work properly.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>