244 Commits

Author SHA1 Message Date
Ryan Houdek
75f7bc0abd Fixes FindMSB op for the interpreter for 8bit/16bit types
Was zexting to 32bit and thing doing a findMSB which was wrong
2020-03-06 07:56:17 +02:00
Ryan Houdek
ec5f3a6854 Fixes LEA instruction
LEA changes behaviour based on the ordering of the prefixes on the
instruction

eg:
66 48 8d 3d ffffffff: lea rdi, [rip - 1]
48 66 8d 3d ffffffff: lea di, [rip - 1]

So we need to know the order of a few of the prefixes.
In the future this should probably be switched to a three deep stack to
have the ordering but for now this'll do
2020-03-06 07:56:17 +02:00
Ryan Houdek
64e863a5a0 Adds a few more instructions to the OpcodeDispatcher 2020-03-06 07:56:17 +02:00
Ryan Houdek
e854112927 Implements Unified memory in the x86 JIT 2020-03-06 07:56:17 +02:00
Ryan Houdek
c336a09641 Adds new IR ops to interpreter and x86 JIT 2020-03-06 07:56:17 +02:00
Ryan Houdek
3bad1eafbc Adds new IR ops to the JSON 2020-03-06 07:56:17 +02:00
Ryan Houdek
7359e44952 Fixes FCVTZ{U,S} Destination size 2020-03-06 07:56:17 +02:00
Ryan Houdek
fe170846f4 Fixes LEA X86Table flag
Default operating size was wrong
2020-03-06 07:56:17 +02:00
Ryan Houdek
ae2689d9e2 Fleshes out a couple x87 instruction table ops 2020-03-06 07:56:17 +02:00
Ryan Houdek
0dcd4bc0b3 Removes some CPUID feature flags
We don't support AVX512, XOP, or FMA4
2020-03-06 07:56:17 +02:00
Ryan Houdek
6f5b9358e2 Syscall additions and fixes
Adds unified memory support for most things
Fixes a couple of bugs
2020-03-06 07:56:17 +02:00
Ryan Houdek
8d293e9184 Adds support for unified memory to the interpreter core 2020-03-06 07:56:17 +02:00
Ryan Houdek
a79197bd39 Supports unified memory in the Core 2020-03-06 07:56:17 +02:00
Ryan Houdek
d5402bce2b Lets the block cache understand the unified address space option
There is a limitation that all executable code must live above the
memory base at this moment
2020-03-06 07:56:16 +02:00
Ryan Houdek
692a93c604 Adds an optional SetMemoryBase CodeLoader function
This is necessary for the code loader to know where the memory base is
for unified memory
2020-03-06 07:56:16 +02:00
Ryan Houdek
0eb5e2b332 Adds unified memory option without supporting it yet
This allows you to set the option from the frontend but it doesn't yet
do anything.
2020-03-06 07:56:16 +02:00
Ryan Houdek
c7c5d8bdd7 Cleanup the CPUID interface
I saw LLVM using CPUID function 0x14 so I decided to fill out the map
with the basic leafs
2020-03-06 07:56:16 +02:00
Ryan Houdek
d2c3468cb5 Fixes accurate-std for writev 2020-03-06 07:56:16 +02:00
Ryan Houdek
ccd7a18445 Fixes a few SSE instruction flags 2020-03-06 07:56:16 +02:00
Ryan Houdek
669aa99d3e Fixes stale data bug in the interpreter 2020-03-06 07:56:16 +02:00
Ryan Houdek
d8b3e3bd84 Adds Popcount x86 instruction to dispatcher 2020-03-06 07:56:16 +02:00
Ryan Houdek
ed12a8a242 Fixes x86 instruction decoding.
In the case of modrm + immediate then the immediate would end up
overwriting Src1 due to the the order of the decoding.
Changes Src1 and Src2 to an array and use a variable to index the array.
Causes a bit of code churn but fixes instruction decoding and allows
easier expansion in the future for instructions that have more sources
like AVX
2020-03-06 07:56:15 +02:00
Ryan Houdek
9135661861 Fixes edge case when instruction declares 66h and REX.W
REX.W takes precedence over 66h
2020-03-06 07:56:15 +02:00
Ryan Houdek
2b237d9801 Add EmulatedFiles file to cmake 2020-03-06 07:56:15 +02:00
Ryan Houdek
7e2e6fc27f Adds basic emulated files support
Allows us to override some files explicitly
2020-03-06 07:56:15 +02:00
Ryan Houdek
bdda36496b Implements FindTrailingZeros IROp in the interpreter 2020-03-06 07:56:15 +02:00
Ryan Houdek
8a2bf00fb6 Enforces register classes on loadstores
Register class for loadstores must be declared upfront.
This was causing a pain point when we were trying to load <16byte in to
FPRs, which was requiring a GPR<->FPR dance.
2020-03-06 07:56:15 +02:00
Ryan Houdek
472ec5f5cf Adds 11 new emulated syscalls 2020-03-06 07:56:15 +02:00
Ryan Houdek
d2bfb39b54 Disable RDRand in CPUID 2020-03-06 07:56:15 +02:00
Ryan Houdek
7ade645af2 Adds vector compare ops to OpDispatcher 2020-03-06 07:56:14 +02:00
Ryan Houdek
39802cc464 Adds a couple nop implementation of ops in OpDispatcher 2020-03-06 07:56:14 +02:00
Ryan Houdek
67ac4f015f Adds ANDN to OpDispatcher 2020-03-06 07:56:14 +02:00
Ryan Houdek
1f4418b463 Adds missing move to OpDispatcher 2020-03-06 07:56:14 +02:00
Ryan Houdek
134fa50f8b Adds fsqrt and frsqrt to OpDispatcher 2020-03-06 07:56:14 +02:00
Ryan Houdek
e260378736 Fix a couple x86 instruction definitions 2020-03-06 07:56:14 +02:00
Ryan Houdek
dacd20f44b Fixes interpreter temp allocation size 2020-03-06 07:56:14 +02:00
Ryan Houdek
9bc899aba0 Changes CMPS to not break RA
RA isn't happy with how this is arranged, so it needs to be changed
until RA is good enough to support it
2020-03-06 07:56:14 +02:00
Ryan Houdek
c39f23802c Fixes ULE Select on x86 JIT 2020-03-06 07:56:14 +02:00
Ryan Houdek
9b19d23977 Fixes FindMSB in Interpreter and Arm64 JIT 2020-03-06 07:56:14 +02:00
Ryan Houdek
2889c860d4 Adds new vector IR ops 2020-03-06 07:56:14 +02:00
Ryan Houdek
4523ec52de Fixes struct stat being different definitions on x86 and Arm 2020-03-06 07:56:13 +02:00
Ryan Houdek
299947e3bf Change syscall time debug to a define 2020-03-06 07:56:13 +02:00
Ryan Houdek
eb1f643e16 Use header provided syscall defines 2020-03-06 07:56:13 +02:00
Ryan Houdek
a8d7549b13 Fixes CPUID call in Arm64 JIT 2020-03-06 07:56:13 +02:00
Ryan Houdek
69d6c09436 Fixes AArch64 JIT compiling 2020-03-06 07:56:13 +02:00
Ryan Houdek
a272970559 Rewrites the Dead Context Loadstore elimination passes to be more generic
Accesses are now classified more generically, which allows them to be
optimized away more easily.
XMM still needs a special case but maybe that can be changed in the
future.
2020-03-06 07:56:13 +02:00
Ryan Houdek
07faaaa0fd Make sure to not do pops at function end if from within custom dispatch for x86 2020-03-06 07:56:13 +02:00
Ryan Houdek
18c5a3d6e5 Fixes a couple of bugs in the RA 2020-03-06 07:56:13 +02:00
Ryan Houdek
03a32ef2aa Disables the redundant context load elimination pass again. Since it is broken once again. 2020-03-06 07:56:13 +02:00
Ryan Houdek
92e4be828f Implements new RA pass that supports PHI nodes
Has a heuristic that changes from a map lookup and a linear scan
depending on the number of SSA values. Map lookup is faster for larger
blocks while for smaller blocks linear scan is faster.

Block scan is 1.5x - 2x faster for a 30k SSA value block I was looking
at.
2020-03-06 07:56:13 +02:00