I do not want compiler attributes like __clang__ or __linux__ scattered
everywhere to detect the platform Ballistic is running on, so I added
preprocessor directives to make things a lot cleaner.
Signed-off-by: Ronald Caesar <github43132@proton.me>
Also remove bal_translate_block() body. It will need to be redesigned
to be used by bal_engine_run().
Signed-off-by: Ronald Caesar <github43132@proton.me>
Everything has been setup except for the main translation loop which
have to be done another day. Its after midnight for me right now :(
Signed-off-by: Ronald Caesar <github43132@proton.me>
Instead of using strcmp() on each decoded intruction's mnemonic to
translate it, we embedd an IR opcode into the struct. This is a very
barebones implementation and does not cover the entire ARM instruction
set. ARM instructions that does not have an IR opcode equivalent will be
marked with `OPCODE_TRAP` and should be implemented in the future.
Signed-off-by: Ronald Caesar <github43132@proton.me>
At first I wondered how will we know the bitfield of the ssa variable
we're creating? Should we hardcode the bit width in the opcode and
create a large switch statement or hash table? To keep things simple
I just added a new bit width parameter to emit_instruction() and it
will be the frontend's responsibility to find the correct bitwidth.
This should better for x86 lowering.
Signed-off-by: Ronald Caesar <github43132@proton.me>
A simple program that prints to stdout the top 20 most common
instructions in an ARM64 binary file.
Signed-off-by: Ronald Caesar <github43132@proton.me>
This function only responsibility is writing opcodes and operands to a
bal_instruction_t and adding it to the instruction stream.
Signed-off-by: Ronald Caesar <github43132@proton.me>
Also added the size of each array to bal_engine_t to make finding the
end of the array in memory simple and easy.
Signed-off-by: Ronald Caesar <github43132@proton.me>
I want users of Ballistic to design their own memory allocators. Their memory
allocation requirements will most likely be different from ours, so providing
users the API to write their own allocators gives them a lot of freedom
to do whatever they feel is best. This idea was inspired in part by the Zig
programming language.
Signed-off-by: Ronald Caesar <github43132@proton.me>
Having 18-bit opcodes means a block has at most 131,072 instructions
which is simply too big. All of these instructions cannot fit into the
L1 cache, which results in cache thrashing.
Signed-off-by: Ronald Caesar <github43132@proton.me>
We have enough bits in the opcode bitfield in instruction_t
to encode register classes (ADD_INT, ADD_FLOAT, ADD_VECTOR). However,
encoding the bit width (ADD_INT8, ADD_INT32) will massively increase the
amount of opcodes needed. So we replace `type` in ssa_version_t with
`bit_width`.
Signed-off-by: Ronald Caesar <github43132@proton.me>
Rule 4.2 states: "If a basic block is deemed cold, it should move to a
separate buffer." This violates Rule 3.1 Implicit Indexing. If v100 is
located at instructions[100] and we move it to a cold buffer, it id no
longer at index 100. If we keep the index 100 but store the data
elsewhere, you break the linear memory array performance benefits.
Hot-cols splitting will be done during code generation.
Signed-off-by: Ronald Caesar <github43132@proton.me>