docs: Organize IR design doc

Signed-off-by: Ronald Caesar <github43132@proton.me>
2026-01-31 01:15:21 +01:00 · 2026-01-01 20:17:23 -04:00
parent 4679593839
commit 547bb249db
1 changed files with 190 additions and 157 deletions
--- a/docs/IR_DESIGN_DOC.md
+++ b/docs/IR_DESIGN_DOC.md
@@ -1,12 +1,184 @@
-All memory will be allocated from a contiguous memory arena before the pass begins. No `malloc` calls inside the loop.

-# <a name="scfg" /> Structured SSA Model
+# Structured SSA Model

-This replicates Dynarmic's IR layer but it respects SSA by using Block Arguments. Branches push values into the target scope like function parameters.
+This replicates [Dynarmic's](https://github.com/pound-emu/dynarmic) IR layer 
+but it respects SSA by using Block Arguments. Branches push values into the
+target scope like function parameters.

-## Proof of Concept
+# Data Structures

-### Scenario
+## Variables
+
+```c
+// This struct is *only* used for SSA construction. It maps the program'
+// original state (like Guest Registers) to the current SSA variable. 
+typedef struct
+{
+    uint32_t current_ssa_index;
+    uint32_t original_variable_index;
+} source_variable_t;
+
+sourve_variable_t source_variables[???];
+
+typedef struct
+{
+    uint16_t use_count;
+} ssa_version_t;
+
+ssa_version_t ssa_versions[???];
+```
+
+## Instruction Encoding
+
+```text
+63               54 53        36 35        18 17        00
+|-----------------| |----------| |----------| |----------|
+        opc             src1         src2         src3
+```
+
+### Encoding Symbols
+
+<**src3**>  18-bit index for `ssa_versions[]`.
+
+<**src2**>  18-bit index for `ssa_versions[]`.
+
+<**src1**>  18-bit index for `ssa_versions[]`.
+
+<**opc**>   10-bit opcode.
+
+### Operational Information
+
+If Bit[17] in `src1`, `src2`, or `src` is 1, the operand is a constant.  It has
+no SSA index. It has no entry in `ssa_versions`.
+
+## Instructions
+
+```c
+typedef uint64_t instruction_t;
+instruction_t instructions[???];
+uint32_t instruction_count;
+```
+
+# Instruction Set Architecture
+
+## Control Instructions
+
+Control Instructions sefines nested scopes (Basic Blocks). They produce SSA
+variables. These will replace phi-nodes and terminals.
+
+1. `OPCODE_IF`
+    * **Input**: Condition variable.
+    * **Structure**: Creates "Then" and "Else" blocks.
+    * **Output**: Defines SSA variables representing the result of the executed
+      branch.
+
+2. `OPCODE_LOOP`
+    * **Input**: Initial loop arguments (optional).
+    * **Structure**: Creates a "Body" block.
+    * **Output**: Defines SSA variables representing the state when the loop
+      terminates.
+
+3. `OPCODE_BLOCK`
+    * **Structure**: Creates a single nested scope.
+    * **Output**: Defines SSA variables yielded by the block.
+
+4. `OPCODE_YIELD`
+    * **Role**: Data Flow.
+    * **Behaviour**: Pushes a value from inside a child scope (Then/Else/Body)
+      to the parent Control Instruction (IF/LOOP), resolving tbe SSA merge.
+
+5. `OPCODE_BREAK`
+    * **Role**: Control Flow.
+    * **Behaviour**: Exits a `LOOP` or `BLOCK` scope immediately. Can carry
+      values to the target scope.
+
+6. `OPCODE_CONTINUE`
+    * **Role**: Control Flow.
+    * **Behaviour**: Jumps to the header of the nearest enclosing `LOOP` scope.
+      Can carry values to update loop arguments.
+
+7. `OPCODE_RETURN`
+    * **Role**: Function Exit
+    * **Behaviour**: Not exactly sure about this one yet. How would function
+      inlining work?
+
+## Extension Instructions
+
+If an operation requires more than 3 operands (like `YIELD` returning 5 values),
+we insert instruction **immediately** preceding the consumer to carry the extra
+load.
+
+### Opcode Design
+
+1. `OPCODE_ARG_EXTENSION`
+    * **Role**: Holds 3 operands that are pushed to the next instruction.
+    * **Output**: `TYPE_VOID
+
+2. `OPCODE_DEF_EXTENSION`
+    * **Role**: Extends the definitiom list of the preceding Control Instruction
+      to support multiple merge values.
+    * **Output**: Defines a valid SSA variable representing the next value in
+      the merge set.
+
+### Scenario 1: `OPCODE_YIELD v1, v2, v3, v4, v5`,
+
+We cannot fit 5 operands into one `instruction_t`. We split them.
+
+Memory Layout in `instructions[]`
+
+| Index | Opcode               | src1 | src2 | src3 | SSA Def | Comment                    |
+|-------|----------------------|------|------|------|---------|----------------------------|
+| 100   | OPCODE_ARG_EXTENSION | v4   | v5   | NULL | v100    | Carries args 4 and 5       |
+| 101   | OPCODE_YIELD         | v1   | v2   | v3   | v101    | Carries args 1-3 & Executes|
+
+
+### Scenario 2: `x1, y1 = OPCODE_IF (vcondition) TARGET_TYPE: INT`
+
+This `IF` block defines 2 variables. Since this IR is designed to define one
+variable per instruction, we split `x1`, and `y1` into two seperate instructions.
+
+Memory Layout in `instructions[]`
+
+| Index | Opcode               | src1 | src2 | src3 | SSA Def | Comment         |
+|-------|----------------------|------|------|------|---------|-----------------|
+| 100   | OPCODE_IF            |vcond | NULL | NULL | v100    | Definition of x |
+| 101   | OPCODE_DEF_EXTENSION | v100 | NULL | NULL | v101    | Definition of y |
+
+# SSA Construction Rules
+
+1. All memory will be allocated from a contiguous memory arena before the pass 
+   begins. No `malloc` calls inside the loop.
+2. Every instruction defines exactly one SSA ID (or Void).
+3. Multi-return values and multi-instruction arguments are handled via
+   [extension instructions](#extension-instructions).
+4. Constants are loaded via pool indices, not raw literals in operands.
+
+
+# Tiered Compilation Strategy
+
+## Tier 1: Dumb Translation
+
+* Greedy Register Allocator.
+* Pre-defined machine code templates for code generation.
+* No optimizations **except** for Peepholes. To make peepholing as fast as
+  possible, we use a sliding window while emitting the machine code.
+
+We switch to tier 2 when a basic block turns hot.
+
+## Tier 2: Optimized Translation
+
+* Run all required optimizations passes.
+
+## Required Optimization Passes
+
+1. Register Allocation
+2. Constant Folding & Propagation
+3. Dead Code Elimination
+4. Peepholes
+
+# Proof of Concept
+
+## Scenario

 ```c
 // Calculate sum of array[i] where array[i] > 0
@@ -22,7 +194,7 @@ while (i < limit) {
 return sum;
 ```

-### Structured SSA Representation
+## Structured SSA Representation

 ```text
 // ---------------------------------------------------------
@@ -130,164 +302,25 @@ OPCODE_END_BLOCK // Terminantes the entire loop.
 OPCODE_RETURN v4
 ```

-## Two Tiered Architecture
-
-### Tier 1: Dumb Translation
-
-* Greedy Register Allocator.
-* Pre-defined machine code templates for code generation.
-* No optimizations **except** for Peepholes. To make peepholing as fast as possible, we use a sliding window while emitting the machine code.
-
-We switch to tier 2 when a basic block turns hot.
-
-### Tier 2: Optimized Translation
-
-* Run all [required](#rop) optimizations passes.
-
-## Variable Design
-
-```c
-// This struct is *only* used for SSA construction. It maps the program's original state (like Guest Registers) to the current SSA variable. 
-typedef struct
-{
-    uint32_t current_ssa_index;
-    uint32_t original_variable_index;
-} source_variable_t;
-
-sourve_variable_t source_variables[???];
-
-typedef struct
-{
-    uint16_t use_count;
-} ssa_version_t;
-
-ssa_version_t ssa_versions[???];
-```
-
-## Instruction Encoding
-
-```text
-63               54 53        36 35        18 17        00
-|-----------------| |----------| |----------| |----------|
-        opc             src1         src2         src3
-```
-
-### Encoding Symbols
-
-<**src3**>  18-bit index for `ssa_versions[]`.
-
-<**src2**>  18-bit index for `ssa_versions[]`.
-
-<**src1**>  18-bit index for `ssa_versions[]`.
-
-<**opc**>   10-bit opcode.
-
-### Operational Information
-
-If Bit[17] in `src1`, `src2`, or `src` is 1, the operand is a constant.  It has not SSA index. It has no entry in `ssa_versions`.
-
-## Instruction Design
-
-```c
-typedef uint64_t instruction_t;
-instruction_t instructions[???];
-uint32_t instruction_count;
-```
-
-## Control Instructions
-
-Control Instructions defines nested scopes (Basic Blocks). They produce SSA variables. These will replace phi-nodes and terminals.
-
-1. `OPCODE_IF`
-    * **Input**: Condition variable.
-    * **Structure**: Creates "Then" and "Else" blocks.
-    * **Output**: Defines SSA variables representing the result of the executed branch.
-
-2. `OPCODE_LOOP`
-    * **Input**: Initial loop arguments (optional).
-    * **Structure**: Creates a "Body" block.
-    * **Output**: Defines SSA variables representing the state when the loop terminates.
-
-3. `OPCODE_BLOCK`
-    * **Structure**: Creates a single nested scope.
-    * **Output**: Defines SSA variables yielded by the block.
-
-4. `OPCODE_YIELD`
-    * **Role**: Data Flow.
-    * **Behaviour**: Pushes a value from inside a child scope (Then/Else/Body) to the parent Control Instruction (IF/LOOP), resolving tbe SSA merge.
-
-5. `OPCODE_BREAK`
-    * **Role**: Control Flow.
-    * **Behaviour**: Exits a `LOOP` or `BLOCK` scope immediately. Can carry values to the target scope.
-
-6. `OPCODE_CONTINUE`
-    * **Role**: Control Flow.
-    * **Behaviour**: Jumps to the header of the nearest enclosing `LOOP` scope. Can carry values to update loop arguments.
-
-7. `OPCODE_RETURN`
-    * **Role**: Function Exit
-    * **Behaviour**: Not exactly sure about this one yet. How would function inlining work?
-
-## Extention Instructions
-
-If an operation requires more than 3 operands (like `YIELD` returning 5 values), we insert instruction **immediately** preceding the consumer to carry the extra load.
-
-### Opcode Design
-
-1. `OPCODE_ARG_EXTENSION`
-    * **Role**: Holds 3 operands that are pushed to the next instruction.
-    * **Output**: `TYPE_VOID
-
-2. `OPCODE_DEF_EXTENSION`
-    * **Role**: Extends the definitiom list of the preceding Control Instruction to support multiple merge values.
-    * **Output**: Defines a valid SSA variable representing the next value in the merge set.
-
-### Scenario 1: `OPCODE_YIELD v1, v2, v3, v4, v5`,
-
-We cannot fit 5 operands into one `instruction_t`. We split them.
-
-Memory Layout in `instructions[]`
-
-| Index | Opcode               | src1 | src2 | src3 | SSA Def | Comment                    |
-|-------|----------------------|------|------|------|---------|----------------------------|
-| 100   | OPCODE_ARG_EXTENSION | v4   | v5   | NULL | v100    | Carries args 4 and 5       |
-| 101   | OPCODE_YIELD         | v1   | v2   | v3   | v101    | Carries args 1-3 & Executes|
-
-
-### Scenario 2: `x1, y1 = OPCODE_IF (vcondition) TARGET_TYPE: INT`
-
-This `IF` block defines 2 variables. Since this IR is designed to define one variable per instruction, we split `x1`, and `y1` into two seperate instructions.
-
-Memory Layout in `instructions[]`
-
-| Index | Opcode               | src1 | src2 | src3 | SSA Def | Comment         |
-|-------|----------------------|------|------|------|---------|-----------------|
-| 100   | OPCODE_IF            |vcond | NULL | NULL | v100    | Definition of x |
-| 101   | OPCODE_DEF_EXTENSION | v100 | NULL | NULL | v101    | Definition of y |
+# Frequently Asked Questions

 ## How do we know when a variable is created?

 We use **Implicit Indexing**:

-1. When we create an instruction at `instructions[100]`, we have implicitly created the variable definition at `ssa_versions[100]`.
+1. When we create an instruction at `instructions[100]`, we have implicitly
+   created the variable definition at `ssa_versions[100]`.
+2. We do not "check" if it is created. The act of incrementing
+   `instruction_count` creates it.

-2. We do not "check" if it is created. The act of incrementing `instruction_count` creates it.
-
-This removes the need for an explicit `definition` bitfield in `instructions_t`, creating space to expand `src1`, `src2`, and `src3` to 18 bits.
+This removes the need for an explicit `definition` bitfield in
+`instructions_t`, creating space to expand `src1`, `src2`, and `src3` to 18
+bits.

 ## What about instructions that don't define anything?

-If `instructions[200]` is `STORE v1, [v2]`, it defines no variable for other instructions to use. If we strictly follow implicit indexing, we create `ssa_values[200]`. Is this waste?
+If `instructions[200]` is `STORE v1, [v2]`, it defines no variable for other
+instructions to use. If we strictly follow implicit indexing, we create
+`ssa_values[200]`. We handle these void instructions by marking the SSA
+variable as `VOID`: `ssa_versions[200].type = TYPE_VOID`.

-**Yes. Its 8 bytes of waste. But this is the best choice we have.**
-
-Its either this or keep track of a `definition` field in `instruction_t` which then shrinks `src1`, `src2`, and `src3` to a tiny 14 bits.
-
-We handle these void instructions by marking the SSA variable as `VOID`: `ssa_versions[200].type = TYPE_VOID`.
-
-## <a name="rop"/> Required Optimization Passes
-
-1. Register Allocation
-2. Constant Folding & Propagation
-3. Dead Code Elimination
-4. Peepholes