It is not an external component, and it makes paths needlessly long. Ryan seemed amenable to this when we discussed on IRC earlier. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
11 KiB
FEXCore IR
The IR for the FEXCore is an SSA based IR that is generated from the incoming x86-64 assembly. SSA is quite nice to work with when translating the x86-64 code to the IR, when optimizing that code with custom optimization passes, and also passing that IR to our CPU backends.
Emulation IR considerations
- We have explicitly sized IR variables
- Supports traditional element sizes of 1,2,4,8 bytes and some 16byte ops
- Supports arbitrary number of vector elements
- The op determines if something is float or integer based.
- Clear separation of scalar IR ops and vector IR ops
- ex, MUL versus VMUL
- We have explicit Load/Store context IR ops
- This allows us to have a clear separation between guest memory and tracked x86-64 state
- We have an explicit CPUID IR op
- This allows us to return fairly complex data (4 registers of data) and also having an easier optimization for constant CPUID functions
- So if we const-prop the CPUID function then it'll just const-prop further along
- We have an explicit syscall op
- The syscall op is fairly complex as well, same with CPUID that if the syscall function is const-prop then we can directly call the syscall handler
- Can save overhead by removing call overheads
- The IR supports branching from one block to another
- Has a conditional branch instruction that either branches to the target branch or falls through to the next block
- Has an unconditional branch to explicitly jump to a block instead of falling through
- There is a desire to follow LLVM semantics around block limitations but it isn't currently strictly enforced
- Supports a debug
Print
Op for printing out values for debug viewing - Supports explicit Load/Store memory IR ops
- This is for accessing guest memory and will do the memory offset translation in to the VM's memory space
- This is done by just adding the VM memory base to the 64bit address passed in
- This is done in a manner that the application can escape from the VM and isn't meant to be safe
- There is an option for JITs to validate the memory region prior to accessing for ensuring correctness
- IR is generated from a JSON file, fairly straightforward to extend.
- Read the python generation file to determine the extent of what it can do
IR function considerations
The first SSA node is a special case node that is considered invalid. This means %0 will always be invalid for "null" node checks The first real SSA node also has to be a IRHeader node. This means it is safe to assume that %1 will always be an IRHeader.
(%%1) IRHeader 0x41a9a0, %%2, 5
The header provides information about that function like the entry point address.
Additionally it also points to the first CodeBlock
IROp
(%%2) CodeBlock %%7, %%168, %%3
- The
CodeBlock
Op is a jump target and must be treated as if it'll be jumped to from other blocks- It contains pointers to the starting op and ending op and they are inclusive
- It also contains a pointer to the next CodeBlock in a singly linked list
- The last CodeBlock will point to the InvalidNode as the next block
Example code block
(%%3) CodeBlock %%169, %%173, %%4
(%%169) BeginBlock %3
%170 i64 = Constant 0x41a9e1
(%%171) StoreContext %170 i64, 0x8, 0x0
(%%172) ExitFunction
(%%173) EndBlock %3
- BeginBlock points back to the CodeBlock SSA which helps with iterating across multiple blocks
- EndBlock the ending op of a CodeBlock and also points back to the CodeBlock SSA.
- ExitFunction will leave the function immediately and return back to the dispatcher
- Every IR Op has an SSA value associated with it used for tracking the op itself
- If the IROp doesn't have a real destination then it is invalid to use it as an argument in most other ops
In-memory representation
The in-memory representation of the IR may be a bit confusing when initially viewed and once dealing with optimizations then it may be confusing as well.
Currently the IR Generation is tied to the OpDispatchBuilder
class. This class handles translating decoded x86 to our IR representation.
When generating IR inside of the OpDispatchBuilder
it is straight forward, just call the IR generation ops.
FEXCore::IR::IntrusiveAllocator
This is an intrusive allocator that is used by the OpDispatchBuilder
for storing IR data. It is a simple linear arena allocator without resizing capabilities.
OpDispatchBuilder
OpDispatchBuilder provides two routines for handling the IR outside of the class
IRListView ViewIR();
- Returns a wrapper container class the allows you to view the IR. This doesn't take ownership of the IR data.
- If the OpDispatcherBuilder changes its IR then changes are also visible to this class
IRListView *CreateIRCopy()
- As the name says, it creates a new copy of the IR that is in the OpDispatchBuilder
- Copying the IR only copies the memory used and doesn't have any free space for optimizations after this copy operation
- Useful for tiered recompilers, AOT, and offline analysis
This class uses two IntrusiveAllocator objects for tracking IR data. ListData
and Data
are the object names.
ListData
is for tracking the doubly linked list of nodes- This ONLY allocates
FEXCore::IR::OrderedNode
objects - When an OrderedNode is allocated its allocation location (NodeOffset) is just the offset from the base pointer
- This allows us to only use uint32_t memory offsets to compact the IR
- Additionally using offsets allows us the freedom to freely move our IR in memory without costly pointer adjustment
- This means everything is fixed size allocated (SSA Node number calculation is just
AllocationOffset / sizeof(OrderedNode)
- OrderedNodes are what the SSA arguments are pointing to in the end
- This ONLY allocates
OrderedNode
This is a doubly linked list of all of our IR nodes. This allows us to walk forward or backward over the IR and they must be ordered correctly to ensure dominance of SSA values.
- Contains
OrderedNodeHeader
- Contains
OpNodeWrapper Value
- Points to the
IROp_Header
backing op for this SSA node
- Points to the
- Contains
OrderedNodeWrapper Next
- Points to the next
OrderedNode
- Points to the next
- Contains
OrderedNodeWrapper Previous
- Points to the previous
OrderedNode
- Points to the previous
- Contains
- Contains the NumUses
- This allows us to easily walk to the list backwards and DCE the ops that have NumUses == 0
IROp_Header *Op(uintptr_t Base)
- Allows you to get the backing IR data for this SSA value
NodeWrapperBase - Type for OrderedNodeHeader
and OpNodeWrapper
using OpNodeWrapper = NodeWrapperBase<IROp_Header>
using OrderedNodeWrapper = NodeWrapperBase<OrderedNode>
- This is a class to let you more easily convert NodeOffsets in to their real backing pointer
GetNode(uintptr_t Base)
allows you to pass in the base pointer from the backing Intrusive allocator and get the object- This can be confusing
- A good rule of thumb is to only ever use
GetNode(ListDataBegin)
with OrderedNodeWrapper - Then once you have the
OrderedNode*
from GetNode, Use theOp(IRDataBegin)
function to get the IR data. - I do NOT recommend using
GetNode
directly fromOpNodeWrapper
as it is VERY easy to mess it up
NodeIterator
Provides a fairly straightforward interface that allows easily walking the IR nodes with C++ increment and decrement operations. Only iterates over a single block
Example usage
IR::NodeIterator After = ...;
IR::NodeIterator End = ...;
while (After != End) {
// NodeIterator() returns a pair of pointers to the OrderedNode and IROp data
// You can unpack the result with structured bindings
auto [CodeNode, IROp] = After();
// IROp_Header contains a bunch of information about the IR object
// We can convert it with the object's C<typename Type> or CW<typename Type> functions
switch(IROp->Op) {
case IR::OP_ADD: {
FEXCore::IR::IROp_Add const *Op = IROp->C<FEXCore::IR::IROp_Add>();
/* We can now access members inside of IROp_Add that were previously unavailable
You can still access the header definitions from Op->Header */
break;
}
/* ... */
}
// Go to the next IR Op
++After;
}
AllNodesIterator
This is like NodeIterator, except that it will cross block boundaries.
IRListView.GetBlocks()
Provides a range for easy iterating over all the blocks in a multi-block with NodeIterator
Example usage
for (auto [BlockNode, BlockHeader] : CurrentIR.GetBlocks()) {
// Do stuff for each block
}
IRListView.GetCode(BlockNode)
Provides a range for easy iterating over all the code in a block
Example usage
for (auto [CodeNode, IROp] : CurrentIR.GetCode(BlockNode)) {
// Do stuff for each op
switch(IROp->Op) {
case IR::OP_ADD: {
FEXCore::IR::IROp_Add const *Op = IROp->C<FEXCore::IR::IROp_Add>();
// Do stuff for each Add op.
break;
}
}
}
IRListView.GetAllCode()
Like GetCode, except it uses AllNodesIterator to allow easy iterating over every single op in the entire Multiblock
Example usage
for (auto [CodeNode, IROp] : CurrentIR.GetAllCode()) {
// Do stuff for each op
}
JSON file
An example of what the IR json looks like
"StoreContext": {
"SSAArgs": "1",
"Args": [
"uint8_t", "Size",
"uint32_t", "Offset"
]
},
The json entry name will be the name of the IR op and the dispatcher function.
This means you'll get a _Add(...)
dispatcher function generated
JSON IR element options
HasDest
- This is used on ops that return a value. Used for tracking of if ops return data
SSAArgs
- These are the number of arguments that the op consumes that are SSA based
- Needs to come from previous ops that had a destination
SSANames
- Allows you to name the SSA arguments in an op
- Otherwise the Op names will only be able to be accessed from the Header of the IR through its arguments array
Args
- These are defined arguments that are stored in the IR encoding that aren't SSA based
- Useful for things that are constant encoded and won't change after the fact
FixedDestSize
- This allows you to override the op's destination size in bytes
- Most ops with implicitly calculate their destination size through the maximum sizes of the IR arguments passed in
DestSize
- This allows an IR size override that isn't just a size in bytes
- This can let the size of the op be another argument or something more extensive
RAOverride
- This allows an op to take regular SSA arguments (So optimization passes will still be aware of them) but also not have them be register allocated
- Useful for block handling ops, where blocks aren't something that get register allocated but still need to have their uses tracked
HelperGen
- If there is a complex IR Op that needs to be defined but you don't want an automatic dispatcher generated then this disables the generation of the dispatcher
Last
- This is a special element only used for the last element in the list