2015-08-06 22:55:19 +00:00
|
|
|
========================================
|
|
|
|
Machine IR (MIR) Format Reference Manual
|
|
|
|
========================================
|
|
|
|
|
|
|
|
.. contents::
|
|
|
|
:local:
|
|
|
|
|
|
|
|
.. warning::
|
|
|
|
This is a work in progress.
|
|
|
|
|
|
|
|
Introduction
|
|
|
|
============
|
|
|
|
|
|
|
|
This document is a reference manual for the Machine IR (MIR) serialization
|
|
|
|
format. MIR is a human readable serialization format that is used to represent
|
|
|
|
LLVM's :ref:`machine specific intermediate representation
|
|
|
|
<machine code representation>`.
|
|
|
|
|
|
|
|
The MIR serialization format is designed to be used for testing the code
|
|
|
|
generation passes in LLVM.
|
|
|
|
|
|
|
|
Overview
|
|
|
|
========
|
|
|
|
|
|
|
|
The MIR serialization format uses a YAML container. YAML is a standard
|
|
|
|
data serialization language, and the full YAML language spec can be read at
|
|
|
|
`yaml.org
|
|
|
|
<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_.
|
|
|
|
|
|
|
|
A MIR file is split up into a series of `YAML documents`_. The first document
|
|
|
|
can contain an optional embedded LLVM IR module, and the rest of the documents
|
|
|
|
contain the serialized machine functions.
|
|
|
|
|
|
|
|
.. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132
|
|
|
|
|
|
|
|
High Level Structure
|
|
|
|
====================
|
|
|
|
|
|
|
|
Embedded Module
|
|
|
|
---------------
|
|
|
|
|
|
|
|
When the first YAML document contains a `YAML block literal string`_, the MIR
|
|
|
|
parser will treat this string as an LLVM assembly language string that
|
|
|
|
represents an embedded LLVM IR module.
|
|
|
|
Here is an example of a YAML document that contains an LLVM module:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
--- |
|
|
|
|
define i32 @inc(i32* %x) {
|
|
|
|
entry:
|
|
|
|
%0 = load i32, i32* %x
|
|
|
|
%1 = add i32 %0, 1
|
|
|
|
store i32 %1, i32* %x
|
|
|
|
ret i32 %1
|
|
|
|
}
|
|
|
|
...
|
|
|
|
|
|
|
|
.. _YAML block literal string: http://www.yaml.org/spec/1.2/spec.html#id2795688
|
|
|
|
|
|
|
|
Machine Functions
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
The remaining YAML documents contain the machine functions. This is an example
|
|
|
|
of such YAML document:
|
|
|
|
|
2015-08-14 00:36:10 +00:00
|
|
|
.. code-block:: llvm
|
2015-08-06 22:55:19 +00:00
|
|
|
|
|
|
|
---
|
|
|
|
name: inc
|
|
|
|
tracksRegLiveness: true
|
|
|
|
liveins:
|
|
|
|
- { reg: '%rdi' }
|
2015-08-14 00:36:10 +00:00
|
|
|
body: |
|
|
|
|
bb.0.entry:
|
|
|
|
liveins: %rdi
|
|
|
|
|
|
|
|
%eax = MOV32rm %rdi, 1, _, 0, _
|
|
|
|
%eax = INC32r killed %eax, implicit-def dead %eflags
|
|
|
|
MOV32mr killed %rdi, 1, _, 0, _, %eax
|
|
|
|
RETQ %eax
|
2015-08-06 22:55:19 +00:00
|
|
|
...
|
|
|
|
|
|
|
|
The document above consists of attributes that represent the various
|
|
|
|
properties and data structures in a machine function.
|
|
|
|
|
|
|
|
The attribute ``name`` is required, and its value should be identical to the
|
|
|
|
name of a function that this machine function is based on.
|
|
|
|
|
2015-08-14 00:36:10 +00:00
|
|
|
The attribute ``body`` is a `YAML block literal string`_. Its value represents
|
|
|
|
the function's machine basic blocks and their machine instructions.
|
2015-08-06 22:55:19 +00:00
|
|
|
|
2015-08-15 01:06:06 +00:00
|
|
|
Machine Instructions Format Reference
|
|
|
|
=====================================
|
|
|
|
|
|
|
|
The machine basic blocks and their instructions are represented using a custom,
|
|
|
|
human readable serialization language. This language is used in the
|
|
|
|
`YAML block literal string`_ that corresponds to the machine function's body.
|
|
|
|
|
|
|
|
A source string that uses this language contains a list of machine basic
|
|
|
|
blocks, which are described in the section below.
|
|
|
|
|
|
|
|
Machine Basic Blocks
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
A machine basic block is defined in a single block definition source construct
|
|
|
|
that contains the block's ID.
|
|
|
|
The example below defines two blocks that have an ID of zero and one:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
bb.0:
|
|
|
|
<instructions>
|
|
|
|
bb.1:
|
|
|
|
<instructions>
|
|
|
|
|
|
|
|
A machine basic block can also have a name. It should be specified after the ID
|
|
|
|
in the block's definition:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
bb.0.entry: ; This block's name is "entry"
|
|
|
|
<instructions>
|
|
|
|
|
|
|
|
The block's name should be identical to the name of the IR block that this
|
|
|
|
machine block is based on.
|
|
|
|
|
|
|
|
Block References
|
|
|
|
^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
The machine basic blocks are identified by their ID numbers. Individual
|
|
|
|
blocks are referenced using the following syntax:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
%bb.<id>[.<name>]
|
|
|
|
|
|
|
|
Examples:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
%bb.0
|
|
|
|
%bb.1.then
|
|
|
|
|
|
|
|
Successors
|
|
|
|
^^^^^^^^^^
|
|
|
|
|
|
|
|
The machine basic block's successors have to be specified before any of the
|
|
|
|
instructions:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
bb.0.entry:
|
|
|
|
successors: %bb.1.then, %bb.2.else
|
|
|
|
<instructions>
|
|
|
|
bb.1.then:
|
|
|
|
<instructions>
|
|
|
|
bb.2.else:
|
|
|
|
<instructions>
|
|
|
|
|
|
|
|
The branch weights can be specified in brackets after the successor blocks.
|
|
|
|
The example below defines a block that has two successors with branch weights
|
|
|
|
of 32 and 16:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
bb.0.entry:
|
|
|
|
successors: %bb.1.then(32), %bb.2.else(16)
|
|
|
|
|
|
|
|
Live In Registers
|
|
|
|
^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
The machine basic block's live in registers have to be specified before any of
|
|
|
|
the instructions:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
bb.0.entry:
|
|
|
|
liveins: %edi, %esi
|
|
|
|
|
|
|
|
The list of live in registers and successors can be empty. The language also
|
|
|
|
allows multiple live in register and successor lists - they are combined into
|
|
|
|
one list by the parser.
|
|
|
|
|
|
|
|
Miscellaneous Attributes
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
The attributes ``IsAddressTaken``, ``IsLandingPad`` and ``Alignment`` can be
|
|
|
|
specified in brackets after the block's definition:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
bb.0.entry (address-taken):
|
|
|
|
<instructions>
|
|
|
|
bb.2.else (align 4):
|
|
|
|
<instructions>
|
|
|
|
bb.3(landing-pad, align 4):
|
|
|
|
<instructions>
|
|
|
|
|
|
|
|
.. TODO: Describe the way the reference to an unnamed LLVM IR block can be
|
|
|
|
preserved.
|
|
|
|
|
2015-08-21 17:26:38 +00:00
|
|
|
Machine Instructions
|
|
|
|
--------------------
|
|
|
|
|
|
|
|
A machine instruction is composed of a name, machine operands,
|
|
|
|
:ref:`instruction flags <instruction-flags>`, and machine memory operands.
|
|
|
|
|
|
|
|
The instruction's name is usually specified before the operands. The example
|
|
|
|
below shows an instance of the X86 ``RETQ`` instruction with a single machine
|
|
|
|
operand:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
RETQ %eax
|
|
|
|
|
|
|
|
However, if the machine instruction has one or more explicitly defined register
|
|
|
|
operands, the instruction's name has to be specified after them. The example
|
|
|
|
below shows an instance of the AArch64 ``LDPXpost`` instruction with three
|
|
|
|
defined register operands:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
%sp, %fp, %lr = LDPXpost %sp, 2
|
|
|
|
|
|
|
|
The instruction names are serialized using the exact definitions from the
|
|
|
|
target's ``*InstrInfo.td`` files, and they are case sensitive. This means that
|
|
|
|
similar instruction names like ``TSTri`` and ``tSTRi`` represent different
|
|
|
|
machine instructions.
|
|
|
|
|
|
|
|
.. _instruction-flags:
|
|
|
|
|
|
|
|
Instruction Flags
|
|
|
|
^^^^^^^^^^^^^^^^^
|
|
|
|
|
|
|
|
The flag ``frame-setup`` can be specified before the instruction's name:
|
|
|
|
|
|
|
|
.. code-block:: llvm
|
|
|
|
|
|
|
|
%fp = frame-setup ADDXri %sp, 0, 0
|
|
|
|
|
2015-08-15 01:06:06 +00:00
|
|
|
|
2015-08-06 22:55:19 +00:00
|
|
|
.. TODO: Describe the parsers default behaviour when optional YAML attributes
|
|
|
|
are missing.
|
2015-08-21 17:26:38 +00:00
|
|
|
.. TODO: Describe the syntax for the bundled instructions.
|
2015-08-06 22:55:19 +00:00
|
|
|
.. TODO: Describe the syntax of the immediate machine operands.
|
|
|
|
.. TODO: Describe the syntax of the register machine operands.
|
|
|
|
.. TODO: Describe the syntax of the virtual register operands and their YAML
|
|
|
|
definitions.
|
|
|
|
.. TODO: Describe the syntax of the register operand flags and the subregisters.
|
|
|
|
.. TODO: Describe the machine function's YAML flag attributes.
|
|
|
|
.. TODO: Describe the syntax for the global value, external symbol and register
|
|
|
|
mask machine operands.
|
|
|
|
.. TODO: Describe the frame information YAML mapping.
|
|
|
|
.. TODO: Describe the syntax of the stack object machine operands and their
|
|
|
|
YAML definitions.
|
|
|
|
.. TODO: Describe the syntax of the constant pool machine operands and their
|
|
|
|
YAML definitions.
|
|
|
|
.. TODO: Describe the syntax of the jump table machine operands and their
|
|
|
|
YAML definitions.
|
|
|
|
.. TODO: Describe the syntax of the block address machine operands.
|
|
|
|
.. TODO: Describe the syntax of the CFI index machine operands.
|
|
|
|
.. TODO: Describe the syntax of the metadata machine operands, and the
|
|
|
|
instructions debug location attribute.
|
|
|
|
.. TODO: Describe the syntax of the target index machine operands.
|
|
|
|
.. TODO: Describe the syntax of the register live out machine operands.
|
|
|
|
.. TODO: Describe the syntax of the machine memory operands.
|