llvm/docs/MIRLangRef.rst

229 lines
6.5 KiB
ReStructuredText

========================================
Machine IR (MIR) Format Reference Manual
========================================
.. contents::
:local:
.. warning::
This is a work in progress.
Introduction
============
This document is a reference manual for the Machine IR (MIR) serialization
format. MIR is a human readable serialization format that is used to represent
LLVM's :ref:`machine specific intermediate representation
<machine code representation>`.
The MIR serialization format is designed to be used for testing the code
generation passes in LLVM.
Overview
========
The MIR serialization format uses a YAML container. YAML is a standard
data serialization language, and the full YAML language spec can be read at
`yaml.org
<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_.
A MIR file is split up into a series of `YAML documents`_. The first document
can contain an optional embedded LLVM IR module, and the rest of the documents
contain the serialized machine functions.
.. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132
High Level Structure
====================
Embedded Module
---------------
When the first YAML document contains a `YAML block literal string`_, the MIR
parser will treat this string as an LLVM assembly language string that
represents an embedded LLVM IR module.
Here is an example of a YAML document that contains an LLVM module:
.. code-block:: llvm
--- |
define i32 @inc(i32* %x) {
entry:
%0 = load i32, i32* %x
%1 = add i32 %0, 1
store i32 %1, i32* %x
ret i32 %1
}
...
.. _YAML block literal string: http://www.yaml.org/spec/1.2/spec.html#id2795688
Machine Functions
-----------------
The remaining YAML documents contain the machine functions. This is an example
of such YAML document:
.. code-block:: llvm
---
name: inc
tracksRegLiveness: true
liveins:
- { reg: '%rdi' }
body: |
bb.0.entry:
liveins: %rdi
%eax = MOV32rm %rdi, 1, _, 0, _
%eax = INC32r killed %eax, implicit-def dead %eflags
MOV32mr killed %rdi, 1, _, 0, _, %eax
RETQ %eax
...
The document above consists of attributes that represent the various
properties and data structures in a machine function.
The attribute ``name`` is required, and its value should be identical to the
name of a function that this machine function is based on.
The attribute ``body`` is a `YAML block literal string`_. Its value represents
the function's machine basic blocks and their machine instructions.
Machine Instructions Format Reference
=====================================
The machine basic blocks and their instructions are represented using a custom,
human readable serialization language. This language is used in the
`YAML block literal string`_ that corresponds to the machine function's body.
A source string that uses this language contains a list of machine basic
blocks, which are described in the section below.
Machine Basic Blocks
--------------------
A machine basic block is defined in a single block definition source construct
that contains the block's ID.
The example below defines two blocks that have an ID of zero and one:
.. code-block:: llvm
bb.0:
<instructions>
bb.1:
<instructions>
A machine basic block can also have a name. It should be specified after the ID
in the block's definition:
.. code-block:: llvm
bb.0.entry: ; This block's name is "entry"
<instructions>
The block's name should be identical to the name of the IR block that this
machine block is based on.
Block References
^^^^^^^^^^^^^^^^
The machine basic blocks are identified by their ID numbers. Individual
blocks are referenced using the following syntax:
.. code-block:: llvm
%bb.<id>[.<name>]
Examples:
.. code-block:: llvm
%bb.0
%bb.1.then
Successors
^^^^^^^^^^
The machine basic block's successors have to be specified before any of the
instructions:
.. code-block:: llvm
bb.0.entry:
successors: %bb.1.then, %bb.2.else
<instructions>
bb.1.then:
<instructions>
bb.2.else:
<instructions>
The branch weights can be specified in brackets after the successor blocks.
The example below defines a block that has two successors with branch weights
of 32 and 16:
.. code-block:: llvm
bb.0.entry:
successors: %bb.1.then(32), %bb.2.else(16)
Live In Registers
^^^^^^^^^^^^^^^^^
The machine basic block's live in registers have to be specified before any of
the instructions:
.. code-block:: llvm
bb.0.entry:
liveins: %edi, %esi
The list of live in registers and successors can be empty. The language also
allows multiple live in register and successor lists - they are combined into
one list by the parser.
Miscellaneous Attributes
^^^^^^^^^^^^^^^^^^^^^^^^
The attributes ``IsAddressTaken``, ``IsLandingPad`` and ``Alignment`` can be
specified in brackets after the block's definition:
.. code-block:: llvm
bb.0.entry (address-taken):
<instructions>
bb.2.else (align 4):
<instructions>
bb.3(landing-pad, align 4):
<instructions>
.. TODO: Describe the way the reference to an unnamed LLVM IR block can be
preserved.
.. TODO: Describe the parsers default behaviour when optional YAML attributes
are missing.
.. TODO: Describe the syntax of the machine instructions.
.. TODO: Describe the syntax of the immediate machine operands.
.. TODO: Describe the syntax of the register machine operands.
.. TODO: Describe the syntax of the virtual register operands and their YAML
definitions.
.. TODO: Describe the syntax of the register operand flags and the subregisters.
.. TODO: Describe the machine function's YAML flag attributes.
.. TODO: Describe the syntax for the global value, external symbol and register
mask machine operands.
.. TODO: Describe the frame information YAML mapping.
.. TODO: Describe the syntax of the stack object machine operands and their
YAML definitions.
.. TODO: Describe the syntax of the constant pool machine operands and their
YAML definitions.
.. TODO: Describe the syntax of the jump table machine operands and their
YAML definitions.
.. TODO: Describe the syntax of the block address machine operands.
.. TODO: Describe the syntax of the CFI index machine operands.
.. TODO: Describe the syntax of the metadata machine operands, and the
instructions debug location attribute.
.. TODO: Describe the syntax of the target index machine operands.
.. TODO: Describe the syntax of the register live out machine operands.
.. TODO: Describe the syntax of the machine memory operands.