Remove canonical assembly format in syntax.md.

This commit is contained in:
Lei Zhang 2015-11-06 10:10:28 -05:00 committed by David Neto
parent 8f6ba14b58
commit 0c00eb2cdf

114
syntax.md
View File

@ -2,67 +2,15 @@
## Overview
The assembly attempts to adhere the binary form as closely as possible
using text names from section 3 of the SPIR-V spec.
The assembly attempts to adhere to the binary form from Section 3 of the SPIR-V
spec as closely as possible, with one exception aiming at improving the text's
readability. The `<result-id>` generated by an instruction is moved to the
beginning of that instruction and followed by an `=` sign. This allows us to
distinguish between variable definitions and uses and locate value definitions
more easily.
Here is an example:
```
OpCapability Shader
OpMemoryModel Logical Simple
OpEntryPoint GLCompute %3 "main"
OpExecutionMode %3 LocalSize 64 64 1
OpTypeVoid %1
OpTypeFunction %2 %1
OpFunction %1 %3 None %2
OpLabel %4
OpReturn
OpFunctionEnd
```
A module is a sequence of instructions, separated by whitespace.
An instruction is an opcode name followed by operands, separated by
whitespace. Typically each instruction is presented on its own line,
but the assembler does not enforce this rule.
The opcode names and expected operands are described in section 3 of
the SPIR-V specification. An operand is one of:
* a literal integer: A decimal integer, or a hexadecimal integer.
A hexadecimal integer is indicated by a leading `0x` or `0X`. A hex
integer supplied for a signed integer value will be sign-extended.
For example, `0xffff` supplied as the literal for an `OpConstant`
on a signed 16-bit integer type will be interpreted as the value `-1`.
* a literal floating point number.
* a literal string.
* A literal string is everything following a double-quote `"` until the
following un-escaped double-quote. This includes special characters such as
newlines.
* A backslash `\` may be used to escape characters in the string. The `\`
may be used to escape a double-quote or a `\` but is simply ignored when
preceding any other character.
* a named enumerated value, specific to that operand position. For example,
the `OpMemoryModel` takes a named Addressing Model operand (e.g. `Logical` or
`Physical32`), and a named Memory Model operand (e.g. `Simple` or `OpenCL`).
Named enumerated values are only meaningful in specific positions, and will
otherwise generate an error.
* a mask expression, consisting of one or more mask enum names separated
by `|`. For example, the expression `NotNaN|NotInf|NSZ` denotes the mask
which is the combination of the `NotNaN`, `NotInf`, and `NSZ` flags.
* an injected immediate integer: `!<integer>`. See [below](#immediate).
* an ID, e.g. `%foo`. See [below](#id).
## Assignment-oriented Assembly Form
<a name="assignment-form"></a>
The description and examples from above describe the Canonical Assembly
Form for SPIR-V assembly language.
We also define the Assignment-oriented Assembly Form, aimed at improving
the text's readability. In AAF, the `<result-id>` generated by an
instruction is moved to the beginning of that instruction and followed by
an `=` sign. This allows us to distinguish between variable definitions
and uses and locate value definitions more easily. So, the above example
can also be written as:
```
OpCapability Shader
OpMemoryModel Logical Simple
@ -76,11 +24,42 @@ can also be written as:
OpFunctionEnd
```
A module is a sequence of instructions, separated by whitespace.
An instruction is an opcode name followed by operands, separated by
whitespace. Typically each instruction is presented on its own line,
but the assembler does not enforce this rule.
The opcode names and expected operands are described in Section 3 of
the SPIR-V specification. An operand is one of:
* a literal integer: A decimal integer, or a hexadecimal integer.
A hexadecimal integer is indicated by a leading `0x` or `0X`. A hex
integer supplied for a signed integer value will be sign-extended.
For example, `0xffff` supplied as the literal for an `OpConstant`
on a signed 16-bit integer type will be interpreted as the value `-1`.
* a literal floating point number.
* a literal string.
* A literal string is everything following a double-quote `"` until the
following un-escaped double-quote. This includes special characters such
as newlines.
* A backslash `\` may be used to escape characters in the string. The `\`
may be used to escape a double-quote or a `\` but is simply ignored when
preceding any other character.
* a named enumerated value, specific to that operand position. For example,
the `OpMemoryModel` takes a named Addressing Model operand (e.g. `Logical` or
`Physical32`), and a named Memory Model operand (e.g. `Simple` or `OpenCL`).
Named enumerated values are only meaningful in specific positions, and will
otherwise generate an error.
* a mask expression, consisting of one or more mask enum names separated
by `|`. For example, the expression `NotNaN|NotInf|NSZ` denotes the mask
which is the combination of the `NotNaN`, `NotInf`, and `NSZ` flags.
* an injected immediate integer: `!<integer>`. See [below](#immediate).
* an ID, e.g. `%foo`. See [below](#id).
## ID Definitions & Usage
<a name="id"></a>
An ID definition pertains to the `<result-id>` of an instruction, and ID usage is a
use of an ID as an input to an instruction.
An ID _definition_ pertains to the `<result-id>` of an instruction, and ID
_usage_ is a use of an ID as an input to an instruction.
An ID in the assembly language begins with `%` and must be followed by a name
consisting of one or more letters, numbers or underscore characters.
@ -148,13 +127,11 @@ with `!<integer>` in them:
When a token in the assembly program is a `!<integer>`, that integer value is
emitted into the binary output, and parsing proceeds differently than before:
each subsequent token not recognized as an OpCode is emitted into the binary
output without any checking; when a recognizable OpCode is eventually
encountered, it begins a new instruction and parsing returns to normal. (If a
subsequent OpCode is never found, then this alternate parsing mode handles all
the remaining tokens in the program. If a subsequent OpCode is in an
[assignment form](#assignment-form), the ID preceding it begins a new
instruction.)
each subsequent token not recognized as an OpCode or a <result-id> is emitted
into the binary output without any checking; when a recognizable OpCode or a
<result-id> is eventually encountered, it begins a new instruction and parsing
returns to normal. (If a subsequent OpCode is never found, then this alternate
parsing mode handles all the remaining tokens in the program.)
The assembler processes the tokens encountered in alternate parsing mode as
follows:
@ -187,8 +164,7 @@ Note that this has some interesting consequences, including:
* The `<result-id>` on the left-hand side of an assignment cannot be a
`!<integer>`. The `<result-id>` can be still be manually controlled if desired
by using the [Canonical Assembly Form](#assignment-form) or by simply
expressing the entire instruction as `!<integer>` tokens for its opcode and
by expressing the entire instruction as `!<integer>` tokens for its opcode and
operands.
* The `=` sign cannot be processed by the alternate parsing mode if the OpCode