mirror of
https://github.com/RPCS3/llvm.git
synced 2025-04-03 13:51:39 +00:00
Move the "High Level Structure" to before "Type System"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@18695 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
parent
d4f0f9849a
commit
fa73021cf1
@ -17,6 +17,13 @@
|
|||||||
<li><a href="#abstract">Abstract</a></li>
|
<li><a href="#abstract">Abstract</a></li>
|
||||||
<li><a href="#introduction">Introduction</a></li>
|
<li><a href="#introduction">Introduction</a></li>
|
||||||
<li><a href="#identifiers">Identifiers</a></li>
|
<li><a href="#identifiers">Identifiers</a></li>
|
||||||
|
<li><a href="#highlevel">High Level Structure</a>
|
||||||
|
<ol>
|
||||||
|
<li><a href="#modulestructure">Module Structure</a></li>
|
||||||
|
<li><a href="#globalvars">Global Variables</a></li>
|
||||||
|
<li><a href="#functionstructure">Function Structure</a></li>
|
||||||
|
</ol>
|
||||||
|
</li>
|
||||||
<li><a href="#typesystem">Type System</a>
|
<li><a href="#typesystem">Type System</a>
|
||||||
<ol>
|
<ol>
|
||||||
<li><a href="#t_primitive">Primitive Types</a>
|
<li><a href="#t_primitive">Primitive Types</a>
|
||||||
@ -35,12 +42,7 @@
|
|||||||
</li>
|
</li>
|
||||||
</ol>
|
</ol>
|
||||||
</li>
|
</li>
|
||||||
<li><a href="#highlevel">High Level Structure</a>
|
<li><a href="#constants">Constants</a>
|
||||||
<ol>
|
|
||||||
<li><a href="#modulestructure">Module Structure</a></li>
|
|
||||||
<li><a href="#globalvars">Global Variables</a></li>
|
|
||||||
<li><a href="#functionstructure">Function Structure</a></li>
|
|
||||||
</ol>
|
|
||||||
</li>
|
</li>
|
||||||
<li><a href="#instref">Instruction Reference</a>
|
<li><a href="#instref">Instruction Reference</a>
|
||||||
<ol>
|
<ol>
|
||||||
@ -279,10 +281,172 @@ exactly. For example, NaN's, infinities, and other special cases are
|
|||||||
represented in their IEEE hexadecimal format so that assembly and
|
represented in their IEEE hexadecimal format so that assembly and
|
||||||
disassembly do not cause any bits to change in the constants.</p>
|
disassembly do not cause any bits to change in the constants.</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<!-- *********************************************************************** -->
|
||||||
|
<div class="doc_section"> <a name="highlevel">High Level Structure</a> </div>
|
||||||
|
<!-- *********************************************************************** -->
|
||||||
|
|
||||||
|
<!-- ======================================================================= -->
|
||||||
|
<div class="doc_subsection"> <a name="modulestructure">Module Structure</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="doc_text">
|
||||||
|
|
||||||
|
<p>LLVM programs are composed of "Module"s, each of which is a
|
||||||
|
translation unit of the input programs. Each module consists of
|
||||||
|
functions, global variables, and symbol table entries. Modules may be
|
||||||
|
combined together with the LLVM linker, which merges function (and
|
||||||
|
global variable) definitions, resolves forward declarations, and merges
|
||||||
|
symbol table entries. Here is an example of the "hello world" module:</p>
|
||||||
|
|
||||||
|
<pre><i>; Declare the string constant as a global constant...</i>
|
||||||
|
<a href="#identifiers">%.LC0</a> = <a href="#linkage_internal">internal</a> <a
|
||||||
|
href="#globalvars">constant</a> <a href="#t_array">[13 x sbyte]</a> c"hello world\0A\00" <i>; [13 x sbyte]*</i>
|
||||||
|
|
||||||
|
<i>; External declaration of the puts function</i>
|
||||||
|
<a href="#functionstructure">declare</a> int %puts(sbyte*) <i>; int(sbyte*)* </i>
|
||||||
|
|
||||||
|
<i>; Definition of main function</i>
|
||||||
|
int %main() { <i>; int()* </i>
|
||||||
|
<i>; Convert [13x sbyte]* to sbyte *...</i>
|
||||||
|
%cast210 = <a
|
||||||
|
href="#i_getelementptr">getelementptr</a> [13 x sbyte]* %.LC0, long 0, long 0 <i>; sbyte*</i>
|
||||||
|
|
||||||
|
<i>; Call puts function to write out the string to stdout...</i>
|
||||||
|
<a
|
||||||
|
href="#i_call">call</a> int %puts(sbyte* %cast210) <i>; int</i>
|
||||||
|
<a
|
||||||
|
href="#i_ret">ret</a> int 0<br>}<br></pre>
|
||||||
|
|
||||||
|
<p>This example is made up of a <a href="#globalvars">global variable</a>
|
||||||
|
named "<tt>.LC0</tt>", an external declaration of the "<tt>puts</tt>"
|
||||||
|
function, and a <a href="#functionstructure">function definition</a>
|
||||||
|
for "<tt>main</tt>".</p>
|
||||||
|
|
||||||
|
<a name="linkage"> In general, a module is made up of a list of global
|
||||||
|
values, where both functions and global variables are global values.
|
||||||
|
Global values are represented by a pointer to a memory location (in
|
||||||
|
this case, a pointer to an array of char, and a pointer to a function),
|
||||||
|
and have one of the following linkage types:</a>
|
||||||
|
|
||||||
|
<p> </p>
|
||||||
|
|
||||||
|
<dl>
|
||||||
|
<dt><tt><b><a name="linkage_internal">internal</a></b></tt> </dt>
|
||||||
|
<dd>Global values with internal linkage are only directly accessible
|
||||||
|
by objects in the current module. In particular, linking code into a
|
||||||
|
module with an internal global value may cause the internal to be
|
||||||
|
renamed as necessary to avoid collisions. Because the symbol is
|
||||||
|
internal to the module, all references can be updated. This
|
||||||
|
corresponds to the notion of the '<tt>static</tt>' keyword in C, or the
|
||||||
|
idea of "anonymous namespaces" in C++.
|
||||||
|
<p> </p>
|
||||||
|
</dd>
|
||||||
|
<dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt>
|
||||||
|
<dd>"<tt>linkonce</tt>" linkage is similar to <tt>internal</tt>
|
||||||
|
linkage, with the twist that linking together two modules defining the
|
||||||
|
same <tt>linkonce</tt> globals will cause one of the globals to be
|
||||||
|
discarded. This is typically used to implement inline functions.
|
||||||
|
Unreferenced <tt>linkonce</tt> globals are allowed to be discarded.
|
||||||
|
<p> </p>
|
||||||
|
</dd>
|
||||||
|
<dt><tt><b><a name="linkage_weak">weak</a></b></tt>: </dt>
|
||||||
|
<dd>"<tt>weak</tt>" linkage is exactly the same as <tt>linkonce</tt>
|
||||||
|
linkage, except that unreferenced <tt>weak</tt> globals may not be
|
||||||
|
discarded. This is used to implement constructs in C such as "<tt>int
|
||||||
|
X;</tt>" at global scope.
|
||||||
|
<p> </p>
|
||||||
|
</dd>
|
||||||
|
<dt><tt><b><a name="linkage_appending">appending</a></b></tt>: </dt>
|
||||||
|
<dd>"<tt>appending</tt>" linkage may only be applied to global
|
||||||
|
variables of pointer to array type. When two global variables with
|
||||||
|
appending linkage are linked together, the two global arrays are
|
||||||
|
appended together. This is the LLVM, typesafe, equivalent of having
|
||||||
|
the system linker append together "sections" with identical names when
|
||||||
|
.o files are linked.
|
||||||
|
<p> </p>
|
||||||
|
</dd>
|
||||||
|
<dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt>
|
||||||
|
<dd>If none of the above identifiers are used, the global is
|
||||||
|
externally visible, meaning that it participates in linkage and can be
|
||||||
|
used to resolve external symbol references.
|
||||||
|
<p> </p>
|
||||||
|
</dd>
|
||||||
|
</dl>
|
||||||
|
|
||||||
|
<p> </p>
|
||||||
|
|
||||||
|
<p><a name="linkage_external">For example, since the "<tt>.LC0</tt>"
|
||||||
|
variable is defined to be internal, if another module defined a "<tt>.LC0</tt>"
|
||||||
|
variable and was linked with this one, one of the two would be renamed,
|
||||||
|
preventing a collision. Since "<tt>main</tt>" and "<tt>puts</tt>" are
|
||||||
|
external (i.e., lacking any linkage declarations), they are accessible
|
||||||
|
outside of the current module. It is illegal for a function <i>declaration</i>
|
||||||
|
to have any linkage type other than "externally visible".</a></p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- ======================================================================= -->
|
||||||
|
<div class="doc_subsection">
|
||||||
|
<a name="globalvars">Global Variables</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="doc_text">
|
||||||
|
|
||||||
|
<p>Global variables define regions of memory allocated at compilation
|
||||||
|
time instead of run-time. Global variables may optionally be
|
||||||
|
initialized. A variable may be defined as a global "constant", which
|
||||||
|
indicates that the contents of the variable will never be modified
|
||||||
|
(enabling better optimization, allowing the global data to be placed in the
|
||||||
|
read-only section of an executable, etc).</p>
|
||||||
|
|
||||||
|
<p>As SSA values, global variables define pointer values that are in
|
||||||
|
scope (i.e. they dominate) all basic blocks in the program. Global
|
||||||
|
variables always define a pointer to their "content" type because they
|
||||||
|
describe a region of memory, and all memory objects in LLVM are
|
||||||
|
accessed through pointers.</p>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
<!-- ======================================================================= -->
|
||||||
|
<div class="doc_subsection">
|
||||||
|
<a name="functionstructure">Functions</a>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="doc_text">
|
||||||
|
|
||||||
|
<p>LLVM function definitions are composed of a (possibly empty) argument list,
|
||||||
|
an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM
|
||||||
|
function declarations are defined with the "<tt>declare</tt>" keyword, a
|
||||||
|
function name, and a function signature.</p>
|
||||||
|
|
||||||
|
<p>A function definition contains a list of basic blocks, forming the CFG for
|
||||||
|
the function. Each basic block may optionally start with a label (giving the
|
||||||
|
basic block a symbol table entry), contains a list of instructions, and ends
|
||||||
|
with a <a href="#terminators">terminator</a> instruction (such as a branch or
|
||||||
|
function return).</p>
|
||||||
|
|
||||||
|
<p>The first basic block in program is special in two ways: it is immediately
|
||||||
|
executed on entrance to the function, and it is not allowed to have predecessor
|
||||||
|
basic blocks (i.e. there can not be any branches to the entry block of a
|
||||||
|
function). Because the block can have no predecessors, it also cannot have any
|
||||||
|
<a href="#i_phi">PHI nodes</a>.</p>
|
||||||
|
|
||||||
|
<p>LLVM functions are identified by their name and type signature. Hence, two
|
||||||
|
functions with the same name but different parameter lists or return values are
|
||||||
|
considered different functions, and LLVM will resolves references to each
|
||||||
|
appropriately.</p>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<!-- *********************************************************************** -->
|
<!-- *********************************************************************** -->
|
||||||
<div class="doc_section"> <a name="typesystem">Type System</a> </div>
|
<div class="doc_section"> <a name="typesystem">Type System</a> </div>
|
||||||
<!-- *********************************************************************** -->
|
<!-- *********************************************************************** -->
|
||||||
|
|
||||||
<div class="doc_text">
|
<div class="doc_text">
|
||||||
|
|
||||||
<p>The LLVM type system is one of the most important features of the
|
<p>The LLVM type system is one of the most important features of the
|
||||||
intermediate representation. Being typed enables a number of
|
intermediate representation. Being typed enables a number of
|
||||||
optimizations to be performed on the IR directly, without having to do
|
optimizations to be performed on the IR directly, without having to do
|
||||||
@ -290,9 +454,9 @@ extra analyses on the side before the transformation. A strong type
|
|||||||
system makes it easier to read the generated code and enables novel
|
system makes it easier to read the generated code and enables novel
|
||||||
analyses and transformations that are not feasible to perform on normal
|
analyses and transformations that are not feasible to perform on normal
|
||||||
three address code representations.</p>
|
three address code representations.</p>
|
||||||
<!-- The written form for the type system was heavily influenced by the
|
|
||||||
syntactic problems with types in the C language<sup><a
|
</div>
|
||||||
href="#rw_stroustrup">1</a></sup>.<p> --> </div>
|
|
||||||
<!-- ======================================================================= -->
|
<!-- ======================================================================= -->
|
||||||
<div class="doc_subsection"> <a name="t_primitive">Primitive Types</a> </div>
|
<div class="doc_subsection"> <a name="t_primitive">Primitive Types</a> </div>
|
||||||
<div class="doc_text">
|
<div class="doc_text">
|
||||||
@ -557,152 +721,6 @@ be any integral or floating point type.</p>
|
|||||||
</table>
|
</table>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<!-- *********************************************************************** -->
|
|
||||||
<div class="doc_section"> <a name="highlevel">High Level Structure</a> </div>
|
|
||||||
<!-- *********************************************************************** -->
|
|
||||||
<!-- ======================================================================= -->
|
|
||||||
<div class="doc_subsection"> <a name="modulestructure">Module Structure</a>
|
|
||||||
</div>
|
|
||||||
<div class="doc_text">
|
|
||||||
<p>LLVM programs are composed of "Module"s, each of which is a
|
|
||||||
translation unit of the input programs. Each module consists of
|
|
||||||
functions, global variables, and symbol table entries. Modules may be
|
|
||||||
combined together with the LLVM linker, which merges function (and
|
|
||||||
global variable) definitions, resolves forward declarations, and merges
|
|
||||||
symbol table entries. Here is an example of the "hello world" module:</p>
|
|
||||||
<pre><i>; Declare the string constant as a global constant...</i>
|
|
||||||
<a href="#identifiers">%.LC0</a> = <a href="#linkage_internal">internal</a> <a
|
|
||||||
href="#globalvars">constant</a> <a href="#t_array">[13 x sbyte]</a> c"hello world\0A\00" <i>; [13 x sbyte]*</i>
|
|
||||||
|
|
||||||
<i>; External declaration of the puts function</i>
|
|
||||||
<a href="#functionstructure">declare</a> int %puts(sbyte*) <i>; int(sbyte*)* </i>
|
|
||||||
|
|
||||||
<i>; Definition of main function</i>
|
|
||||||
int %main() { <i>; int()* </i>
|
|
||||||
<i>; Convert [13x sbyte]* to sbyte *...</i>
|
|
||||||
%cast210 = <a
|
|
||||||
href="#i_getelementptr">getelementptr</a> [13 x sbyte]* %.LC0, long 0, long 0 <i>; sbyte*</i>
|
|
||||||
|
|
||||||
<i>; Call puts function to write out the string to stdout...</i>
|
|
||||||
<a
|
|
||||||
href="#i_call">call</a> int %puts(sbyte* %cast210) <i>; int</i>
|
|
||||||
<a
|
|
||||||
href="#i_ret">ret</a> int 0<br>}<br></pre>
|
|
||||||
<p>This example is made up of a <a href="#globalvars">global variable</a>
|
|
||||||
named "<tt>.LC0</tt>", an external declaration of the "<tt>puts</tt>"
|
|
||||||
function, and a <a href="#functionstructure">function definition</a>
|
|
||||||
for "<tt>main</tt>".</p>
|
|
||||||
<a name="linkage"> In general, a module is made up of a list of global
|
|
||||||
values, where both functions and global variables are global values.
|
|
||||||
Global values are represented by a pointer to a memory location (in
|
|
||||||
this case, a pointer to an array of char, and a pointer to a function),
|
|
||||||
and have one of the following linkage types:</a>
|
|
||||||
<p> </p>
|
|
||||||
<dl>
|
|
||||||
<dt><tt><b><a name="linkage_internal">internal</a></b></tt> </dt>
|
|
||||||
<dd>Global values with internal linkage are only directly accessible
|
|
||||||
by objects in the current module. In particular, linking code into a
|
|
||||||
module with an internal global value may cause the internal to be
|
|
||||||
renamed as necessary to avoid collisions. Because the symbol is
|
|
||||||
internal to the module, all references can be updated. This
|
|
||||||
corresponds to the notion of the '<tt>static</tt>' keyword in C, or the
|
|
||||||
idea of "anonymous namespaces" in C++.
|
|
||||||
<p> </p>
|
|
||||||
</dd>
|
|
||||||
<dt><tt><b><a name="linkage_linkonce">linkonce</a></b></tt>: </dt>
|
|
||||||
<dd>"<tt>linkonce</tt>" linkage is similar to <tt>internal</tt>
|
|
||||||
linkage, with the twist that linking together two modules defining the
|
|
||||||
same <tt>linkonce</tt> globals will cause one of the globals to be
|
|
||||||
discarded. This is typically used to implement inline functions.
|
|
||||||
Unreferenced <tt>linkonce</tt> globals are allowed to be discarded.
|
|
||||||
<p> </p>
|
|
||||||
</dd>
|
|
||||||
<dt><tt><b><a name="linkage_weak">weak</a></b></tt>: </dt>
|
|
||||||
<dd>"<tt>weak</tt>" linkage is exactly the same as <tt>linkonce</tt>
|
|
||||||
linkage, except that unreferenced <tt>weak</tt> globals may not be
|
|
||||||
discarded. This is used to implement constructs in C such as "<tt>int
|
|
||||||
X;</tt>" at global scope.
|
|
||||||
<p> </p>
|
|
||||||
</dd>
|
|
||||||
<dt><tt><b><a name="linkage_appending">appending</a></b></tt>: </dt>
|
|
||||||
<dd>"<tt>appending</tt>" linkage may only be applied to global
|
|
||||||
variables of pointer to array type. When two global variables with
|
|
||||||
appending linkage are linked together, the two global arrays are
|
|
||||||
appended together. This is the LLVM, typesafe, equivalent of having
|
|
||||||
the system linker append together "sections" with identical names when
|
|
||||||
.o files are linked.
|
|
||||||
<p> </p>
|
|
||||||
</dd>
|
|
||||||
<dt><tt><b><a name="linkage_external">externally visible</a></b></tt>:</dt>
|
|
||||||
<dd>If none of the above identifiers are used, the global is
|
|
||||||
externally visible, meaning that it participates in linkage and can be
|
|
||||||
used to resolve external symbol references.
|
|
||||||
<p> </p>
|
|
||||||
</dd>
|
|
||||||
</dl>
|
|
||||||
<p> </p>
|
|
||||||
<p><a name="linkage_external">For example, since the "<tt>.LC0</tt>"
|
|
||||||
variable is defined to be internal, if another module defined a "<tt>.LC0</tt>"
|
|
||||||
variable and was linked with this one, one of the two would be renamed,
|
|
||||||
preventing a collision. Since "<tt>main</tt>" and "<tt>puts</tt>" are
|
|
||||||
external (i.e., lacking any linkage declarations), they are accessible
|
|
||||||
outside of the current module. It is illegal for a function <i>declaration</i>
|
|
||||||
to have any linkage type other than "externally visible".</a></p>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- ======================================================================= -->
|
|
||||||
<div class="doc_subsection">
|
|
||||||
<a name="globalvars">Global Variables</a>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div class="doc_text">
|
|
||||||
|
|
||||||
<p>Global variables define regions of memory allocated at compilation
|
|
||||||
time instead of run-time. Global variables may optionally be
|
|
||||||
initialized. A variable may be defined as a global "constant", which
|
|
||||||
indicates that the contents of the variable will never be modified
|
|
||||||
(opening options for optimization).</p>
|
|
||||||
|
|
||||||
<p>As SSA values, global variables define pointer values that are in
|
|
||||||
scope (i.e. they dominate) for all basic blocks in the program. Global
|
|
||||||
variables always define a pointer to their "content" type because they
|
|
||||||
describe a region of memory, and all memory objects in LLVM are
|
|
||||||
accessed through pointers.</p>
|
|
||||||
|
|
||||||
</div>
|
|
||||||
|
|
||||||
|
|
||||||
<!-- ======================================================================= -->
|
|
||||||
<div class="doc_subsection">
|
|
||||||
<a name="functionstructure">Functions</a>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div class="doc_text">
|
|
||||||
|
|
||||||
<p>LLVM function definitions are composed of a (possibly empty) argument list,
|
|
||||||
an opening curly brace, a list of basic blocks, and a closing curly brace. LLVM
|
|
||||||
function declarations are defined with the "<tt>declare</tt>" keyword, a
|
|
||||||
function name, and a function signature.</p>
|
|
||||||
|
|
||||||
<p>A function definition contains a list of basic blocks, forming the CFG for
|
|
||||||
the function. Each basic block may optionally start with a label (giving the
|
|
||||||
basic block a symbol table entry), contains a list of instructions, and ends
|
|
||||||
with a <a href="#terminators">terminator</a> instruction (such as a branch or
|
|
||||||
function return).</p>
|
|
||||||
|
|
||||||
<p>The first basic block in program is special in two ways: it is immediately
|
|
||||||
executed on entrance to the function, and it is not allowed to have predecessor
|
|
||||||
basic blocks (i.e. there can not be any branches to the entry block of a
|
|
||||||
function). Because the block can have no predecessors, it also cannot have any
|
|
||||||
<a href="#i_phi">PHI nodes</a>.</p>
|
|
||||||
|
|
||||||
<p>LLVM functions are identified by their name and type signature. Hence, two
|
|
||||||
functions with the same name but different parameter lists or return values are
|
|
||||||
considered different functions, and LLVM will resolves references to each
|
|
||||||
appropriately.</p>
|
|
||||||
|
|
||||||
</div>
|
|
||||||
|
|
||||||
|
|
||||||
<!-- *********************************************************************** -->
|
<!-- *********************************************************************** -->
|
||||||
<div class="doc_section"> <a name="instref">Instruction Reference</a> </div>
|
<div class="doc_section"> <a name="instref">Instruction Reference</a> </div>
|
||||||
|
Loading…
x
Reference in New Issue
Block a user