From 77ddce97adaaad1e8001acb9a9a28d7c3062fd93 Mon Sep 17 00:00:00 2001 From: Mikhail Glushenkov Date: Tue, 6 May 2008 18:17:19 +0000 Subject: [PATCH] Update documentation to reflect the current state of affairs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@50763 91177308-0d34-0410-b5e6-96231b3b80d8 --- tools/llvmc2/doc/LLVMC-Tutorial.rst | 218 ++++++++++++++++++++-------- 1 file changed, 157 insertions(+), 61 deletions(-) diff --git a/tools/llvmc2/doc/LLVMC-Tutorial.rst b/tools/llvmc2/doc/LLVMC-Tutorial.rst index 9f66e743136..91a0561bd45 100644 --- a/tools/llvmc2/doc/LLVMC-Tutorial.rst +++ b/tools/llvmc2/doc/LLVMC-Tutorial.rst @@ -1,44 +1,141 @@ -Tutorial - Writing LLVMCC Configuration files +Tutorial - Writing LLVMC Configuration files ============================================= -LLVMCC is a generic compiler driver (just like ``gcc``), designed to be -customizable and extensible. Its job is essentially to transform a set -of input files into a set of targets, depending on configuration rules -and user options. This tutorial describes how one can write -configuration files for ``llvmcc``. +LLVMC is a generic compiler driver, designed to be customizable and +extensible. It plays the same role for LLVM as the ``gcc`` program +does for GCC - LLVMC's job is essentially to transform a set of input +files into a set of targets depending on configuration rules and user +options. What makes LLVMC different is that these transformation rules +are completely customizable - in fact, LLVMC knows nothing about the +specifics of transformation (even the command-line options are mostly +not hard-coded) and regards the transformation structure as an +abstract graph. This makes it possible to adapt LLVMC for other +purposes - for example, as a build tool for game resources. This +tutorial describes the basic usage and configuration of LLVMC. -Because LLVMCC uses TableGen [1]_ as the language of its configuration -files, you need to be familiar with it. +Because LLVMC employs TableGen [1]_ as its configuration language, you +need to be familiar with it to customize LLVMC. -Describing a toolchain ----------------------- +Compiling with LLVMC +-------------------- -The main concept that ``llvmcc`` operates with is a *toolchain*, which -is just a list of tools that process input files in a pipeline-like -fashion. Toolchain definitions look like this:: +In general, LLVMC tries to be command-line compatible with ``gcc`` as +much as possible, so most of the familiar options work:: - def ToolChains : ToolChains<[ - ToolChain<[llvm_gcc_c, llc, llvm_gcc_assembler, llvm_gcc_linker]>, - ToolChain<[llvm_gcc_cpp, llc, llvm_gcc_assembler, llvm_gcc_linker]>, - ... - ]>; + $ llvmc2 -O3 -Wall hello.cpp + $ ./a.out + hello -Every configuration file should have a single toolchains list called -``ToolChains``. +One nice feature of LLVMC is that you don't have to distinguish +between different compilers for different languages (think ``g++`` and +``gcc``) - the right toolchain is chosen automatically based on input +language names (which are, in turn, determined from file extension). If +you want to force files ending with ".c" compile as C++, use the +``-x`` option, just like you would do it with ``gcc``:: -At the time of writing, ``llvmcc`` does not support mixing various -toolchains together - in other words, all input files should be in the -same language. + $ llvmc2 -x c hello.cpp + $ # hello.cpp is really a C file + $ ./a.out + hello -Another temporary limitation is that every toolchain should end with a -"join" node - a linker-like program that combines its inputs into a -single output file. +On the other hand, when using LLVMC as a linker to combine several C++ +object files you should provide the ``--linker`` option since it's +impossible for LLVMC to choose the right linker in that case:: -Describing a tool ------------------ + $ llvmc2 -c hello.cpp + $ llvmc2 hello.o + [A lot of link-time errors skipped] + $ llvmc2 --linker=c++ hello.o + $ ./a.out + hello -A single element of a toolchain is a tool. A tool definition looks -like this (taken from the Tools.td file):: +For further help on command-line LLVMC usage, refer to the ``llvmc +--help`` output. + +Customizing LLVMC: the compilation graph +---------------------------------------- + +At the time of writing LLVMC does not support on-the-fly reloading of +configuration, so to customize LLVMC you'll have to edit and recompile +the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The +relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``. + +Internally, LLVMC stores information about possible transformations in +form of a graph. Nodes in this graph represent tools, and edges +between two nodes represent a transformation path. A special "root" +node represents entry points for the transformations. LLVMC also +assigns a weight to each edge (more on that below) to choose between +several alternative edges. + +The definition of the compilation graph (see file ``Example.td``) is +just a list of edges:: + + def CompilationGraph : CompilationGraph<[ + Edge, + Edge, + ... + + Edge, + Edge, + ... + + OptionalEdge, + OptionalEdge, + ... + + OptionalEdge, + ... + + ]>; + +As you can see, the edges can be either default or optional, where +optional edges are differentiated by sporting a list of patterns (or +edge properties) which are used to calculate the edge's weight. The +default edges are assigned a weight of 1, and optional edges get a +weight of 0 + 2*N where N is the number of succesful edge property +matches. When passing an input file through the graph, LLVMC picks the +edge with the maximum weight. To avoid ambiguity, there should be only +one default edge between two nodes (with the exception of the root +node, which gets a special treatment - there you are allowed to +specify one default edge *per language*). + +* Possible edge properties are: + + - ``switch_on`` - Returns true if a given command-line option is + provided by the user. Example: ``(switch_on "opt")``. Note that + you have to define all possible command-line options separately in + the tool descriptions. See the next section for the discussion of + different kinds of command-line options. + + - ``parameter_equals`` - Returns true if a command-line parameter equals + a given value. Example: ``(parameter_equals "W", "all")``. + + - ``element_in_list`` - Returns true if a command-line parameter list + includes a given value. Example: ``(parameter_in_list "l", "pthread")``. + + - ``if_input_languages_contain`` - Returns true if a given input + language belongs to the current input language set. + + - ``and`` - Edge property combinator. Returns true if all of its + arguments return true. Used like this: (and + (prop1), (prop2), ... (propN)). Nesting not allowed. + + - ``or`` - Edge property combinator that returns true if any one of its + arguments returns true. Example: (or (prop1), (prop2), ... (propN)) + +To get a visual representation of the compilation graph (useful for +debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and +``gsview`` installed for this to work properly. + + +Writing a tool description +-------------------------- + +As was said earlier, nodes in the compilation graph represent tools. A +tool definition looks like this (taken from the ``Tools.td`` file):: def llvm_gcc_cpp : Tool<[ (in_language "c++"), @@ -57,19 +154,21 @@ aren't handled by the other tools. The complete list of the currently implemented tool properties follows: * Possible tool properties: - - in_language - input language name. - - out_language - output language name. + - ``in_language`` - input language name. - - output_suffix - output file suffix. + - ``out_language`` - output language name. - - cmd_line - the actual command used to run the tool. You can use - ``$INFILE`` and ``$OUTFILE`` variables. + - ``output_suffix`` - output file suffix. - - join - this tool is a "join node" in the graph, i.e. it gets a + - ``cmd_line`` - the actual command used to run the tool. You can use + ``$INFILE`` and ``$OUTFILE`` variables, as well as output + redirection with ``>``. + + - ``join`` - this tool is a "join node" in the graph, i.e. it gets a list of input files and joins them together. Used for linkers. - - sink - all command-line options that are not handled by other + - ``sink`` - all command-line options that are not handled by other tools are passed to this tool. The next tool definition is slightly more complex:: @@ -93,43 +192,48 @@ attributes: a name and a (possibly empty) list of properties. All currently implemented option types and properties are described below: * Possible option types: - - switch_option - a simple boolean switch, for example ``-time``. - - parameter_option - option that takes an argument, for example ``-std=c99``; + - ``switch_option`` - a simple boolean switch, for example ``-time``. - - parameter_list_option - same as the above, but more than one + - ``parameter_option`` - option that takes an argument, for example + ``-std=c99``; + + - ``parameter_list_option`` - same as the above, but more than one occurence of the option is allowed. - - prefix_option - same as the parameter_option, but the option name + - ``prefix_option`` - same as the parameter_option, but the option name and parameter value are not separated. - - prefix_list_option - same as the above, but more than one + - ``prefix_list_option`` - same as the above, but more than one occurence of the option is allowed; example: ``-lm -lpthread``. + * Possible option properties: - - append_cmd - append a string to the tool invocation command. - - forward - forward this option unchanged. + - ``append_cmd`` - append a string to the tool invocation command. - - stop_compilation - stop compilation after this phase. + - ``forward`` - forward this option unchanged. - - unpack_values - used for for splitting and forwarding + - ``stop_compilation`` - stop compilation after this phase. + + - ``unpack_values`` - used for for splitting and forwarding comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is converted to ``-foo=bar -baz`` and appended to the tool invocation command. - - help - help string associated with this option. + - ``help`` - help string associated with this option. + + - ``required`` - this option is obligatory. - - required - this option is obligatory. Language map ------------ -One last bit that you probably should change is the language map, -which defines mappings between language names and file extensions. It -is used internally to choose the proper toolchain based on the names -of the input files. Language map definition is located in the file -``Tools.td`` and looks like this:: +One last thing that you need to modify when adding support for a new +language to LLVMC is the language map, which defines mappings from +file extensions to language names. It is used to choose the proper +toolchain based on the input. Language map definition is located in +the file ``Tools.td`` and looks like this:: def LanguageMap : LanguageMap< [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>, @@ -138,14 +242,6 @@ of the input files. Language map definition is located in the file ]>; -Putting it all together ------------------------ - -Since at the time of writing LLVMCC does not support on-the-fly -reloading of the configuration, the only way to test your changes is -to recompile the program. To do this, ``cd`` to the source code -directory and run ``make``. - References ==========