LLVM 2.5 Release Notes
- Introduction
- Sub-project Status Update
- External Projects Using LLVM 2.5
- What's New in LLVM 2.5?
- Installation Instructions
- Portability and Supported Platforms
- Known Problems
- Additional Information
This document contains the release notes for the LLVM Compiler
Infrastructure, release 2.5. Here we describe the status of LLVM, including
major improvements from the previous release and significant known problems.
All LLVM releases may be downloaded from the LLVM releases web site.
For more information about LLVM, including information about the latest
release, please check out the main LLVM
web site. If you have questions or comments, the LLVM Developer's Mailing
List is a good place to send them.
Note that if you are reading this file from a Subversion checkout or the
main LLVM web page, this document applies to the next release, not the
current one. To see the release notes for a specific release, please see the
releases page.
The LLVM 2.5 distribution currently consists of code from the core LLVM
repository —which roughly includes the LLVM optimizers, code generators
and supporting tools — and the llvm-gcc repository. In addition to this
code, the LLVM Project includes other sub-projects that are in development. The
two which are the most actively developed are the Clang
Project and the VMKit Project.
The Clang project is an effort to build
a set of new 'LLVM native' front-end technologies for the LLVM optimizer and
code generator. While Clang is not included in the LLVM 2.5 release, it is
continuing to make major strides forward in all areas. Its C and Objective-C
parsing and code generation support is now very solid. For example, it is
capable of successfully building many real-world applications for X86-32
andX86-64,
including the FreeBSD
kernel. C++ is also
making incredible progress,
and work on templates has recently started.
While Clang is not yet production quality, it is progressing very nicely and
is quite usable for building many C and Objective-C applications. If you are
interested in fast compiles and good diagnostics, we encourage you to try it out
by building from mainline
and reporting any issues you hit to the Clang front-end mailing
list.
In the LLVM 2.5 time-frame, the Clang team has made many improvements:
- Clang now has a new driver, which is focused on providing a GCC-compatible
interface.
- The X86-64 ABI is now supported.
- Precompiled header support is now implemented.
- Objective-C support is significantly improved beyond LLVM 2.4, supporting
many features, such as Objective-C Garbage Collection.
- Many many bugs are fixed and many features have been added.
Previously announced in the last LLVM release, the Clang project also
includes an early stage static source code analysis tool for automatically finding bugs
in C and Objective-C programs. The tool performs a growing set of checks to find
bugs that occur on a specific path within a program.
In the LLVM 2.5 time-frame there have been many significant improvements to
the analyzer's core path simulation engine and machinery for generating
path-based bug reports to end-users. Particularly noteworthy improvements
include experimental support for full field-sensitivity and reasoning about heap
objects as well as an improved value-constraints subengine that does a much
better job of reasoning about inequality relationships (e.g., x > 2)
between variables and constants.
The set of checks performed by the static analyzer continue to expand, and
future plans for the tool include full source-level inter-procedural analysis
and deeper checks such as buffer overrun detection. There are many opportunities
to extend and enhance the static analyzer, and anyone interested in working on
this project is encouraged to get involved!
The VMKit project is an implementation of
a JVM and a CLI Virtual Machines (Microsoft .NET is an
implementation of the CLI) using the Just-In-Time compiler of LLVM.
Following LLVM 2.5, VMKit has its first release that you can find on its
webpage. The release includes
bug fixes, cleanup and new features. The major changes are:
- Ahead of Time compiler: compiles .class files to llvm .bc. VMKit uses this
functionality to native compile the standard classes (eg java.lang.String).
Users can compile AOT .class files into dynamic libraries and run them with the
help of VMKit.
- New exception model: the dwarf exception model is very slow for
exception-intensive applications, so the JVM has had a new implementation of
exceptions which check at each function call if an exception happened. There is
a low performance penalty on applications without exceptions, but it is a big
gain for exception-intensive applications. For example the jack benchmark in
Spec JVM98 is 6x faster (performance gain of 83%).
- New support for OSX/X64, Linux/X64 (with the Boehm GC), Linux/ppc32.
Pure
is an algebraic/functional programming language based on term rewriting.
Programs are collections of equations which are used to evaluate expressions in
a symbolic fashion. Pure offers dynamic typing, eager and lazy evaluation,
lexical closures, a hygienic macro system (also based on term rewriting),
built-in list and matrix support (including list and matrix comprehensions) and
an easy-to-use C interface. The interpreter uses LLVM as a backend to
JIT-compile Pure programs to fast native code.
In addition to the usual algebraic data structures, Pure also has
MATLAB-style matrices in order to support numeric computations and signal
processing in an efficient way. Pure is mainly aimed at mathematical
applications right now, but it has been designed as a general purpose language.
The dynamic interpreter environment and the C interface make it possible to use
it as a kind of functional scripting language for many application areas.
LDC is an implementation of
the D Programming Language using the LLVM optimizer and code generator.
LDC project works great with the LLVM 2.5 release. General improvements in this
cycle have included new inline asm constraint handling, better debug info
support, general bugfixes, and better x86-64 support. This has allowed
some major improvements in LDC, getting us much closer to being as
fully featured as the original DMD compiler from DigitalMars.
Roadsend PHP (rphp) is an open
source compiler for the PHP programming language that uses LLVM for its
optimizer, JIT, and static compiler. This is a reimplementation of an earlier
project that is now based on the LLVM.
This release includes a huge number of bug fixes, performance tweaks, and
minor improvements. Some of the major improvements and new features are listed
in this section.
LLVM 2.5 includes several major new capabilities:
- LLVM 2.5 includes a brand new XCore backend.
- llvm-gcc now generally supports the GFortran front-end, and the precompiled
release binaries now support Fortran, even on Mac OS/X.
- CMake is now used by the LLVM build process
on Windows. It automatically generates Visual Studio project files (and
more) from a set of simple text files. This makes it much easier to
maintain. In time, we'd like to standardize on CMake for everything.
- LLVM 2.5 now uses (and includes) Google Test for unit testing.
- The LLVM native code generator now supports arbitrary precision integers.
Types like i33 have long been valid in the LLVM IR, but were previously
only supported by the interpreter. Note that the C backend still does not
support these.
- LLVM 2.5 no longer uses 'bison', so it is easier to build on Windows.
LLVM fully supports the llvm-gcc 4.2 front-end, which marries the GCC
front-ends and driver with the LLVM optimizer and code generator. It currently
includes support for the C, C++, Objective-C, Ada, and Fortran front-ends.
- In this release, the GCC inliner is completely disabled. Previously the GCC
inliner was used to handle always-inline functions and other cases. This caused
problems with code size growth, and it is completely disabled in this
release.
- llvm-gcc (and LLVM in general) now support code generation for stack
canaries, which is an effective form of buffer overflow
protection. llvm-gcc supports this with the -fstack-protector
command line option (just like GCC). In LLVM IR, you can request code
generation for stack canaries with function attributes.
LLVM IR has several new features that are used by our existing front-ends and
can be useful if you are writing a front-end for LLVM:
- The shufflevector instruction
has been generalized to allow different shuffle mask width than its input
vectors. This allows you to use shufflevector to combine two
"<4 x float>" vectors into a "<8 x float>" for example.
- LLVM IR now supports new intrinsics for computing and acting on overflow of integer operations. This allows
efficient code generation for languages that must trap or throw an exception on
overflow. While these intrinsics work on all targets, they only generate
efficient code on X86 so far.
- LLVM IR now supports a new private
linkage type to produce labels that are stripped by the assembler before it
produces a .o file (thus they are invisible to the linker).
- LLVM IR supports two new attributes for better alias analysis. The noalias attribute can now be used on the
return value of a function to indicate that it returns new memory (e.g.
'malloc', 'calloc', etc).
- The new nocapture attribute can be
used on pointer arguments to functions that access through but do not return the
pointer in a data structure that out lives the call (e.g. 'strlen', 'memcpy',
and many others). The simplifylibcalls pass applies these attributes to
standard libc functions.
- The parser for ".ll" files in lib/AsmParser is now completely rewritten as a
recursive descent parser. This parser produces better error messages (including
caret diagnostics) is less fragile (less likely to crash on strange things) does
not leak memory, is more efficient, and eliminates LLVM's last use of the
'bison' tool.
- Debug information representation and manipulation internals have been
consolidated to use a new set of classes in
llvm/Analysis/DebugInfo.h classes. These routines are more
efficient, robust, and extensible and replace the older mechanisms.
llvm-gcc, clang, and the code generator now use them to create and process
debug information.
In addition to a huge array of bug fixes and minor performance tweaks, this
release includes a few major enhancements and additions to the optimizers:
- The loop optimizer now improves floating point induction variables in
several ways, including adding shadow induction variables to avoid
"integer <-> floating point" conversions in loops when safe.
- The "-mem2reg" pass is now much faster on code with huge basic blocks.
- The "-jump-threading" pass is more powerful: it is iterative
and handles threading based on values with fully and partially redundant
loads.
- The "-memdep" memory dependence analysis pass (used by GVN and memcpyopt) is
both faster and more aggressive.
- The "-scalarrepl" scalar replacement of aggregates pass is more aggressive
about promoting unions to registers.
We have put a significant amount of work into the code generator
infrastructure, which allows us to implement more aggressive algorithms and make
it run faster:
- The Writing an LLVM Compiler
Backend document has been greatly expanded and is substantially more
complete.
- The SelectionDAG type legalization logic has been completely rewritten, is
now more powerful (it supports arbitrary precision integer types for example),
and more correct in several corner cases. The type legalizer converts
operations on types that are not natively supported by the target machine into
equivalent code sequences that only use natively supported types. The old type
legalizer is still available (for now) and will be used if
-disable-legalize-types is passed to the code generator.
- The code generator now supports widening illegal vectors to larger legal
ones (for example, converting operations on <3 x float> to work on
<4 x float>) which is very important for common graphics
applications.
- The assembly printers for each target are now split out into their own
libraries that are separate from the main code generation logic. This reduces
code size of JIT compilers by not requiring them to be linked in.
- The 'fast' instruction selection path (used at -O0 and for fast JIT
compilers) now supports accelerating codegen for code that uses exception
handling constructs.
- The optional PBQP register allocator now supports register coalescing.
New features of the X86 target include:
- The "llvm.returnaddress"
intrinsic (which is used to implement "__builtin_return_address") now supports
non-zero stack depths on X86.
- The X86 backend now supports code generation of vector shift operations
using SSE instructions.
- X86-64 code generation now takes advantage of red zone (unless
-mno-red-zone option is specified).
- The X86 backend now supports using address space #256 in LLVM IR as a way of
performing memory references off the GS segment register. This allows a
front-end to take advantage of very low-level programming techniques when
targetting X86 CPUs. See test/CodeGen/X86/movgs.ll for a simple example.
- The X86 backend now supports a -disable-mmx command line option to
prevent use of MMX even on chips that support it. This is important for cases
where code does not contain the proper "llvm.x86.mmx.emms" intrinsics.
- The X86 JIT now detects the new Intel "Core i7" and Atom" chips,
auto-configuring itself appropriately for the features of these chips.
- The JIT now supports exception handling constructs on Linux/X86-64 and
Darwin/x86-64.
- The JIT supports Thread Local Storage (TLS) on Linux/X86-32 but not yet on
X86-64.
New features of the PIC16 target include:
- Both direct and indirect load/stores work now.
- Logical, bitwise and conditional operations now work for integer data
types.
- Function calls involving basic types work now.
- Support for integer arrays.
- Compiler can now emit libcalls for operations not support by m/c insns.
- Support for both data and rom address spaces.
Things not yet supported:
- Floating point.
- Passing/returning aggregate types to/from functions.
- Variable arguments.
- Indirect function calls.
- Interrupts/programs.
- Debug info.
New features include:
- Beginning with LLVM 2.5, llvmc2 is known as
just llvmc. The old llvmc driver was removed.
- The Clang plugin was substantially improved and is now enabled
by default. The command llvmc --clang can be now used as a
synonym to ccc.
- There is now a --check-graph option which is supposed to catch
common errors like multiple default edges, mismatched output/input language
names and cycles. In general, these checks can't be done at compile-time
because of the need to support plugins.
- Plugins are now more flexible and can refer to compilation graph nodes and
options defined in other plugins. To manage dependencies, a priority-sorting
mechanism was introduced. This change affects the TableGen file syntax; see the
documentation for details.
- Hooks can now be provided with arguments. The syntax is "$CALL(MyHook,
'Arg1', 'Arg2', 'Arg #3')".
- A new option type: multi-valued option, for options that take more than one
argument (for example, "-foo a b c").
- New option properties: 'one_or_more', 'zero_or_more',
'hidden' and 'really_hidden'.
- The 'case' expression gained an 'error' action and
an 'empty' test (equivalent to "(not (not_empty ...))").
- Documentation now looks more consistent to the rest of the LLVM
docs. There is also a man page now.
If you're already an LLVM user or developer with out-of-tree changes based
on LLVM 2.4, this section lists some "gotchas" that you may run into upgrading
from the previous release.
- llvm-gcc defaults to -fno-math-errno on all X86 targets.
In addition, many APIs have changed in this release. Some of the major LLVM
API changes are:
LLVM is known to work on the following platforms:
- Intel and AMD machines (IA32, X86-64, AMD64, EMT-64) running Red Hat
Linux, Fedora Core and FreeBSD (and probably other unix-like systems).
- PowerPC and X86-based Mac OS X systems, running 10.3 and above in 32-bit
and 64-bit modes.
- Intel and AMD machines running on Win32 using MinGW libraries (native).
- Intel and AMD machines running on Win32 with the Cygwin libraries (limited
support is available for native builds with Visual C++).
- Sun UltraSPARC workstations running Solaris 10.
- Alpha-based machines running Debian GNU/Linux.
- Itanium-based (IA64) machines running Linux and HP-UX.
The core LLVM infrastructure uses GNU autoconf to adapt itself
to the machine and operating system on which it is built. However, minor
porting may be required to get LLVM to work on new platforms. We welcome your
portability patches and reports of successful builds or error messages.
This section contains significant known problems with the LLVM system,
listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if
there isn't already one.
The following components of this LLVM release are either untested, known to
be broken or unreliable, or are in early development. These components should
not be relied on, and bugs should not be filed against them, but they may be
useful to some people. In particular, if you would like to work on one of these
components, please contact us on the LLVMdev list.
- The MSIL, IA64, Alpha, SPU, MIPS, and PIC16 backends are experimental.
- The llc "-filetype=asm" (the default) is the only supported
value for this option.
- The X86 backend does not yet support
all inline assembly that uses the X86
floating point stack. It supports the 'f' and 't' constraints, but not
'u'.
- The X86 backend generates inefficient floating point code when configured
to generate code for systems that don't have SSE2.
- Win64 code generation wasn't widely tested. Everything should work, but we
expect small issues to happen. Also, llvm-gcc cannot build mingw64 runtime
currently due
to several
bugs due to lack of support for the
'u' inline assembly constraint and X87 floating point inline assembly.
- The X86-64 backend does not yet support the LLVM IR instruction
va_arg. Currently, the llvm-gcc and front-ends support variadic
argument constructs on X86-64 by lowering them manually.
- The Linux PPC32/ABI support needs testing for the interpreter and static
compilation, and lacks support for debug information.
- Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6
processors, thumb programs can crash or produce wrong
results (PR1388).
- Compilation for ARM Linux OABI (old ABI) is supported, but not fully tested.
- There is a bug in QEMU-ARM (<= 0.9.0) which causes it to incorrectly
execute
programs compiled with LLVM. Please use more recent versions of QEMU.
- The SPARC backend only supports the 32-bit SPARC ABI (-m32), it does not
support the 64-bit SPARC ABI (-m64).
- The O32 ABI is not fully supported.
- 64-bit MIPS targets are not supported yet.
- On 21164s, some rare FP arithmetic sequences which may trap do not have the
appropriate nops inserted to ensure restartability.
- The Itanium backend is highly experimental, and has a number of known
issues. We are looking for a maintainer for the Itanium backend. If you
are interested, please contact the LLVMdev mailing list.
llvm-gcc does not currently support Link-Time
Optimization on most platforms "out-of-the-box". Please inquire on the
LLVMdev mailing list if you are interested.
The only major language feature of GCC not supported by llvm-gcc is
the __builtin_apply family of builtins. However, some extensions
are only supported on some targets. For example, trampolines are only
supported on some targets (these are used when you take the address of a
nested function).
If you run into GCC extensions which are not supported, please let us know.
The C++ front-end is considered to be fully
tested and works for a number of non-trivial programs, including LLVM
itself, Qt, Mozilla, etc.
- Exception handling works well on the X86 and PowerPC targets. Currently
only Linux and Darwin targets are supported (both 32 and 64 bit).
- Fortran support generally works, but there are still several unresolved bugs
in Bugzilla. Please see the tools/gfortran component for details.
- The Fortran front-end currently does not build on Darwin (without tweaks)
due to unresolved dependencies on the C front-end.
The llvm-gcc 4.2 Ada compiler works fairly well, however this is not a mature
technology and problems should be expected.
- The Ada front-end currently only builds on X86-32. This is mainly due
to lack of trampoline support (pointers to nested functions) on other platforms,
however it also fails to build on X86-64
which does support trampolines.
- The Ada front-end fails to bootstrap.
This is due to lack of LLVM support for setjmp/longjmp style
exception handling, which is used internally by the compiler.
Workaround: configure with --disable-bootstrap.
- The c380004, c393010
and cxg2021 ACATS tests fail
(c380004 also fails with gcc-4.2 mainline).
If the compiler is built with checks disabled then c393010
causes the compiler to go into an infinite loop, using up all system memory.
- Some gcc specific Ada tests continue to crash the compiler.
- The -E binder option (exception backtraces)
does not work and will result in programs
crashing if an exception is raised. Workaround: do not use -E.
- Only discrete types are allowed to start
or finish at a non-byte offset in a record. Workaround: do not pack records
or use representation clauses that result in a field of a non-discrete type
starting or finishing in the middle of a byte.
- The lli interpreter considers
'main' as generated by the Ada binder to be invalid.
Workaround: hand edit the file to use pointers for argv and
envp rather than integers.
- The -fstack-check option is
ignored.
A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also
contains versions of the API documentation which is up-to-date with the
Subversion version of the source code.
You can access versions of these documents specific to this release by going
into the "llvm/doc/" directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact
us via the mailing
lists.
LLVM Compiler Infrastructure
Last modified: $Date$