This patch adds a -post-link-opts option to llvm-ld which allows an arbitrary
program to optimize bytecode after linking. The program is passed two file
names. The first is the input (linked bytecode) the second is where it must
place its output (presumably after optimizing). If the output file is bytecode,
it is used as a substitute for the input. This will allow things like poolalloc
to be written as a separate program instead of a loadable module or built into
LLVM.
llvm-svn: 24893
* Add --enable-debug-runtime option, defaults to disabled
* Pass the new config var, DEBUG_RUNTIME, to Makefiles
* Don't use -Wa,-strip-debug if debug-runtime is enabled
llvm-svn: 24891
for Darwin.
* Added lowering hook for ISD::RET. It inserts CopyToRegs for the return
value (or store / fld / copy to ST(0) for floating point value). This
eliminate the need to write C++ code to handle RET with variable number
of operands.
llvm-svn: 24888
use too much stack space, overflowing the stack for large functions. Instead
of emitting new SDOperands in each match block, we emit some common ones at
the top of SelectCode then reuse them when possible.
This reduces the stack size of SelectCode from 28K to 21K. Note that GCC
compiles it to 512 bytes :-/
I've filed GCC PR 25505 to track this.
llvm-svn: 24882
it keeps it from trying to add the same node to the node set
over and over if it matches multiple given patterns. Also in cases where there
are a lot of patterns to be matched, and it matches an early one, this
will make the script run slightly faster. It's more there because it logically
should be, than anything else, I mean, Python is never going to be fast ;-)
llvm-svn: 24876
last night, breaking crafty and twolf. Make sure that the newly found
legal nodes are themselves not re-legalized until the next iteration.
Also, since this functionality exists now, we can reduce number of legalizer
iterations by depending on this behavior instead of having to misuse 'do
another iteration' to get the same effect.
llvm-svn: 24875
us to load and store vectors directly at a pointer (offset of zero) by
using r0 as the base register. This also requires some asm printer work
to satisfy the darwin assembler.
For
void %foo(<4 x float> * %a) {
entry:
%tmp1 = load <4 x float> * %a;
%tmp2 = add <4 x float> %tmp1, %tmp1
store <4 x float> %tmp2, <4 x float> *%a
ret void
}
We now produce:
_foo:
lvx v0, 0, r3
vaddfp v0, v0, v0
stvx v0, 0, r3
blr
Instead of:
_foo:
li r2, 0
lvx v0, r2, r3
vaddfp v0, v0, v0
stvx v0, r2, r3
blr
llvm-svn: 24872
we were storing into [FP+88] instead of [FP+92].
Improve codegen by emitting [FP+92], instead of emitting a copy of FP into
another GPR which wouldn't be coallesced because FP isn't register allocated.
llvm-svn: 24859
from a dot file that is the output of DSA. Nodes to extract
are specified by giving the name of the node seen in the graphical
representation, i.e. in the .ps if the node is specified %xyz
asking for just x, xy, or xyz will retain it in the output file.
Because it operates on substrings underspecifying may result
in additional unexpected nodes. Be as specific as possible.
Obviously, however, if you ask for %xyz and there is a
getelementptr of %xyz you will get both nodes. Some manual
editing may still be necessary because of this, but this script
can pare down 10,000 line files to 20 line files, making like easier.
llvm-svn: 24851