prototypes, even if they don't precisely match what it would prefer to use.
This fixes: CBackend/2004-02-15-PreexistingExternals.llx compiling it into:
ltmp_0_30 = memcpy(l14_C, 4u, 17);
ltmp_1_30 = memcpy(((int *)l27_A), ((unsigned )(long)l27_B), ((int )123u));
instead of:
ltmp_0_30 = memcpy(l14_C, 4u, 17);
ltmp_1_27 = l43_memcpy(l27_A, l27_B, 123u);
Which does the wrong thing as you could imagine.
llvm-svn: 11481
initializers for constant structs and arrays take constant space, instead of
space proportinal to the number of elements. This reduces the memory usage of
the LLVM compiler by hundreds of megabytes when compiling some nasty SPEC95
benchmarks.
llvm-svn: 11470
'Constant', instead of specific subclass pointers. In the future, these will
return an instance of ConstantAggregateZero if all of the inputs are zeros.
llvm-svn: 11467
this speeds up a release llvm-as from 21.95s to 11.21s, because before it
would do an expensive traversal of the type-graph of every type resolved.
llvm-svn: 11242
consistent across the various type classes, we can factor out a LOT more
almost-identical code. Also, add a couple of temporary statistics.
llvm-svn: 11232
all of the ad-hoc storage of contained types. This allows getContainedType to
not be virtual, and allows us to entirely delete the TypeIterator class.
llvm-svn: 11230
1. The "work" was not in the assert, so it was punishing the optimized release
2. getNamedFunction is _very_ expensive in large programs. It is not designed
to be used like this, and was taking 7% of the execution time of the code
generator on perlbmk.
Since the assert "can never fail", I'm just killing it.
llvm-svn: 11214
instead of a loop that is really inefficient with large basic blocks.
This speeds up the inliner pass on the testcase in PR209 from 13.8s to 2.24s
which still isn't exactly speedy, but is a lot better. :)
llvm-svn: 11105
Basically we store floating point values as their integral components, instead of relying
on the semantics of floating point < to differentiate between values. This is likely to
make the map search be faster anyway.
llvm-svn: 11064
out that the problem was actually the writer writing out a 'null' value
because it didn't normalize it. This fixes:
test/Regression/Assembler/2004-01-22-FloatNormalization.ll
llvm-svn: 10967
fact "profitable" to do so. This makes compactification "free" for small
programs (ie, it is completely disabled) and even helps large programs by
not having to encode pointless compactification planes.
On 176.gcc, this saves 50K from the bytecode file, which is, alas only
a couple percent.
This concludes my head bashing against the bytecode format, at least for
now.
llvm-svn: 10922
the bytecode file for 176.gcc by about 200K (10%), and 254.gap by about 167K,
a 25% reduction. There is still a lot of room for improvement in the encoding
of the compaction table.
llvm-svn: 10913
Since this really only makes sense for these two, change hte instance variable
to reflect whether we are writing a bytecode file or not. This makes it
reasonable to add bcwriter specific stuff to it as necessary.
llvm-svn: 10837
testcase test/Regression/Assembler/ConstantExprFold.llx
Note that these kinds of things only rarely show up in source code, but are
exceedingly common in the intermediate stages of algorithms like SCCP. By
folding things (especially relational operators) that use symbolic constants,
we are able to speculatively fold more conditional branches, which can
lead to some big simplifications.
It would be easy to add a lot more special cases here, so if you notice
SCCP missing anything "obvious", you know what to make smarter. :)
llvm-svn: 10812
Move a bunch of (now) private stuff from ConstantFolding.h into
ConstantFolding.cpp.
This _finally_ gets us to a place where we have a sane constant folder. The
rules are:
1. LLVM clients now use ConstantExpr::get* methods to fold constants. If they
cannot be folded, a constantexpr is created, so these methods always return
valid Constant*'s.
2. The implementation of ConstantExpr::get* uses the functions exposed by
ConstantFolding.h to try to fold constants. If they cannot be folded,
they should return a null pointer.
3. The implementation of ConstantFolding can do whatever it wants, and only
has one client (Constants.cpp)
This cuts down on the wierd dependencies, and eliminates the two interfaces.
The old constanthandling interface was especially bad for clients to use
because almost none of them took the failure condition into consideration,
thus leading to obscure problems.
llvm-svn: 10807
this whole refactoring: allow constant folding methods to return something
other than predefined classes, allow them to return generic Constant*'s.
llvm-svn: 10806
The first change (which is disabled) compactifies all of the function constant
pools into the global constant pool, in an attempt to reduce the amount of
duplication and overhead. Unfortunately, as the comment indicates, this is
not yet a win, so it is disabled.
The second change sorts the typeid's so that those types that can be used
by instructions in the program appear earlier in the table than those that
cannot (such as structures and arrays). This causes the instructions to
be able to use the dense encoding more often, saving about 5K on 254.gap.
This is only a .65% savings though, unfortunately. :(
llvm-svn: 10754
on the algorithm for directly computing immediate dominators presented in this
paper:
A Fast Algorithm for Finding Dominators in a Flowgraph
T. Lengauer & R. Tarjan, ACM TOPLAS July 1979, pgs 121-141.
This _substantially_ speeds up construction of all dominator related information.
Post-dominators to follow.
llvm-svn: 10301
* Add new constructors to allow insertion of terminator instructions at the
end of basic blocks.
* Move a ReturnInst method out-of-line, so that the vtable and type info don't
need to be emitted to every translation unit that uses the class.
llvm-svn: 10107
This change speeds up type resolution by checking to see if a type is
recursive, and if it's not, using a more efficient algorithm.
This dramatically reduces bytecode loading time of kc++, reducing time-to-jit
kc++ --version to 17s from 33s
llvm-svn: 10088
Rename SlotCalculator::getValSlot() to SlotCalculator::getSlot(),
SlotCalculator::insertValue() to SlotCalculator::getOrCreateSlot(),
SlotCalculator::insertVal() to SlotCalculator::insertValue(), and
SlotCalculator::doInsertVal() to SlotCalculator::doInsertValue().
llvm-svn: 9190
this list (except use_size()) are constant time. Before the killUse method
(used whenever something stopped using a value) was linear time, and thus
very very slow for large programs.
This speeds GCCAS up _substantially_ on large programs: almost 2x for 176.gcc:
176.gcc: 77.07s -> 37.38s
177.mesa: 7.59s -> 5.57s
252.eon: 21.02s -> 19.52s (*)
253.perlbmk: 11.40s -> 13.05s
254.gap: 7.25s -> 7.42s
252.eon would speed up a whole lot more, but optimization time is being
dominated by the inlining pass, which needs to be fixed.
llvm-svn: 9160
* Fix a nasty initializer ordering bug. Any only-CFG passes which registered
themselves before the CFGOnlyAnalysis vector initialized got forgotten and
thus got invalidated and recomputed.
In particular, in my compiled version of gccas, the Loop information pass was
being recomputed unnecessarily.
llvm-svn: 9074
constants as necessary due to type resolution. With this change, the
following spec benchmarks now link: 176.gcc, 177.mesa, 252.eon,
253.perlbmk, & 300.twolf. IOW, all SPEC INT and FP benchmarks now link.
llvm-svn: 8853
machinery. This dramatically simplifies how things works, removes irritating
little corner cases, and overall improves speed and reliability.
Highlights of this change are:
1. The exponential algorithm built into the code is now gone. For example
the time to disassemble one bytecode file from the mesa benchmark went
from taking 12.5s to taking 0.16s.
2. The linker bugs should be dramatically reduced. The one remaining bug
has to do with constant handling, which I actually introduced in
"union-find" checkins.
3. The code is much easier to follow, as a result of fewer special cases.
It's probably also smaller. yaay.
llvm-svn: 8842
This makes use of the new PATypeHolder's to keep types from being deleted
prematurely, instead of the wierd "self reference" garbage. This is easier
to understand and more efficient as well.
llvm-svn: 8834
because it can add a module ID which we do not have at this time.
* Check to see if the module has been initialized when materializing it.
llvm-svn: 8674
We want to check for length 5 because we might get the "llvm." string as the
name. That string is in the LLVM namespace and should be checked as such.
We also don't have to worry about garbage data because (I believe) the string
class will return a valid value. So, the switch statement will work and we
don't have to worry about the code wandering into segfault land.
llvm-svn: 8419
be at least 6 characters, since something must follow the "llvm." string in the
function name.
This seems to fix an assertion failure with the SingleSource tests, too.
llvm-svn: 8418
not correctly calculated, and calculating it wrong for fun seems rather
pointless. This also speeds up my favorite testcase by .25 seconds.
llvm-svn: 8330
we need to know anyway. This reduces the 2002-07-08-HugePerformanceProblem.llx
down to 3.210u:0.010s, which is back in the acceptable range again
llvm-svn: 8323
the type is analyzed. Instead, only compute it when requested (with
getDescription), and cached for reuse later.
This dramatically speeds up LLVM in general because these descriptions almost
_never_ need to be constructed. The only time they are used is when a type is
<<'d. Printing of modules by themselves uses other code to print symbolic
types when possible, so these descriptions are really only used for debugging.
Also, this fixes the particularly bad case when lots of types get resolved to
each other, such as during linking of large programs. In these cases, the type
descriptions would be repeatedly recomputed and discarded even though: A. noone
reads the description before it gets resolved, and B. many many resolutions
happen at intermediate steps, causing a HUGE waste of time.
Overall, this makes the getTypeDesc function much more light-weight, and fixes
bug: Assembler/2002-07-08-HugePerformanceProblem.llx, which went from taking
1048.770u/19.150s (which is 17.5 MINUTES, on apoc), to taking 0.020u/0.000s,
which is a nice little speedup. :)
llvm-svn: 8320