contains the type we are looking for, just search the immediately used types.
We can only do this because we keep the "current" type in the nesting level
as we decrement upreferences.
This change speeds up the testcase in PR224 from 50.4s to 22.08s, not
too shabby.
llvm-svn: 11221
the Virt2PhysRegMap std::map with an std::vector. This speeds up the
register allocator another (almost) 40%, from .72->.45s in a release build
of LLC on 253.perlbmk.
llvm-svn: 11219
from physical registers, and they are always dense, it makes sense to not have
a ton of RBtree overhead. This change speeds up regalloclocal about ~30% on
253.perlbmk, from .35s -> .27s in the JIT (in LLC, it goes from .74 -> .55).
Now live variable analysis is the slowest codegen pass. Of course it doesn't
help that we have to run it twice, because regalloclocal doesn't update it,
but even if it did it would be the slowest pass (now it's just the 2x slowest
pass :(
llvm-svn: 11215
1. The "work" was not in the assert, so it was punishing the optimized release
2. getNamedFunction is _very_ expensive in large programs. It is not designed
to be used like this, and was taking 7% of the execution time of the code
generator on perlbmk.
Since the assert "can never fail", I'm just killing it.
llvm-svn: 11214
removeDeadNodes is called, only call it at the end of the pass being run.
This saves 1.3 seconds running DSA on 177.mesa (5.3->4.0s), which is
pretty big. This is only possible because of the automatic garbage
collection done on forwarding nodes.
llvm-svn: 11178
DSGraphs while they are forwarding. When the last reference to the forwarding
node is dropped, the forwarding node is autodeleted. This should simplify
removeTriviallyDead nodes, and is only (efficiently) possible because we are
using an ilist of dsnodes now.
llvm-svn: 11175
slots each. As a concequence they get numbered as 0, 2, 4 and so
on. The first slot is used for operand uses and the second for
defs. Here's an example:
0: A = ...
2: B = ...
4: C = A + B ;; last use of A
The live intervals should look like:
A = [1, 5)
B = [3, x)
C = [5, y)
llvm-svn: 11141
The problem is that the dominator update code didn't "realize" that it's
possible for the newly inserted basic block to dominate anything. Because
it IS possible, stuff was getting updated wrong.
llvm-svn: 11137
complete rewrite of load-vn will make it a bit faster. This changes speeds up
the gcse pass (which uses load-vn) from 25.45s to 0.42s on the testcase in
PR209.
I've also verified that this gives the exact same results as the old one.
llvm-svn: 11132