The coalescer can in very rare cases leave too large live intervals around after
rematerializing cheap-as-a-move instructions.
Linear scan doesn't really care, but live range splitting gets very confused
when a live range is killed by a ghost instruction.
I will fix this properly in the coalescer after 2.9 branches.
llvm-svn: 127096
regs. This is the only change in this checkin that may affects the
default scheduler. With better register tracking and heuristics, it
doesn't make sense to artificially lower the register limit so much.
Added -sched-high-latency-cycles and X86InstrInfo::isHighLatencyDef to
give the scheduler a way to account for div and sqrt on targets that
don't have an itinerary. It is currently defaults to 10 (the actual
number doesn't matter much), but only takes effect on non-default
schedulers: list-hybrid and list-ilp.
Added several heuristics that can be individually disabled for the
non-default sched=list-ilp mode. This helps us determine how much
better we can do on a given benchmark than the default
scheduler. Certain compute intensive loops run much faster in this
mode with the right set of heuristics, and it doesn't seem to have
much negative impact elsewhere. Not all of the heuristics are needed,
but we still need to experiment to decide which should be disabled by
default for sched=list-ilp.
llvm-svn: 127067
This simplifies the code and makes it faster too.
The interference patterns are saved for each candidate register. It will be
reused for actually executing the split. Work in progress.
llvm-svn: 127054
Initially, slot indexes are quad-spaced. There is room for inserting up to 3
new instructions between the original instructions.
When we run out of indexes between two instructions, renumber locally using
double-spaced indexes. The original quad-spacing means that we catch up quickly,
and we only have to renumber a handful of instructions to get a monotonic
sequence. This is much faster than renumbering the whole function as we did
before.
llvm-svn: 127023
You can't really predict how many indexes will be needed from the number of
defs, so let's keep it simple.
Also remove an extra empty index that was inserted after each basic block. It
was intended for live-out ranges, but it was never used that way.
llvm-svn: 127014
type after type legalization has completed. Before then it may simply not be big
enough to hold the shift amount, particularly on x86 which uses a very small type
for shifts (this issue broke stuff in the past which is why LegalizeTypes carefully
uses a large type for shift amounts).
llvm-svn: 127000
Fix the PendingQueue, then disable it because it's not required for
the current schedulers' heuristics.
Fix the logic for the unused list-ilp scheduler.
llvm-svn: 126981
it. It's been assumed up til now that it would be in its immediate
successor. However, this isn't necessarily the case. It could be in one of its
successor's successors.
Modify the code to more thoroughly check for an 'eh.selector' call in
successors. It only looks at a successor if we get there as a result of an
unconditional branch.
Testcase ObjC/exceptions-4.m in r126968.
llvm-svn: 126969
There are probably much larger speedups to be had by renumbering locally instead
of looping over the whole function. For now, the greedy register allocator is
25% faster.
llvm-svn: 126926
This is much faster than using a pointer to a ManagedStatic object accessed with
a function call. The greedy register allocator is 5% faster overall just from
the SlotIndex default constructor savings.
llvm-svn: 126925
The SlotIndex created by the default construction does not represent a position
in the function, and it doesn't make sense to compare it to other indexes.
llvm-svn: 126924
We need to wait until we meet a PHIDef in its defining block before resurrecting
PHIKills in the predecessors.
This should unbreak the llvm-gcc-build-x86_64-darwin10-x-mingw32-x-armeabi bot.
llvm-svn: 126905
David Greene changed CannotYetSelect() to print the full DAG including multiple
copies of operands reached through different paths in the DAG. Unfortunately
this blows up exponentially in some cases. The depth limit of 100 is way too
high to prevent this -- I'm seeing a message string of 150MB with a depth of
only 40 in one particularly bad case, even though the DAG has less than 200
nodes. Part of the problem is that the printing code is following chain
operands, so if you fail to select an operation with a chain, the printer will
follow all the chained operations back to the entry node.
llvm-svn: 126899
Values that map to a single new value in a new interval after splitting don't
need new PHIDefs, and if the parent value was never rematerialized the live
range will be the same.
llvm-svn: 126894
Extract the updateSSA() method from the too long extendRange().
LiveOutCache can be shared among all the new intervals since there is at most
one of the new ranges live out from each basic block.
llvm-svn: 126818