The live range of a register defined by an early clobber starts at the use slot,
not the def slot.
Except when it is an early clobber tied to a use operand. Then it starts at the
def slot like a standard def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119305 91177308-0d34-0410-b5e6-96231b3b80d8
live ranges for the spill register are also defined at the use slot instead of
the normal def slot.
This fixes PR8612 for the inline spiller. A use was being allocated to the same
register as a spilled early clobber def.
This problem exists in all the spillers. A fix for the standard spiller is
forthcoming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119182 91177308-0d34-0410-b5e6-96231b3b80d8
since it is trivial and will be shared between ppc and x86.
This substantially simplifies the X86 backend also.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119089 91177308-0d34-0410-b5e6-96231b3b80d8
catastrophic compilation time in the event of unreasonable LLVM
IR. Code quality is a separate issue--someone upstream needs to do a
better job of reducing to llvm.memcpy. If the situation can be reproduced with
any supported frontend, then it will be a separate bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118904 91177308-0d34-0410-b5e6-96231b3b80d8
it makes no sense for allocation_order iterators to visit reserved regs.
The inline spiller depends on AliasAnalysis.
Manage the Query state to avoid uninitialized or stale results.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118800 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first small step towards using closed intervals for liveness instead
of the half-open intervals we're using now.
We want to be able to distinguish between a SlotIndex that represents a variable
being live-out of a basic block, and an index representing a variable live-in to
its successor.
That requires two separate indexes between blocks. One for live-outs and one for
live-ins.
With this change, getMBBEndIdx(MBB).getPrevSlot() becomes stable so it stays
greater than any instructions inserted at the end of MBB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118747 91177308-0d34-0410-b5e6-96231b3b80d8
Whenever splitting wants to insert a copy, it checks if the value can be
rematerialized cheaply instead.
Missing features:
- Delete instructions when all uses have been rematerialized.
- Truncate live ranges to the remaining uses after rematerialization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118702 91177308-0d34-0410-b5e6-96231b3b80d8
benchmarks hitting an assertion.
Adds LiveIntervalUnion::collectInterferingVRegs.
Fixes "late spilling" by checking for any unspillable live vregs among
all physReg aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118701 91177308-0d34-0410-b5e6-96231b3b80d8
handle cases in which a register is unavailable for spill code.
Adds LiveIntervalUnion::extract. While processing interferences on a
live virtual register, reuses the same Query object for each
physcial reg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118423 91177308-0d34-0410-b5e6-96231b3b80d8
to perform the copy, which may be of lots of memory [*]. It would be good if the
fall-back code generated something reasonable, i.e. did the copy in a loop, rather
than vast numbers of loads and stores. Add a note about this. Currently target
specific code seems to always kick in so this is more of a theoretical issue rather
than a practical one now that X86 has been fixed.
[*] It's amazing how often people pass mega-byte long arrays by copy...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118275 91177308-0d34-0410-b5e6-96231b3b80d8
and as such can be represented by an MVT - the more complicated
EVT is not needed. Use MVT for ValVT everywhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118245 91177308-0d34-0410-b5e6-96231b3b80d8
This way, InlineSpiller does the same amount of splitting as the standard
spiller. Splitting should really be guided by the register allocator, and
doesn't belong in the spiller at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118216 91177308-0d34-0410-b5e6-96231b3b80d8
with a SimpleValueType, while an EVT supports equality and
inequality comparisons with SimpleValueType.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118169 91177308-0d34-0410-b5e6-96231b3b80d8
value type, so there is no point in passing it around using
an EVT. Use the simpler MVT everywhere. Rather than trying
to propagate this information maximally in all the code that
using the calling convention stuff, I chose to do a mainly
low impact change instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118167 91177308-0d34-0410-b5e6-96231b3b80d8
1. Fix pre-ra scheduler so it doesn't try to push instructions above calls to
"optimize for latency". Call instructions don't have the right latency and
this is more likely to use introduce spills.
2. Fix if-converter cost function. For ARM, it should use instruction latencies,
not # of micro-ops since multi-latency instructions is completely executed
even when the predicate is false. Also, some instruction will be "slower"
when they are predicated due to the register def becoming implicit input.
rdar://8598427
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118135 91177308-0d34-0410-b5e6-96231b3b80d8
breaker needs to check all definitions of the antidepenent register to
avoid multiple defs of the same new register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@118032 91177308-0d34-0410-b5e6-96231b3b80d8
BB#1: derived from LLVM BB %bb.nph28
Live Ins: %AL
Predecessors according to CFG: BB#0
TEST8rr %reg16384<kill>, %reg16384, %EFLAGS<imp-def>; GR8:%reg16384
JNE_4 <BB#2>, %EFLAGS<imp-use,kill>
JMP_4 <BB#2>
Successors according to CFG: BB#2 BB#2
These double CFG edges only ever occur in bugpoint-generated code, so there is
no need to attempt something clever.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117992 91177308-0d34-0410-b5e6-96231b3b80d8
It is legal for an instruction to have two operands using the same register,
only one a kill. This is interpreted as a kill.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117981 91177308-0d34-0410-b5e6-96231b3b80d8
source, and let rewrite() clean it up.
This way, kill flags on the inserted copies are fixed as well during rewrite().
We can't just assume that all the copies we insert are going to be kills since
critical edges into loop headers sometimes require both source and dest to be
live out of a block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117980 91177308-0d34-0410-b5e6-96231b3b80d8
At least X86FloatingPoint requires correct kill flags after register allocation,
and targets using register scavenging benefit. Conservative kill flags are not
enough.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117960 91177308-0d34-0410-b5e6-96231b3b80d8
at more than those which define CPSR. You can have this situation:
(1) subs ...
(2) sub r6, r5, r4
(3) movge ...
(4) cmp r6, 0
(5) movge ...
We cannot convert (2) to "subs" because (3) is using the CPSR set by
(1). There's an analogous situation here:
(1) sub r1, r2, r3
(2) sub r4, r5, r6
(3) cmp r4, ...
(5) movge ...
(6) cmp r1, ...
(7) movge ...
We cannot convert (1) to "subs" because of the intervening use of CPSR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117950 91177308-0d34-0410-b5e6-96231b3b80d8
When an instruction refers to a spill slot with a LiveStacks entry, check that
the spill slot is live at the instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117944 91177308-0d34-0410-b5e6-96231b3b80d8
looks like is happening:
Without the peephole optimizer:
(1) sub r6, r6, #32
orr r12, r12, lr, lsl r9
orr r2, r2, r3, lsl r10
(x) cmp r6, #0
ldr r9, LCPI2_10
ldr r10, LCPI2_11
(2) sub r8, r8, #32
(a) movge r12, lr, lsr r6
(y) cmp r8, #0
LPC2_10:
ldr lr, [pc, r10]
(b) movge r2, r3, lsr r8
With the peephole optimizer:
ldr r9, LCPI2_10
ldr r10, LCPI2_11
(1*) subs r6, r6, #32
(2*) subs r8, r8, #32
(a*) movge r12, lr, lsr r6
(b*) movge r2, r3, lsr r8
(1) is used by (x) for the conditional move at (a). (2) is used by (y) for the
conditional move at (b). After the peephole optimizer, these the flags resulting
from (1*) are ignored and only the flags from (2*) are considered for both
conditional moves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117876 91177308-0d34-0410-b5e6-96231b3b80d8
operand and one of them has a single use that is a live out copy, favor the
one that is live out. Otherwise it will be difficult to eliminate the copy
if the instruction is a loop induction variable update. e.g.
BB:
sub r1, r3, #1
str r0, [r2, r3]
mov r3, r1
cmp
bne BB
=>
BB:
str r0, [r2, r3]
sub r3, r3, #1
cmp
bne BB
This fixed the recent 256.bzip2 regression.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117675 91177308-0d34-0410-b5e6-96231b3b80d8
We don't want unused values forming their own equivalence classes, so we lump
them all together in one class, and then merge them with the class of the last
used value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117670 91177308-0d34-0410-b5e6-96231b3b80d8
in SSAUpdaterImpl.h
Verifying live intervals revealed that the old method was completely wrong, and
we need an iterative approach to calculating PHI placemant. Fortunately, we have
MachineDominators available, so we don't have to compute that over and over
like SSAUpdaterImpl.h must.
Live-out values are cached between calls to mapValue() and computed in a greedy
way, so most calls will be working with very small block sets.
Thanks to Bob for explaining how this should work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117599 91177308-0d34-0410-b5e6-96231b3b80d8
proper SSA updating.
This doesn't cause MachineDominators to be recomputed since we are already
requiring MachineLoopInfo which uses dominators as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117598 91177308-0d34-0410-b5e6-96231b3b80d8
There are currently 100 references to COFF::IMAGE_SCN in 6 files
and 11 different functions. Section to attribute mapping really
needs to happen in one place to avoid problems like this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117473 91177308-0d34-0410-b5e6-96231b3b80d8
Critical edges going into a loop are not as bad as critical exits. We can handle
them by splitting the critical edge, or by having both inside and outside
registers live out of the predecessor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117423 91177308-0d34-0410-b5e6-96231b3b80d8
memory, so a MachineMemOperand is useful (not propagated
into the MachineInstr yet). No functional change except
for dump output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117413 91177308-0d34-0410-b5e6-96231b3b80d8
the remainder register.
Example:
bb0:
x = 1
bb1:
use(x)
...
x = 2
jump bb1
When x is isolated in bb1, the inner part breaks into two components, x1 and x2:
bb0:
x0 = 1
bb1:
x1 = x0
use(x1)
...
x2 = 2
x0 = x2
jump bb1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117408 91177308-0d34-0410-b5e6-96231b3b80d8
do not double-count the duplicate instructions by counting once from the
beginning and again from the end. Keep track of where the duplicates from
the beginning ended and don't go past that point when counting duplicates
at the end. Radar 8589805.
This change causes one of the MC/ARM/simple-fp-encoding tests to produce
different (better!) code without the vmovne instruction being tested.
I changed the test to produce vmovne and vmoveq instructions but moving
between register files in the opposite direction. That's not quite the same
but predicated versions of those instructions weren't being tested before,
so at least the test coverage is not any worse, just different.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117333 91177308-0d34-0410-b5e6-96231b3b80d8
instructions separately from the count of non-predicated instructions. The
instruction count is used in places to determine how many instructions to
copy, predicate, etc. and things get confused if that count includes the
extra cost for microcoded ops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117332 91177308-0d34-0410-b5e6-96231b3b80d8
2) live-outs.
Previously the post-RA schedulers completely ignore these dependencies since
returns, branches, etc. are all scheduling barriers. This patch model the
latencies between instructions being scheduled and the barriers. It also
handle calls by marking their register uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117193 91177308-0d34-0410-b5e6-96231b3b80d8
framework. It's purpose is not to improve register allocation per se,
but to make it easier to develop powerful live range splitting. I call
it the basic allocator because it is as simple as a global allocator
can be but provides the building blocks for sophisticated register
allocation with live range splitting.
A minimal implementation is provided that trivially spills whenever it
runs out of registers. I'm checking in now to get high-level design
and style feedback. I've only done minimal testing. The next step is
implementing a "greedy" allocation algorithm that does some register
reassignment and makes better splitting decisions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117174 91177308-0d34-0410-b5e6-96231b3b80d8
When a block has exactly two uses and the register is both live-in and live-out,
don't isolate the block. We would be inserting two copies, so we haven't really
made any progress.
If the live-in and live-out values separate into disconnected components after
splitting, we would be making progress. We can't detect that for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117169 91177308-0d34-0410-b5e6-96231b3b80d8
An exit block with a critical edge must only have predecessors in the loop, or
just before the loop. This guarantees that the inserted copies in the loop
predecessors dominate the exit block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@117144 91177308-0d34-0410-b5e6-96231b3b80d8
- Initial register pressure in the loop should be all the live defs into the
loop. Not just those from loop preheader which is often empty.
- When an instruction is hoisted, update register pressure from loop preheader
to the original BB.
- Treat only use of a virtual register as kill since the code is still SSA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116956 91177308-0d34-0410-b5e6-96231b3b80d8
operand, also check if subregisters are killed.
Add <imp-def> operands for subregisters that remain alive after a super register
is killed.
I don't have a testcase for this that reproduces on trunk. <rdar://problem/8441758>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116940 91177308-0d34-0410-b5e6-96231b3b80d8
setup they require. Use this for ARM/Darwin to rematerialize the base
pointer from the frame pointer when required. rdar://8564268
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116879 91177308-0d34-0410-b5e6-96231b3b80d8
Pull an unsigned out of the Contents union such that it has the same size as two
pointers and no padding.
Arrange members such that the Contents union and all pointers can be 8-byte
aligned without padding.
This speeds up code generation by 0.8% on a 64-bit host. 32-bit hosts should be
unaffected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116857 91177308-0d34-0410-b5e6-96231b3b80d8
must be called in the pass's constructor. This function uses static dependency declarations to recursively initialize
the pass's dependencies.
Clients that only create passes through the createFooPass() APIs will require no changes. Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.
I have tested this with all standard configurations of clang and llvm-gcc on Darwin. It is possible that there are problems
with the static dependencies that will only be visible with non-standard options. If you encounter any crash in pass
registration/creation, please send the testcase to me directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116820 91177308-0d34-0410-b5e6-96231b3b80d8
in MultiSource/Benchmarks/VersaBench/beamformer/beamformer.
SmallSet.insert returns true if the element is inserted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116790 91177308-0d34-0410-b5e6-96231b3b80d8
"long latency" enough to hoist even if it may increase spilling. Reloading
a value from spill slot is often cheaper than performing an expensive
computation in the loop. For X86, that means machine LICM will hoist
SQRT, DIV, etc. ARM will be somewhat aggressive with VFP and NEON
instructions.
- Enable register pressure aware machine LICM by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116781 91177308-0d34-0410-b5e6-96231b3b80d8
does normal initialization and normal chaining. Change the default
AliasAnalysis implementation to NoAlias.
Update StandardCompileOpts.h and friends to explicitly request
BasicAliasAnalysis.
Update tests to explicitly request -basicaa.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116720 91177308-0d34-0410-b5e6-96231b3b80d8
All registers created during splitting or spilling are assigned to the same
stack slot as the parent register.
When splitting or rematting, we may not spill at all. In that case the stack
slot is still assigned, but it will be dead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116546 91177308-0d34-0410-b5e6-96231b3b80d8
splitting or spillling, and to help with rematerialization.
Use LiveRangeEdit in InlineSpiller and SplitKit. This will eventually make it
possible to share remat code between InlineSpiller and SplitKit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116543 91177308-0d34-0410-b5e6-96231b3b80d8
Before we would also split around a loop if any peripheral block had multiple
uses. This could cause repeated splitting when splitting a different live range
would insert uses into the periphery.
Now -spiller=inline passes the nightly test suite again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116494 91177308-0d34-0410-b5e6-96231b3b80d8
perform initialization without static constructors AND without explicit initialization
by the client. For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve. I hope to be able to relax
the latter requirement in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116334 91177308-0d34-0410-b5e6-96231b3b80d8
LocalRewriter.
This is a bit of a hack that adds an implicit use operand to model the
read-modify-write nature of a partial redef. Uses and defs are rewritten in
separate passes, and a single operand would never be processed twice.
<rdar://problem/8518892>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116210 91177308-0d34-0410-b5e6-96231b3b80d8
functions: computeRemainder and rewrite.
When the remainder breaks up into multiple components, remember to rewrite those
uses as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116121 91177308-0d34-0410-b5e6-96231b3b80d8
Such a check does not make any sense in presense of inlining and other compiler-dependent stuff.
This should fix bunch of warnings on mingw32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116113 91177308-0d34-0410-b5e6-96231b3b80d8
implicit. e.g.
%D6<def>, %D7<def> = VLD1q16 %R2<kill>, 0, ..., %Q3<imp-def>
%Q1<def> = VMULv8i16 %Q1<kill>, %Q3<kill>, ...
The real definition indices are 0,1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116080 91177308-0d34-0410-b5e6-96231b3b80d8
connected components. These components should be allocated different virtual
registers because there is no reason for them to be allocated together.
Add the ConnectedVNInfoEqClasses class to calculate the connected components,
and move values to new LiveIntervals.
Use it from SplitKit::rewrite by creating new virtual registers for the
components.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@116006 91177308-0d34-0410-b5e6-96231b3b80d8
This function is intended to be used when inserting a machine instruction that
trivially restricts the legal registers, like LEA requiring a GR32_NOSP
argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115875 91177308-0d34-0410-b5e6-96231b3b80d8
allow target to correctly compute latency for cases where static scheduling
itineraries isn't sufficient. e.g. variable_ops instructions such as
ARM::ldm.
This also allows target without scheduling itineraries to compute operand
latencies. e.g. X86 can return (approximated) latencies for high latency
instructions such as division.
- Compute operand latencies for those defined by load multiple instructions,
e.g. ldm and those used by store multiple instructions, e.g. stm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@115755 91177308-0d34-0410-b5e6-96231b3b80d8