Commit Graph

11355 Commits

Author SHA1 Message Date
Eric Christopher
f7579ff174 Expand invalid return values for umulo and smulo. Handle these similarly
to add/sub by doing the normal operation and then checking for overflow
afterwards. This generally relies on the DAG handling the later invalid
operations as well.

Fixes the 64-bit part of rdar://8622122 and rdar://8774702.

llvm-svn: 123908
2011-01-20 08:54:28 +00:00
Evan Cheng
6dc21c7358 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.

llvm-svn: 123905
2011-01-20 08:34:58 +00:00
Andrew Trick
bf079d8831 Selection DAG scheduler register pressure heuristic fixes.
Added a check for already live regs before claiming HighRegPressure.
Fixed a few cases of checking the wrong number of successors.
Added some tracing until these heuristics are better understood.

llvm-svn: 123892
2011-01-20 06:21:59 +00:00
Jakob Stoklund Olesen
ea33059ff5 Check that a live range exists before shortening it. This fixes PR8989.
The live range may have been deleted earlier because of rematerialization.

llvm-svn: 123891
2011-01-20 06:20:02 +00:00
Jakob Stoklund Olesen
bb94da29b2 Add hidden -verify-coalescing to run the machine code verifier before and after
register coalescing.

llvm-svn: 123890
2011-01-20 06:20:00 +00:00
Jakob Stoklund Olesen
c387993232 Fix bug found by new clang warning.
llvm-svn: 123872
2011-01-20 02:43:19 +00:00
Eric Christopher
58f8058502 Use only one API at a time.
llvm-svn: 123866
2011-01-20 01:29:23 +00:00
Eric Christopher
1b0e5debb4 If we can, lower the multiply part of a umulo/smulo call to a libcall
with an invalid type then split the result and perform the overflow check
normally.

Fixes the 32-bit parts of rdar://8622122 and rdar://8774702.

llvm-svn: 123864
2011-01-20 00:29:24 +00:00
Devang Patel
729c5e59af Fix debug info for merged global.
llvm-svn: 123862
2011-01-20 00:02:16 +00:00
Jakob Stoklund Olesen
69294ae8d7 Divert Hopfield network debug output. It is very noisy.
llvm-svn: 123859
2011-01-19 23:14:59 +00:00
Jakob Stoklund Olesen
c47bd85657 Don't accidentally leave small gaps in the live ranges when leaving the active
interval after an instruction. The leaveIntvAfter() method only adds liveness
from the instruction's boundary index to the inserted copy.

Ideally, SplitKit should be smarter about this, perhaps by combining useIntv()
and leaveIntvAfter() into one method that guarantees continuity.

llvm-svn: 123858
2011-01-19 23:14:56 +00:00
Devang Patel
574e10fa1e Fix register address expression. Patch by Ken Dyck.
llvm-svn: 123856
2011-01-19 23:04:47 +00:00
Jakob Stoklund Olesen
77738dd84e Implement RAGreedy::splitAroundRegion and remove loop splitting.
Region splitting includes loop splitting as a subset, and it is more generic.
The splitting heuristics for variables that are live in more than one block are
now:

1. Try to create a region that covers multiple basic blocks.
2. Try to create a new live range for each block with multiple uses.
3. Spill.

Steps 2 and 3 are similar to what the standard spiller is doing.

llvm-svn: 123853
2011-01-19 22:11:48 +00:00
Jakob Stoklund Olesen
c0ff5356d4 Add RAGreedy methods for splitting live ranges around regions.
Analyze the live range's behavior entering and leaving basic blocks. Compute an
interference pattern for each allocation candidate, and use SpillPlacement to
find an optimal region where that register can be live.

This code is still not enabled.

llvm-svn: 123774
2011-01-18 21:13:27 +00:00
Jeffrey Yasskin
5f5e1f5ef1 Remove unused variables found by gcc-4.6's -Wunused-but-set-variable.
llvm-svn: 123707
2011-01-18 00:51:23 +00:00
Stuart Hastings
f5f8318eb6 Remove checking that prevented overlapping CALLSEQ_START/CALLSEQ_END
ranges, add legalizer support for nested calls.  Necessary for ARM
byval support.  Radar 7662569.

llvm-svn: 123704
2011-01-18 00:09:27 +00:00
Benjamin Kramer
869dc645f1 Fix an off-by-one error in ctpop combining.
llvm-svn: 123664
2011-01-17 18:00:28 +00:00
Benjamin Kramer
e9488ed8eb Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0.
This shaves off 4 popcounts from the hacked 186.crafty source.

This is enabled even when a native popcount instruction is available. The
combined code is one operation longer but it should be faster nevertheless.

llvm-svn: 123621
2011-01-17 12:04:57 +00:00
Chris Lattner
c4d1d86d3e reapply my fix for PR8961 with a tweak to properly handle
multi-instruction sequences like calls.  Many thanks to Jakob for
finding a testcase.

llvm-svn: 123559
2011-01-16 02:27:38 +00:00
Benjamin Kramer
2e7ead5bb5 Add an assert so we don't silently miscompile ctpop for bit widths > 128.
llvm-svn: 123549
2011-01-15 21:19:37 +00:00
Benjamin Kramer
b48a048de6 Reimplement CTPOP legalization with the "best" algorithm from
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.

I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.

llvm-svn: 123547
2011-01-15 20:30:30 +00:00
Ted Kremenek
c9d2425c5a Update CMake build.
llvm-svn: 123491
2011-01-14 22:58:11 +00:00
Dan Gohman
a4f2631ea9 Delete an assignment to ThisBB which isn't needed, and tidy up some
comments.

llvm-svn: 123479
2011-01-14 22:26:16 +00:00
Anton Korobeynikov
1f9df99db1 Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff.
llvm-svn: 123476
2011-01-14 21:58:08 +00:00
Anton Korobeynikov
ef11a77938 Add CFI directives-based frame information emission. Not hooked yet.
llvm-svn: 123474
2011-01-14 21:57:53 +00:00
Anton Korobeynikov
e53322ef91 Split stuff as a preparation for CFI directives-based frame information emission
llvm-svn: 123473
2011-01-14 21:57:45 +00:00
Andrew Trick
a0e69757d1 Support for precise scheduling of the instruction selection DAG,
disabled in this checkin. Sorry for the large diffs due to
refactoring. New functionality is all guarded by EnableSchedCycles.

Scheduling the isel DAG is inherently imprecise, but we give it a best
effort:
- Added MayReduceRegPressure to allow stalled nodes in the queue only
  if there is a regpressure need.
- Added BUHasStall to allow checking for either dependence stalls due to
  latency or resource stalls due to pipeline hazards.
- Added BUCompareLatency to encapsulate and standardize the heuristics
  for minimizing stall cycles (vs. reducing register pressure).
- Modified the bottom-up heuristic (now in BUCompareLatency) to
  prioritize nodes by their depth rather than height. As long as it
  doesn't stall, height is irrelevant. Depth represents the critical
  path to the DAG root.
- Added hybrid_ls_rr_sort::isReady to filter stalled nodes before
  adding them to the available queue.

Related Cleanup: most of the register reduction routines do not need
to be templates.

llvm-svn: 123468
2011-01-14 21:11:41 +00:00
Jakob Stoklund Olesen
9f5e00f957 Try for the third time to teach getFirstTerminator() about debug values.
This time let's rephrase to trick gcc-4.3 into not miscompiling.

llvm-svn: 123432
2011-01-14 06:33:45 +00:00
Jakob Stoklund Olesen
99ad62ed9e Revert r123419. It still breaks llvm-gcc-i386-linux-selfhost.
llvm-svn: 123423
2011-01-14 02:12:54 +00:00
Chris Lattner
a0074ca5fc Set the insertion point correctly for instructions generated by load folding:
they should go *before* the new instruction not after it. 

llvm-svn: 123420
2011-01-14 01:33:40 +00:00
Jakob Stoklund Olesen
3d8deb13ee Try again to teach getFirstTerminator() about debug values.
Fix some callers to better deal with debug values.

llvm-svn: 123419
2011-01-14 01:17:53 +00:00
Jakob Stoklund Olesen
b5e12bb37c Better terminator avoidance.
This approach also works when the terminator doesn't have a slot index. (Which
can happen??)

llvm-svn: 123413
2011-01-13 23:35:53 +00:00
Jakob Stoklund Olesen
d63287ff98 Temporary workaround for an i386 crash in LiveDebugVariables.
llvm-svn: 123400
2011-01-13 21:28:55 +00:00
Jakob Stoklund Olesen
0f2b9d9dc4 Teach frame lowering to ignore debug values after the terminators.
llvm-svn: 123399
2011-01-13 21:28:52 +00:00
Devang Patel
8e59113036 Speculatively revert r123384 to make llvm-gcc-i386-linux-selfhost buildbot happy.
llvm-svn: 123389
2011-01-13 19:27:50 +00:00
Jakob Stoklund Olesen
6aa35206e7 Teach MachineBasicBlock::getFirstTerminator to ignore debug values.
It will still return an iterator that points to the first terminator or end(),
but there may be DBG_VALUE instructions following the first terminator.

llvm-svn: 123384
2011-01-13 18:41:05 +00:00
Dan Gohman
f4ec824435 Fix r123346 to handle scalar types too.
llvm-svn: 123352
2011-01-13 01:06:51 +00:00
Jakob Stoklund Olesen
6cdcc6287b Add missing space in debug output
llvm-svn: 123351
2011-01-13 00:57:35 +00:00
Dan Gohman
5bbd766a7b Apply the patch from PR8958, which allows llc to get slightly
further on the associated testcase before aborting.

llvm-svn: 123346
2011-01-12 23:56:26 +00:00
Jakob Stoklund Olesen
3987889b61 Try again enabling LiveDebugVariables.
llvm-svn: 123342
2011-01-12 23:36:21 +00:00
Jakob Stoklund Olesen
953b1b115d Don't emit a DBG_VALUE for a spill slot that the rewriter decided not to use after all.
llvm-svn: 123339
2011-01-12 23:14:07 +00:00
Jakob Stoklund Olesen
48c7a5cf7e Fix braino in dominator tree walk.
llvm-svn: 123338
2011-01-12 23:14:04 +00:00
Jakob Stoklund Olesen
7a13190a2e Sometimes, old virtual registers can linger on DBG_VALUE instructions.
Make sure we don't crash in that case, but simply turn them into %noreg instead.

llvm-svn: 123335
2011-01-12 22:37:49 +00:00
Jakob Stoklund Olesen
59d3b89873 Teach VirtRegRewriter to update slot indexes when erasing instructions.
It was leaving dangling pointers in the slot index maps.

llvm-svn: 123334
2011-01-12 22:28:51 +00:00
Jakob Stoklund Olesen
8c5c268f05 Annotate VirtRegRewriter debug output with slot indexes.
llvm-svn: 123333
2011-01-12 22:28:48 +00:00
Jakob Stoklund Olesen
c1a042a528 Verify slot index ordering.
The slot indexes must be monotonically increasing through the function.

llvm-svn: 123324
2011-01-12 21:27:48 +00:00
Jakob Stoklund Olesen
764cce86f0 Verify that machine instruction parent pointers are consistent.
llvm-svn: 123322
2011-01-12 21:27:41 +00:00
Jakob Stoklund Olesen
1f7052b53b The world is not ready for LiveDebugVariables yet.
llvm-svn: 123290
2011-01-11 23:20:33 +00:00
Jakob Stoklund Olesen
d7a523358c Enable LiveDebugVariables by default.
llvm-svn: 123282
2011-01-11 22:45:28 +00:00
Jakob Stoklund Olesen
1cd577b435 Don't insert DBG_VALUE instructions after the first terminator.
For one, MachineBasicBlock::getFirstTerminator() doesn't understand what is
happening, and it also makes sense to have all control flow run through the
DBG_VALUE.

llvm-svn: 123277
2011-01-11 22:11:16 +00:00
Devang Patel
7b5cf4eafc Appropriately truncate debug info range in dwarf output.
This is not yet completely enabled.

llvm-svn: 123274
2011-01-11 21:42:10 +00:00
Eric Christopher
5a4d64216f Move ExpandAtomic into the integer expansion routines - it's only used there.
llvm-svn: 123202
2011-01-11 00:36:08 +00:00
Dale Johannesen
cd78621861 Fix PR 8916 (qv for analysis), at least the immediate problem.
There's an inherent tension in DAGCombine between assuming
that things will be put in canonical form, and the Depth
mechanism that disables transformations when recursion gets
too deep.  It would not surprise me if there's a lot of little
bugs like this one waiting to be discovered.  The mechanism
seems fragile and I'd suggest looking at it from a design viewpoint.

llvm-svn: 123191
2011-01-10 21:53:07 +00:00
Anton Korobeynikov
cf5967630b Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there.
llvm-svn: 123170
2011-01-10 12:39:04 +00:00
Chris Lattner
0e49a35bd2 fit in 80 cols and use MBB::isSuccessor instead of a hand
rolled std::find.

llvm-svn: 123164
2011-01-10 07:51:31 +00:00
Jakob Stoklund Olesen
32f1783ca1 Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic.
These functions not longer assert when passed 0, but simply return false instead.

No functional change intended.

llvm-svn: 123155
2011-01-10 02:58:51 +00:00
Jakob Stoklund Olesen
785d31a2d2 Remove MachineRegisterInfo::getLastVirtReg(), it was giving wrong results
when no virtual registers have been allocated.

It was only used to resize IndexedMaps, so provide an IndexedMap::resize()
method such that

 Map.grow(MRI.getLastVirtReg());

can be replaced with the simpler

 Map.resize(MRI.getNumVirtRegs());

This works correctly when no virtuals are allocated, and it bypasses the to/from
index conversions.

llvm-svn: 123130
2011-01-09 21:58:20 +00:00
Chris Lattner
f26e71fa4c sort this.
llvm-svn: 123129
2011-01-09 21:31:39 +00:00
Jakob Stoklund Olesen
957748e7ac Teach TargetRegisterInfo how to cram stack slot indexes in with the virtual and
physical register numbers.

This makes the hack used in LiveInterval official, and lets LiveInterval be
oblivious of stack slots.

The isPhysicalRegister() and isVirtualRegister() predicates don't know about
this, so when a variable may contain a stack slot, isStackSlot() should always
be tested first.

llvm-svn: 123128
2011-01-09 21:17:37 +00:00
Jakob Stoklund Olesen
0088b6ffb6 Add a forgotten VireReg2IndexFunctor.
llvm-svn: 123123
2011-01-09 18:58:33 +00:00
Cameron Zwarich
3e060bd398 Eliminate some extra hash table lookups.
llvm-svn: 123115
2011-01-09 10:54:21 +00:00
Cameron Zwarich
4625675112 Add an informative comment.
llvm-svn: 123114
2011-01-09 10:32:30 +00:00
Jakob Stoklund Olesen
d4dcf22b65 Simplify LiveDebugVariables by storing MachineOperand copies locations instead
of using a Location class with the same information.

When making a copy of a MachineOperand that was already stored in a
MachineInstr, it is necessary to clear the parent pointer on the copy. Otherwise
the register use-def lists become inconsistent.

Add MachineOperand::clearParent() to do that. An alternative would be a custom
MachineOperand copy constructor that cleared ParentMI. I didn't want to do that
because of the performance impact.

llvm-svn: 123109
2011-01-09 05:33:21 +00:00
Jakob Stoklund Olesen
c20baa8f1d Shrink a BitVector that didn't mean to store bits for all physical registers.
llvm-svn: 123108
2011-01-09 03:45:44 +00:00
Jakob Stoklund Olesen
ed53ab1635 Replace TargetRegisterInfo::printReg with a PrintReg class that also works without a TRI instance.
Print virtual registers numbered from 0 instead of the arbitrary
FirstVirtualRegister. The first virtual register is printed as %vreg0.
TRI::NoRegister is printed as %noreg.

llvm-svn: 123107
2011-01-09 03:05:53 +00:00
Jakob Stoklund Olesen
9a7e67d141 Use IndexedMap for MachineRegisterInfo as well. No functional change.
llvm-svn: 123106
2011-01-09 03:05:46 +00:00
Jakob Stoklund Olesen
f43442c9f7 Fix VirtRegMap to use TRI::index2VirtReg and TRI::virtReg2Index instead of
depending on TRI::FirstVirtualRegister.

Also use TRI::printReg instead of printing virtual registers directly.

llvm-svn: 123101
2011-01-08 23:11:07 +00:00
Jakob Stoklund Olesen
b04c78d5ea Fix a MachineVerifier loop that probably didn't mean to skip the last two
virtual registers.

llvm-svn: 123100
2011-01-08 23:11:02 +00:00
Jakob Stoklund Olesen
fb2b53c0de Use an IndexedMap for LiveVariables::VirtRegInfo.
Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow
iteration over virtual registers without depending on the representation of
virtual register numbers.

llvm-svn: 123098
2011-01-08 23:10:57 +00:00
Jakob Stoklund Olesen
b3820cdc22 Use an IndexedMap for LiveOutRegInfo to hide its dependence on TargetRegisterInfo::FirstVirtualRegister.
llvm-svn: 123096
2011-01-08 23:10:50 +00:00
Cameron Zwarich
33c137a88b Fix coding style.
llvm-svn: 123093
2011-01-08 22:36:53 +00:00
Cameron Zwarich
a40df277f1 Make more passes preserve dominators (or state that they preserve dominators if
they all ready do). This removes two dominator recomputations prior to isel,
which is a 1% improvement in total llc time for 403.gcc.

The only potentially suspect thing is making GCStrategy recompute dominators if
it used a custom lowering strategy.

llvm-svn: 123064
2011-01-08 17:01:52 +00:00
Evan Cheng
1afd04fc59 Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call.
llvm-svn: 123048
2011-01-08 01:24:27 +00:00
Evan Cheng
aa16fd02ad Do not model all INLINEASM instructions as having unmodelled side effects.
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.

This allows memory instructions to be moved around INLINEASM instructions.

llvm-svn: 123044
2011-01-07 23:50:32 +00:00
Devang Patel
d3ba97949a Speculatively revert r123032.
llvm-svn: 123039
2011-01-07 22:33:41 +00:00
Devang Patel
a52d6c216d Appropriately truncate debug info range in dwarf output.
Enable live debug variables pass.

llvm-svn: 123032
2011-01-07 21:30:41 +00:00
Evan Cheng
8b58b77d06 DBG_VALUE does not have any side effects; it also makes no sense to mark it cheap as a copy.
llvm-svn: 123031
2011-01-07 21:08:26 +00:00
Bob Wilson
22f18a7e94 Add ARM patterns to match EXTRACT_SUBVECTOR nodes.
Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle
vectors from being translated to EXTRACT_SUBVECTOR.
Patch by Tim Northover.

The test changes are needed to keep those spill-q tests from testing aligned
spills and restores.  If the only aligned stack objects are spill slots, we
no longer realign the stack frame.  Prior to this patch, an EXTRACT_SUBVECTOR
was legalized by loading from the stack, which created an aligned frame index.
Now, however, there is nothing except the spill slot in the stack frame, so
I added an aligned alloca.

llvm-svn: 122995
2011-01-07 04:59:04 +00:00
Bob Wilson
d9a324ac11 Fix a comment typo.
llvm-svn: 122994
2011-01-07 04:58:58 +00:00
Bob Wilson
bcbb3375dd Change EXTRACT_SUBVECTOR to require a constant index.
We were never generating any of these nodes with variable indices, and there
was one legalizer function asserting on a non-constant index.  If we ever have
a need to support variable indices, we can add this back again.

llvm-svn: 122993
2011-01-07 04:58:56 +00:00
Bill Wendling
0bf94c2188 Early exit if we don't have invokes. The 'Unwinds' vector isn't modified unless
we have invokes, so there is no functionality change here.

llvm-svn: 122990
2011-01-07 02:54:45 +00:00
Duncan Sands
06444485ee Fix the other problem reported in PR8582. Testcase and patch by
Nadav Rotem.

llvm-svn: 122983
2011-01-06 23:45:22 +00:00
Eric Christopher
16127008fd Add some fairly duplicated code to let type legalization split illegal
typed atomics. This will lower exclusively to libcalls at the moment.

llvm-svn: 122979
2011-01-06 22:28:56 +00:00
Devang Patel
7cb0e7c2ef Emit 128 bit constant.
This fixes PR 8913 crash.

llvm-svn: 122971
2011-01-06 21:39:25 +00:00
Evan Cheng
cb39cc2164 Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy
etc. takes an option OptSize. If OptSize is true, it would return
the inline limit for functions with attribute OptSize.

llvm-svn: 122952
2011-01-06 06:52:41 +00:00
Evan Cheng
70711ea54d Revert r122936. I'll re-implement the change.
llvm-svn: 122949
2011-01-06 06:17:53 +00:00
Jakob Stoklund Olesen
b3e7b27c1f Zap the last two -Wself-assign warnings in llvm.
Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there.

llvm-svn: 122940
2011-01-06 01:33:22 +00:00
Jakob Stoklund Olesen
7b1480ff12 Add the SpillPlacement analysis pass.
This pass precomputes CFG block frequency information that can be used by the
register allocator to find optimal spill code placement.

Given an interference pattern, placeSpills() will compute which basic blocks
should have the current variable enter or exit in a register, and which blocks
prefer the stack.

The algorithm is ready to consume block frequencies from profiling data, but for
now it gets by with the static estimates used for spill weights.

This is a work in progress and still not hooked up to RegAllocGreedy.

llvm-svn: 122938
2011-01-06 01:21:53 +00:00
Evan Cheng
d425aa5d2a r105228 reduced the memcpy / memset inline limit to 4 with -Os to avoid blowing
up freebsd bootloader. However, this doesn't make much sense for Darwin, whose
-Os is meant to optimize for size only if it doesn't hurt performance.
rdar://8821501

llvm-svn: 122936
2011-01-06 01:04:47 +00:00
Evan Cheng
2af40ae781 Avoid zero extend bit test operands to pointer type if all the masks fit in
the original type of the switch statement key.
rdar://8781238

llvm-svn: 122935
2011-01-06 01:02:44 +00:00
Evan Cheng
bf92316fab Optimize:
r1025 = s/zext r1024, 4
  r1026 = extract_subreg r1025, 4
to:
  r1026 = copy r1024

llvm-svn: 122925
2011-01-05 23:06:49 +00:00
Jakob Stoklund Olesen
ce25984bae Add a hidden command line option to display edge bundle graphs as they are
calculated.

llvm-svn: 122912
2011-01-05 21:50:24 +00:00
Eric Christopher
651810d717 80-cols.
llvm-svn: 122909
2011-01-05 21:45:56 +00:00
Eric Christopher
be2382f9a6 Remove TODO, these appear to be implemented.
llvm-svn: 122849
2011-01-04 22:31:50 +00:00
Jakob Stoklund Olesen
abf8941a60 Turn the EdgeBundles class into a stand-alone machine CFG analysis pass.
The analysis will be needed by both the greedy register allocator and the
X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't
change.

This pass is very fast, usually showing up as 0.0% wall time.

llvm-svn: 122832
2011-01-04 21:10:05 +00:00
Cameron Zwarich
fce4db4cbe Switch to path halving from path compression for a small speedup. This also
makes getLeader() nonrecursive.

llvm-svn: 122811
2011-01-04 16:24:51 +00:00
Cameron Zwarich
2975ee7cc6 Eliminate repeated allocation of a per-BB DenseMap for a 4.6% reduction of time
spent in StrongPHIElimination on 403.gcc.

llvm-svn: 122803
2011-01-04 06:42:27 +00:00
Owen Anderson
9eeb0d483e Clean up a funky pass registration that got passed over when I got rid of static constructors.
llvm-svn: 122795
2011-01-04 00:55:21 +00:00
Cameron Zwarich
60ec113434 Use a RecyclingAllocator to allocate values for MachineCSE's ScopedHashTable for
a 28% speedup of MachineCSE time on 403.gcc.

llvm-svn: 122735
2011-01-03 04:07:46 +00:00
Chris Lattner
e396e846b4 split dom frontier handling stuff out to its own DominanceFrontier header,
so that Dominators.h is *just* domtree.  Also prune #includes a bit.

llvm-svn: 122714
2011-01-02 22:09:33 +00:00
Benjamin Kramer
a58b69aa9d Try to reuse the value when lowering memset.
This allows us to compile:
  void test(char *s, int a) {
    __builtin_memset(s, a, 15);
  }
into 1 mul + 3 stores instead of 3 muls + 3 stores.

llvm-svn: 122710
2011-01-02 19:57:05 +00:00
Benjamin Kramer
38491f47ce Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors.
We could implement a DAGCombine to turn x * 0x0101 back into logic operations
on targets that doesn't support the multiply or it is slow (p4) if someone cares
enough.

Example code:
  void test(char *s, int a) {
      __builtin_memset(s, a, 4);
  }
before:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    movl  %eax, %ecx
    shll  $8, %ecx
    orl %eax, %ecx
    movl  %ecx, %eax
    shll  $16, %eax
    orl %ecx, %eax
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret
after:
  _test:                                  ## @test
    movzbl  8(%esp), %eax
    imull $16843009, %eax, %eax   ## imm = 0x1010101
    movl  4(%esp), %ecx
    movl  %eax, 4(%ecx)
    movl  %eax, (%ecx)
    ret

llvm-svn: 122707
2011-01-02 19:44:58 +00:00
Cameron Zwarich
ae468579bb Use getVRegDef() instead of def_iterator. This leads to fewer defs being added
with 2-address instructions, for about a 3.5% speedup of StrongPHIElimination on
403.gcc.

llvm-svn: 122635
2010-12-30 00:42:23 +00:00
Cameron Zwarich
1e7124e6fa None of the other pass names in CodeGen have terminating periods.
llvm-svn: 122628
2010-12-29 11:49:10 +00:00
Cameron Zwarich
a7052a3c06 Instead of processing every instruction when splitting interferences, only
process those instructions that define phi sources. This is a 47% speedup of
StrongPHIElimination compile time on 403.gcc.

llvm-svn: 122627
2010-12-29 11:00:09 +00:00
Cameron Zwarich
292870da06 Add a missing word to a comment.
llvm-svn: 122625
2010-12-29 04:42:39 +00:00
Cameron Zwarich
6fc15ba38b Add text explaining an assertion.
llvm-svn: 122617
2010-12-29 03:52:51 +00:00
Cameron Zwarich
0fa638e27c Simplify some code in MachineVerifier that was doing the correct thing, but not
in the most obvious way.

llvm-svn: 122610
2010-12-28 23:45:38 +00:00
Cameron Zwarich
3eacb7fff8 Revert the optimization in r122596. It is correct for all current targets, but
it relies on assumptions that may not be true in the future.

llvm-svn: 122608
2010-12-28 23:02:56 +00:00
Cameron Zwarich
c9c7488542 Avoid iterating every operand of an instruction in StrongPHIElimination, since
we are only interested in the defs when discovering interferences.

This is a 28% speedup running StrongPHIElimination on 403.gcc.

llvm-svn: 122596
2010-12-28 10:49:33 +00:00
Duncan Sands
cc5a4497fd Pacify the compiler. BestWeight cannot in fact be used uninitialized
in this function, but the compiler was warning that it might be when
doing a release build.

llvm-svn: 122595
2010-12-28 10:07:15 +00:00
Cameron Zwarich
30f2239301 Change an assertion to assert what the code actually relies upon.
llvm-svn: 122586
2010-12-27 22:08:42 +00:00
Cameron Zwarich
cfdb10a1bb Land a first cut at StrongPHIElimination. There are only 5 new test failures
when running without the verifier, and I have not yet checked them to see if
the new results are still correct. There are more verifier failures, but they
all seem to be additional occurrences of verifier failures that occur with the
existing PHIElimination pass. There are a few obvious issues with the code:

1) It doesn't properly update the register equivalence classes during copy
insertion, and instead recomputes them before merging live intervals and
renaming registers. I wanted to keep this first patch simple for debugging
purposes, but it shouldn't be very hard to do this.

2) It doesn't mix the renaming and live interval merging with the copy insertion
process, which leads to a lot of virtual register churn. Virtual registers and
live intervals are created, only to later be merged into others. The code should
be smarter and only create a new virtual register if there is no existing
register in the same congruence class.

3) In one place the code uses a DenseMap per basic block, which is unnecessary
heap allocation. There should be an inline storage version of DenseMap.

I did a quick compile-time test of running llc on 403.gcc with and without
StrongPHIElimination. It is slightly slower with StrongPHIElimination, because
the small decrease in the coalescer runtime can't beat the increase in phi
elimination runtime. Perhaps fixing the above performance issues will narrow
the gap.

I also haven't yet run any tests of the quality of the generated code.

llvm-svn: 122582
2010-12-27 10:08:19 +00:00
Cameron Zwarich
66289e34e1 Add knowledge of phi-def and phi-kill valnos to MachineVerifier's predecessor
valno verification. The "Different value live out of predecessor" check is
incorrect in the case of phi-def valnos, so just skip that check for phi-def
valnos and instead check that all of the valnos for predecessors have phi-kill.
Fixes PR8863.

llvm-svn: 122581
2010-12-27 05:17:23 +00:00
Andrew Trick
dfa31b1cf9 Minor cleanup related to my latest scheduler changes.
llvm-svn: 122545
2010-12-24 07:10:19 +00:00
Andrew Trick
c926e98fc7 Fix a few cases where the scheduler is not checking for phys reg copies. The scheduling node may have a NULL DAG node, yuck.
llvm-svn: 122544
2010-12-24 06:46:50 +00:00
Andrew Trick
134b2a5907 Various bits of framework needed for precise machine-level selection
DAG scheduling during isel. Most new functionality is currently
guarded by -enable-sched-cycles and -enable-sched-hazard.

Added InstrItineraryData::IssueWidth field, currently derived from
ARM itineraries, but could be initialized differently on other targets.

Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is
active, and if so how many cycles of state it holds.

Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry
into the scheduler's available queue.

ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to
get information about it's SUnits, provides RecedeCycle for bottom-up
scheduling, correctly computes scoreboard depth, tracks IssueCount, and
considers potential stall cycles when checking for hazards.

ScheduleDAGRRList now models machine cycles and hazards (under
flags). It tracks MinAvailableCycle, drives the hazard recognizer and
priority queue's ready filter, manages a new PendingQueue, properly
accounts for stall cycles, etc.

llvm-svn: 122541
2010-12-24 05:03:26 +00:00
Andrew Trick
53f4556c64 whitespace
llvm-svn: 122539
2010-12-24 04:28:06 +00:00
Cameron Zwarich
a7ad357a13 Simplify a check for implicit defs and remove a FIXME.
llvm-svn: 122537
2010-12-24 03:09:36 +00:00
Chris Lattner
b607e7deda flags -> glue for selectiondag
llvm-svn: 122509
2010-12-23 17:24:32 +00:00
Chris Lattner
fb9ff7a4ff sdisel flag -> glue.
llvm-svn: 122507
2010-12-23 17:13:18 +00:00
Andrew Trick
ca2e267ddc Reorganize ListScheduleBottomUp in preparation for modeling machine cycles and instruction issue.
llvm-svn: 122491
2010-12-23 05:42:20 +00:00
Andrew Trick
c046a115d4 Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle.
llvm-svn: 122474
2010-12-23 04:16:14 +00:00
Andrew Trick
e48d5d8395 In CheckForLiveRegDef use TRI->getOverlaps.
llvm-svn: 122473
2010-12-23 03:43:21 +00:00
Andrew Trick
cc701bcfdc Fixes PR8823: add-with-overflow-128.ll
In the bottom-up selection DAG scheduling, handle two-address
instructions that read/write unspillable registers. Treat
the entire chain of two-address nodes as a single live range.

llvm-svn: 122472
2010-12-23 03:15:51 +00:00
Jeffrey Yasskin
a199652a3e Change all self assignments X=X to (void)X, so that we can turn on a
new gcc warning that complains on self-assignments and
self-initializations.

llvm-svn: 122458
2010-12-23 00:58:24 +00:00
Benjamin Kramer
49942a90b7 DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code.
example code:
unsigned foo(unsigned x, unsigned y) {
  if (x != 0) y--;
  return y;
}

before:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    sbbl  %eax, %eax              ## encoding: [0x19,0xc0]
    notl  %eax                    ## encoding: [0xf7,0xd0]
    addl  8(%esp), %eax           ## encoding: [0x03,0x44,0x24,0x08]
    ret                           ## encoding: [0xc3]

after:
  _foo:                           ## @foo
    cmpl  $1, 4(%esp)             ## encoding: [0x83,0x7c,0x24,0x04,0x01]
    movl  8(%esp), %eax           ## encoding: [0x8b,0x44,0x24,0x08]
    adcl  $-1, %eax               ## encoding: [0x83,0xd0,0xff]
    ret                           ## encoding: [0xc3]

llvm-svn: 122455
2010-12-22 23:17:45 +00:00
Jakob Stoklund Olesen
f761c75efb When RegAllocGreedy decides to spill the interferences of the current register,
pick the victim with the lowest total spill weight.

llvm-svn: 122445
2010-12-22 22:01:30 +00:00
Jakob Stoklund Olesen
71e527ef4b Include a shadow of the original CFG edges in the edge bundle graph.
llvm-svn: 122444
2010-12-22 22:01:28 +00:00
Chris Lattner
04ef853e23 Fix a bug in ReduceLoadWidth that wasn't handling extending
loads properly.  We miscompiled the testcase into:

_test:                                  ## @test
	movl	$128, (%rdi)
	movzbl	1(%rdi), %eax
	ret

Now we get a proper:

_test:                                  ## @test
	movl	$128, (%rdi)
	movsbl	(%rdi), %eax
	movzbl	%ah, %eax
	ret

This fixes PR8757.

llvm-svn: 122392
2010-12-22 08:02:57 +00:00
Chris Lattner
35fcc63498 more cleanups, move a check for "roundedness" earlier to reject
unhanded cases faster and simplify code.

llvm-svn: 122391
2010-12-22 08:01:44 +00:00
Chris Lattner
60dcb2b5c2 reduce indentation and improve comments, no functionality change.
llvm-svn: 122389
2010-12-22 07:36:50 +00:00
Andrew Trick
afec190a28 In DelayForLiveRegsBottomUp, handle instructions that read and write
the same physical register. Simplifies the fix from the previous
checkin r122211.

llvm-svn: 122370
2010-12-21 22:27:44 +00:00
Andrew Trick
1e3ad9f721 whitespace
llvm-svn: 122368
2010-12-21 22:25:04 +00:00
Dale Johannesen
e0fb87c3d7 Reapply 122353-122355 with fixes. 122354 was wrong;
the shift type was needed one place, the shift count
type another.  The transform in 123555 had the same
problem.

llvm-svn: 122366
2010-12-21 21:55:50 +00:00
Dale Johannesen
972aba543a Revert 122353-122355 for the moment, they broke stuff.
llvm-svn: 122360
2010-12-21 21:22:27 +00:00
Dale Johannesen
39186cfb0b Add a new transform to DAGCombiner.
llvm-svn: 122355
2010-12-21 20:10:51 +00:00
Dale Johannesen
5f3e7b08f6 Get the type of a shift from the shift, not from its shift
count operand.  These should be the same but apparently are
not always, and this is cleaner anyway.  This improves the
code in an existing test.

llvm-svn: 122354
2010-12-21 20:06:19 +00:00
Dale Johannesen
bad19334ee Shift by the word size is invalid IR; don't create it.
llvm-svn: 122353
2010-12-21 20:00:06 +00:00
Chris Lattner
8a3058137a fix some typos
llvm-svn: 122349
2010-12-21 18:05:22 +00:00
Stuart Hastings
fedc21e594 Fix indentation, add comment.
llvm-svn: 122345
2010-12-21 17:16:58 +00:00
Stuart Hastings
a1f786efa9 Missing logic for nested CALLSEQ_START/END.
llvm-svn: 122342
2010-12-21 17:07:24 +00:00
Cameron Zwarich
0243f1d21e Incremental progress towards a new implementation of StrongPHIElimination. Most
of the problems with my last attempt were in the updating of LiveIntervals
rather than the coalescing itself. Therefore, I decided to get that right first
by essentially reimplementing the existing PHIElimination using LiveIntervals.

It works correctly, with only a few tests failing (which may not be legitimate
failures) and no new verifier failures (at least as far as I can tell, I didn't
count the number per file).

llvm-svn: 122321
2010-12-21 06:54:43 +00:00
Chris Lattner
65c5243bd6 rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for
something that just glues two nodes together, even if it is
sometimes used for flags.

llvm-svn: 122310
2010-12-21 02:38:05 +00:00
Chris Lattner
b37e697277 improve "cannot yet select" errors a trivial amount: now
they are just as useless, but at least a bit more gramatical

llvm-svn: 122305
2010-12-21 02:07:03 +00:00
Jakob Stoklund Olesen
e9eb1be4dd Add EdgeBundles to SplitKit.
Edge bundles is an annotation on the CFG that turns it into a bipartite directed
graph where each basic block is connected to an outgoing and an ingoing bundle.
These bundles are useful for identifying regions of the CFG for live range
splitting.

llvm-svn: 122301
2010-12-21 01:50:21 +00:00
Jakob Stoklund Olesen
86786c46c2 Use IntEqClasses to compute connected components of live intervals.
llvm-svn: 122296
2010-12-21 00:48:17 +00:00
Dale Johannesen
036c3da142 Cosmetic changes.
llvm-svn: 122259
2010-12-20 20:10:50 +00:00
Cameron Zwarich
ad29bd5325 MachineVerifier should count landing pad successors as basic blocks rather than
out-edges. Fixes PR8824.

llvm-svn: 122228
2010-12-20 04:19:48 +00:00
Cameron Zwarich
6970ec515e Teach MachineVerifier that early clobber defs begin at USE slots and other defs
begin at DEF slots. Fixes the second half of PR8813.

llvm-svn: 122225
2010-12-20 03:15:20 +00:00
Cameron Zwarich
31af86ef44 Add a missing check from r122218.
llvm-svn: 122224
2010-12-20 02:59:51 +00:00
Chris Lattner
249e131f39 implement type legalization promotion support for SMULO and UMULO, giving
ARM (and other 32-bit-only) targets support for i8 and i16 overflow 
multiplies.  The generated code isn't great, but this at least fixes
CodeGen/Generic/overflow.ll when running on ARM hosts.

llvm-svn: 122221
2010-12-20 02:05:39 +00:00
Cameron Zwarich
bcd02fd9a4 Don't assume that an instruction ending a register's live range always reads
the register; it may be a dead def instead. Fixes PR8820.

llvm-svn: 122218
2010-12-20 01:22:37 +00:00
Chris Lattner
0f801998bf Fix a bug in the scheduler's handling of "unspillable" vregs.
Imagine we see:

EFLAGS = inst1
EFLAGS = inst2 FLAGS
gpr = inst3 EFLAGS

Previously, we would refuse to schedule inst2 because it clobbers
the EFLAGS of the predecessor.  However, it also uses the EFLAGS
of the predecessor, so it is safe to emit.  SDep edges ensure that
the right order happens already anyway.

This fixes 2 testsuite crashes with the X86 patch I'm going to
commit next.

llvm-svn: 122211
2010-12-20 00:55:43 +00:00
Chris Lattner
85875bf06b the result of CheckForLiveRegDef is dead, remove it.
llvm-svn: 122209
2010-12-20 00:51:56 +00:00
Chris Lattner
ee7fa0d706 reduce indentation, no functionality change.
llvm-svn: 122208
2010-12-20 00:50:16 +00:00
Cameron Zwarich
8c00d690f5 Ignore debug values when performing MachineVerifier liveness checks. Fixes
PR8822.

llvm-svn: 122207
2010-12-20 00:08:10 +00:00
Cameron Zwarich
c8dfbe7503 Early clobber operands are allowed to be defined at use indices. This fixes one
half of PR8813.

llvm-svn: 122205
2010-12-19 23:50:53 +00:00
Cameron Zwarich
6f5c1021ba Fix PR8815 by checking for an explicit clobber def tied to a use operand in
ConnectedVNInfoEqClasses::Classify().

llvm-svn: 122202
2010-12-19 22:12:45 +00:00
Cameron Zwarich
37aec9c35d Fix PR8811 by teaching MachineVerifier about optional defs.
llvm-svn: 122199
2010-12-19 21:37:23 +00:00
Cameron Zwarich
64fbc5e267 StrongPHIElimination will never run before TwoAddressInstructionPass.
llvm-svn: 122197
2010-12-19 21:32:29 +00:00
Nick Lewycky
c85935836b Add missing standard headers. Patch by Joerg Sonnenberger!
llvm-svn: 122193
2010-12-19 20:43:38 +00:00
Chris Lattner
92dcd2af36 teach MaskedValueIsZero how to analyze ADDE. This is
enough to teach it that ADDE(0,0) is known 0 except the 
low bit, for example.

llvm-svn: 122191
2010-12-19 20:38:28 +00:00
Cameron Zwarich
163792fb1f Remove some checks for StrongPHIElim. These checks make it impossible to use an
alternative register allocator that does not require LiveIntervals by specifying
it on the command-line for a target that has StrongPHIElimination enabled by
default.

These checks are pretty meaningless anyways, since StrongPHIElimination and
PHIElimination are never used at the same time.

llvm-svn: 122176
2010-12-19 18:03:27 +00:00
Chris Lattner
ac82ea26da fix PR8642: if a critical edge has a PHI value that can trap,
isel is *required* to split the edge.  PHI values get evaluated
on the edge, not in their predecessor block.

llvm-svn: 122170
2010-12-19 04:58:57 +00:00
Jakob Stoklund Olesen
bdf06d6c7b Apparently, operandices is not a word.
llvm-svn: 122135
2010-12-18 03:28:32 +00:00
Jakob Stoklund Olesen
e06ded7533 Teach the inline spiller to attempt folding a load instruction into its single
use before rematerializing the load.

This allows us to produce:

    addps	LCPI0_1(%rip), %xmm2

Instead of:

    movaps	LCPI0_1(%rip), %xmm3
    addps	%xmm3, %xmm2

Saving a register and an instruction. The standard spiller already knows how to
do this.

llvm-svn: 122133
2010-12-18 03:04:14 +00:00
Jakob Stoklund Olesen
485b7965b3 Tweak debug spew.
llvm-svn: 122132
2010-12-18 03:04:11 +00:00
Jakob Stoklund Olesen
a2f2eab8d4 Check that the register is live-in to the loop header before inserting copies in
the loop predecessors.

The register can be live-out from a predecessor without being live-in to the
loop header if there is a critical edge from the predecessor.

llvm-svn: 122123
2010-12-18 01:06:19 +00:00
Nick Lewycky
30eef45106 Fix GCC warning:
lib/CodeGen/RegAllocGreedy.cpp:311: error: unused variable 'PhysReg' [-Wunused-variable]

llvm-svn: 122122
2010-12-18 01:05:55 +00:00
Jakob Stoklund Olesen
2879da5e13 Pass a Banner argument to the machine code verifier both from
createMachineVerifierPass and MachineFunction::verify.

The banner is printed before the machine code dump, just like the printer pass.

llvm-svn: 122113
2010-12-18 00:06:56 +00:00
Jakob Stoklund Olesen
6498db2c8c Avoid dereferencing end() in collectInterferingVRegs() when there is no
interference.

llvm-svn: 122108
2010-12-17 23:16:38 +00:00
Jakob Stoklund Olesen
db4b62f32e Make the -verify-regalloc command line option available to base classes as
RegAllocBase::VerifyEnabled.

Run the machine code verifier in a few interesting places during RegAllocGreedy.

llvm-svn: 122107
2010-12-17 23:16:35 +00:00
Jakob Stoklund Olesen
df9e162423 Enable loop splitting in RegAllocGreedy.
The heuristics split around the largest loop where the current register may be
allocated without interference.

llvm-svn: 122106
2010-12-17 23:16:32 +00:00
Bill Wendling
c16f9b1ccc During local stack slot allocation, the materializeFrameBaseRegister function
may be called. If the entry block is empty, the insertion point iterator will be
the "end()" value. Calling ->getParent() on it (among others) causes problems.

Modify materializeFrameBaseRegister to take the machine basic block and insert
the frame base register at the beginning of that block. (It's very similar to
what the code does all ready. The only difference is that it will always insert
at the beginning of the entry block instead of after a previous materialization
of the frame base register. I doubt that that matters here.)

<rdar://problem/8782198>

llvm-svn: 122104
2010-12-17 23:09:14 +00:00
Bob Wilson
c57c2d755b Fix a DAGCombiner crash when folding binary vector operations with constant
BUILD_VECTOR operands where the element type is not legal.  I had previously
changed this code to insert TRUNCATE operations, but that was just wrong.

llvm-svn: 122102
2010-12-17 23:06:49 +00:00
Dale Johannesen
c2c6ebd82a Add a transform to DAG Combiner. This improves the
code for the case where 32-bit divide by constant is
turned into 64-bit multiply by constant.  8771012.

llvm-svn: 122090
2010-12-17 21:45:49 +00:00
Jakob Stoklund Olesen
0dc90e6b1b Allow missing kill flags on an untied operand of a two-address instruction when
the operand uses the same register as a tied operand:

  %r1 = add %r1, %r1

If add were a three-address instruction, kill flags would be required on at
least one of the uses. Since it is a two-address instruction, the tied use
operand must not have a kill flag.

This change makes the kill flag on the untied use operand optional.

llvm-svn: 122082
2010-12-17 19:18:41 +00:00
Jakob Stoklund Olesen
f4a0c81371 Add MachineLoopRange comparators for sorting loop lists by number and by area.
llvm-svn: 122073
2010-12-17 18:13:52 +00:00
Jakob Stoklund Olesen
40f23cd5ca Provide LiveIntervalUnion::Query::checkLoopInterference.
This is a three-way interval list intersection between a virtual register, a
live interval union, and a loop. It will be used to identify interference-free
loops for live range splitting.

llvm-svn: 122034
2010-12-17 04:09:47 +00:00
Bob Wilson
e06f6eabe7 Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation.
Radar 8776599

llvm-svn: 122018
2010-12-17 01:21:12 +00:00
Bob Wilson
f6fe49f7d2 Fix a comment typo.
llvm-svn: 122016
2010-12-17 01:21:05 +00:00
Daniel Dunbar
8ab9be2005 MC: Make TargetAsmBackend available to the AsmStreamer.
- Treaty talks on the non-proliferation of MC objects broke down.

llvm-svn: 121949
2010-12-16 03:05:59 +00:00
Jakob Stoklund Olesen
1811e4cb20 Start using SplitKit and MachineLoopRanges in RegAllocGreedy in preparation of
live range splitting around loops guided by register pressure.

So far, trySplit() simply prints a lot of debug output.

llvm-svn: 121918
2010-12-15 23:46:13 +00:00
Jakob Stoklund Olesen
d40af5ffbd Add MachineLoopRanges analysis.
A MachineLoopRange contains the intervals of slot indexes covered by the blocks
in a loop. This representation of the loop blocks is more efficient to compare
against interfering registers during register coalescing.

llvm-svn: 121917
2010-12-15 23:41:23 +00:00
Evan Cheng
68e1ed8752 Teach machine cse to commute instructions.
llvm-svn: 121903
2010-12-15 22:16:21 +00:00
Dan Gohman
295ba3ab26 Move Value::getUnderlyingObject to be a standalone
function so that it can live in Analysis instead of
VMCore.

llvm-svn: 121885
2010-12-15 20:02:24 +00:00
Jakob Stoklund Olesen
3cfea82733 Fix build.
llvm-svn: 121872
2010-12-15 18:07:48 +00:00
Jakob Stoklund Olesen
308656b955 Detect and enumerate bypass loops.
Bypass loops have the current live range live through, but contain no uses or
defs. Splitting around a bypass loop can free registers for other uses inside
the loop by spilling the split range.

llvm-svn: 121871
2010-12-15 17:49:52 +00:00
Jakob Stoklund Olesen
849388944e Separate SplitAnalysis::getSplitLoops().
This method returns the set of loops with uses that are candidates for
splitting.

llvm-svn: 121870
2010-12-15 17:41:19 +00:00
Chris Lattner
81815cd4db take care of some todos, transforming [us]mul_lohi into
a wider mul if the wider mul is legal.

llvm-svn: 121848
2010-12-15 06:04:19 +00:00
Chris Lattner
746d6a1f60 when transforming a MULHS into a wider MUL, there is no need to SRA the
result, the top bits are truncated off anyway, just use SRL.

llvm-svn: 121846
2010-12-15 05:51:39 +00:00
Jakob Stoklund Olesen
48800c9689 Simplify RegAllocGreedy's use of register aliases.
llvm-svn: 121807
2010-12-14 23:38:19 +00:00
Jakob Stoklund Olesen
7ee6f83da1 Simplify CCState's use of register aliases.
llvm-svn: 121806
2010-12-14 23:28:01 +00:00
Jakob Stoklund Olesen
03856151db Simplify AggressiveAntiDepBreaker's use of register aliases.
llvm-svn: 121805
2010-12-14 23:23:15 +00:00
Jakob Stoklund Olesen
870fbdd686 Simplyfy RegAllocBasic by using getOverlaps instead of getAliasSet.
llvm-svn: 121801
2010-12-14 23:10:48 +00:00
Evan Cheng
7e96e67d98 Fix a minor bug in two-address pass. It was missing a commute opportunity.
regB = move RCX
regA = op regB, regC
RAX  = move regA
where both regB and regC are killed. If regB is constrainted to non-compatible
physical registers but regC is not constrainted at all, then it's better to
commute the instruction.
       movl    %edi, %eax
       shlq    $32, %rcx
       leaq    (%rcx,%rax), %rax
=>
       movl    %edi, %eax
       shlq    $32, %rcx
       orq     %rcx, %rax
rdar://8762995

llvm-svn: 121793
2010-12-14 21:34:53 +00:00
Matt Beaumont-Gay
01264443a8 Move debugging code entirely within DEBUG(). Silences an unused variable
warning in the opt build.

llvm-svn: 121791
2010-12-14 21:14:55 +00:00
Jakob Stoklund Olesen
c13ce4748e Add LiveIntervalUnion print methods, RegAllocGreedy::trySplit debug spew.
llvm-svn: 121783
2010-12-14 19:38:49 +00:00
Jakob Stoklund Olesen
c5ad05ca30 Use TRI::printReg instead of AbstractRegisterDescription when printing
LiveIntervalUnions.

llvm-svn: 121781
2010-12-14 18:53:47 +00:00
Jakob Stoklund Olesen
74ba8b77e6 Q.seenAllInterferences() must be called after Q.collectInterferingVRegs().
llvm-svn: 121774
2010-12-14 17:47:36 +00:00
Jakob Stoklund Olesen
d4d3c5dd1e Remove unused vector.
llvm-svn: 121741
2010-12-14 00:58:47 +00:00
Jakob Stoklund Olesen
00d6ac22d0 Try reassigning all virtual register interferences, not just those with lower
spill weight. Filter out fixed registers instead.

Add support for reassigning an interference that was assigned to an alias.

llvm-svn: 121737
2010-12-14 00:37:49 +00:00
Jakob Stoklund Olesen
ffefd5bd4e Add stub for RAGreedy::trySplit.
llvm-svn: 121736
2010-12-14 00:37:44 +00:00
Chris Lattner
14810c808b Add a couple dag combines to transform mulhi/mullo into a wider multiply
when the wider type is legal.  This allows us to compile:

define zeroext i16 @test1(i16 zeroext %x) nounwind {
entry:
	%div = udiv i16 %x, 33
	ret i16 %div
}

into:

test1:                                  # @test1
	movzwl	4(%esp), %eax
	imull	$63551, %eax, %eax      # imm = 0xF83F
	shrl	$21, %eax
	ret

instead of:

test1:                                  # @test1
        movw    $-1985, %ax             # imm = 0xFFFFFFFFFFFFF83F
        mulw    4(%esp)
        andl    $65504, %edx            # imm = 0xFFE0
        movl    %edx, %eax
        shrl    $5, %eax
        ret

Implementing rdar://8760399 and example #4 from:
http://blog.regehr.org/archives/320

We should implement the same thing for [su]mul_hilo, but I don't
have immediate plans to do this.

llvm-svn: 121696
2010-12-13 08:39:01 +00:00
Chris Lattner
324f849088 remove the verbose-asm "constant pool double" comments that we were printing
for each constant pool entry.  Using WriteTypeSymbolic here takes time
proportional to the size of the module, for each constant pool entry.

This speeds up -verbose-asm llc on 252.eon (a random testcase at my disposal)
from 4.4s to 2.137s.  llc takes 2.11s with asm-verbose off, so this is now a
pretty reasonable cost for verbose comments.

llvm-svn: 121691
2010-12-13 07:35:47 +00:00
Chris Lattner
6df4d5d88e reduce indentation by using continue, no functionality change.
llvm-svn: 121662
2010-12-13 01:11:17 +00:00
Duncan Sands
47a4bbd31d Catch attempts to remove a deleted node from the CSE maps. Better to
catch this here rather than later after accessing uninitialized memory
etc.  Fires when compiling the testcase in PR8237.

llvm-svn: 121635
2010-12-12 13:22:50 +00:00
Jakob Stoklund Olesen
a523d5f048 Add named timer groups for the different stages of register allocation.
llvm-svn: 121604
2010-12-11 00:19:56 +00:00
Jakob Stoklund Olesen
ef80efea1d Move MRI into RegAllocBase. Clean up debug output a bit.
llvm-svn: 121599
2010-12-10 23:49:00 +00:00
Nick Lewycky
46a6ed1f0f Remove extraneous close parenthesis.
Fix build breakage.

llvm-svn: 121596
2010-12-10 23:14:35 +00:00
Nick Lewycky
9afbedbc48 Move variable that's unused in an NDEBUG build inside the DEBUG() macro, fixing
lib/CodeGen/RegAllocGreedy.cpp:233: error: unused variable 'TRC' [-Wunused-variable]

llvm-svn: 121594
2010-12-10 23:05:10 +00:00
Jakob Stoklund Olesen
6cd6e644e7 Force the greedy register allocator to always use the inline spiller.
Soon, RegAllocGreedy will start splitting live ranges, and then deferred
spilling won't work anyway.

llvm-svn: 121591
2010-12-10 22:54:44 +00:00
Jakob Stoklund Olesen
cbd4bac09d Rip out live range splitting support from the inline spiller.
The spiller should only spill. The register allocator will drive live range
splitting, it has the needed information about register pressure and
interferences.

llvm-svn: 121590
2010-12-10 22:54:40 +00:00
Jakob Stoklund Olesen
5ab6552845 Use AllocationOrder in RegAllocGreedy, fix a bug in the hint calculation.
llvm-svn: 121584
2010-12-10 22:21:05 +00:00
Jakob Stoklund Olesen
ea59381fc8 Fix miscompilation caused by trivial logic error in the reassignVReg()
interference check.

llvm-svn: 121519
2010-12-10 20:45:04 +00:00
Jakob Stoklund Olesen
e3924a3c85 Add an AllocationOrder class that can iterate over the allocatable physical
registers for a given virtual register.

Reserved registers are filtered from the allocation order, and any valid hint is
returned as the first suggestion.

For target dependent hints, a number of arcane target hooks are invoked.

llvm-svn: 121497
2010-12-10 18:36:02 +00:00
Rafael Espindola
0e665e502d Fixed version of 121434 with no new memory leaks.
llvm-svn: 121471
2010-12-10 07:39:47 +00:00
Rafael Espindola
011e168728 Revert my previous patch to make the valgrind bots happy.
llvm-svn: 121461
2010-12-10 04:01:09 +00:00
Rafael Espindola
03ad1e8f1f Initial support for the cfi directives. This is just enough to get
f:
        .cfi_startproc
        nop
        .cfi_endproc

assembled (on ELF).

llvm-svn: 121434
2010-12-09 23:48:29 +00:00
Stuart Hastings
f7bba0cfe3 Initial support for nested CALLSEQ_START/CALLSEQ_END constructs in LegalizeDAG.
Necessary for byval support on ARM.  Radar 7662569.

llvm-svn: 121412
2010-12-09 21:25:20 +00:00
Jakob Stoklund Olesen
fe4b9ee934 Remember to filter out reserved rergisters from the allocation order.
llvm-svn: 121411
2010-12-09 21:20:46 +00:00
Jakob Stoklund Olesen
5bb5c67227 Add a forgotten initializer for CheckedFirstInterference.
llvm-svn: 121410
2010-12-09 21:20:44 +00:00
Andrew Trick
ec37b93b07 Added register reassignment prototype to RAGreedy. It's a simple
heuristic to reshuffle register assignments when we can't find an
available reg.

llvm-svn: 121388
2010-12-09 18:15:21 +00:00
Eric Christopher
ebd7ab9857 80-col fixups.
llvm-svn: 121356
2010-12-09 04:48:06 +00:00
Jakob Stoklund Olesen
17b2e8c293 IntervalMap iterators are heavyweight, so avoid copying them around and use
references instead.

Similarly, IntervalMap::begin() is almost as expensive as find(), so use find(x)
instead of begin().advanceTo(x);

This makes RegAllocBasic run another 5% faster.

llvm-svn: 121344
2010-12-09 01:06:52 +00:00
Devang Patel
b3e0d80b1f DW_FORM_data1 may not provide sufficient room for vtable index, use _udata instead.
This fixes radar 8730409.

llvm-svn: 121323
2010-12-09 00:10:40 +00:00
Jakob Stoklund Olesen
ffc0f6586a Properly deal with empty intervals when checking for interference.
llvm-svn: 121319
2010-12-08 23:51:35 +00:00
Jakob Stoklund Olesen
37df3f04c2 Implement very primitive hinting support in RegAllocGreedy.
The hint is simply tried first and then forgotten if it couldn't be allocated
immediately.

llvm-svn: 121306
2010-12-08 22:57:16 +00:00
Jakob Stoklund Olesen
3c81b6a50b Store (priority,regnum) pairs in the priority queue instead of providing an
abstract priority queue interface in subclasses that want to override the
priority calculations.

Subclasses must provide a getPriority() implementation instead.

This approach requires less code as long as priorities are expressable as simple
floats, and it avoids the dangers of defining potentially expensive priority
comparison functions.

It also should speed up priority_queue operations since they no longer have to
chase pointers when comparing registers. This is not measurable, though.

Preferably, we shouldn't use floats to guide code generation. The use of floats
here is derived from the use of floats for spill weights. Spill weights have a
dynamic range that doesn't lend itself easily to a fixpoint implementation.

When someone invents a stable spill weight representation, it can be reused for
allocation priorities.

llvm-svn: 121294
2010-12-08 22:22:41 +00:00
Eric Christopher
d492f798d1 Reword comment slightly.
llvm-svn: 121293
2010-12-08 22:21:42 +00:00
Eric Christopher
77d3a7b3fb Fix comment.
llvm-svn: 121285
2010-12-08 21:35:09 +00:00
Jakob Stoklund Olesen
f04d283db1 Trim includes.
llvm-svn: 121283
2010-12-08 21:12:00 +00:00
Andrew Trick
fb72ca2129 Generalize PostRAHazardRecognizer so it can be used in any pass for
both forward and backward scheduling. Rename it to
ScoreboardHazardRecognizer (Scoreboard is one word). Remove integer
division from the scoreboard's critical path.

llvm-svn: 121274
2010-12-08 20:04:29 +00:00
Jakob Stoklund Olesen
d638b989f2 Stub out RegAllocGreedy.
This new register allocator is initially identical to RegAllocBasic, but it will
receive all of the tricks that RegAllocBasic won't get.

RegAllocGreedy will eventually replace linear scan.

llvm-svn: 121234
2010-12-08 03:26:16 +00:00
Jakob Stoklund Olesen
77e7ad803a Move RABasic::addMBBLiveIns to the base class, it is generally useful.
Minor optimization to the use of IntervalMap iterators. They are fairly
heavyweight, so prefer SI.valid() over SI != end().

llvm-svn: 121217
2010-12-08 01:06:06 +00:00
Jakob Stoklund Olesen
9d6472e894 Switch LiveIntervalUnion from std::set to IntervalMap.
This speeds up RegAllocBasic by 20%, not counting releaseMemory which becomes
way faster.

llvm-svn: 121201
2010-12-07 23:18:47 +00:00
Jakob Stoklund Olesen
39e22e19bf Simplify assertion.
llvm-svn: 121162
2010-12-07 18:51:27 +00:00
Jay Foad
79e18ed269 PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and
zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method
trunc(), to be const and to return a new value instead of modifying the
object in place.

llvm-svn: 121120
2010-12-07 08:25:19 +00:00
Jakob Stoklund Olesen
48ba44334f Remove unused member.
llvm-svn: 121098
2010-12-07 01:32:45 +00:00
Devang Patel
12459bc442 Undefined value in reg 0 may need a marker to identify end of source range.
This will be used to truncate live range of DBG_VALUE instruction by register allocator and friends.

llvm-svn: 121061
2010-12-06 22:48:22 +00:00
Devang Patel
6fe7fe8dd4 If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG message instead of creating DBG_VALUE for undefined value in reg0.
llvm-svn: 121059
2010-12-06 22:39:26 +00:00
Rafael Espindola
3e954d16f4 Second try at making direct object emission produce the same results
as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc
bootstrap on darwin10 using darwin9's assembler and linker.

llvm-svn: 121006
2010-12-06 17:27:56 +00:00
Rafael Espindola
4ec917db9b Revert previous two patches while I try to find out how to make both
linux and darwin assemblers happy :-(

llvm-svn: 121004
2010-12-06 15:35:15 +00:00
Rafael Espindola
3dc2b4cba7 Add an EmitAbsValue helper method and use it in cases where we want to be sure
that no relocations are used (on MochO).
Fixes llc producing different output from llc + llvm-mc.

llvm-svn: 121000
2010-12-06 14:53:14 +00:00
Cameron Zwarich
f56ba80bb2 Some cleanup before I start committing some incremental progress on
StrongPHIElimination.

llvm-svn: 120961
2010-12-05 22:34:08 +00:00
Cameron Zwarich
f64c26bb9e Remove the PHIElimination.h header, as it is no longer needed.
llvm-svn: 120959
2010-12-05 21:39:42 +00:00
Cameron Zwarich
fbe9e91d97 I forgot to actually remove the FindCopyInsertPoint() declaration from
PHIElimination.h.

llvm-svn: 120953
2010-12-05 19:58:57 +00:00
Cameron Zwarich
cb613dcf69 Remove the SplitCriticalEdge() method declaration from PHIElimination.h. At one
time, this method existed, but now PHIElimination uses the method of the same
name on MachineBasicBlock.

llvm-svn: 120952
2010-12-05 19:54:23 +00:00
Cameron Zwarich
c680f44c1b Move the FindCopyInsertPoint method of PHIElimination to a new standalone
function so that it can be shared with StrongPHIElimination.

llvm-svn: 120951
2010-12-05 19:51:05 +00:00