llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-05 18:49:06 +00:00

Author	SHA1	Message	Date
Eric Christopher	f7579ff174	Expand invalid return values for umulo and smulo. Handle these similarly to add/sub by doing the normal operation and then checking for overflow afterwards. This generally relies on the DAG handling the later invalid operations as well. Fixes the 64-bit part of rdar://8622122 and rdar://8774702. llvm-svn: 123908	2011-01-20 08:54:28 +00:00
Evan Cheng	6dc21c7358	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905	2011-01-20 08:34:58 +00:00
Andrew Trick	bf079d8831	Selection DAG scheduler register pressure heuristic fixes. Added a check for already live regs before claiming HighRegPressure. Fixed a few cases of checking the wrong number of successors. Added some tracing until these heuristics are better understood. llvm-svn: 123892	2011-01-20 06:21:59 +00:00
Jakob Stoklund Olesen	ea33059ff5	Check that a live range exists before shortening it. This fixes PR8989. The live range may have been deleted earlier because of rematerialization. llvm-svn: 123891	2011-01-20 06:20:02 +00:00
Jakob Stoklund Olesen	bb94da29b2	Add hidden -verify-coalescing to run the machine code verifier before and after register coalescing. llvm-svn: 123890	2011-01-20 06:20:00 +00:00
Jakob Stoklund Olesen	c387993232	Fix bug found by new clang warning. llvm-svn: 123872	2011-01-20 02:43:19 +00:00
Eric Christopher	58f8058502	Use only one API at a time. llvm-svn: 123866	2011-01-20 01:29:23 +00:00
Eric Christopher	1b0e5debb4	If we can, lower the multiply part of a umulo/smulo call to a libcall with an invalid type then split the result and perform the overflow check normally. Fixes the 32-bit parts of rdar://8622122 and rdar://8774702. llvm-svn: 123864	2011-01-20 00:29:24 +00:00
Devang Patel	729c5e59af	Fix debug info for merged global. llvm-svn: 123862	2011-01-20 00:02:16 +00:00
Jakob Stoklund Olesen	69294ae8d7	Divert Hopfield network debug output. It is very noisy. llvm-svn: 123859	2011-01-19 23:14:59 +00:00
Jakob Stoklund Olesen	c47bd85657	Don't accidentally leave small gaps in the live ranges when leaving the active interval after an instruction. The leaveIntvAfter() method only adds liveness from the instruction's boundary index to the inserted copy. Ideally, SplitKit should be smarter about this, perhaps by combining useIntv() and leaveIntvAfter() into one method that guarantees continuity. llvm-svn: 123858	2011-01-19 23:14:56 +00:00
Devang Patel	574e10fa1e	Fix register address expression. Patch by Ken Dyck. llvm-svn: 123856	2011-01-19 23:04:47 +00:00
Jakob Stoklund Olesen	77738dd84e	Implement RAGreedy::splitAroundRegion and remove loop splitting. Region splitting includes loop splitting as a subset, and it is more generic. The splitting heuristics for variables that are live in more than one block are now: 1. Try to create a region that covers multiple basic blocks. 2. Try to create a new live range for each block with multiple uses. 3. Spill. Steps 2 and 3 are similar to what the standard spiller is doing. llvm-svn: 123853	2011-01-19 22:11:48 +00:00
Jakob Stoklund Olesen	c0ff5356d4	Add RAGreedy methods for splitting live ranges around regions. Analyze the live range's behavior entering and leaving basic blocks. Compute an interference pattern for each allocation candidate, and use SpillPlacement to find an optimal region where that register can be live. This code is still not enabled. llvm-svn: 123774	2011-01-18 21:13:27 +00:00
Jeffrey Yasskin	5f5e1f5ef1	Remove unused variables found by gcc-4.6's -Wunused-but-set-variable. llvm-svn: 123707	2011-01-18 00:51:23 +00:00
Stuart Hastings	f5f8318eb6	Remove checking that prevented overlapping CALLSEQ_START/CALLSEQ_END ranges, add legalizer support for nested calls. Necessary for ARM byval support. Radar 7662569. llvm-svn: 123704	2011-01-18 00:09:27 +00:00
Benjamin Kramer	869dc645f1	Fix an off-by-one error in ctpop combining. llvm-svn: 123664	2011-01-17 18:00:28 +00:00
Benjamin Kramer	e9488ed8eb	Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0. This shaves off 4 popcounts from the hacked 186.crafty source. This is enabled even when a native popcount instruction is available. The combined code is one operation longer but it should be faster nevertheless. llvm-svn: 123621	2011-01-17 12:04:57 +00:00
Chris Lattner	c4d1d86d3e	reapply my fix for PR8961 with a tweak to properly handle multi-instruction sequences like calls. Many thanks to Jakob for finding a testcase. llvm-svn: 123559	2011-01-16 02:27:38 +00:00
Benjamin Kramer	2e7ead5bb5	Add an assert so we don't silently miscompile ctpop for bit widths > 128. llvm-svn: 123549	2011-01-15 21:19:37 +00:00
Benjamin Kramer	b48a048de6	Reimplement CTPOP legalization with the "best" algorithm from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter, especially when counting 64 bit population on a 32 bit target. I hope this is fast enough to replace Kernighan-style counting loops even when the input is rather sparse. llvm-svn: 123547	2011-01-15 20:30:30 +00:00
Ted Kremenek	c9d2425c5a	Update CMake build. llvm-svn: 123491	2011-01-14 22:58:11 +00:00
Dan Gohman	a4f2631ea9	Delete an assignment to ThisBB which isn't needed, and tidy up some comments. llvm-svn: 123479	2011-01-14 22:26:16 +00:00
Anton Korobeynikov	1f9df99db1	Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff. llvm-svn: 123476	2011-01-14 21:58:08 +00:00
Anton Korobeynikov	ef11a77938	Add CFI directives-based frame information emission. Not hooked yet. llvm-svn: 123474	2011-01-14 21:57:53 +00:00
Anton Korobeynikov	e53322ef91	Split stuff as a preparation for CFI directives-based frame information emission llvm-svn: 123473	2011-01-14 21:57:45 +00:00
Andrew Trick	a0e69757d1	Support for precise scheduling of the instruction selection DAG, disabled in this checkin. Sorry for the large diffs due to refactoring. New functionality is all guarded by EnableSchedCycles. Scheduling the isel DAG is inherently imprecise, but we give it a best effort: - Added MayReduceRegPressure to allow stalled nodes in the queue only if there is a regpressure need. - Added BUHasStall to allow checking for either dependence stalls due to latency or resource stalls due to pipeline hazards. - Added BUCompareLatency to encapsulate and standardize the heuristics for minimizing stall cycles (vs. reducing register pressure). - Modified the bottom-up heuristic (now in BUCompareLatency) to prioritize nodes by their depth rather than height. As long as it doesn't stall, height is irrelevant. Depth represents the critical path to the DAG root. - Added hybrid_ls_rr_sort::isReady to filter stalled nodes before adding them to the available queue. Related Cleanup: most of the register reduction routines do not need to be templates. llvm-svn: 123468	2011-01-14 21:11:41 +00:00
Jakob Stoklund Olesen	9f5e00f957	Try for the third time to teach getFirstTerminator() about debug values. This time let's rephrase to trick gcc-4.3 into not miscompiling. llvm-svn: 123432	2011-01-14 06:33:45 +00:00
Jakob Stoklund Olesen	99ad62ed9e	Revert r123419. It still breaks llvm-gcc-i386-linux-selfhost. llvm-svn: 123423	2011-01-14 02:12:54 +00:00
Chris Lattner	a0074ca5fc	Set the insertion point correctly for instructions generated by load folding: they should go before the new instruction not after it. llvm-svn: 123420	2011-01-14 01:33:40 +00:00
Jakob Stoklund Olesen	3d8deb13ee	Try again to teach getFirstTerminator() about debug values. Fix some callers to better deal with debug values. llvm-svn: 123419	2011-01-14 01:17:53 +00:00
Jakob Stoklund Olesen	b5e12bb37c	Better terminator avoidance. This approach also works when the terminator doesn't have a slot index. (Which can happen??) llvm-svn: 123413	2011-01-13 23:35:53 +00:00
Jakob Stoklund Olesen	d63287ff98	Temporary workaround for an i386 crash in LiveDebugVariables. llvm-svn: 123400	2011-01-13 21:28:55 +00:00
Jakob Stoklund Olesen	0f2b9d9dc4	Teach frame lowering to ignore debug values after the terminators. llvm-svn: 123399	2011-01-13 21:28:52 +00:00
Devang Patel	8e59113036	Speculatively revert r123384 to make llvm-gcc-i386-linux-selfhost buildbot happy. llvm-svn: 123389	2011-01-13 19:27:50 +00:00
Jakob Stoklund Olesen	6aa35206e7	Teach MachineBasicBlock::getFirstTerminator to ignore debug values. It will still return an iterator that points to the first terminator or end(), but there may be DBG_VALUE instructions following the first terminator. llvm-svn: 123384	2011-01-13 18:41:05 +00:00
Dan Gohman	f4ec824435	Fix r123346 to handle scalar types too. llvm-svn: 123352	2011-01-13 01:06:51 +00:00
Jakob Stoklund Olesen	6cdcc6287b	Add missing space in debug output llvm-svn: 123351	2011-01-13 00:57:35 +00:00
Dan Gohman	5bbd766a7b	Apply the patch from PR8958, which allows llc to get slightly further on the associated testcase before aborting. llvm-svn: 123346	2011-01-12 23:56:26 +00:00
Jakob Stoklund Olesen	3987889b61	Try again enabling LiveDebugVariables. llvm-svn: 123342	2011-01-12 23:36:21 +00:00
Jakob Stoklund Olesen	953b1b115d	Don't emit a DBG_VALUE for a spill slot that the rewriter decided not to use after all. llvm-svn: 123339	2011-01-12 23:14:07 +00:00
Jakob Stoklund Olesen	48c7a5cf7e	Fix braino in dominator tree walk. llvm-svn: 123338	2011-01-12 23:14:04 +00:00
Jakob Stoklund Olesen	7a13190a2e	Sometimes, old virtual registers can linger on DBG_VALUE instructions. Make sure we don't crash in that case, but simply turn them into %noreg instead. llvm-svn: 123335	2011-01-12 22:37:49 +00:00
Jakob Stoklund Olesen	59d3b89873	Teach VirtRegRewriter to update slot indexes when erasing instructions. It was leaving dangling pointers in the slot index maps. llvm-svn: 123334	2011-01-12 22:28:51 +00:00
Jakob Stoklund Olesen	8c5c268f05	Annotate VirtRegRewriter debug output with slot indexes. llvm-svn: 123333	2011-01-12 22:28:48 +00:00
Jakob Stoklund Olesen	c1a042a528	Verify slot index ordering. The slot indexes must be monotonically increasing through the function. llvm-svn: 123324	2011-01-12 21:27:48 +00:00
Jakob Stoklund Olesen	764cce86f0	Verify that machine instruction parent pointers are consistent. llvm-svn: 123322	2011-01-12 21:27:41 +00:00
Jakob Stoklund Olesen	1f7052b53b	The world is not ready for LiveDebugVariables yet. llvm-svn: 123290	2011-01-11 23:20:33 +00:00
Jakob Stoklund Olesen	d7a523358c	Enable LiveDebugVariables by default. llvm-svn: 123282	2011-01-11 22:45:28 +00:00
Jakob Stoklund Olesen	1cd577b435	Don't insert DBG_VALUE instructions after the first terminator. For one, MachineBasicBlock::getFirstTerminator() doesn't understand what is happening, and it also makes sense to have all control flow run through the DBG_VALUE. llvm-svn: 123277	2011-01-11 22:11:16 +00:00
Devang Patel	7b5cf4eafc	Appropriately truncate debug info range in dwarf output. This is not yet completely enabled. llvm-svn: 123274	2011-01-11 21:42:10 +00:00
Eric Christopher	5a4d64216f	Move ExpandAtomic into the integer expansion routines - it's only used there. llvm-svn: 123202	2011-01-11 00:36:08 +00:00
Dale Johannesen	cd78621861	Fix PR 8916 (qv for analysis), at least the immediate problem. There's an inherent tension in DAGCombine between assuming that things will be put in canonical form, and the Depth mechanism that disables transformations when recursion gets too deep. It would not surprise me if there's a lot of little bugs like this one waiting to be discovered. The mechanism seems fragile and I'd suggest looking at it from a design viewpoint. llvm-svn: 123191	2011-01-10 21:53:07 +00:00
Anton Korobeynikov	cf5967630b	Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there. llvm-svn: 123170	2011-01-10 12:39:04 +00:00
Chris Lattner	0e49a35bd2	fit in 80 cols and use MBB::isSuccessor instead of a hand rolled std::find. llvm-svn: 123164	2011-01-10 07:51:31 +00:00
Jakob Stoklund Olesen	32f1783ca1	Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic. These functions not longer assert when passed 0, but simply return false instead. No functional change intended. llvm-svn: 123155	2011-01-10 02:58:51 +00:00
Jakob Stoklund Olesen	785d31a2d2	Remove MachineRegisterInfo::getLastVirtReg(), it was giving wrong results when no virtual registers have been allocated. It was only used to resize IndexedMaps, so provide an IndexedMap::resize() method such that Map.grow(MRI.getLastVirtReg()); can be replaced with the simpler Map.resize(MRI.getNumVirtRegs()); This works correctly when no virtuals are allocated, and it bypasses the to/from index conversions. llvm-svn: 123130	2011-01-09 21:58:20 +00:00
Chris Lattner	f26e71fa4c	sort this. llvm-svn: 123129	2011-01-09 21:31:39 +00:00
Jakob Stoklund Olesen	957748e7ac	Teach TargetRegisterInfo how to cram stack slot indexes in with the virtual and physical register numbers. This makes the hack used in LiveInterval official, and lets LiveInterval be oblivious of stack slots. The isPhysicalRegister() and isVirtualRegister() predicates don't know about this, so when a variable may contain a stack slot, isStackSlot() should always be tested first. llvm-svn: 123128	2011-01-09 21:17:37 +00:00
Jakob Stoklund Olesen	0088b6ffb6	Add a forgotten VireReg2IndexFunctor. llvm-svn: 123123	2011-01-09 18:58:33 +00:00
Cameron Zwarich	3e060bd398	Eliminate some extra hash table lookups. llvm-svn: 123115	2011-01-09 10:54:21 +00:00
Cameron Zwarich	4625675112	Add an informative comment. llvm-svn: 123114	2011-01-09 10:32:30 +00:00
Jakob Stoklund Olesen	d4dcf22b65	Simplify LiveDebugVariables by storing MachineOperand copies locations instead of using a Location class with the same information. When making a copy of a MachineOperand that was already stored in a MachineInstr, it is necessary to clear the parent pointer on the copy. Otherwise the register use-def lists become inconsistent. Add MachineOperand::clearParent() to do that. An alternative would be a custom MachineOperand copy constructor that cleared ParentMI. I didn't want to do that because of the performance impact. llvm-svn: 123109	2011-01-09 05:33:21 +00:00
Jakob Stoklund Olesen	c20baa8f1d	Shrink a BitVector that didn't mean to store bits for all physical registers. llvm-svn: 123108	2011-01-09 03:45:44 +00:00
Jakob Stoklund Olesen	ed53ab1635	Replace TargetRegisterInfo::printReg with a PrintReg class that also works without a TRI instance. Print virtual registers numbered from 0 instead of the arbitrary FirstVirtualRegister. The first virtual register is printed as %vreg0. TRI::NoRegister is printed as %noreg. llvm-svn: 123107	2011-01-09 03:05:53 +00:00
Jakob Stoklund Olesen	9a7e67d141	Use IndexedMap for MachineRegisterInfo as well. No functional change. llvm-svn: 123106	2011-01-09 03:05:46 +00:00
Jakob Stoklund Olesen	f43442c9f7	Fix VirtRegMap to use TRI::index2VirtReg and TRI::virtReg2Index instead of depending on TRI::FirstVirtualRegister. Also use TRI::printReg instead of printing virtual registers directly. llvm-svn: 123101	2011-01-08 23:11:07 +00:00
Jakob Stoklund Olesen	b04c78d5ea	Fix a MachineVerifier loop that probably didn't mean to skip the last two virtual registers. llvm-svn: 123100	2011-01-08 23:11:02 +00:00
Jakob Stoklund Olesen	fb2b53c0de	Use an IndexedMap for LiveVariables::VirtRegInfo. Provide MRI::getNumVirtRegs() and TRI::index2VirtReg() functions to allow iteration over virtual registers without depending on the representation of virtual register numbers. llvm-svn: 123098	2011-01-08 23:10:57 +00:00
Jakob Stoklund Olesen	b3820cdc22	Use an IndexedMap for LiveOutRegInfo to hide its dependence on TargetRegisterInfo::FirstVirtualRegister. llvm-svn: 123096	2011-01-08 23:10:50 +00:00
Cameron Zwarich	33c137a88b	Fix coding style. llvm-svn: 123093	2011-01-08 22:36:53 +00:00
Cameron Zwarich	a40df277f1	Make more passes preserve dominators (or state that they preserve dominators if they all ready do). This removes two dominator recomputations prior to isel, which is a 1% improvement in total llc time for 403.gcc. The only potentially suspect thing is making GCStrategy recompute dominators if it used a custom lowering strategy. llvm-svn: 123064	2011-01-08 17:01:52 +00:00
Evan Cheng	1afd04fc59	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Evan Cheng	aa16fd02ad	Do not model all INLINEASM instructions as having unmodelled side effects. Instead encode llvm IR level property "HasSideEffects" in an operand (shared with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check the operand when the instruction is an INLINEASM. This allows memory instructions to be moved around INLINEASM instructions. llvm-svn: 123044	2011-01-07 23:50:32 +00:00
Devang Patel	d3ba97949a	Speculatively revert r123032. llvm-svn: 123039	2011-01-07 22:33:41 +00:00
Devang Patel	a52d6c216d	Appropriately truncate debug info range in dwarf output. Enable live debug variables pass. llvm-svn: 123032	2011-01-07 21:30:41 +00:00
Evan Cheng	8b58b77d06	DBG_VALUE does not have any side effects; it also makes no sense to mark it cheap as a copy. llvm-svn: 123031	2011-01-07 21:08:26 +00:00
Bob Wilson	22f18a7e94	Add ARM patterns to match EXTRACT_SUBVECTOR nodes. Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle vectors from being translated to EXTRACT_SUBVECTOR. Patch by Tim Northover. The test changes are needed to keep those spill-q tests from testing aligned spills and restores. If the only aligned stack objects are spill slots, we no longer realign the stack frame. Prior to this patch, an EXTRACT_SUBVECTOR was legalized by loading from the stack, which created an aligned frame index. Now, however, there is nothing except the spill slot in the stack frame, so I added an aligned alloca. llvm-svn: 122995	2011-01-07 04:59:04 +00:00
Bob Wilson	d9a324ac11	Fix a comment typo. llvm-svn: 122994	2011-01-07 04:58:58 +00:00
Bob Wilson	bcbb3375dd	Change EXTRACT_SUBVECTOR to require a constant index. We were never generating any of these nodes with variable indices, and there was one legalizer function asserting on a non-constant index. If we ever have a need to support variable indices, we can add this back again. llvm-svn: 122993	2011-01-07 04:58:56 +00:00
Bill Wendling	0bf94c2188	Early exit if we don't have invokes. The 'Unwinds' vector isn't modified unless we have invokes, so there is no functionality change here. llvm-svn: 122990	2011-01-07 02:54:45 +00:00
Duncan Sands	06444485ee	Fix the other problem reported in PR8582. Testcase and patch by Nadav Rotem. llvm-svn: 122983	2011-01-06 23:45:22 +00:00
Eric Christopher	16127008fd	Add some fairly duplicated code to let type legalization split illegal typed atomics. This will lower exclusively to libcalls at the moment. llvm-svn: 122979	2011-01-06 22:28:56 +00:00
Devang Patel	7cb0e7c2ef	Emit 128 bit constant. This fixes PR 8913 crash. llvm-svn: 122971	2011-01-06 21:39:25 +00:00
Evan Cheng	cb39cc2164	Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy etc. takes an option OptSize. If OptSize is true, it would return the inline limit for functions with attribute OptSize. llvm-svn: 122952	2011-01-06 06:52:41 +00:00
Evan Cheng	70711ea54d	Revert r122936. I'll re-implement the change. llvm-svn: 122949	2011-01-06 06:17:53 +00:00
Jakob Stoklund Olesen	b3e7b27c1f	Zap the last two -Wself-assign warnings in llvm. Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there. llvm-svn: 122940	2011-01-06 01:33:22 +00:00
Jakob Stoklund Olesen	7b1480ff12	Add the SpillPlacement analysis pass. This pass precomputes CFG block frequency information that can be used by the register allocator to find optimal spill code placement. Given an interference pattern, placeSpills() will compute which basic blocks should have the current variable enter or exit in a register, and which blocks prefer the stack. The algorithm is ready to consume block frequencies from profiling data, but for now it gets by with the static estimates used for spill weights. This is a work in progress and still not hooked up to RegAllocGreedy. llvm-svn: 122938	2011-01-06 01:21:53 +00:00
Evan Cheng	d425aa5d2a	r105228 reduced the memcpy / memset inline limit to 4 with -Os to avoid blowing up freebsd bootloader. However, this doesn't make much sense for Darwin, whose -Os is meant to optimize for size only if it doesn't hurt performance. rdar://8821501 llvm-svn: 122936	2011-01-06 01:04:47 +00:00
Evan Cheng	2af40ae781	Avoid zero extend bit test operands to pointer type if all the masks fit in the original type of the switch statement key. rdar://8781238 llvm-svn: 122935	2011-01-06 01:02:44 +00:00
Evan Cheng	bf92316fab	Optimize: r1025 = s/zext r1024, 4 r1026 = extract_subreg r1025, 4 to: r1026 = copy r1024 llvm-svn: 122925	2011-01-05 23:06:49 +00:00
Jakob Stoklund Olesen	ce25984bae	Add a hidden command line option to display edge bundle graphs as they are calculated. llvm-svn: 122912	2011-01-05 21:50:24 +00:00
Eric Christopher	651810d717	80-cols. llvm-svn: 122909	2011-01-05 21:45:56 +00:00
Eric Christopher	be2382f9a6	Remove TODO, these appear to be implemented. llvm-svn: 122849	2011-01-04 22:31:50 +00:00
Jakob Stoklund Olesen	abf8941a60	Turn the EdgeBundles class into a stand-alone machine CFG analysis pass. The analysis will be needed by both the greedy register allocator and the X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't change. This pass is very fast, usually showing up as 0.0% wall time. llvm-svn: 122832	2011-01-04 21:10:05 +00:00
Cameron Zwarich	fce4db4cbe	Switch to path halving from path compression for a small speedup. This also makes getLeader() nonrecursive. llvm-svn: 122811	2011-01-04 16:24:51 +00:00
Cameron Zwarich	2975ee7cc6	Eliminate repeated allocation of a per-BB DenseMap for a 4.6% reduction of time spent in StrongPHIElimination on 403.gcc. llvm-svn: 122803	2011-01-04 06:42:27 +00:00
Owen Anderson	9eeb0d483e	Clean up a funky pass registration that got passed over when I got rid of static constructors. llvm-svn: 122795	2011-01-04 00:55:21 +00:00
Cameron Zwarich	60ec113434	Use a RecyclingAllocator to allocate values for MachineCSE's ScopedHashTable for a 28% speedup of MachineCSE time on 403.gcc. llvm-svn: 122735	2011-01-03 04:07:46 +00:00
Chris Lattner	e396e846b4	split dom frontier handling stuff out to its own DominanceFrontier header, so that Dominators.h is just domtree. Also prune #includes a bit. llvm-svn: 122714	2011-01-02 22:09:33 +00:00
Benjamin Kramer	a58b69aa9d	Try to reuse the value when lowering memset. This allows us to compile: void test(char *s, int a) { __builtin_memset(s, a, 15); } into 1 mul + 3 stores instead of 3 muls + 3 stores. llvm-svn: 122710	2011-01-02 19:57:05 +00:00
Benjamin Kramer	38491f47ce	Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors. We could implement a DAGCombine to turn x * 0x0101 back into logic operations on targets that doesn't support the multiply or it is slow (p4) if someone cares enough. Example code: void test(char *s, int a) { __builtin_memset(s, a, 4); } before: _test: ## @test movzbl 8(%esp), %eax movl %eax, %ecx shll $8, %ecx orl %eax, %ecx movl %ecx, %eax shll $16, %eax orl %ecx, %eax movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret after: _test: ## @test movzbl 8(%esp), %eax imull $16843009, %eax, %eax ## imm = 0x1010101 movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret llvm-svn: 122707	2011-01-02 19:44:58 +00:00
Cameron Zwarich	ae468579bb	Use getVRegDef() instead of def_iterator. This leads to fewer defs being added with 2-address instructions, for about a 3.5% speedup of StrongPHIElimination on 403.gcc. llvm-svn: 122635	2010-12-30 00:42:23 +00:00
Cameron Zwarich	1e7124e6fa	None of the other pass names in CodeGen have terminating periods. llvm-svn: 122628	2010-12-29 11:49:10 +00:00
Cameron Zwarich	a7052a3c06	Instead of processing every instruction when splitting interferences, only process those instructions that define phi sources. This is a 47% speedup of StrongPHIElimination compile time on 403.gcc. llvm-svn: 122627	2010-12-29 11:00:09 +00:00
Cameron Zwarich	292870da06	Add a missing word to a comment. llvm-svn: 122625	2010-12-29 04:42:39 +00:00
Cameron Zwarich	6fc15ba38b	Add text explaining an assertion. llvm-svn: 122617	2010-12-29 03:52:51 +00:00
Cameron Zwarich	0fa638e27c	Simplify some code in MachineVerifier that was doing the correct thing, but not in the most obvious way. llvm-svn: 122610	2010-12-28 23:45:38 +00:00
Cameron Zwarich	3eacb7fff8	Revert the optimization in r122596. It is correct for all current targets, but it relies on assumptions that may not be true in the future. llvm-svn: 122608	2010-12-28 23:02:56 +00:00
Cameron Zwarich	c9c7488542	Avoid iterating every operand of an instruction in StrongPHIElimination, since we are only interested in the defs when discovering interferences. This is a 28% speedup running StrongPHIElimination on 403.gcc. llvm-svn: 122596	2010-12-28 10:49:33 +00:00
Duncan Sands	cc5a4497fd	Pacify the compiler. BestWeight cannot in fact be used uninitialized in this function, but the compiler was warning that it might be when doing a release build. llvm-svn: 122595	2010-12-28 10:07:15 +00:00
Cameron Zwarich	30f2239301	Change an assertion to assert what the code actually relies upon. llvm-svn: 122586	2010-12-27 22:08:42 +00:00
Cameron Zwarich	cfdb10a1bb	Land a first cut at StrongPHIElimination. There are only 5 new test failures when running without the verifier, and I have not yet checked them to see if the new results are still correct. There are more verifier failures, but they all seem to be additional occurrences of verifier failures that occur with the existing PHIElimination pass. There are a few obvious issues with the code: 1) It doesn't properly update the register equivalence classes during copy insertion, and instead recomputes them before merging live intervals and renaming registers. I wanted to keep this first patch simple for debugging purposes, but it shouldn't be very hard to do this. 2) It doesn't mix the renaming and live interval merging with the copy insertion process, which leads to a lot of virtual register churn. Virtual registers and live intervals are created, only to later be merged into others. The code should be smarter and only create a new virtual register if there is no existing register in the same congruence class. 3) In one place the code uses a DenseMap per basic block, which is unnecessary heap allocation. There should be an inline storage version of DenseMap. I did a quick compile-time test of running llc on 403.gcc with and without StrongPHIElimination. It is slightly slower with StrongPHIElimination, because the small decrease in the coalescer runtime can't beat the increase in phi elimination runtime. Perhaps fixing the above performance issues will narrow the gap. I also haven't yet run any tests of the quality of the generated code. llvm-svn: 122582	2010-12-27 10:08:19 +00:00
Cameron Zwarich	66289e34e1	Add knowledge of phi-def and phi-kill valnos to MachineVerifier's predecessor valno verification. The "Different value live out of predecessor" check is incorrect in the case of phi-def valnos, so just skip that check for phi-def valnos and instead check that all of the valnos for predecessors have phi-kill. Fixes PR8863. llvm-svn: 122581	2010-12-27 05:17:23 +00:00
Andrew Trick	dfa31b1cf9	Minor cleanup related to my latest scheduler changes. llvm-svn: 122545	2010-12-24 07:10:19 +00:00
Andrew Trick	c926e98fc7	Fix a few cases where the scheduler is not checking for phys reg copies. The scheduling node may have a NULL DAG node, yuck. llvm-svn: 122544	2010-12-24 06:46:50 +00:00
Andrew Trick	134b2a5907	Various bits of framework needed for precise machine-level selection DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541	2010-12-24 05:03:26 +00:00
Andrew Trick	53f4556c64	whitespace llvm-svn: 122539	2010-12-24 04:28:06 +00:00
Cameron Zwarich	a7ad357a13	Simplify a check for implicit defs and remove a FIXME. llvm-svn: 122537	2010-12-24 03:09:36 +00:00
Chris Lattner	b607e7deda	flags -> glue for selectiondag llvm-svn: 122509	2010-12-23 17:24:32 +00:00
Chris Lattner	fb9ff7a4ff	sdisel flag -> glue. llvm-svn: 122507	2010-12-23 17:13:18 +00:00
Andrew Trick	ca2e267ddc	Reorganize ListScheduleBottomUp in preparation for modeling machine cycles and instruction issue. llvm-svn: 122491	2010-12-23 05:42:20 +00:00
Andrew Trick	c046a115d4	Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle. llvm-svn: 122474	2010-12-23 04:16:14 +00:00
Andrew Trick	e48d5d8395	In CheckForLiveRegDef use TRI->getOverlaps. llvm-svn: 122473	2010-12-23 03:43:21 +00:00
Andrew Trick	cc701bcfdc	Fixes PR8823: add-with-overflow-128.ll In the bottom-up selection DAG scheduling, handle two-address instructions that read/write unspillable registers. Treat the entire chain of two-address nodes as a single live range. llvm-svn: 122472	2010-12-23 03:15:51 +00:00
Jeffrey Yasskin	a199652a3e	Change all self assignments X=X to (void)X, so that we can turn on a new gcc warning that complains on self-assignments and self-initializations. llvm-svn: 122458	2010-12-23 00:58:24 +00:00
Benjamin Kramer	49942a90b7	DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455	2010-12-22 23:17:45 +00:00
Jakob Stoklund Olesen	f761c75efb	When RegAllocGreedy decides to spill the interferences of the current register, pick the victim with the lowest total spill weight. llvm-svn: 122445	2010-12-22 22:01:30 +00:00
Jakob Stoklund Olesen	71e527ef4b	Include a shadow of the original CFG edges in the edge bundle graph. llvm-svn: 122444	2010-12-22 22:01:28 +00:00
Chris Lattner	04ef853e23	Fix a bug in ReduceLoadWidth that wasn't handling extending loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392	2010-12-22 08:02:57 +00:00
Chris Lattner	35fcc63498	more cleanups, move a check for "roundedness" earlier to reject unhanded cases faster and simplify code. llvm-svn: 122391	2010-12-22 08:01:44 +00:00
Chris Lattner	60dcb2b5c2	reduce indentation and improve comments, no functionality change. llvm-svn: 122389	2010-12-22 07:36:50 +00:00
Andrew Trick	afec190a28	In DelayForLiveRegsBottomUp, handle instructions that read and write the same physical register. Simplifies the fix from the previous checkin r122211. llvm-svn: 122370	2010-12-21 22:27:44 +00:00
Andrew Trick	1e3ad9f721	whitespace llvm-svn: 122368	2010-12-21 22:25:04 +00:00
Dale Johannesen	e0fb87c3d7	Reapply 122353-122355 with fixes. 122354 was wrong; the shift type was needed one place, the shift count type another. The transform in 123555 had the same problem. llvm-svn: 122366	2010-12-21 21:55:50 +00:00
Dale Johannesen	972aba543a	Revert 122353-122355 for the moment, they broke stuff. llvm-svn: 122360	2010-12-21 21:22:27 +00:00
Dale Johannesen	39186cfb0b	Add a new transform to DAGCombiner. llvm-svn: 122355	2010-12-21 20:10:51 +00:00
Dale Johannesen	5f3e7b08f6	Get the type of a shift from the shift, not from its shift count operand. These should be the same but apparently are not always, and this is cleaner anyway. This improves the code in an existing test. llvm-svn: 122354	2010-12-21 20:06:19 +00:00
Dale Johannesen	bad19334ee	Shift by the word size is invalid IR; don't create it. llvm-svn: 122353	2010-12-21 20:00:06 +00:00
Chris Lattner	8a3058137a	fix some typos llvm-svn: 122349	2010-12-21 18:05:22 +00:00
Stuart Hastings	fedc21e594	Fix indentation, add comment. llvm-svn: 122345	2010-12-21 17:16:58 +00:00
Stuart Hastings	a1f786efa9	Missing logic for nested CALLSEQ_START/END. llvm-svn: 122342	2010-12-21 17:07:24 +00:00
Cameron Zwarich	0243f1d21e	Incremental progress towards a new implementation of StrongPHIElimination. Most of the problems with my last attempt were in the updating of LiveIntervals rather than the coalescing itself. Therefore, I decided to get that right first by essentially reimplementing the existing PHIElimination using LiveIntervals. It works correctly, with only a few tests failing (which may not be legitimate failures) and no new verifier failures (at least as far as I can tell, I didn't count the number per file). llvm-svn: 122321	2010-12-21 06:54:43 +00:00
Chris Lattner	65c5243bd6	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Chris Lattner	b37e697277	improve "cannot yet select" errors a trivial amount: now they are just as useless, but at least a bit more gramatical llvm-svn: 122305	2010-12-21 02:07:03 +00:00
Jakob Stoklund Olesen	e9eb1be4dd	Add EdgeBundles to SplitKit. Edge bundles is an annotation on the CFG that turns it into a bipartite directed graph where each basic block is connected to an outgoing and an ingoing bundle. These bundles are useful for identifying regions of the CFG for live range splitting. llvm-svn: 122301	2010-12-21 01:50:21 +00:00
Jakob Stoklund Olesen	86786c46c2	Use IntEqClasses to compute connected components of live intervals. llvm-svn: 122296	2010-12-21 00:48:17 +00:00
Dale Johannesen	036c3da142	Cosmetic changes. llvm-svn: 122259	2010-12-20 20:10:50 +00:00
Cameron Zwarich	ad29bd5325	MachineVerifier should count landing pad successors as basic blocks rather than out-edges. Fixes PR8824. llvm-svn: 122228	2010-12-20 04:19:48 +00:00
Cameron Zwarich	6970ec515e	Teach MachineVerifier that early clobber defs begin at USE slots and other defs begin at DEF slots. Fixes the second half of PR8813. llvm-svn: 122225	2010-12-20 03:15:20 +00:00
Cameron Zwarich	31af86ef44	Add a missing check from r122218. llvm-svn: 122224	2010-12-20 02:59:51 +00:00
Chris Lattner	249e131f39	implement type legalization promotion support for SMULO and UMULO, giving ARM (and other 32-bit-only) targets support for i8 and i16 overflow multiplies. The generated code isn't great, but this at least fixes CodeGen/Generic/overflow.ll when running on ARM hosts. llvm-svn: 122221	2010-12-20 02:05:39 +00:00
Cameron Zwarich	bcd02fd9a4	Don't assume that an instruction ending a register's live range always reads the register; it may be a dead def instead. Fixes PR8820. llvm-svn: 122218	2010-12-20 01:22:37 +00:00
Chris Lattner	0f801998bf	Fix a bug in the scheduler's handling of "unspillable" vregs. Imagine we see: EFLAGS = inst1 EFLAGS = inst2 FLAGS gpr = inst3 EFLAGS Previously, we would refuse to schedule inst2 because it clobbers the EFLAGS of the predecessor. However, it also uses the EFLAGS of the predecessor, so it is safe to emit. SDep edges ensure that the right order happens already anyway. This fixes 2 testsuite crashes with the X86 patch I'm going to commit next. llvm-svn: 122211	2010-12-20 00:55:43 +00:00
Chris Lattner	85875bf06b	the result of CheckForLiveRegDef is dead, remove it. llvm-svn: 122209	2010-12-20 00:51:56 +00:00
Chris Lattner	ee7fa0d706	reduce indentation, no functionality change. llvm-svn: 122208	2010-12-20 00:50:16 +00:00
Cameron Zwarich	8c00d690f5	Ignore debug values when performing MachineVerifier liveness checks. Fixes PR8822. llvm-svn: 122207	2010-12-20 00:08:10 +00:00
Cameron Zwarich	c8dfbe7503	Early clobber operands are allowed to be defined at use indices. This fixes one half of PR8813. llvm-svn: 122205	2010-12-19 23:50:53 +00:00
Cameron Zwarich	6f5c1021ba	Fix PR8815 by checking for an explicit clobber def tied to a use operand in ConnectedVNInfoEqClasses::Classify(). llvm-svn: 122202	2010-12-19 22:12:45 +00:00
Cameron Zwarich	37aec9c35d	Fix PR8811 by teaching MachineVerifier about optional defs. llvm-svn: 122199	2010-12-19 21:37:23 +00:00
Cameron Zwarich	64fbc5e267	StrongPHIElimination will never run before TwoAddressInstructionPass. llvm-svn: 122197	2010-12-19 21:32:29 +00:00
Nick Lewycky	c85935836b	Add missing standard headers. Patch by Joerg Sonnenberger! llvm-svn: 122193	2010-12-19 20:43:38 +00:00
Chris Lattner	92dcd2af36	teach MaskedValueIsZero how to analyze ADDE. This is enough to teach it that ADDE(0,0) is known 0 except the low bit, for example. llvm-svn: 122191	2010-12-19 20:38:28 +00:00
Cameron Zwarich	163792fb1f	Remove some checks for StrongPHIElim. These checks make it impossible to use an alternative register allocator that does not require LiveIntervals by specifying it on the command-line for a target that has StrongPHIElimination enabled by default. These checks are pretty meaningless anyways, since StrongPHIElimination and PHIElimination are never used at the same time. llvm-svn: 122176	2010-12-19 18:03:27 +00:00
Chris Lattner	ac82ea26da	fix PR8642: if a critical edge has a PHI value that can trap, isel is required to split the edge. PHI values get evaluated on the edge, not in their predecessor block. llvm-svn: 122170	2010-12-19 04:58:57 +00:00
Jakob Stoklund Olesen	bdf06d6c7b	Apparently, operandices is not a word. llvm-svn: 122135	2010-12-18 03:28:32 +00:00
Jakob Stoklund Olesen	e06ded7533	Teach the inline spiller to attempt folding a load instruction into its single use before rematerializing the load. This allows us to produce: addps LCPI0_1(%rip), %xmm2 Instead of: movaps LCPI0_1(%rip), %xmm3 addps %xmm3, %xmm2 Saving a register and an instruction. The standard spiller already knows how to do this. llvm-svn: 122133	2010-12-18 03:04:14 +00:00
Jakob Stoklund Olesen	485b7965b3	Tweak debug spew. llvm-svn: 122132	2010-12-18 03:04:11 +00:00
Jakob Stoklund Olesen	a2f2eab8d4	Check that the register is live-in to the loop header before inserting copies in the loop predecessors. The register can be live-out from a predecessor without being live-in to the loop header if there is a critical edge from the predecessor. llvm-svn: 122123	2010-12-18 01:06:19 +00:00
Nick Lewycky	30eef45106	Fix GCC warning: lib/CodeGen/RegAllocGreedy.cpp:311: error: unused variable 'PhysReg' [-Wunused-variable] llvm-svn: 122122	2010-12-18 01:05:55 +00:00
Jakob Stoklund Olesen	2879da5e13	Pass a Banner argument to the machine code verifier both from createMachineVerifierPass and MachineFunction::verify. The banner is printed before the machine code dump, just like the printer pass. llvm-svn: 122113	2010-12-18 00:06:56 +00:00
Jakob Stoklund Olesen	6498db2c8c	Avoid dereferencing end() in collectInterferingVRegs() when there is no interference. llvm-svn: 122108	2010-12-17 23:16:38 +00:00
Jakob Stoklund Olesen	db4b62f32e	Make the -verify-regalloc command line option available to base classes as RegAllocBase::VerifyEnabled. Run the machine code verifier in a few interesting places during RegAllocGreedy. llvm-svn: 122107	2010-12-17 23:16:35 +00:00
Jakob Stoklund Olesen	df9e162423	Enable loop splitting in RegAllocGreedy. The heuristics split around the largest loop where the current register may be allocated without interference. llvm-svn: 122106	2010-12-17 23:16:32 +00:00
Bill Wendling	c16f9b1ccc	During local stack slot allocation, the materializeFrameBaseRegister function may be called. If the entry block is empty, the insertion point iterator will be the "end()" value. Calling ->getParent() on it (among others) causes problems. Modify materializeFrameBaseRegister to take the machine basic block and insert the frame base register at the beginning of that block. (It's very similar to what the code does all ready. The only difference is that it will always insert at the beginning of the entry block instead of after a previous materialization of the frame base register. I doubt that that matters here.) <rdar://problem/8782198> llvm-svn: 122104	2010-12-17 23:09:14 +00:00
Bob Wilson	c57c2d755b	Fix a DAGCombiner crash when folding binary vector operations with constant BUILD_VECTOR operands where the element type is not legal. I had previously changed this code to insert TRUNCATE operations, but that was just wrong. llvm-svn: 122102	2010-12-17 23:06:49 +00:00
Dale Johannesen	c2c6ebd82a	Add a transform to DAG Combiner. This improves the code for the case where 32-bit divide by constant is turned into 64-bit multiply by constant. 8771012. llvm-svn: 122090	2010-12-17 21:45:49 +00:00
Jakob Stoklund Olesen	0dc90e6b1b	Allow missing kill flags on an untied operand of a two-address instruction when the operand uses the same register as a tied operand: %r1 = add %r1, %r1 If add were a three-address instruction, kill flags would be required on at least one of the uses. Since it is a two-address instruction, the tied use operand must not have a kill flag. This change makes the kill flag on the untied use operand optional. llvm-svn: 122082	2010-12-17 19:18:41 +00:00
Jakob Stoklund Olesen	f4a0c81371	Add MachineLoopRange comparators for sorting loop lists by number and by area. llvm-svn: 122073	2010-12-17 18:13:52 +00:00
Jakob Stoklund Olesen	40f23cd5ca	Provide LiveIntervalUnion::Query::checkLoopInterference. This is a three-way interval list intersection between a virtual register, a live interval union, and a loop. It will be used to identify interference-free loops for live range splitting. llvm-svn: 122034	2010-12-17 04:09:47 +00:00
Bob Wilson	e06f6eabe7	Fix crash compiling a QQQQ REG_SEQUENCE for a Neon vld3_lane operation. Radar 8776599 llvm-svn: 122018	2010-12-17 01:21:12 +00:00
Bob Wilson	f6fe49f7d2	Fix a comment typo. llvm-svn: 122016	2010-12-17 01:21:05 +00:00
Daniel Dunbar	8ab9be2005	MC: Make TargetAsmBackend available to the AsmStreamer. - Treaty talks on the non-proliferation of MC objects broke down. llvm-svn: 121949	2010-12-16 03:05:59 +00:00
Jakob Stoklund Olesen	1811e4cb20	Start using SplitKit and MachineLoopRanges in RegAllocGreedy in preparation of live range splitting around loops guided by register pressure. So far, trySplit() simply prints a lot of debug output. llvm-svn: 121918	2010-12-15 23:46:13 +00:00
Jakob Stoklund Olesen	d40af5ffbd	Add MachineLoopRanges analysis. A MachineLoopRange contains the intervals of slot indexes covered by the blocks in a loop. This representation of the loop blocks is more efficient to compare against interfering registers during register coalescing. llvm-svn: 121917	2010-12-15 23:41:23 +00:00
Evan Cheng	68e1ed8752	Teach machine cse to commute instructions. llvm-svn: 121903	2010-12-15 22:16:21 +00:00
Dan Gohman	295ba3ab26	Move Value::getUnderlyingObject to be a standalone function so that it can live in Analysis instead of VMCore. llvm-svn: 121885	2010-12-15 20:02:24 +00:00
Jakob Stoklund Olesen	3cfea82733	Fix build. llvm-svn: 121872	2010-12-15 18:07:48 +00:00
Jakob Stoklund Olesen	308656b955	Detect and enumerate bypass loops. Bypass loops have the current live range live through, but contain no uses or defs. Splitting around a bypass loop can free registers for other uses inside the loop by spilling the split range. llvm-svn: 121871	2010-12-15 17:49:52 +00:00
Jakob Stoklund Olesen	849388944e	Separate SplitAnalysis::getSplitLoops(). This method returns the set of loops with uses that are candidates for splitting. llvm-svn: 121870	2010-12-15 17:41:19 +00:00
Chris Lattner	81815cd4db	take care of some todos, transforming [us]mul_lohi into a wider mul if the wider mul is legal. llvm-svn: 121848	2010-12-15 06:04:19 +00:00
Chris Lattner	746d6a1f60	when transforming a MULHS into a wider MUL, there is no need to SRA the result, the top bits are truncated off anyway, just use SRL. llvm-svn: 121846	2010-12-15 05:51:39 +00:00
Jakob Stoklund Olesen	48800c9689	Simplify RegAllocGreedy's use of register aliases. llvm-svn: 121807	2010-12-14 23:38:19 +00:00
Jakob Stoklund Olesen	7ee6f83da1	Simplify CCState's use of register aliases. llvm-svn: 121806	2010-12-14 23:28:01 +00:00
Jakob Stoklund Olesen	03856151db	Simplify AggressiveAntiDepBreaker's use of register aliases. llvm-svn: 121805	2010-12-14 23:23:15 +00:00
Jakob Stoklund Olesen	870fbdd686	Simplyfy RegAllocBasic by using getOverlaps instead of getAliasSet. llvm-svn: 121801	2010-12-14 23:10:48 +00:00
Evan Cheng	7e96e67d98	Fix a minor bug in two-address pass. It was missing a commute opportunity. regB = move RCX regA = op regB, regC RAX = move regA where both regB and regC are killed. If regB is constrainted to non-compatible physical registers but regC is not constrainted at all, then it's better to commute the instruction. movl %edi, %eax shlq $32, %rcx leaq (%rcx,%rax), %rax => movl %edi, %eax shlq $32, %rcx orq %rcx, %rax rdar://8762995 llvm-svn: 121793	2010-12-14 21:34:53 +00:00
Matt Beaumont-Gay	01264443a8	Move debugging code entirely within DEBUG(). Silences an unused variable warning in the opt build. llvm-svn: 121791	2010-12-14 21:14:55 +00:00
Jakob Stoklund Olesen	c13ce4748e	Add LiveIntervalUnion print methods, RegAllocGreedy::trySplit debug spew. llvm-svn: 121783	2010-12-14 19:38:49 +00:00
Jakob Stoklund Olesen	c5ad05ca30	Use TRI::printReg instead of AbstractRegisterDescription when printing LiveIntervalUnions. llvm-svn: 121781	2010-12-14 18:53:47 +00:00
Jakob Stoklund Olesen	74ba8b77e6	Q.seenAllInterferences() must be called after Q.collectInterferingVRegs(). llvm-svn: 121774	2010-12-14 17:47:36 +00:00
Jakob Stoklund Olesen	d4d3c5dd1e	Remove unused vector. llvm-svn: 121741	2010-12-14 00:58:47 +00:00
Jakob Stoklund Olesen	00d6ac22d0	Try reassigning all virtual register interferences, not just those with lower spill weight. Filter out fixed registers instead. Add support for reassigning an interference that was assigned to an alias. llvm-svn: 121737	2010-12-14 00:37:49 +00:00
Jakob Stoklund Olesen	ffefd5bd4e	Add stub for RAGreedy::trySplit. llvm-svn: 121736	2010-12-14 00:37:44 +00:00
Chris Lattner	14810c808b	Add a couple dag combines to transform mulhi/mullo into a wider multiply when the wider type is legal. This allows us to compile: define zeroext i16 @test1(i16 zeroext %x) nounwind { entry: %div = udiv i16 %x, 33 ret i16 %div } into: test1: # @test1 movzwl 4(%esp), %eax imull $63551, %eax, %eax # imm = 0xF83F shrl $21, %eax ret instead of: test1: # @test1 movw $-1985, %ax # imm = 0xFFFFFFFFFFFFF83F mulw 4(%esp) andl $65504, %edx # imm = 0xFFE0 movl %edx, %eax shrl $5, %eax ret Implementing rdar://8760399 and example #4 from: http://blog.regehr.org/archives/320 We should implement the same thing for [su]mul_hilo, but I don't have immediate plans to do this. llvm-svn: 121696	2010-12-13 08:39:01 +00:00
Chris Lattner	324f849088	remove the verbose-asm "constant pool double" comments that we were printing for each constant pool entry. Using WriteTypeSymbolic here takes time proportional to the size of the module, for each constant pool entry. This speeds up -verbose-asm llc on 252.eon (a random testcase at my disposal) from 4.4s to 2.137s. llc takes 2.11s with asm-verbose off, so this is now a pretty reasonable cost for verbose comments. llvm-svn: 121691	2010-12-13 07:35:47 +00:00
Chris Lattner	6df4d5d88e	reduce indentation by using continue, no functionality change. llvm-svn: 121662	2010-12-13 01:11:17 +00:00
Duncan Sands	47a4bbd31d	Catch attempts to remove a deleted node from the CSE maps. Better to catch this here rather than later after accessing uninitialized memory etc. Fires when compiling the testcase in PR8237. llvm-svn: 121635	2010-12-12 13:22:50 +00:00
Jakob Stoklund Olesen	a523d5f048	Add named timer groups for the different stages of register allocation. llvm-svn: 121604	2010-12-11 00:19:56 +00:00
Jakob Stoklund Olesen	ef80efea1d	Move MRI into RegAllocBase. Clean up debug output a bit. llvm-svn: 121599	2010-12-10 23:49:00 +00:00
Nick Lewycky	46a6ed1f0f	Remove extraneous close parenthesis. Fix build breakage. llvm-svn: 121596	2010-12-10 23:14:35 +00:00
Nick Lewycky	9afbedbc48	Move variable that's unused in an NDEBUG build inside the DEBUG() macro, fixing lib/CodeGen/RegAllocGreedy.cpp:233: error: unused variable 'TRC' [-Wunused-variable] llvm-svn: 121594	2010-12-10 23:05:10 +00:00
Jakob Stoklund Olesen	6cd6e644e7	Force the greedy register allocator to always use the inline spiller. Soon, RegAllocGreedy will start splitting live ranges, and then deferred spilling won't work anyway. llvm-svn: 121591	2010-12-10 22:54:44 +00:00
Jakob Stoklund Olesen	cbd4bac09d	Rip out live range splitting support from the inline spiller. The spiller should only spill. The register allocator will drive live range splitting, it has the needed information about register pressure and interferences. llvm-svn: 121590	2010-12-10 22:54:40 +00:00
Jakob Stoklund Olesen	5ab6552845	Use AllocationOrder in RegAllocGreedy, fix a bug in the hint calculation. llvm-svn: 121584	2010-12-10 22:21:05 +00:00
Jakob Stoklund Olesen	ea59381fc8	Fix miscompilation caused by trivial logic error in the reassignVReg() interference check. llvm-svn: 121519	2010-12-10 20:45:04 +00:00
Jakob Stoklund Olesen	e3924a3c85	Add an AllocationOrder class that can iterate over the allocatable physical registers for a given virtual register. Reserved registers are filtered from the allocation order, and any valid hint is returned as the first suggestion. For target dependent hints, a number of arcane target hooks are invoked. llvm-svn: 121497	2010-12-10 18:36:02 +00:00
Rafael Espindola	0e665e502d	Fixed version of 121434 with no new memory leaks. llvm-svn: 121471	2010-12-10 07:39:47 +00:00
Rafael Espindola	011e168728	Revert my previous patch to make the valgrind bots happy. llvm-svn: 121461	2010-12-10 04:01:09 +00:00
Rafael Espindola	03ad1e8f1f	Initial support for the cfi directives. This is just enough to get f: .cfi_startproc nop .cfi_endproc assembled (on ELF). llvm-svn: 121434	2010-12-09 23:48:29 +00:00
Stuart Hastings	f7bba0cfe3	Initial support for nested CALLSEQ_START/CALLSEQ_END constructs in LegalizeDAG. Necessary for byval support on ARM. Radar 7662569. llvm-svn: 121412	2010-12-09 21:25:20 +00:00
Jakob Stoklund Olesen	fe4b9ee934	Remember to filter out reserved rergisters from the allocation order. llvm-svn: 121411	2010-12-09 21:20:46 +00:00
Jakob Stoklund Olesen	5bb5c67227	Add a forgotten initializer for CheckedFirstInterference. llvm-svn: 121410	2010-12-09 21:20:44 +00:00
Andrew Trick	ec37b93b07	Added register reassignment prototype to RAGreedy. It's a simple heuristic to reshuffle register assignments when we can't find an available reg. llvm-svn: 121388	2010-12-09 18:15:21 +00:00
Eric Christopher	ebd7ab9857	80-col fixups. llvm-svn: 121356	2010-12-09 04:48:06 +00:00
Jakob Stoklund Olesen	17b2e8c293	IntervalMap iterators are heavyweight, so avoid copying them around and use references instead. Similarly, IntervalMap::begin() is almost as expensive as find(), so use find(x) instead of begin().advanceTo(x); This makes RegAllocBasic run another 5% faster. llvm-svn: 121344	2010-12-09 01:06:52 +00:00
Devang Patel	b3e0d80b1f	DW_FORM_data1 may not provide sufficient room for vtable index, use _udata instead. This fixes radar 8730409. llvm-svn: 121323	2010-12-09 00:10:40 +00:00
Jakob Stoklund Olesen	ffc0f6586a	Properly deal with empty intervals when checking for interference. llvm-svn: 121319	2010-12-08 23:51:35 +00:00
Jakob Stoklund Olesen	37df3f04c2	Implement very primitive hinting support in RegAllocGreedy. The hint is simply tried first and then forgotten if it couldn't be allocated immediately. llvm-svn: 121306	2010-12-08 22:57:16 +00:00
Jakob Stoklund Olesen	3c81b6a50b	Store (priority,regnum) pairs in the priority queue instead of providing an abstract priority queue interface in subclasses that want to override the priority calculations. Subclasses must provide a getPriority() implementation instead. This approach requires less code as long as priorities are expressable as simple floats, and it avoids the dangers of defining potentially expensive priority comparison functions. It also should speed up priority_queue operations since they no longer have to chase pointers when comparing registers. This is not measurable, though. Preferably, we shouldn't use floats to guide code generation. The use of floats here is derived from the use of floats for spill weights. Spill weights have a dynamic range that doesn't lend itself easily to a fixpoint implementation. When someone invents a stable spill weight representation, it can be reused for allocation priorities. llvm-svn: 121294	2010-12-08 22:22:41 +00:00
Eric Christopher	d492f798d1	Reword comment slightly. llvm-svn: 121293	2010-12-08 22:21:42 +00:00
Eric Christopher	77d3a7b3fb	Fix comment. llvm-svn: 121285	2010-12-08 21:35:09 +00:00
Jakob Stoklund Olesen	f04d283db1	Trim includes. llvm-svn: 121283	2010-12-08 21:12:00 +00:00
Andrew Trick	fb72ca2129	Generalize PostRAHazardRecognizer so it can be used in any pass for both forward and backward scheduling. Rename it to ScoreboardHazardRecognizer (Scoreboard is one word). Remove integer division from the scoreboard's critical path. llvm-svn: 121274	2010-12-08 20:04:29 +00:00
Jakob Stoklund Olesen	d638b989f2	Stub out RegAllocGreedy. This new register allocator is initially identical to RegAllocBasic, but it will receive all of the tricks that RegAllocBasic won't get. RegAllocGreedy will eventually replace linear scan. llvm-svn: 121234	2010-12-08 03:26:16 +00:00
Jakob Stoklund Olesen	77e7ad803a	Move RABasic::addMBBLiveIns to the base class, it is generally useful. Minor optimization to the use of IntervalMap iterators. They are fairly heavyweight, so prefer SI.valid() over SI != end(). llvm-svn: 121217	2010-12-08 01:06:06 +00:00
Jakob Stoklund Olesen	9d6472e894	Switch LiveIntervalUnion from std::set to IntervalMap. This speeds up RegAllocBasic by 20%, not counting releaseMemory which becomes way faster. llvm-svn: 121201	2010-12-07 23:18:47 +00:00
Jakob Stoklund Olesen	39e22e19bf	Simplify assertion. llvm-svn: 121162	2010-12-07 18:51:27 +00:00
Jay Foad	79e18ed269	PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method trunc(), to be const and to return a new value instead of modifying the object in place. llvm-svn: 121120	2010-12-07 08:25:19 +00:00
Jakob Stoklund Olesen	48ba44334f	Remove unused member. llvm-svn: 121098	2010-12-07 01:32:45 +00:00
Devang Patel	12459bc442	Undefined value in reg 0 may need a marker to identify end of source range. This will be used to truncate live range of DBG_VALUE instruction by register allocator and friends. llvm-svn: 121061	2010-12-06 22:48:22 +00:00
Devang Patel	6fe7fe8dd4	If dbg_declare() or dbg_value() is not lowered by isel then emit DEBUG message instead of creating DBG_VALUE for undefined value in reg0. llvm-svn: 121059	2010-12-06 22:39:26 +00:00
Rafael Espindola	3e954d16f4	Second try at making direct object emission produce the same results as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc bootstrap on darwin10 using darwin9's assembler and linker. llvm-svn: 121006	2010-12-06 17:27:56 +00:00
Rafael Espindola	4ec917db9b	Revert previous two patches while I try to find out how to make both linux and darwin assemblers happy :-( llvm-svn: 121004	2010-12-06 15:35:15 +00:00
Rafael Espindola	3dc2b4cba7	Add an EmitAbsValue helper method and use it in cases where we want to be sure that no relocations are used (on MochO). Fixes llc producing different output from llc + llvm-mc. llvm-svn: 121000	2010-12-06 14:53:14 +00:00
Cameron Zwarich	f56ba80bb2	Some cleanup before I start committing some incremental progress on StrongPHIElimination. llvm-svn: 120961	2010-12-05 22:34:08 +00:00
Cameron Zwarich	f64c26bb9e	Remove the PHIElimination.h header, as it is no longer needed. llvm-svn: 120959	2010-12-05 21:39:42 +00:00
Cameron Zwarich	fbe9e91d97	I forgot to actually remove the FindCopyInsertPoint() declaration from PHIElimination.h. llvm-svn: 120953	2010-12-05 19:58:57 +00:00
Cameron Zwarich	cb613dcf69	Remove the SplitCriticalEdge() method declaration from PHIElimination.h. At one time, this method existed, but now PHIElimination uses the method of the same name on MachineBasicBlock. llvm-svn: 120952	2010-12-05 19:54:23 +00:00
Cameron Zwarich	c680f44c1b	Move the FindCopyInsertPoint method of PHIElimination to a new standalone function so that it can be shared with StrongPHIElimination. llvm-svn: 120951	2010-12-05 19:51:05 +00:00

... 3 4 5 6 7 ...

11355 Commits