llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-07 11:51:13 +00:00

Author	SHA1	Message	Date
Evan Cheng	70711ea54d	Revert r122936. I'll re-implement the change. llvm-svn: 122949	2011-01-06 06:17:53 +00:00
Jakob Stoklund Olesen	b3e7b27c1f	Zap the last two -Wself-assign warnings in llvm. Simplify RALinScan::DowngradeRegister with TRI::getOverlaps while we are there. llvm-svn: 122940	2011-01-06 01:33:22 +00:00
Jakob Stoklund Olesen	7b1480ff12	Add the SpillPlacement analysis pass. This pass precomputes CFG block frequency information that can be used by the register allocator to find optimal spill code placement. Given an interference pattern, placeSpills() will compute which basic blocks should have the current variable enter or exit in a register, and which blocks prefer the stack. The algorithm is ready to consume block frequencies from profiling data, but for now it gets by with the static estimates used for spill weights. This is a work in progress and still not hooked up to RegAllocGreedy. llvm-svn: 122938	2011-01-06 01:21:53 +00:00
Evan Cheng	d425aa5d2a	r105228 reduced the memcpy / memset inline limit to 4 with -Os to avoid blowing up freebsd bootloader. However, this doesn't make much sense for Darwin, whose -Os is meant to optimize for size only if it doesn't hurt performance. rdar://8821501 llvm-svn: 122936	2011-01-06 01:04:47 +00:00
Evan Cheng	2af40ae781	Avoid zero extend bit test operands to pointer type if all the masks fit in the original type of the switch statement key. rdar://8781238 llvm-svn: 122935	2011-01-06 01:02:44 +00:00
Evan Cheng	bf92316fab	Optimize: r1025 = s/zext r1024, 4 r1026 = extract_subreg r1025, 4 to: r1026 = copy r1024 llvm-svn: 122925	2011-01-05 23:06:49 +00:00
Jakob Stoklund Olesen	ce25984bae	Add a hidden command line option to display edge bundle graphs as they are calculated. llvm-svn: 122912	2011-01-05 21:50:24 +00:00
Eric Christopher	651810d717	80-cols. llvm-svn: 122909	2011-01-05 21:45:56 +00:00
Eric Christopher	be2382f9a6	Remove TODO, these appear to be implemented. llvm-svn: 122849	2011-01-04 22:31:50 +00:00
Jakob Stoklund Olesen	abf8941a60	Turn the EdgeBundles class into a stand-alone machine CFG analysis pass. The analysis will be needed by both the greedy register allocator and the X86FloatingPoint pass. It only needs to be computed once when the CFG doesn't change. This pass is very fast, usually showing up as 0.0% wall time. llvm-svn: 122832	2011-01-04 21:10:05 +00:00
Cameron Zwarich	fce4db4cbe	Switch to path halving from path compression for a small speedup. This also makes getLeader() nonrecursive. llvm-svn: 122811	2011-01-04 16:24:51 +00:00
Cameron Zwarich	2975ee7cc6	Eliminate repeated allocation of a per-BB DenseMap for a 4.6% reduction of time spent in StrongPHIElimination on 403.gcc. llvm-svn: 122803	2011-01-04 06:42:27 +00:00
Owen Anderson	9eeb0d483e	Clean up a funky pass registration that got passed over when I got rid of static constructors. llvm-svn: 122795	2011-01-04 00:55:21 +00:00
Cameron Zwarich	60ec113434	Use a RecyclingAllocator to allocate values for MachineCSE's ScopedHashTable for a 28% speedup of MachineCSE time on 403.gcc. llvm-svn: 122735	2011-01-03 04:07:46 +00:00
Chris Lattner	e396e846b4	split dom frontier handling stuff out to its own DominanceFrontier header, so that Dominators.h is just domtree. Also prune #includes a bit. llvm-svn: 122714	2011-01-02 22:09:33 +00:00
Benjamin Kramer	a58b69aa9d	Try to reuse the value when lowering memset. This allows us to compile: void test(char *s, int a) { __builtin_memset(s, a, 15); } into 1 mul + 3 stores instead of 3 muls + 3 stores. llvm-svn: 122710	2011-01-02 19:57:05 +00:00
Benjamin Kramer	38491f47ce	Lower the i8 extension in memset to a multiply instead of a potentially long series of shifts and ors. We could implement a DAGCombine to turn x * 0x0101 back into logic operations on targets that doesn't support the multiply or it is slow (p4) if someone cares enough. Example code: void test(char *s, int a) { __builtin_memset(s, a, 4); } before: _test: ## @test movzbl 8(%esp), %eax movl %eax, %ecx shll $8, %ecx orl %eax, %ecx movl %ecx, %eax shll $16, %eax orl %ecx, %eax movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret after: _test: ## @test movzbl 8(%esp), %eax imull $16843009, %eax, %eax ## imm = 0x1010101 movl 4(%esp), %ecx movl %eax, 4(%ecx) movl %eax, (%ecx) ret llvm-svn: 122707	2011-01-02 19:44:58 +00:00
Cameron Zwarich	ae468579bb	Use getVRegDef() instead of def_iterator. This leads to fewer defs being added with 2-address instructions, for about a 3.5% speedup of StrongPHIElimination on 403.gcc. llvm-svn: 122635	2010-12-30 00:42:23 +00:00
Cameron Zwarich	1e7124e6fa	None of the other pass names in CodeGen have terminating periods. llvm-svn: 122628	2010-12-29 11:49:10 +00:00
Cameron Zwarich	a7052a3c06	Instead of processing every instruction when splitting interferences, only process those instructions that define phi sources. This is a 47% speedup of StrongPHIElimination compile time on 403.gcc. llvm-svn: 122627	2010-12-29 11:00:09 +00:00
Cameron Zwarich	292870da06	Add a missing word to a comment. llvm-svn: 122625	2010-12-29 04:42:39 +00:00
Cameron Zwarich	6fc15ba38b	Add text explaining an assertion. llvm-svn: 122617	2010-12-29 03:52:51 +00:00
Cameron Zwarich	0fa638e27c	Simplify some code in MachineVerifier that was doing the correct thing, but not in the most obvious way. llvm-svn: 122610	2010-12-28 23:45:38 +00:00
Cameron Zwarich	3eacb7fff8	Revert the optimization in r122596. It is correct for all current targets, but it relies on assumptions that may not be true in the future. llvm-svn: 122608	2010-12-28 23:02:56 +00:00
Cameron Zwarich	c9c7488542	Avoid iterating every operand of an instruction in StrongPHIElimination, since we are only interested in the defs when discovering interferences. This is a 28% speedup running StrongPHIElimination on 403.gcc. llvm-svn: 122596	2010-12-28 10:49:33 +00:00
Duncan Sands	cc5a4497fd	Pacify the compiler. BestWeight cannot in fact be used uninitialized in this function, but the compiler was warning that it might be when doing a release build. llvm-svn: 122595	2010-12-28 10:07:15 +00:00
Cameron Zwarich	30f2239301	Change an assertion to assert what the code actually relies upon. llvm-svn: 122586	2010-12-27 22:08:42 +00:00
Cameron Zwarich	cfdb10a1bb	Land a first cut at StrongPHIElimination. There are only 5 new test failures when running without the verifier, and I have not yet checked them to see if the new results are still correct. There are more verifier failures, but they all seem to be additional occurrences of verifier failures that occur with the existing PHIElimination pass. There are a few obvious issues with the code: 1) It doesn't properly update the register equivalence classes during copy insertion, and instead recomputes them before merging live intervals and renaming registers. I wanted to keep this first patch simple for debugging purposes, but it shouldn't be very hard to do this. 2) It doesn't mix the renaming and live interval merging with the copy insertion process, which leads to a lot of virtual register churn. Virtual registers and live intervals are created, only to later be merged into others. The code should be smarter and only create a new virtual register if there is no existing register in the same congruence class. 3) In one place the code uses a DenseMap per basic block, which is unnecessary heap allocation. There should be an inline storage version of DenseMap. I did a quick compile-time test of running llc on 403.gcc with and without StrongPHIElimination. It is slightly slower with StrongPHIElimination, because the small decrease in the coalescer runtime can't beat the increase in phi elimination runtime. Perhaps fixing the above performance issues will narrow the gap. I also haven't yet run any tests of the quality of the generated code. llvm-svn: 122582	2010-12-27 10:08:19 +00:00
Cameron Zwarich	66289e34e1	Add knowledge of phi-def and phi-kill valnos to MachineVerifier's predecessor valno verification. The "Different value live out of predecessor" check is incorrect in the case of phi-def valnos, so just skip that check for phi-def valnos and instead check that all of the valnos for predecessors have phi-kill. Fixes PR8863. llvm-svn: 122581	2010-12-27 05:17:23 +00:00
Andrew Trick	dfa31b1cf9	Minor cleanup related to my latest scheduler changes. llvm-svn: 122545	2010-12-24 07:10:19 +00:00
Andrew Trick	c926e98fc7	Fix a few cases where the scheduler is not checking for phys reg copies. The scheduling node may have a NULL DAG node, yuck. llvm-svn: 122544	2010-12-24 06:46:50 +00:00
Andrew Trick	134b2a5907	Various bits of framework needed for precise machine-level selection DAG scheduling during isel. Most new functionality is currently guarded by -enable-sched-cycles and -enable-sched-hazard. Added InstrItineraryData::IssueWidth field, currently derived from ARM itineraries, but could be initialized differently on other targets. Added ScheduleHazardRecognizer::MaxLookAhead to indicate whether it is active, and if so how many cycles of state it holds. Added SchedulingPriorityQueue::HasReadyFilter to allowing gating entry into the scheduler's available queue. ScoreboardHazardRecognizer now accesses the ScheduleDAG in order to get information about it's SUnits, provides RecedeCycle for bottom-up scheduling, correctly computes scoreboard depth, tracks IssueCount, and considers potential stall cycles when checking for hazards. ScheduleDAGRRList now models machine cycles and hazards (under flags). It tracks MinAvailableCycle, drives the hazard recognizer and priority queue's ready filter, manages a new PendingQueue, properly accounts for stall cycles, etc. llvm-svn: 122541	2010-12-24 05:03:26 +00:00
Andrew Trick	53f4556c64	whitespace llvm-svn: 122539	2010-12-24 04:28:06 +00:00
Cameron Zwarich	a7ad357a13	Simplify a check for implicit defs and remove a FIXME. llvm-svn: 122537	2010-12-24 03:09:36 +00:00
Chris Lattner	b607e7deda	flags -> glue for selectiondag llvm-svn: 122509	2010-12-23 17:24:32 +00:00
Chris Lattner	fb9ff7a4ff	sdisel flag -> glue. llvm-svn: 122507	2010-12-23 17:13:18 +00:00
Andrew Trick	ca2e267ddc	Reorganize ListScheduleBottomUp in preparation for modeling machine cycles and instruction issue. llvm-svn: 122491	2010-12-23 05:42:20 +00:00
Andrew Trick	c046a115d4	Converted LiveRegCycles to LiveRegGens. It's easier to work with and allows multiple nodes per cycle. llvm-svn: 122474	2010-12-23 04:16:14 +00:00
Andrew Trick	e48d5d8395	In CheckForLiveRegDef use TRI->getOverlaps. llvm-svn: 122473	2010-12-23 03:43:21 +00:00
Andrew Trick	cc701bcfdc	Fixes PR8823: add-with-overflow-128.ll In the bottom-up selection DAG scheduling, handle two-address instructions that read/write unspillable registers. Treat the entire chain of two-address nodes as a single live range. llvm-svn: 122472	2010-12-23 03:15:51 +00:00
Jeffrey Yasskin	a199652a3e	Change all self assignments X=X to (void)X, so that we can turn on a new gcc warning that complains on self-assignments and self-initializations. llvm-svn: 122458	2010-12-23 00:58:24 +00:00
Benjamin Kramer	49942a90b7	DAGCombine add (sext i1), X into sub X, (zext i1) if sext from i1 is illegal. The latter usually compiles into smaller code. example code: unsigned foo(unsigned x, unsigned y) { if (x != 0) y--; return y; } before: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] sbbl %eax, %eax ## encoding: [0x19,0xc0] notl %eax ## encoding: [0xf7,0xd0] addl 8(%esp), %eax ## encoding: [0x03,0x44,0x24,0x08] ret ## encoding: [0xc3] after: _foo: ## @foo cmpl $1, 4(%esp) ## encoding: [0x83,0x7c,0x24,0x04,0x01] movl 8(%esp), %eax ## encoding: [0x8b,0x44,0x24,0x08] adcl $-1, %eax ## encoding: [0x83,0xd0,0xff] ret ## encoding: [0xc3] llvm-svn: 122455	2010-12-22 23:17:45 +00:00
Jakob Stoklund Olesen	f761c75efb	When RegAllocGreedy decides to spill the interferences of the current register, pick the victim with the lowest total spill weight. llvm-svn: 122445	2010-12-22 22:01:30 +00:00
Jakob Stoklund Olesen	71e527ef4b	Include a shadow of the original CFG edges in the edge bundle graph. llvm-svn: 122444	2010-12-22 22:01:28 +00:00
Chris Lattner	04ef853e23	Fix a bug in ReduceLoadWidth that wasn't handling extending loads properly. We miscompiled the testcase into: _test: ## @test movl $128, (%rdi) movzbl 1(%rdi), %eax ret Now we get a proper: _test: ## @test movl $128, (%rdi) movsbl (%rdi), %eax movzbl %ah, %eax ret This fixes PR8757. llvm-svn: 122392	2010-12-22 08:02:57 +00:00
Chris Lattner	35fcc63498	more cleanups, move a check for "roundedness" earlier to reject unhanded cases faster and simplify code. llvm-svn: 122391	2010-12-22 08:01:44 +00:00
Chris Lattner	60dcb2b5c2	reduce indentation and improve comments, no functionality change. llvm-svn: 122389	2010-12-22 07:36:50 +00:00
Andrew Trick	afec190a28	In DelayForLiveRegsBottomUp, handle instructions that read and write the same physical register. Simplifies the fix from the previous checkin r122211. llvm-svn: 122370	2010-12-21 22:27:44 +00:00
Andrew Trick	1e3ad9f721	whitespace llvm-svn: 122368	2010-12-21 22:25:04 +00:00
Dale Johannesen	e0fb87c3d7	Reapply 122353-122355 with fixes. 122354 was wrong; the shift type was needed one place, the shift count type another. The transform in 123555 had the same problem. llvm-svn: 122366	2010-12-21 21:55:50 +00:00

1 2 3 4 5 ...

11070 Commits