llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-12-30 15:45:26 +00:00

Author	SHA1	Message	Date
Ted Kremenek	de82fd5282	Update CMake build. llvm-svn: 154622	2012-04-12 22:15:23 +00:00
Evandro Menezes	dcd4bebf98	Hexagon: fix CMake error. llvm-svn: 154620	2012-04-12 21:44:58 +00:00
Sirish Pande	ff74c0b4e8	HexagonPacketizer patch. llvm-svn: 154616	2012-04-12 21:06:38 +00:00
Preston Gurd	6e9bcca355	This patch improves the MCJIT runtime dynamic loader by adding new handling of zero-initialized sections, virtual sections and common symbols and preventing the loading of sections which are not required for execution such as debug information. Patch by Andy Kaylor! llvm-svn: 154610	2012-04-12 20:13:57 +00:00
Evan Cheng	d9958dcd91	Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106 llvm-svn: 154604	2012-04-12 19:14:21 +00:00
Evandro Menezes	f199e6b61f	Hexagon: enable assembler output through the MC layer. llvm-svn: 154597	2012-04-12 17:55:53 +00:00
Benjamin Kramer	c672ae3ee2	Remove README entry obsoleted by register masks. llvm-svn: 154588	2012-04-12 12:47:29 +00:00
Craig Topper	448790d566	Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are integer instructions. llvm-svn: 154580	2012-04-12 07:23:00 +00:00
Jim Grosbach	ceb845983c	ARM 'adr' fixups don't need the interworking addend tweaking. They reference the PC directly, so things work properly that way. rdar://11231229 llvm-svn: 154576	2012-04-12 01:19:35 +00:00
Akira Hatanaka	48dbb62cb1	Emit neg.s or neg.d only if -enable-no-nans-fp-math is supplied by user, otherwise expand FNEG during legalization. llvm-svn: 154546	2012-04-11 22:59:08 +00:00
Akira Hatanaka	11a442d515	Emit abs.s or abs.d only if -enable-no-nans-fp-math is supplied by user. Invalid operation is signaled if the operand of these instructions is NaN. llvm-svn: 154545	2012-04-11 22:49:04 +00:00
Kevin Enderby	64c95fb56a	Fixed a case of ARM disassembly getting an assert on a bad encoding of a VST instruction. llvm-svn: 154544	2012-04-11 22:40:17 +00:00
Akira Hatanaka	6636922675	Fix bugs in lowering of FCOPYSIGN nodes. - FCOPYSIGN nodes that have operands of different types were not handled. - Different code was generated depending on the endianness of the target. Additionally, code is added that emits INS and EXT instructions, if they are supported by target (they are R2 instructions). llvm-svn: 154540	2012-04-11 22:13:04 +00:00
Chad Rosier	b41586c8e1	Typo. llvm-svn: 154522	2012-04-11 19:21:58 +00:00
Jim Grosbach	86b5cd7421	ARM 'vuzp.32 Dd, Dm' is a pseudo-instruction. While there is an encoding for it in VUZP, the result of that is undefined, so we should avoid it. Define the instruction as a pseudo for VTRN.32 instead, as the ARM ARM indicates. rdar://11222366 llvm-svn: 154511	2012-04-11 17:40:18 +00:00
Jim Grosbach	e54b48cd74	ARM 'vzip.32 Dd, Dm' is a pseudo-instruction. While there is an encoding for it in VZIP, the result of that is undefined, so we should avoid it. Define the instruction as a pseudo for VTRN.32 instead, as the ARM ARM indicates. rdar://11221911 llvm-svn: 154505	2012-04-11 16:53:25 +00:00
Sylvestre Ledru	40d3066f8b	Fix the build under Debian GNU/Hurd. Thanks to Pino Toscano for the patch llvm-svn: 154500	2012-04-11 15:35:36 +00:00
Benjamin Kramer	eba5ed591b	Cache the hash value of the operands in the MDNode. FoldingSet is implemented as a chained hash table. When there is a hash collision during insertion, which is common as we fill the table until a load factor of 2.0 is hit, we walk the chained elements, comparing every operand with the new element's operands. This can be very expensive if the MDNode has many operands. We sacrifice a word of space in MDNode to cache the full hash value, reducing compares on collision to a minimum. MDNode grows from 28 to 32 bytes + operands on x86. On x86_64 the new bits fit nicely into existing padding, not growing the struct at all. The actual speedup depends a lot on the test case and is typically between 1% and 2% for C++ code with clang -c -O0 -g. llvm-svn: 154497	2012-04-11 14:06:54 +00:00
Benjamin Kramer	c1e98c85e2	FoldingSet: Push the hash through FoldingSetTraits::Equals, so clients can use it. llvm-svn: 154496	2012-04-11 14:06:47 +00:00
Benjamin Kramer	3a0f5a0df3	Compute hashes directly with hash_combine instead of taking a detour through FoldingSetNodeID. llvm-svn: 154495	2012-04-11 14:06:39 +00:00
Nadav Rotem	210f92b306	remove unused argument llvm-svn: 154494	2012-04-11 11:05:21 +00:00
Duncan Sands	da21cc27c0	Add a C binding to the Target and TargetMachine classes to allow for emitting binary and assembly. Patch by Carlo Kok. Emitting was inspired by but not based on the D llvm bindings. llvm-svn: 154493	2012-04-11 10:25:24 +00:00
Chandler Carruth	80c3e3bbba	Add two statistics to help track how we are computing the inline cost. Yea, 'NumCallerCallersAnalyzed' isn't a great name, suggestions welcome. llvm-svn: 154492	2012-04-11 10:15:10 +00:00
Nadav Rotem	b05ea8c9af	Reapply 154397. Original message: Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490	2012-04-11 08:26:11 +00:00
Evan Cheng	f138fb4599	Add more fused mul+add/sub patterns. rdar://10139676 llvm-svn: 154484	2012-04-11 06:59:47 +00:00
Nadav Rotem	c922b4f2a3	Reapply 154396 after fixing a test. Original message: Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendV uses a register for the selection while Vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154483	2012-04-11 06:40:27 +00:00
Evan Cheng	f9baff015d	Clean up ARM fused multiply + add/sub support some more: rename some isel predicates. Also remove NEON2 since it's not really useful and it is confusing. If NEON + VFP4 implies NEON2 but NEON2 doesn't imply NEON + VFP4, what does it really mean? rdar://10139676 llvm-svn: 154480	2012-04-11 05:33:07 +00:00
Craig Topper	28df4bf296	Fix an overly indented line. Remove an 'else' after an 'if' that returns. llvm-svn: 154479	2012-04-11 04:55:51 +00:00
Craig Topper	82772b86d6	Inline implVisitAluOverflow by introducing a nested switch to convert the intrinsic to an nodetype. llvm-svn: 154478	2012-04-11 04:34:11 +00:00
Craig Topper	0590d2cdea	Optimize code a bit by calling push_back only once in some loops. Reduces compiled code size a bit. llvm-svn: 154473	2012-04-11 03:06:35 +00:00
Evan Cheng	b5291aea18	Match (fneg (fma) to vfnma. rdar://10139676 llvm-svn: 154469	2012-04-11 01:21:25 +00:00
Charles Davis	a5e1970cd0	Add retw and lretw instructions. Also, fix Intel syntax parsing for all ret instructions. llvm-svn: 154468	2012-04-11 01:10:53 +00:00
Kevin Enderby	304e4812bc	Fix ARM disassembly of VLD instructions with writebacks. And add test a case for all opcodes handed by DecodeVLDInstruction() in ARMDisassembler.cpp . llvm-svn: 154459	2012-04-11 00:25:40 +00:00
Jim Grosbach	b10b1b22cb	ARM add missing Thumb1 two-operand aliases for shift-by-immediate. rdar://11222742 llvm-svn: 154457	2012-04-11 00:15:16 +00:00
Evan Cheng	12bfe1150d	Fix a number of problems with ARM fused multiply add/subtract instructions. 1. The new instruction itinerary entries are not properly described. 2. The asm parser can't handle vfms and vfnms. 3. There were no assembler, disassembler test cases. 4. HasNEON2 has the wrong assembler predicate. rdar://10139676 llvm-svn: 154456	2012-04-11 00:13:00 +00:00
Jakob Stoklund Olesen	4bfc07ceb5	Tweak MachineLICM heuristics for cheap instructions. Allow cheap instructions to be hoisted if they are register pressure neutral or better. This happens if the instruction is the last loop use of another virtual register. Only expensive instructions are allowed to increase loop register pressure. llvm-svn: 154455	2012-04-11 00:00:28 +00:00
Jakob Stoklund Olesen	b1ec8d8548	Only check for PHI uses inside the current loop. Hoisting a value that is used by a PHI in the loop will introduce a copy because the live range is extended to cross the PHI. The same applies to PHIs in exit blocks. Also use this opportunity to make HasLoopPHIUse() non-recursive. llvm-svn: 154454	2012-04-11 00:00:26 +00:00
Owen Anderson	a8319713a4	Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. llvm-svn: 154447	2012-04-10 22:46:53 +00:00
Kostya Serebryany	3047a70ed9	[tsan] two more compile-time optimizations: - don't isntrument reads from constant globals. Saves ~1.5% of instrumented instructions on CPU2006 (counting static instructions, not their execution). - don't insrument reads from vtable (which is a global constant too). Saves ~5%. I did not measure the run-time impact of this, but it is certainly non-negative. llvm-svn: 154444	2012-04-10 22:29:17 +00:00
Evan Cheng	f9617f7f54	Handle llvm.fma.* intrinsics. rdar://10914096 llvm-svn: 154439	2012-04-10 21:40:28 +00:00
Duncan Sands	6d360055c5	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Bill Wendling	16712e549c	The MDString class stored a StringRef to the string which was already in a StringMap. This was redundant and unnecessarily bloated the MDString class. Because the MDString class is a "Value" and will never have a "name", and because the Name field in the Value class is a pointer to a StringMap entry, we repurpose the Name field for an MDString. It stores the StringMap entry in the Name field, and uses the normal methods to get the string (name) back. PR12474 llvm-svn: 154429	2012-04-10 20:12:16 +00:00
Chad Rosier	b2ebb93f3c	Whitespace. llvm-svn: 154427	2012-04-10 19:42:07 +00:00
Chad Rosier	f3b2588ea8	Revert r154396, which looks to be the real culprit behind the bot failures. llvm-svn: 154426	2012-04-10 19:39:18 +00:00
Eric Christopher	f8886e8f48	Temporarily revert this patch to see if it brings the buildbots back. llvm-svn: 154425	2012-04-10 19:33:16 +00:00
Kostya Serebryany	01d463472d	[tsan] compile-time instrumentation: do not instrument a read if a write to the same temp follows in the same BB. Also add stats printing. On Spec CPU2006 this optimization saves roughly 4% of instrumented reads (which is 3% of all instrumented accesses): Writes : 161216 Reads : 446458 Reads-before-write: 18295 llvm-svn: 154418	2012-04-10 18:18:56 +00:00
Eric Christopher	ec1405e930	To ensure that we have more accurate line information for a block don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 llvm-svn: 154417	2012-04-10 18:18:10 +00:00
Owen Anderson	540d48ddb5	Revert r154397, which was causing make check failures on the buildbots. llvm-svn: 154414	2012-04-10 18:02:12 +00:00
Jim Grosbach	d32f050f68	ARM fix cc_out operand handling for t2SUBrr instructions. We were incorrectly conflating some add variants which don't have a cc_out operand with the mirroring sub encodings, which do. Part of the awesome non-orthogonality legacy of thumb1. Similarly, handling of add/sub of an immediate was sometimes incorrectly removing the cc_out operand for add/sub register variants. rdar://11216577 llvm-svn: 154411	2012-04-10 17:31:55 +00:00
David Blaikie	cf463882c0	Remove unused variable. llvm-svn: 154398	2012-04-10 15:23:13 +00:00
Nadav Rotem	e5008bb774	Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397	2012-04-10 14:58:31 +00:00
Nadav Rotem	74f87a6bd8	Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396	2012-04-10 14:33:13 +00:00
Chandler Carruth	3c9796d9b0	Make a somewhat subtle change in the logic of block placement. Sometimes the loop header has a non-loop predecessor which has been pre-fused into its chain due to unanalyzable branches. In this case, rotating the header into the body of the loop in order to place a loop exit at the bottom of the loop is a Very Bad Idea as it makes the loop non-contiguous. I'm working on a good test case for this, but it's a bit annoynig to craft. I should get one shortly, but I'm submitting this now so I can begin the (lengthy) performance analysis process. An initial run of LNT looks really, really good, but there is too much noise there for me to trust it much. llvm-svn: 154395	2012-04-10 13:35:57 +00:00
Anton Korobeynikov	0fc5fe0430	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
David Chisnall	a098752b13	Use the correct section types on Solaris for unwind data on both x86 and x86-64. Patch by Dmitri Shubin! llvm-svn: 154391	2012-04-10 11:44:33 +00:00
Duncan Sands	f25460b85f	Express the number of ULPs in fpaccuracy metadata as a real rather than a rational number, eg as 2.5 rather than 5, 2. OK'd by Peter Collingbourne. llvm-svn: 154387	2012-04-10 08:22:43 +00:00
Andrew Trick	7230fee696	Fix 12513: Loop unrolling breaks with indirect branches. Take this opportunity to generalize the indirectbr bailout logic for loop transformations. CFG transformations will never get indirectbr right, and there's no point trying. llvm-svn: 154386	2012-04-10 05:14:42 +00:00
Andrew Trick	83a330c1b9	whitespace llvm-svn: 154385	2012-04-10 05:14:37 +00:00
Evan Cheng	460634e917	Make the code slightly more palatable. llvm-svn: 154378	2012-04-10 03:15:18 +00:00
Danil Malyshev	6db4fe8581	Add a constructor for DataRefImpl and remove excess initialization. llvm-svn: 154371	2012-04-10 01:54:44 +00:00
Evan Cheng	5825e9dbf5	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Rafael Espindola	9febd1fbf7	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Jim Grosbach	3c0465899e	ARM LDR/LDRT has the same encoding collision as STR/STRT. Generalized logic of r154141. llvm-svn: 154362	2012-04-10 00:13:07 +00:00
Akira Hatanaka	1b46e841a2	Have TargetLowering::getPICJumpTableRelocBase return a node that points to the GOT if jump table uses 64-bit gp-relative relocation. llvm-svn: 154341	2012-04-09 20:32:12 +00:00
Chad Rosier	a588421976	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340	2012-04-09 20:32:02 +00:00
Lang Hames	751eb83306	Patch r153892 for PR11861 apparently broke an external project (see PR12493). This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when rescheduling instructions in TryInstructionTransform. Hopefully this will fix PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after the copy that unties the operands is emitted (this seems to be a more appropriate fix for that issue anyway). llvm-svn: 154338	2012-04-09 20:17:30 +00:00
Chad Rosier	b7c56882e4	Update comments and remove unnecessary isVolatile() check. llvm-svn: 154336	2012-04-09 19:38:15 +00:00
David Blaikie	0f75c2a359	Fix accidentally constant conditions found by uncommitted improvements to -Wconstant-conversion. A couple of cases where we were accidentally creating constant conditions by something like "x == a \|\| b" instead of "x == a \|\| x == b". In one case a conditional & then unreachable was used - I transformed this into a direct assert instead. llvm-svn: 154324	2012-04-09 16:29:35 +00:00
Rafael Espindola	6b7bf4d0aa	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Preston Gurd	c758aebf45	This patch adds X86 instruction itineraries, which were missed by the original patch to add itineraries, to X86InstrArithmetc.td. llvm-svn: 154320	2012-04-09 15:32:22 +00:00
Nadav Rotem	9f7f17826e	Lower some x86 shuffle sequences to the vblend family of instructions. llvm-svn: 154313	2012-04-09 08:33:21 +00:00
Nadav Rotem	4499fb1d50	Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type. Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310	2012-04-09 07:45:58 +00:00
Craig Topper	b06257c64d	Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309	2012-04-09 07:19:09 +00:00
Craig Topper	a248e92058	Remove unnecessary 'else' on an 'if' that always returns llvm-svn: 154308	2012-04-09 05:59:53 +00:00
Craig Topper	ee38217fe4	Optimize code slightly. No functionality change. llvm-svn: 154307	2012-04-09 05:55:33 +00:00
Craig Topper	24c4646a77	Replace some explicit checks with asserts for conditions that should never happen. llvm-svn: 154305	2012-04-09 05:16:56 +00:00
Chandler Carruth	bb1db0e66a	Cleanup and relax a restriction on the matching of global offsets into x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is not using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304	2012-04-09 02:13:06 +00:00
Craig Topper	1960db33c0	Optimize code a bit. No functional change intended. llvm-svn: 154299	2012-04-08 23:15:04 +00:00
Benjamin Kramer	e99c184047	Silence sign-compare warning. llvm-svn: 154297	2012-04-08 19:04:45 +00:00
Duncan Sands	28b9aa998e	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Craig Topper	e0c286243b	Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently. llvm-svn: 154295	2012-04-08 17:53:33 +00:00
Chandler Carruth	11c412fd2c	Teach LLVM about a PIE option which, when enabled on top of PIC, makes optimizations which are valid for position independent code being linked into a single executable, but not for such code being linked into a shared library. I discussed the design of this with Eric Christopher, and the decision was to support an optional bit rather than a completely separate relocation model. Fundamentally, this is still PIC relocation, its just that certain optimizations are only valid under a PIC relocation model when the resulting code won't be in a shared library. The simplest path to here is to expose a single bit option in the TargetOptions. If folks have different/better designs, I'm all ears. =] I've included the first optimization based upon this: changing TLS models to the *Exec models when PIE is enabled. This is the LLVM component of PR12380 and is all of the hard work. llvm-svn: 154294	2012-04-08 17:51:45 +00:00
Chandler Carruth	233e7232ae	Move the TLSModel information into the TargetMachine rather than hiding in TargetLowering. There was already a FIXME about this location being odd. The interface is simplified as a consequence. This will also make it easier to change TLS models when compiling with PIE. llvm-svn: 154292	2012-04-08 17:20:55 +00:00
Benjamin Kramer	2f6ec0fcc0	EngineBuilder::create is expected to take ownership of the TargetMachine passed to it. Delete it on error or when we create an interpreter that doesn't need it. llvm-svn: 154288	2012-04-08 14:53:14 +00:00
Chandler Carruth	5ec9b9fd94	Remove an over zealous assert. The assert was trying to catch places where a chain outside of the loop block-set ended up in the worklist for scheduling as part of the contiguous loop. However, asserting the first block in the chain is in the loop-set isn't a valid check -- we may be forced to drag a chain into the worklist due to one block in the chain being part of the loop even though the first block is not in the loop. This occurs when we have been forced to form a chain early due to un-analyzable branches. No test case here as I have no idea how to even begin reducing one, and it will be hopelessly fragile. We have to somehow end up with a loop header of an inner loop which is a successor of a basic block with an unanalyzable pair of branch instructions. Ow. Self-host triggers it so it is unlikely it will regress. This at least gets block placement back to passing selfhost and the test suite. There are still a lot of slowdown that I don't like coming out of block placement, although there are now also a lot of speedups. =[ I'm seeing swings in both directions up to 10%. I'm going to try to find time to dig into this and see if we can turn this on for 3.1 as it does a really good job of cleaning up after some loops that degraded with the inliner changes. llvm-svn: 154287	2012-04-08 14:37:02 +00:00
Chandler Carruth	2fe7a17703	Add a debug-only 'dump' method to the BlockChain structure to ease debugging. llvm-svn: 154286	2012-04-08 14:37:01 +00:00
Chandler Carruth	b3fb4be360	Teach InstCombine to nuke a common alloca pattern -- an alloca which has GEPs, bit casts, and stores reaching it but no other instructions. These often show up during the iterative processing of the inliner, SROA, and DCE. Once we hit this point, we can completely remove the alloca. These were actually showing up in the final, fully optimized code in a bunch of inliner tests I've been working on, and notably they show up after LLVM finishes optimizing away all function calls involved in hash_combine(a, b). llvm-svn: 154285	2012-04-08 14:36:56 +00:00
Nadav Rotem	8957364ae5	AVX2: Build splat vectors by broadcasting a scalar from the constant pool. Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. llvm-svn: 154284	2012-04-08 12:54:54 +00:00
Bill Wendling	15f23d4018	Remove the 'Parent' pointer from the MDNodeOperand class. An MDNode has a list of MDNodeOperands allocated directly after it as part of its allocation. Therefore, the Parent of the MDNodeOperands can be found by walking back through the operands to the beginning of that list. Mark the first operand's value pointer as being the 'first' operand so that we know where the beginning of said list is. This saves a lot of space during LTO with -O0 -g flags. llvm-svn: 154280	2012-04-08 10:20:49 +00:00
Bill Wendling	d13bec8fa9	Allow subclasses of the ValueHandleBase to store information as part of the value pointer by making the value pointer into a pointer-int pair with 2 bits available for flags. llvm-svn: 154279	2012-04-08 10:16:43 +00:00
Craig Topper	a6412fb8c0	Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1. llvm-svn: 154272	2012-04-07 22:32:29 +00:00
Craig Topper	1ddf62dc2c	Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns. llvm-svn: 154268	2012-04-07 21:57:43 +00:00
Craig Topper	d40e2513b2	Remove 'else' after 'if' that ends in return. llvm-svn: 154267	2012-04-07 21:23:41 +00:00
Nadav Rotem	37734277f0	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	cd52f3d447	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Chandler Carruth	2817fc1e53	Fix ValueTracking to conclude that debug intrinsics are safe to speculate. Without this, loop rotate (among many other places) would suddenly stop working in the presence of debug info. I found this looking at loop rotate, and have augmented its tests with a reduction out of a very hot loop in yacr2 where failing to do this rotation costs sometimes more than 10% in runtime performance, perturbing numerous downstream optimizations. This should have no impact on performance without debug info, but the change in performance when debug info is enabled can be extreme. As a consequence (and this how I got to this yak) any profiling of performance problems should be treated with deep suspicion -- they may have been wildly innacurate of debug info was enabled for profiling. =/ Just a heads up. llvm-svn: 154263	2012-04-07 19:22:18 +00:00
Benjamin Kramer	a690750db9	SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW. Found by inspection. llvm-svn: 154262	2012-04-07 17:19:26 +00:00
Bob Wilson	059dbb715f	Fix Thumb __builtin_longjmp with integrated assembler. <rdar://problem/11203543> The tLDRr instruction with the last register operand set to the zero register prints in assembly as if no register was specified, and the assembler encodes it as a tLDRi instruction with a zero immediate. With the integrated assembler, that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which is broken. Emit the instruction as tLDRi with a zero immediate. I don't know if there's a good way to write a testcase for this. Suggestions welcome. Opportunities for follow-up work: 1) The asm printer should complain if a non-optional register operand is set to the zero register, instead of silently dropping it. 2) The integrated assembler should complain in the same situation, instead of silently emitting the operand as "r0". llvm-svn: 154261	2012-04-07 16:51:59 +00:00
Hongbin Zheng	48758c581f	Refactor: Use positive field names in VectorizeConfig. llvm-svn: 154249	2012-04-07 03:56:23 +00:00
NAKAMURA Takumi	1bfc716b7d	Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming. Cygwin-1.7 supports dw2. Some recent mingw distros support one, too. I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin. llvm-svn: 154247	2012-04-07 02:24:20 +00:00
Alexis Hunt	5c14769849	Output UTF-8-encoded characters as identifier characters into assembly by default. This is a behaviour configurable in the MCAsmInfo. I've decided to turn it on by default in (possibly optimistic) hopes that most assemblers are reasonably sane. If this proves a problem, switching to default seems reasonable. I'm not sure if this is the opportune place to test, but it seemed good to make sure it was tested somewhere. llvm-svn: 154235	2012-04-07 00:37:53 +00:00
Jim Grosbach	249356cbf3	Tidy up. 80 columns. llvm-svn: 154226	2012-04-06 23:43:50 +00:00
Jakob Stoklund Olesen	446611ae2a	ARMPat is equivalent to Requires<[IsARM]>. llvm-svn: 154210	2012-04-06 21:21:59 +00:00
Jakob Stoklund Olesen	ce15da8935	Eliminate iOS-specific tail call instructions. After register masks were introdruced to represent the call clobbers, it is no longer necessary to have duplicate instruction for iOS. llvm-svn: 154209	2012-04-06 21:17:42 +00:00
Chandler Carruth	55fe352a8c	There is no portable std::abs overload for int64_t, use the llvm::abs64 which exists for this purpose. llvm-svn: 154199	2012-04-06 20:10:52 +00:00
Sean Callanan	f467eceb2a	Fixed two leaks in the MC disassembler. The MC disassembler requires a MCSubtargetInfo and a MCInstrInfo to exist in order to initialize the instruction printer and disassembler; however, although the printer and disassembler keep references to these objects they do not own them. Previously, the MCSubtargetInfo and MCInstrInfo objects were just leaked. I have extended LLVMDisasmContext to own these objects and delete them when it is destroyed. llvm-svn: 154192	2012-04-06 18:21:09 +00:00
Jakob Stoklund Olesen	bb7b631def	Allow negative immediates in ARM and Thumb2 compares. ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183	2012-04-06 17:45:04 +00:00
David Chisnall	599e54b905	Reintroduce InlineCostAnalyzer::getInlineCost() variant with explicit callee parameter until we have a more sensible API for doing the same thing. Reviewed by Chandler. llvm-svn: 154180	2012-04-06 17:27:41 +00:00
Chandler Carruth	020a15db9d	Sink the collection of return instructions until after all simplification has been performed. This is a bit less efficient (requires another ilist walk of the basic blocks) but shouldn't matter in practice. More importantly, it's just too much work to keep track of all the various ways the return instructions can be mutated while simplifying them. This fixes yet another crasher, reported by Daniel Dunbar. llvm-svn: 154179	2012-04-06 17:21:31 +00:00
Duncan Sands	c7d0fdb71f	Make GVN's propagateEquality non-recursive. No intended functionality change. The modifications are a lot more trivial than they appear to be in the diff! llvm-svn: 154174	2012-04-06 15:31:09 +00:00
Benjamin Kramer	103f74e9f8	Fix narrowing conversion. llvm-svn: 154171	2012-04-06 13:33:52 +00:00
Craig Topper	ffae2f8986	Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413. llvm-svn: 154166	2012-04-06 07:45:23 +00:00
Chandler Carruth	dc52b30dac	Sink the return instruction collection until after we're done deleting dead code, including dead return instructions in some cases. Otherwise, we end up having a bogus poniter to a return instruction that blows up much further down the road. It turns out that this pattern is both simpler to code, easier to update in the face of enhancements to the inliner cleanup, and likely cheaper given that it won't add dead instructions to the list. Thanks to John Regehr's numerous test cases for teasing this out. llvm-svn: 154157	2012-04-06 01:11:52 +00:00
Jakob Stoklund Olesen	96c573a6c4	Deduplicate ARM call-related instructions. We had special instructions for iOS because r9 is call-clobbered, but that is represented dynamically by the register mask operands now, so there is no need for the pseudo-instructions. llvm-svn: 154144	2012-04-06 00:04:58 +00:00
Jim Grosbach	e1c687cc0a	ARM: Don't form a t2LDRi8 or t2STRi8 with an offset of zero. The load/store optimizer splits LDRD/STRD into two instructions when the register pairing doesn't work out. For negative offsets in Thumb2, it uses t2STRi8 to do that. That's fine, except for the case when the offset is in the range [-4,-1]. In that case, we'll also form a second t2STRi8 with the original offset plus 4, resulting in a t2STRi8 with a non-negative offset, which ends up as if it were an STRT, which is completely bogus. Similarly for loads. No testcase, unfortunately, as any I've been able to construct is both large and extremely fragile. rdar://11193937 llvm-svn: 154141	2012-04-05 23:51:24 +00:00
Jim Grosbach	2169e1d55c	ARM assembly aliases for add negative immediates using sub. 'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out. Thumb1 aliases for adding a negative immediate to the stack pointer, also. rdar://11192734 llvm-svn: 154123	2012-04-05 20:57:13 +00:00
Eric Christopher	2e17b32e69	Patch to set is_stmt a little better for prologue lines in a function. This enables debuggers to see what are interesting lines for a breakpoint rather than any line that starts a function. rdar://9852092 llvm-svn: 154120	2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen	28edb011c4	Don't break the IV update in TLI::SimplifySetCC(). LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> llvm-svn: 154119	2012-04-05 20:30:20 +00:00
Dan Gohman	a5e2200b2a	Fix accidentally inverted logic from r152803, and make the testcase slightly less trivial. This fixes rdar://11171718. llvm-svn: 154118	2012-04-05 20:27:21 +00:00
Owen Anderson	b21312019c	Treat f16 the same as f80/f128 for the purposes of generating constants during instruction selection. llvm-svn: 154113	2012-04-05 18:50:32 +00:00
Silviu Baranga	f376e00699	Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions. llvm-svn: 154101	2012-04-05 16:19:29 +00:00
Silviu Baranga	1c2668f700	Added support for handling unpredictable arithmetic instructions on ARM. llvm-svn: 154100	2012-04-05 16:13:15 +00:00
Hongbin Zheng	4da3f9fa46	BBVectorize: Add the const modifier to the VectorizeConfig because we won't modify it. llvm-svn: 154098	2012-04-05 16:07:49 +00:00
Hongbin Zheng	7a4e40f87f	Introduce the VectorizeConfig class, with which we can control the behavior of the BBVectorizePass without using command line option. As pointed out by Hal, we can ask the TargetLoweringInfo for the architecture specific VectorizeConfig to perform vectorizing with architecture specific information. llvm-svn: 154096	2012-04-05 15:46:55 +00:00
Hongbin Zheng	8d380b332d	Add the function "vectorizeBasicBlock" which allow users vectorize a BasicBlock in other passes, e.g. we can call vectorizeBasicBlock in the loop unroll pass right after the loop is unrolled. llvm-svn: 154089	2012-04-05 08:05:16 +00:00
Jim Grosbach	5d11d38750	ARM assembly aliases for two-operand V[R]SHR instructions. rdar://11189467 llvm-svn: 154087	2012-04-05 07:23:53 +00:00
Argyrios Kyrtzidis	f5736f87f2	In MemoryBuffer::getOpenFile() make sure that the buffer is null-terminated if the caller requested a null-terminated one. When mapping the file there could be a racing issue that resulted in the file being larger than the FileSize passed by the caller. We already have an assertion for this in MemoryBuffer::init() but have a runtime guarantee that the buffer will be null-terminated, so do a copy that adds a null-terminator. Protects against crash of rdar://11161822. llvm-svn: 154082	2012-04-05 04:23:56 +00:00
Jim Grosbach	64f4e8d5b3	ARM assembly parsing for 'msr' plain 'cpsr' operand. Plain 'cpsr' is an alias for 'cpsr_fc'. rdar://11153753 llvm-svn: 154080	2012-04-05 03:17:53 +00:00
Jakob Stoklund Olesen	e1ae4f161c	Pass the right sign to TLI->isLegalICmpImmediate. LSR can fold three addressing modes into its ICmpZero node: ICmpZero BaseReg + Offset => ICmp BaseReg, -Offset ICmpZero -1ScaleReg + Offset => ICmp ScaleReg, Offset ICmpZero BaseReg + -1ScaleReg => ICmp BaseReg, ScaleReg The first two cases are only used if TLI->isLegalICmpImmediate() likes the offset. Make sure the right Offset sign is passed to this method in the second case. The ARM version is not symmetric. <rdar://problem/11184260> llvm-svn: 154079	2012-04-05 03:10:56 +00:00
Akira Hatanaka	e5ea70212f	Reapply 154038 without the failing test. llvm-svn: 154062	2012-04-04 22:16:36 +00:00
Owen Anderson	f6f930a990	Revert r154038. It was causing make check failures. llvm-svn: 154054	2012-04-04 21:18:58 +00:00
Pete Cooper	4f727ef169	REG_SEQUENCE expansion to COPY instructions wasn't taking account of sub register indices on the source registers. No simple test case llvm-svn: 154051	2012-04-04 21:03:25 +00:00
Benjamin Kramer	270e886395	Fix a C++11 UDL conflict. Still not fixed in the standard ;) llvm-svn: 154044	2012-04-04 20:33:56 +00:00
Pete Cooper	8d002ed0bb	f16 FREM can now be legalized by promoting to f32 llvm-svn: 154039	2012-04-04 19:36:31 +00:00
Akira Hatanaka	4df2267566	Fix LowerGlobalAddress to produce instructions with the correct relocation types for N32 ABI. Add new test case and update existing ones. llvm-svn: 154038	2012-04-04 19:02:38 +00:00
Akira Hatanaka	f9e02ac6e1	Fix LowerJumpTable to produce instructions with the correct relocation types for N32 ABI. Test case will be updated after the patch that fixes TargetLowering::getPICJumpTableRelocBase is checked in. llvm-svn: 154036	2012-04-04 18:31:32 +00:00
Akira Hatanaka	c8028e2551	Fix LowerConstantPool to produce instructions with the correct relocation types for N32 ABI and update test case. llvm-svn: 154034	2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen	0419ed395c	Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr. A MOVCCr instruction can be commuted by inverting the condition. This can help reduce register pressure and remove unnecessary copies in some cases. <rdar://problem/11182914> llvm-svn: 154033	2012-04-04 18:23:42 +00:00
Jakob Stoklund Olesen	f0c39f0a1e	Remove spurious debug output. llvm-svn: 154032	2012-04-04 18:23:38 +00:00
Akira Hatanaka	913d78a99c	Fix LowerBlockAddress to produce instructions with the correct relocation types for N32 ABI and update test case. llvm-svn: 154031	2012-04-04 18:22:53 +00:00
Rafael Espindola	88a1aeb123	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Hongbin Zheng	c50b4781ab	LoopUnrollPass: Use variable "Threshold" instead of "CurrentThreshold" when reducing unroll count, otherwise the reduced unroll count is not taking the "OptimizeForSize" attribute into account. llvm-svn: 154007	2012-04-04 11:44:08 +00:00
Benjamin Kramer	a323a34d00	Move yaml::Stream's dtor out of line so it can see Scanner's dtor. llvm-svn: 154004	2012-04-04 08:53:34 +00:00
Craig Topper	98fc96208f	Remove default case from switch that was already covering all cases. llvm-svn: 153996	2012-04-04 04:42:42 +00:00
Pete Cooper	702973d27c	Removed useless switch for default case when switch was covering all the enum values llvm-svn: 153984	2012-04-04 00:53:04 +00:00
Michael J. Spencer	84f354e368	Sorry about that. MSVC seems to accept just about any random string you give it ;/ llvm-svn: 153979	2012-04-03 23:36:44 +00:00
Michael J. Spencer	2f9beb2374	Add YAML parser to Support. llvm-svn: 153977	2012-04-03 23:09:22 +00:00
Pete Cooper	4164f86b8a	Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095> llvm-svn: 153976	2012-04-03 22:57:55 +00:00
Pete Cooper	983fc686b4	Removed one last bad continue statement meant to be removed in r153914. llvm-svn: 153975	2012-04-03 22:18:49 +00:00
Chad Rosier	e7870c71eb	Fix an issue in SimplifySetCC() specific to vector comparisons. When folding X == X we need to check getBooleanContents() to determine if the result is a vector of ones or a vector of negative ones. I tried creating a test case, but the problem seems to only be exposed on a much older version of clang (around r144500). rdar://10923049 llvm-svn: 153966	2012-04-03 20:11:24 +00:00
Eric Christopher	53ef0cf4a5	Fix thinko check for number of operands to be the one that actually might have more than 19 operands. Add a testcase to make sure I never screw that up again. Part of rdar://11026482 llvm-svn: 153961	2012-04-03 17:55:42 +00:00
Dylan Noblesmith	8ab4926be7	ARMDisassembler: drop bogus dependency on ARMCodeGen And indirectly, a dependency on most of the core LLVM optimization libraries. llvm-svn: 153957	2012-04-03 15:48:14 +00:00
Dylan Noblesmith	2431e625b9	Object: drop bogus VMCore dependency llvm-svn: 153956	2012-04-03 15:48:10 +00:00
Bill Wendling	e3c2c36927	The speedup doesn't appear to have been from this, but was an anomaly of my testing machine. llvm-svn: 153951	2012-04-03 11:19:21 +00:00
Bill Wendling	3f12fbd290	Reserve space for the eventual filling of the vector. This gives a small speedup. llvm-svn: 153949	2012-04-03 10:50:09 +00:00
Anton Korobeynikov	e70c37c738	Make PPCCompilationCallbackC function to be static, so there will be no need to issue call via PLT when LLVM is built as shared library. This mimics the X86 backend towards the approach. llvm-svn: 153938	2012-04-03 06:59:28 +00:00
Craig Topper	ce6c05e0df	Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo. llvm-svn: 153935	2012-04-03 05:20:24 +00:00
Akira Hatanaka	c5bbe0b434	Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler. llvm-svn: 153926	2012-04-03 03:01:13 +00:00
Akira Hatanaka	cecb440c11	Revert r153924. There were buildbot failures. llvm-svn: 153925	2012-04-03 02:51:09 +00:00
Akira Hatanaka	058b0cfb55	MIPS disassembler support. Patch by Vladimir Medic. llvm-svn: 153924	2012-04-03 02:20:58 +00:00
Eric Christopher	ba40985484	Add a line number for the scope of the function (starting at the first brace) so that we get more accurate line number information about the declaration of a given function and the line where the function first starts. Part of rdar://11026482 llvm-svn: 153916	2012-04-03 00:43:49 +00:00
Pete Cooper	fb86d3b6bc	Fixes to r153903. Added missing explanation of behaviour when the VirtRegMap is NULL. Also changed it in this case to just avoid updating the map, but live ranges or intervals will still get updated and created llvm-svn: 153914	2012-04-03 00:28:46 +00:00
Pete Cooper	426b167bc5	Moved LiveRangeEdit.h so that it can be called from other parts of the backend, not just libCodeGen llvm-svn: 153906	2012-04-02 22:44:18 +00:00
Jakob Stoklund Olesen	97f47c37b6	Allocate virtual registers in ascending order. This is just the fallback tie-breaker ordering, the main allocation order is still descending size. Patch by Shamil Kurmangaleev! llvm-svn: 153904	2012-04-02 22:30:39 +00:00
Pete Cooper	a76a82ef6f	Refactored the LiveRangeEdit interface so that MachineFunction, TargetInstrInfo, MachineRegisterInfo, LiveIntervals, and VirtRegMap are all passed into the constructor and stored as members instead of passed in to each method. llvm-svn: 153903	2012-04-02 22:22:53 +00:00
Bill Wendling	1db4186413	Add an option to turn off the expensive GVN load PRE part of GVN. llvm-svn: 153902	2012-04-02 22:16:50 +00:00
Owen Anderson	157487e7c5	Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901	2012-04-02 22:10:29 +00:00
Lang Hames	dbc3175c89	During two-address lowering, rescheduling an instruction does not untie operands. Make TryInstructionTransform return false to reflect this. Fixes PR11861. llvm-svn: 153892	2012-04-02 19:58:43 +00:00
Akira Hatanaka	f37a1c4323	Initial 64 bit direct object support. This patch allows llvm to recognize that a 64 bit object file is being produced and that the subsequently generated ELF header has the correct information. The test case checks for both big and little endian flavors. Patch by Jack Carter. llvm-svn: 153889	2012-04-02 19:25:22 +00:00
Hal Finkel	63edfabaaf	The binutils for the IBM BG/P are too old to support CFI. llvm-svn: 153886	2012-04-02 19:09:04 +00:00
Hal Finkel	d6e526ae11	Add triple support for the IBM BG/P and BG/Q supercomputers. llvm-svn: 153882	2012-04-02 18:31:33 +00:00
Eric Christopher	6c4e6016b5	Turn on the accelerator tables for Darwin. llvm-svn: 153880	2012-04-02 17:58:52 +00:00
Stepan Dyatkovskiy	0ddc03ebad	Fast fix for PR12343: http://llvm.org/bugs/show_bug.cgi?id=12343 We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling. Fix forbids this case for unswitching. llvm-svn: 153879	2012-04-02 17:16:45 +00:00
Roman Divacky	2460282f66	Implement the SVR4 byval alignment for aggregates. Fixing a FIXME. llvm-svn: 153876	2012-04-02 15:49:30 +00:00
Benjamin Kramer	2f6189e2a5	Move getOpcodeName from the various target InstPrinters into the superclass MCInstPrinter. All implementations used the same code. llvm-svn: 153866	2012-04-02 08:32:38 +00:00
Nadav Rotem	a9ec0e024f	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Craig Topper	fe02cb5e8b	Remove getInstructionName from MCInstPrinter implementations in favor of using the instruction name table from MCInstrInfo. Reduces static data in the InstPrinter implementations. llvm-svn: 153863	2012-04-02 07:01:04 +00:00
Craig Topper	dbc259a436	Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo. llvm-svn: 153860	2012-04-02 06:09:36 +00:00
Hal Finkel	e54b93886a	Fix some 80-col. violations I introduced with the A2 PPC64 core. llvm-svn: 153852	2012-04-01 21:20:14 +00:00
Hal Finkel	1c045f6845	Enable prefetch generation on PPC64. llvm-svn: 153851	2012-04-01 20:08:17 +00:00
Hal Finkel	415234aaa4	Add LdStSTD* itin. for the PPC64 A2 core. llvm-svn: 153850	2012-04-01 20:08:08 +00:00
Nadav Rotem	2729f54295	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Lang Hames	44174d3b7a	Fix typo. llvm-svn: 153846	2012-04-01 19:27:25 +00:00
Hal Finkel	ff17f29a1f	Set the default PPC node scheduling preference to ILP (for the embedded cores). The 440 and A2 cores have detailed itineraries, and this allows them to be fully used to maximize throughput. llvm-svn: 153845	2012-04-01 19:23:08 +00:00
Hal Finkel	71772b9747	Add ppc440 itin. entries for LdStSTD* llvm-svn: 153844	2012-04-01 19:23:04 +00:00
Hal Finkel	f74994d731	Use full anti-dep. breaking with post-ra sched. on the embedded ppc cores. Post-RA scheduling gives a significant performance improvement on the embedded cores, so turn it on. Using full anti-dep. breaking is important for FP-intensive blocks, so turn it on (just on the embedded cores for now; this should also be good on the 970s because post-ra scheduling is all that we have for now, but that should have more testing first). llvm-svn: 153843	2012-04-01 19:22:57 +00:00
Hal Finkel	fd26145bc6	Add instruction itinerary for the PPC64 A2 core. This adds a full itinerary for IBM's PPC64 A2 embedded core. These cores form the basis for the CPUs in the new IBM BG/Q supercomputer. llvm-svn: 153842	2012-04-01 19:22:40 +00:00
Chandler Carruth	8b30ff4d0f	Belatedly address some code review from Chris. As a side note, I really dislike array_pod_sort... Do we really still care about any STL implementations that get this so wrong? Does libc++? llvm-svn: 153834	2012-04-01 10:41:24 +00:00
Chandler Carruth	1a2234d527	Fix a pretty scary bug I introduced into the always inliner with a single missing character. Somehow, this had gone untested. I've added tests for returns-twice logic specifically with the always-inliner that would have caught this, and fixed the bug. Thanks to Matt for the careful review and spotting this!!! =D llvm-svn: 153832	2012-04-01 10:21:05 +00:00
Andrew Trick	31337c9d64	misched: Add finalizeScheduler to complete the target interface. llvm-svn: 153827	2012-04-01 07:24:23 +00:00
Eli Bendersky	f79279db2f	Removing a file that's no longer being used after the recent refactorings llvm-svn: 153825	2012-04-01 06:50:01 +00:00
Hal Finkel	42a487282a	Split the LdStGeneral PPC itin. class into LdStLoad and LdStStore. Loads and stores can have different pipeline behavior, especially on embedded chips. This change allows those differences to be expressed. Except for the 440 scheduler, there are no functionality changes. On the 440, the latency adjustment is only by one cycle, and so this probably does not affect much. Nevertheless, it will make a larger difference in the future and this removes a FIXME from the 440 itin. llvm-svn: 153821	2012-04-01 04:44:16 +00:00
Rafael Espindola	2da83dbcd4	Teach CodeGen's version of computeMaskedBits to understand the range metadata. This is the CodeGen equivalent of r153747. I tested that there is not noticeable performance difference with any combination of -O0/-O2 /-g when compiling gcc as a single compilation unit. llvm-svn: 153817	2012-03-31 18:14:00 +00:00
Hal Finkel	548d6f1ad0	Fix dynamic linking on PPC64. Dynamic linking on PPC64 has had problems since we had to move the top-down hazard-detection logic post-ra. For dynamic linking to work there needs to be a nop placed after every call. It turns out that it is really hard to guarantee that nothing will be placed in between the call (bl) and the nop during post-ra scheduling. Previous attempts at fixing this by placing logic inside the hazard detector only partially worked. This is now fixed in a different way: call+nop codegen-only instructions. As far as CodeGen is concerned the pair is now a single instruction and cannot be split. This solution works much better than previous attempts. The scoreboard hazard detector is also renamed to be more generic, there is currently no cpu-specific logic in it. llvm-svn: 153816	2012-03-31 14:45:15 +00:00
Chandler Carruth	1c12d40cce	Fix a typo reported in IRC by someone reviewing this code. llvm-svn: 153815	2012-03-31 13:18:09 +00:00
Chandler Carruth	ad48424b6b	Give the always-inliner its own custom filter. It shouldn't have to pay the very high overhead of the complex inline cost analysis when all it wants to do is detect three patterns which must not be inlined. Comment the code, clean it up, and leave some hints about possible performance improvements if this ever shows up on a profile. Moving this off of the (now more expensive) inline cost analysis is particularly important because we have to run this inliner even at -O0. llvm-svn: 153814	2012-03-31 13:17:18 +00:00
Chandler Carruth	61d0735f00	Remove a bunch of empty, dead, and no-op methods from all of these interfaces. These methods were used in the old inline cost system where there was a persistent cache that had to be updated, invalidated, and cleared. We're now doing more direct computations that don't require this intricate dance. Even if we resume some level of caching, it would almost certainly have a simpler and more narrow interface than this. llvm-svn: 153813	2012-03-31 12:48:08 +00:00
Chandler Carruth	8cacff57bf	Initial commit for the rewrite of the inline cost analysis to operate on a per-callsite walk of the called function's instructions, in breadth-first order over the potentially reachable set of basic blocks. This is a major shift in how inline cost analysis works to improve the accuracy and rationality of inlining decisions. A brief outline of the algorithm this moves to: - Build a simplification mapping based on the callsite arguments to the function arguments. - Push the entry block onto a worklist of potentially-live basic blocks. - Pop the first block off of the front of the worklist (for breadth-first ordering) and walk its instructions using a custom InstVisitor. - For each instruction's operands, re-map them based on the simplification mappings available for the given callsite. - Compute any simplification possible of the instruction after re-mapping, and store that back int othe simplification mapping. - Compute any bonuses, costs, or other impacts of the instruction on the cost metric. - When the terminator is reached, replace any conditional value in the terminator with any simplifications from the mapping we have, and add any successors which are not proven to be dead from these simplifications to the worklist. - Pop the next block off of the front of the worklist, and repeat. - As soon as the cost of inlining exceeds the threshold for the callsite, stop analyzing the function in order to bound cost. The primary goal of this algorithm is to perfectly handle dead code paths. We do not want any code in trivially dead code paths to impact inlining decisions. The previous metric was extremely flawed here, and would always subtract the average cost of two successors of a conditional branch when it was proven to become an unconditional branch at the callsite. There was no handling of wildly different costs between the two successors, which would cause inlining when the path actually taken was too large, and no inlining when the path actually taken was trivially simple. There was also no handling of the code path, only the immediate successors. These problems vanish completely now. See the added regression tests for the shiny new features -- we skip recursive function calls, SROA-killing instructions, and high cost complex CFG structures when dead at the callsite being analyzed. Switching to this algorithm required refactoring the inline cost interface to accept the actual threshold rather than simply returning a single cost. The resulting interface is pretty bad, and I'm planning to do lots of interface cleanup after this patch. Several other refactorings fell out of this, but I've tried to minimize them for this patch. =/ There is still more cleanup that can be done here. Please point out anything that you see in review. I've worked really hard to try to mirror at least the spirit of all of the previous heuristics in the new model. It's not clear that they are all correct any more, but I wanted to minimize the change in this single patch, it's already a bit ridiculous. One heuristic that is not yet mirrored is to allow inlining of functions with a dynamic alloca if the caller has a dynamic alloca. I will add this back, but I think the most reasonable way requires changes to the inliner itself rather than just the cost metric, and so I've deferred this for a subsequent patch. The test case is XFAIL-ed until then. As mentioned in the review mail, this seems to make Clang run about 1% to 2% faster in -O0, but makes its binary size grow by just under 4%. I've looked into the 4% growth, and it can be fixed, but requires changes to other parts of the inliner. llvm-svn: 153812	2012-03-31 12:42:41 +00:00
Benjamin Kramer	b7ee22e3f3	Internalize: Remove reference of @llvm.noinline, it was replaced with the noinline attribute a long time ago. llvm-svn: 153806	2012-03-31 11:03:47 +00:00
Duncan Sands	5165327295	I noticed in passing that the Metadata getIfExists method was creating a new node and returning it if one didn't exist. llvm-svn: 153798	2012-03-31 08:20:11 +00:00
Hal Finkel	45ad90afac	Correctly vectorize powi. The powi intrinsic requires special handling because it always takes a single integer power regardless of the result type. As a result, we can vectorize only if the powers are equal. Fixes PR12364. llvm-svn: 153797	2012-03-31 03:38:40 +00:00
Akira Hatanaka	4ef4aae332	Select static relocation model if it is jitting. llvm-svn: 153795	2012-03-31 02:38:36 +00:00
Jakob Stoklund Olesen	728984c476	Add a 2 byte safety margin in offset computations. ARMConstantIslandPass still has bugs where jump table compression can cause constant pool entries to go out of range. Add a safety margin of 2 bytes when placing constant islands, but use the real max displacement for verification. <rdar://problem/11156595> llvm-svn: 153789	2012-03-31 00:06:44 +00:00
Jakob Stoklund Olesen	91f86a31e7	Add more debugging output to ARMConstantIslandPass. llvm-svn: 153788	2012-03-31 00:06:42 +00:00
Benjamin Kramer	dbd6a33c45	Rip out emission of the regIsInRegClass function for the asm printer. It's slow, bloated and completely redundant with MCRegisterClass::contains. llvm-svn: 153782	2012-03-30 23:13:40 +00:00
Jim Grosbach	ab2d3b5529	ARM fix encoding fixup resolution for ldrd and friends. The 8-bit payload is not contiguous in the opcode. Move the upper nibble over 4 bits into the correct place. rdar://11158641 llvm-svn: 153780	2012-03-30 21:54:22 +00:00
Jim Grosbach	37853d6216	ARM assembler should prefer non-aliases encoding of cmp. When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg, we want to use the non-negated form to make sure we prefer the normal encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'. llvm-svn: 153770	2012-03-30 19:59:02 +00:00
Jim Grosbach	92ee2a8454	ARM encoding for VSWP got the second operand incorrect. Make the non-tied register operand names line up with what the base class encoding handler expects. rdar://11157236 llvm-svn: 153766	2012-03-30 18:53:01 +00:00
Jim Grosbach	472cefe371	ARM can only use narrow encoding for low regs. llvm-svn: 153765	2012-03-30 18:39:43 +00:00
Jim Grosbach	2536615bab	ARM integrated assembler should encoding choice for add/sub imm. For 'adds r2, r2, #56' outside of an IT block, the 16-bit encoding T2 can be used for this syntax. Prefer the narrow encoding when possible. rdar://11156277 llvm-svn: 153759	2012-03-30 17:20:40 +00:00
Rafael Espindola	151b420718	Handle unreachable code in the dominates functions. This changes users when needed for correctness, but still doesn't clean up code that now unnecessary checks for reachability. llvm-svn: 153755	2012-03-30 16:46:21 +00:00
Danil Malyshev	df8df843d9	Re-factored RuntimeDyLd: 1. The main works will made in the RuntimeDyLdImpl with uses the ObjectFile class. RuntimeDyLdMachO and RuntimeDyLdELF now only parses relocations and resolve it. This is allows to make improvements of the RuntimeDyLd more easily. In addition the support for COFF can be easily added. 2. Added ARM relocations to RuntimeDyLdELF. 3. Added support for stub functions for the ARM, allowing to do a long branch. 4. Added support for external functions that are not loaded from the object files, but can be loaded from external libraries. Now MCJIT can correctly execute the code containing the printf, putc, and etc. 5. The sections emitted instead functions, thanks Jim Grosbach. MemoryManager.startFunctionBody() and MemoryManager.endFunctionBody() have been removed. 6. MCJITMemoryManager.allocateDataSection() and MCJITMemoryManager. allocateCodeSection() used JMM->allocateSpace() instead of JMM->allocateCodeSection() and JMM->allocateDataSection(), because I got an error: "Cannot allocate an allocated block!" with object file contains more than one code or data sections. llvm-svn: 153754	2012-03-30 16:45:19 +00:00
Jim Grosbach	9b185a753c	ARM assembly parsing needs to be paranoid about negative immediates. Make sure to treat immediates as unsigned when doing relative comparisons. rdar://11153621 llvm-svn: 153753	2012-03-30 16:31:31 +00:00
Rafael Espindola	4578d5e45a	Add computeMaskedBitsLoad back, as it was the change to instsimplify that caused the slowdown last time. llvm-svn: 153747	2012-03-30 15:52:11 +00:00
Benjamin Kramer	0365dc97a8	Add a note about a missed cmov -> sbb opportunity. llvm-svn: 153741	2012-03-30 13:02:58 +00:00
James Molloy	70a6f5ebc7	Ensure conditional BL instructions for ARM are given the fixup fixup_arm_condbranch. Patch by Tim Northover! llvm-svn: 153737	2012-03-30 09:15:32 +00:00
Evan Cheng	f3c23907f5	ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249 llvm-svn: 153717	2012-03-30 01:24:39 +00:00
Bill Wendling	c6f065c054	If we have a VLA that has a "use" in a metadata node that's then used here but it has no other uses, then we have a problem. E.g., int foo (const int x) { char a[x]; return 0; } If we assign 'a' a vreg and fast isel later on has to use the selection DAG isel, it will want to copy the value to the vreg. However, there are no uses, which goes counter to what selection DAG isel expects. <rdar://problem/11134152> llvm-svn: 153705	2012-03-30 00:02:55 +00:00
Bill Wendling	86e08bb6de	Revert r153694. It was causing failures in the buildbots. llvm-svn: 153701	2012-03-29 23:23:59 +00:00
Jakob Stoklund Olesen	8fe088c0ee	Invalidate liveness in ARMConstantIslandPass. This pass splits basic blocks to insert constant islands, and it doesn't recompute the live-in lists. No later passes depend on accurate liveness information. This fixes PR12410 where the machine code verifier was complaining. llvm-svn: 153700	2012-03-29 23:14:26 +00:00
Jakob Stoklund Olesen	d9c6469e9a	Prefer even-odd D-register pairs. We are sometimes allocatinog from the DPair register class which contains odd-even pairs in addition to the Q registers. Place the Q registers first in the DPair allocation order as they can be copied with a single instruction. The odd-even pairs should only be allocated as a last resort. llvm-svn: 153699	2012-03-29 22:54:32 +00:00
Lang Hames	1a0d0ec699	Try using vmov.i32 to materialize FP32 constants that can't be materialized by vmov.f32. llvm-svn: 153696	2012-03-29 21:56:11 +00:00
Danil Malyshev	d66f6a3b28	Re-factored RuntimeDyld. Added ExecutionEngine/MCJIT tests. llvm-svn: 153694	2012-03-29 21:46:18 +00:00
Eric Christopher	330add6489	Lowercase the tag name to match the rest of dwarf. llvm-svn: 153691	2012-03-29 21:35:05 +00:00
Jim Grosbach	ab639b8c36	ARM assembly 'cmp lr, #0 ' should not encode using 'cmn'. The CMP->CMN alias was matching for an immediate of zero when it should only match for negative values. rdar://11129224 llvm-svn: 153689	2012-03-29 21:19:52 +00:00
Jakob Stoklund Olesen	2cbfc41270	Handle register copies for the new ARM register classes. ARM recently gained DPair, DTriple, and DQuad register classes. Update copyPhysReg() to handle copies in these register classes. No test case, it is difficult to make the register allocator emit the odd copies reliably. The missing DPair copy caused a failure on partialsums in the nightly test suite. <rdar://problem/11147997> llvm-svn: 153686	2012-03-29 21:10:40 +00:00
Lang Hames	94d892c492	Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in 64-bit mode. llvm-svn: 153680	2012-03-29 19:54:28 +00:00
Akira Hatanaka	fa2f5577e9	Expand FREM. llvm-svn: 153671	2012-03-29 18:43:11 +00:00
Jakob Stoklund Olesen	9571cb56c5	Don't PRE compares. CodeGenPrepare sinks compare instructions down to their uses to prevent live flags and predicate registers across basic blocks. PRE of a compare instruction prevents that, forcing the i1 compare result into a general purpose register. That is usually more expensive than the redundant compare PRE was trying to eliminate in the first place. llvm-svn: 153657	2012-03-29 17:22:39 +00:00
Benjamin Kramer	e3b0c81c27	Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds. llvm-svn: 153643	2012-03-29 12:37:26 +00:00
Eric Christopher	469ec18341	Add support for objc property decls according to the page at: http://llvm.org/docs/SourceLevelDebugging.html#objcproperty including type and DECL. Expand the metadata needed accordingly. rdar://11144023 llvm-svn: 153639	2012-03-29 08:42:56 +00:00
Craig Topper	9a00ba461c	Only allow symbolic names for (v)cmpss/sd/ps/pd encodings 8-31 to be used with 'v' version of instructions. llvm-svn: 153636	2012-03-29 07:11:23 +00:00
Joel Jones	486c38b0cf	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153635	2012-03-29 05:45:48 +00:00
Joel Jones	32f97db4b2	Reverted to revision 153616 to unblock build llvm-svn: 153623	2012-03-29 01:20:56 +00:00
Joel Jones	b4477ee31f	For X86, change load/dec-or-inc/store into dec-or-inc, respectively. This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153617	2012-03-29 00:37:47 +00:00
Jakob Stoklund Olesen	753b1e33e0	Enable machine code verification in the entire code generator. Some targets still mess up the liveness information, but that isn't verified after MRI->invalidateLiveness(). The verifier can still check other useful things like register classes and CFG, so it should be enabled after all passes. llvm-svn: 153615	2012-03-28 23:54:28 +00:00
Jakob Stoklund Olesen	4b4ee58c4c	Enable machine code verification after PreSched2 passes. The late scheduler depends on accurate liveness information if it is breaking anti-dependencies, so we should be able to verify it. Relax the terminator checking in the machine code verifier so it can handle the basic blocks created by if conversion. llvm-svn: 153614	2012-03-28 23:31:15 +00:00
Jakob Stoklund Olesen	e6574db283	Don't kill the base register when expanding strd. When an strd instruction doesn't get the registers it wants, it can be expanded into two str instructions. Make sure the first str doesn't kill the base register in the case where the base and data registers are identical: t2STRi12 %R0<kill>, %R0, 4, pred:14, pred:%noreg t2STRi12 %R2<kill>, %R0, 8, pred:14, pred:%noreg <rdar://problem/11101911> llvm-svn: 153611	2012-03-28 23:07:03 +00:00
Jakob Stoklund Olesen	ebee7e5cff	Preserve implicit defs in ARMLoadStoreOptimizer. When a number of sub-register VLRDS instructions are combined into a VLDM, preserve any super-register implicit defs. This is required to keep the register scavenger and machine code verifier happy. Enable machine code verification after ARMLoadStoreOptimizer. ARM/2012-01-26-CopyPropKills.ll was failing because of this. llvm-svn: 153610	2012-03-28 22:50:56 +00:00
Danil Malyshev	aba46febe1	Move getPointerToNamedFunction() from JIT/MCJIT to JITMemoryManager. llvm-svn: 153607	2012-03-28 21:46:36 +00:00
Rafael Espindola	1b3dfa1ce2	Handle intrinsics in GlobalsModRef. Fixes pr12351. llvm-svn: 153604	2012-03-28 21:31:24 +00:00
Jakob Stoklund Olesen	7623979dd6	Spill DPair registers, not just QPR. The arm_neon intrinsics can create virtual registers from the DPair register class which allows both even-odd and odd-even D-register pairs. This fixes PR12389. llvm-svn: 153603	2012-03-28 21:20:32 +00:00
Jakob Stoklund Olesen	6ce4ff3ff7	Also verify after ExpandPostRAPseudos. llvm-svn: 153599	2012-03-28 20:49:30 +00:00
Jakob Stoklund Olesen	f5df00f0fb	Enable machine code verification after the late machine optimization passes. Branch folding invalidates liveness and disables liveness verification on some targets. llvm-svn: 153597	2012-03-28 20:47:37 +00:00
Jakob Stoklund Olesen	37927fe83c	Skip liveness verification when MRI->tracksLiveness() is false. Extract the liveness verification into its own method. This makes it possible to run the machine code verifier after liveness information is no longer required to be valid. llvm-svn: 153596	2012-03-28 20:47:35 +00:00
Jakob Stoklund Olesen	2c29e5d7f9	Revert r153516: "Invalidate liveness in Thumb2ITBlockPass." Revert r153519: "ARMLoadStoreOptimizer invalidates register liveness." These patches caused miscompilations in povray by turning off branch folding's updating of live-in lists. It turns out the the late scheduler depends on the live-in lists, even if it doesn't need correct kill flags. <rdar://problem/11139228> llvm-svn: 153593	2012-03-28 20:11:44 +00:00
Jakob Stoklund Olesen	7c56ad07a6	Allow removeLiveIn to be called with a register that isn't live-in. This avoids the silly double search: if (isLiveIn(Reg)) removeLiveIn(Reg); llvm-svn: 153592	2012-03-28 20:11:42 +00:00
Chad Rosier	3f0e43807e	Revert r153521 as it's causing large regressions on the nightly testers. Original commit message for r153521 (aka r153423): Use the new range metadata in computeMaskedBits and add a new optimization to instruction simplify that lets us remove an and when loding a boolean value. llvm-svn: 153587	2012-03-28 18:42:50 +00:00
Pete Cooper	3bf62a9db3	Fixed commuteInstructions bug where if its called pre-regalloc the subreg indices weren't commuted llvm-svn: 153579	2012-03-28 17:02:22 +00:00
Benjamin Kramer	b6baea7014	GlobalOpt: If we have an inbounds GEP from a ConstantAggregateZero global that we just determined to be constant, replace all loads from it with a zero value. llvm-svn: 153576	2012-03-28 14:50:09 +00:00

... 3 4 5 6 7 ...

54073 Commits