llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-03-07 11:59:09 +00:00

Author	SHA1	Message	Date
Nadav Rotem	d01a7b5942	Reapply r162160 with a fix: Optimize Arith->Trunc->SETCC sequence to allow better compare/branch code. llvm-svn: 162172	2012-08-18 17:53:03 +00:00
Nadav Rotem	e9cdefa762	Revert r162160 because it made a few buildbots fail. llvm-svn: 162164	2012-08-18 05:02:36 +00:00
Nadav Rotem	76f1b84f58	The X86 backend has a number of optimizations for SETCC nodes which use arithmetic instructions. However, when small data types are used, a truncate node appears between the SETCC node and the arithmetic operation. This patch adds support for this pattern. Before: xorl %esi, %edi testb %dil, %dil setne %al ret After: xorb %dil, %sil setne %al ret rdar://12081007 llvm-svn: 162160	2012-08-18 02:43:28 +00:00
Eli Friedman	925738bb5c	Make atomic load and store of pointers work. Tighten verification of atomic operations so other unexpected operations don't slip through. Based on patch by Logan Chien. PR11786/PR13186. llvm-svn: 162146	2012-08-17 23:24:29 +00:00
Benjamin Kramer	ba78a8432b	TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type. llvm-svn: 162101	2012-08-17 15:54:21 +00:00
Benjamin Kramer	b42939c43b	Fix broken check lines. I really need to find a way to automate this, but I can't come up with a regex that has no false positives while handling tricky cases like custom check prefixes. llvm-svn: 162097	2012-08-17 12:28:26 +00:00
Bill Wendling	a7d745827b	Remove invalid test. This test requires that dead basic blocks be kept around. That's not how we do things. Besides, the commit message tells us that it is covered by the GCC test suite. ------------------------------------------------------------------------ r127497 \| zwarich \| 2011-03-11 13:51:56 -0800 (Fri, 11 Mar 2011) \| 3 lines Fix the GCC test suite issue exposed by r127477, which was caused by stack protector insertion not working correctly with unreachable code. Since that revision was rolled out, this test doesn't actual fail before this fix. ------------------------------------------------------------------------ llvm-svn: 161985	2012-08-15 20:54:09 +00:00
Michael Liao	daebe04c2f	fix PR11334 - FP_EXTEND only support extending from vectors with matching elements. This results in the scalarization of extending to v2f64 from v2f32, which will be legalized to v4f32 not matching with v2f64. - add X86-specific VFPEXT supproting extending from v4f32 to v2f64. - add BUILD_VECTOR lowering helper to recover back the original extending from v4f32 to v2f64. - test case is enhanced to include different vector width. llvm-svn: 161894	2012-08-14 21:24:47 +00:00
Nadav Rotem	eb22b069bb	During the CodeGenPrepare we often lower intrinsics (such as objsize) and allow some optimizations to turn conditional branches into unconditional. This commit adds a simple control-flow optimization which merges two consecutive basic blocks which are connected by a single edge. This allows the codegen to operate on larger basic blocks. rdar://11973998 llvm-svn: 161852	2012-08-14 05:19:07 +00:00
Bill Wendling	9cd2a5dbb9	Rename test since it's not linux-specific. llvm-svn: 161792	2012-08-13 21:32:42 +00:00
Jakob Stoklund Olesen	8e6595639e	Handle extra Tail predecessors in if-conversion. It is still possible to if-convert if the tail block has extra predecessors, but the tail phis must be rewritten instead of being removed. llvm-svn: 161781	2012-08-13 20:49:04 +00:00
Manman Ren	ead7af5ec0	Fix failure on Atom bot due to r161769 llvm-svn: 161777	2012-08-13 19:34:29 +00:00
Manman Ren	cb05c49c64	X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed to a memory operand. PR13576 llvm-svn: 161769	2012-08-13 18:29:41 +00:00
Michael Liao	4b95cb463a	fix PR13577, an issue introduced by r161687 - FCMOV only supports a subset of X86 conditions. Skip boolean simplification if X86 condition is not valid for FCMOV. - add a minimal test case for PR13577. llvm-svn: 161732	2012-08-11 23:47:06 +00:00
Benjamin Kramer	b338b22e72	PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely. This is common e.g. when doing rip-relative addressing on x86_64. llvm-svn: 161728	2012-08-11 19:05:13 +00:00
Manman Ren	9bd686f936	X86: when we are auto-detecting the subtarget features, make sure we turn on FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge. FeatureFastUAMem is already on if we pass in nehalem or westmere as a command argument. rdar: 7252306 llvm-svn: 161717	2012-08-10 23:43:32 +00:00
Michael Liao	97334a5c5f	add X86-specific DAG optimization to simplify boolean test - if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value generated from X86ISD::SETCC, try to simplify the boolean value generation and checking by reusing the original EFLAGS with proper condition code - add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places consuming EFLAGS part of patches fixing PR12312 llvm-svn: 161687	2012-08-10 19:58:13 +00:00
Jakob Stoklund Olesen	52532aa712	Update edge weights correctly in replaceSuccessor(). When replacing Old with New, it can happen that New is already a successor. Add the old and new edge weights instead of creating a duplicate edge. llvm-svn: 161653	2012-08-10 03:23:27 +00:00
Jakob Stoklund Olesen	5c0e68ae5a	Reapply r161633-161634 "Partition use lists so defs always come before uses."" No changes to these patches, MRI needed to be notified when changing uses into defs and vice versa. llvm-svn: 161644	2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen	7680d4d42c	Revert r161633-161634 "Partition use lists so defs always come before uses." These commits broke a number of buildbots. llvm-svn: 161640	2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen	9594629bf1	Partition use lists so defs always come before uses. This makes it possible to speed up def_iterator by stopping at the first use. This makes def_empty() and getUniqueVRegDef() much faster when there are many uses. In a +Asserts build, LiveVariables is 100x faster in one case because getVRegDef() has an assertion that would scan to the end of a def_iterator chain. Spill weight calculation is significantly faster (300x in one case) because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP) which calls def_empty(%RIP). llvm-svn: 161634	2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen	58b6c3edfa	Don't use pointer-pointers for the register use lists. Use a more conventional doubly linked list where the Prev pointers form a cycle. This means it is no longer necessary to adjust the Prev pointers when reallocating the VRegInfo array. The test changes are required because the register allocation hint is using the use-list order to break ties. llvm-svn: 161633	2012-08-09 22:49:42 +00:00
Bob Wilson	b173ddc53e	Add test triples to fix win32 failures. Revert workaround from r161292. I don't have a win32 system to test, so hopefully I got them all fixed here. llvm-svn: 161519	2012-08-08 20:31:37 +00:00
Manman Ren	967804ad0a	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Evan Cheng	96c6741fad	X86 cmp lowering is looking past truncate on the condition node. It should only do so when the high bits are known zero. This caused a subtle miscompilation. rdar://12027825 llvm-svn: 161451	2012-08-07 22:21:00 +00:00
Chandler Carruth	49d4e3f282	Add a much more conservative strategy for aligning branch targets. Previously, MBP essentially aligned every branch target it could. This bloats code quite a bit, especially non-looping code which has no real reason to prefer aligned branch targets so heavily. As Andy said in review, it's still a bit odd to do this without a real cost model, but this at least has much more plausible heuristics. Fixes PR13265. llvm-svn: 161409	2012-08-07 09:45:24 +00:00
Manman Ren	5d43c19d9e	MachineCSE: Update the heuristics for isProfitableToCSE. If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396	2012-08-07 06:16:46 +00:00
Craig Topper	dc1b95e7de	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Craig Topper	7a2d94e7d0	Update test to check for r161305 llvm-svn: 161307	2012-08-05 09:06:28 +00:00
Bob Wilson	9f6e25017a	Refactor and check "onlyReadsMemory" before optimizing builtins. This patch is mostly just refactoring a bunch of copy-and-pasted code, but it also adds a check that the call instructions are readnone or readonly. That check was already present for sin, cos, sqrt, log2, and exp2 calls, but it was missing for the rest of the builtins being handled in this code. llvm-svn: 161282	2012-08-03 23:29:17 +00:00
Bob Wilson	7d92f01d57	Fix memcmp code-gen to honor -fno-builtin. I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp in TargetLibraryInfo, so that it would use custom code for memcmp calls even with -fno-builtin. I also had to add a new -disable-simplify-libcalls option to llc so that I could write a test for this. llvm-svn: 161262	2012-08-03 21:26:18 +00:00
Bob Wilson	d1eefbeac2	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Manman Ren	2c236a30f6	X86 Peephole: fold loads to the source register operand if possible. Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207	2012-08-02 19:37:32 +00:00
NAKAMURA Takumi	44d438eb55	llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Make sure this is testing without +avx. FIXME: Could +avx be checked here too? llvm-svn: 161156	2012-08-02 06:36:56 +00:00
NAKAMURA Takumi	d591c838db	llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Rewrite expressions to pass regardless of PR11031. - Relax to match even if epilogue (pop %ebp) were emitted. - Assume the return value is stored to %xmm0. llvm-svn: 161155	2012-08-02 06:33:58 +00:00
Manman Ren	78b8d454cc	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Matt Beaumont-Gay	3e9c5fc8d3	Line endings. llvm-svn: 161117	2012-08-01 16:42:35 +00:00
Elena Demikhovsky	0fec7026d9	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Chad Rosier	9a4ff99710	[x86 frame lowering] In 32-bit mode, use ESI as the base pointer. Previously, we were using EBX, but PIC requires the GOT to be in EBX before function calls via PLT GOT pointer. llvm-svn: 161066	2012-07-31 18:29:21 +00:00
Manman Ren	3d7e85d5b8	MachineSink: Sort the successors before trying to find SuccToSinkTo. One motivating example is to sink an instruction from a basic block which has two successors: one outside the loop, the other inside the loop. We should try to sink the instruction outside the loop. rdar://11980766 llvm-svn: 161062	2012-07-31 18:10:39 +00:00
Manman Ren	3769ac64a6	Reverse order of the two branches at end of a basic block if it is profitable. We branch to the successor with higher edge weight first. Convert from je LBB4_8 --> to outer loop jmp LBB4_14 --> to inner loop to jne LBB4_14 jmp LBB4_8 PR12750 rdar: 11393714 llvm-svn: 161018	2012-07-31 01:11:07 +00:00
Pete Cooper	e45da564cf	Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd llvm-svn: 160987	2012-07-30 20:23:19 +00:00
Manman Ren	ceef7c4d9b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Manman Ren	9494d0a2c2	X86 Peephole: fold loads to the source register operand if possible. Trying to fix the bot by specifying a triple in the failing testing cases. llvm-svn: 160920	2012-07-28 17:51:24 +00:00
Manman Ren	ea77f9076b	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Manman Ren	fbc9fcdbf2	X86 Peephole: fix PR13475 in optimizeCompare. It is possible that an instruction can use and update EFLAGS. When checking the safety, we should check the usage of EFLAGS first before declaring it is safe to optimize due to the update. llvm-svn: 160912	2012-07-28 03:15:46 +00:00
Evan Cheng	f318924b28	Teach CodeGenPrep to look past bitcast when it's duplicating return instruction into predecessor blocks to enable tail call optimization. rdar://11958338 llvm-svn: 160894	2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen	03a59af504	Add <imp-def> of super-register when lowering SUBREG_TO_REG. Patch by Tyler Nowicki! llvm-svn: 160888	2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen	44868927b9	Eliminate a batch of uses of sub_ss and sub_sd in the X86 target. These idempotent sub-register indices don't do anything --- They simply map XMM registers to themselves. They no longer affect register classes either since the SubRegClasses field has been removed from Target.td. This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns with COPY_TO_REGCLASS patterns which simply become COPY instructions. The number of IMPLICIT_DEF instructions before register allocation is reduced, and that is the cause of the test case changes. llvm-svn: 160816	2012-07-26 21:40:42 +00:00
Manman Ren	3a016b55c3	Update testing case for Atom when disabling rematerialization in TwoAddressInstructionPass. The generated code for Atom has a different code sequence. This is realted to commit r160749. llvm-svn: 160755	2012-07-25 20:17:14 +00:00

1 2 3 4 5 ...

3504 Commits