llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-05 10:39:21 +00:00

Author	SHA1	Message	Date
Jakob Stoklund Olesen	7f7f8a2e77	Consider unknown alignment caused by OptimizeThumb2Instructions(). This function runs after all constant islands have been placed, and may shrink some instructions to their 2-byte forms. This can actually cause some constant pool entries to move out of range because of growing alignment padding. Treat instructions that may be shrunk the same as inline asm - they erode the known alignment bits. Also reinstate an old assertion in verify(). It is correct now that basic block offsets include alignments. Add a single large test case that will hopefully exercise many parts of the constant island pass. <rdar://problem/10670199> llvm-svn: 147885	2012-01-10 22:32:14 +00:00
Jim Grosbach	59537e1ce3	ARM updating VST2 pseudo-lowering fixed vs. register update. rdar://10663487 llvm-svn: 147876	2012-01-10 21:11:12 +00:00
Kevin Enderby	75f4b470f9	Various crash reporting tools have a problem with the dwarf generated for assembly source when it generates the TAG_subprogram dwarf debug info for the labels that have nothing between them as in this bit of assembly source: % cat ZeroLength.s _func1: _func2: nop One solution would be to not emit the subsequent labels with the same address and use the next label with a different address or the end of the section for the AT_high_pc value of the TAG_subprogram. Turns out in llvm-mc it is not possible in all cases to determine of two symbols have the same value at the point we put out the TAG_subprogram dwarf debug info. So we will have llvm-mc instead of putting out TAG_subprogram's put out DW_TAG_label's. And the DW_TAG_label does not have a AT_high_pc value which avoids the problem. This commit is only the functional change to make the diffs clear as to what is really being changed. The next commit will be to clean up the names of such things like MCGenDwarfSubprogramEntry to something like MCGenDwarfLabelEntry. rdar://10666925 llvm-svn: 147860	2012-01-10 17:52:29 +00:00
Nadav Rotem	969b8a6903	Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back. llvm-svn: 147851	2012-01-10 14:28:46 +00:00
Craig Topper	01eba20904	Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector. There is no vbroadcastsd xmm, but we do need to support 64-bit integers broadcasted into xmm. Also factor the AVX check into the isVectorBroadcast function. This makes more sense since the AVX2 check was already inside. llvm-svn: 147844	2012-01-10 08:23:59 +00:00
Evan Cheng	7855c5d08f	Allow machine-cse to look across MBB boundary when cse'ing instructions that define physical registers. It's currently very restrictive, only catching cases where the CE is in an immediate (and only) predecessor. But it catches a surprising large number of cases. rdar://10660865 llvm-svn: 147827	2012-01-10 02:02:58 +00:00
Andrew Trick	db66631fb3	Enable LSR IV Chains with sufficient heuristics. These heuristics are sufficient for enabling IV chains by default. Performance analysis has been done for i386, x86_64, and thumbv7. The optimization is rarely important, but can significantly speed up certain cases by eliminating spill code within the loop. Unrolled loops are prime candidates for IV chains. In many cases, the final code could still be improved with more target specific optimization following LSR. The goal of this feature is for LSR to make the best choice of induction variables. Instruction selection may not completely take advantage of this feature yet. As a result, there could be cases of slight code size increase. Code size can be worse on x86 because it doesn't support postincrement addressing. In fact, when chains are formed, you may see redundant address plus stride addition in the addressing mode. GenerateIVChains tries to compensate for the common cases. On ARM, code size increase can be mitigated by using postincrement addressing, but downstream codegen currently misses some opportunities. llvm-svn: 147826	2012-01-10 01:45:08 +00:00
Andrew Trick	09d73ea35b	Adding IV chain generation to LSR. After collecting chains, check if any should be materialized. If so, hide the chained IV users from the LSR solver. LSR will only solve for the head of the chain. GenerateIVChains will then materialize the chained IV users by computing the IV relative to its previous value in the chain. In theory, chained IV users could be exposed to LSR's solver. This would be considerably complicated to implement and I'm not aware of a case where we need it. In practice it's more important to intelligently prune the search space of nontrivial loops before running the solver, otherwise the solver is often forced to prune the most optimal solutions. Hiding the chained users does this well, so that LSR is more likely to find the best IV for the chain as a whole. llvm-svn: 147801	2012-01-09 21:18:52 +00:00
Benjamin Kramer	f9cefbfed0	InstCombine: Teach foldLogOpOfMaskedICmpsHelper that sign bit tests are bit tests. This subsumes several other transforms while enabling us to catch more cases. llvm-svn: 147777	2012-01-09 17:23:27 +00:00
Chandler Carruth	9a418a2713	Cleanup and FileCheck-ize a test. llvm-svn: 147772	2012-01-09 09:44:26 +00:00
Craig Topper	ee2dabebe3	Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing. llvm-svn: 147767	2012-01-09 06:52:46 +00:00
Rafael Espindola	7618aa1c64	Don't print an unused label before .cfi_endproc. llvm-svn: 147763	2012-01-09 00:17:29 +00:00
Craig Topper	1fdcf7071d	Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level. llvm-svn: 147762	2012-01-09 00:11:29 +00:00
Benjamin Kramer	e1321329f4	Tweak my last commit to be less conservative about uses. We still save an instruction when just the "and" part is replaced. Also change the code to match comments more closely. llvm-svn: 147753	2012-01-08 21:12:51 +00:00
Benjamin Kramer	e94856c8c4	InstCombine: If we have a bit test and a sign test anded/ored together, merge the sign bit into the bit test. This is common in bit field code, e.g. checking if the first or the last bit of a bit field is set. llvm-svn: 147749	2012-01-08 18:32:24 +00:00
Victor Umansky	5d24f5f51a	Reverted commit #147601 upon Evan's request. llvm-svn: 147748	2012-01-08 17:20:33 +00:00
Rafael Espindola	19a13321f8	Don't print a label before .cfi_startproc when we don't need to. This makes the produce assembly when using CFI just a bit more readable. llvm-svn: 147743	2012-01-07 22:42:19 +00:00
Jakob Stoklund Olesen	71f92061aa	Use getRegForValue() to materialize the address of ARM globals. This enables basic local CSE, giving us 20% smaller code for consumer-typeset in -O0 builds. <rdar://problem/10658692> llvm-svn: 147720	2012-01-07 04:07:22 +00:00
Andrew Trick	d9eb9c8780	LSR: Don't optimize loops if an outer loop has no preheader. LoopSimplify may not run on some outer loops, e.g. because of indirect branches. SCEVExpander simply cannot handle outer loops with no preheaders. Fixes rdar://10655343 SCEVExpander segfault. llvm-svn: 147718	2012-01-07 03:16:50 +00:00
Rafael Espindola	2d545fa143	Split Finish into Finish and FinishImpl to have a common place to do end of file error checking. Use that to error on an unfinished cfi_startproc. The error is not nice, but is already better than a segmentation fault. llvm-svn: 147717	2012-01-07 03:13:18 +00:00
Evan Cheng	8af07ba749	Added a late machine instruction copy propagation pass. This catches opportunities that only present themselves after late optimizations such as tail duplication .e.g. ## BB#1: movl %eax, %ecx movl %ecx, %eax ret The register allocator also leaves some of them around (due to false dep between copies from phi-elimination, etc.) This required some changes in codegen passes. Post-ra scheduler and the pseudo-instruction expansion passes have been moved after branch folding and tail merging. They were before branch folding before because it did not always update block livein's. That's fixed now. The pass change makes independently since we want to properly schedule instructions after branch folding / tail duplication. rdar://10428165 rdar://10640363 llvm-svn: 147716	2012-01-07 03:02:36 +00:00
Jakob Stoklund Olesen	536b4e24d8	Use movw+movt in ARMFastISel::ARMMaterializeGV. This eliminates a lot of constant pool entries for -O0 builds of code with many global variable accesses. This speeds up -O0 codegen of consumer-typeset by 2x because the constant island pass no longer has to look at thousands of constant pool entries. <rdar://problem/10629774> llvm-svn: 147712	2012-01-07 01:47:05 +00:00
Andrew Trick	8a5a1e603e	Extended replaceCongruentPhis to handle mixed phi types. llvm-svn: 147707	2012-01-07 01:12:09 +00:00
Eric Christopher	09a41e8939	Make the 'x' constraint work for AVX registers as well. Fixes rdar://10614894 llvm-svn: 147704	2012-01-07 01:02:09 +00:00
Andrew Trick	0ec80535d9	comment typo llvm-svn: 147701	2012-01-07 00:29:20 +00:00
Jakob Stoklund Olesen	39a3fa2c29	Enable aligned NEON spilling by default. Experiments show this to be a small speedup for modern ARM cores. llvm-svn: 147689	2012-01-06 22:19:37 +00:00
Dan Gohman	a4fde8485d	Fix SpeculativelyExecuteBB to either speculate all or none of the phis present in the bottom of the CFG triangle, as the transformation isn't ever valuable if the branch can't be eliminated. Also, unify some heuristics between SimplifyCFG's multiple if-converters, for consistency. This fixes rdar://10627242. llvm-svn: 147630	2012-01-05 23:58:56 +00:00
Eli Friedman	5af9c3cbbb	PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into global initializers if there's an implied extension or truncation. llvm-svn: 147625	2012-01-05 23:03:32 +00:00
Rafael Espindola	aaed8dbd04	Link symbols with different visibilities according to the rules in the System V Application Binary Interface. This lets us use -fvisibility-inlines-hidden with LTO. Fixes PR11697. llvm-svn: 147624	2012-01-05 23:02:01 +00:00
Dan Gohman	4fc691d9ef	Revert r56315. When the instruction to speculate is a load, this code can incorrectly move the load across a store. This never happens in practice today, but only because the current heuristics accidentally preclude it. llvm-svn: 147623	2012-01-05 22:54:35 +00:00
Chandler Carruth	90f124f420	Prevent a DAGCombine from firing where there are two uses of a combined-away node and the result of the combine isn't substantially smaller than the input, it's just canonicalized. This is the first part of a significant (7%) performance gain for Snappy's hot decompression loop. llvm-svn: 147604	2012-01-05 11:05:55 +00:00
Chandler Carruth	75ff869d6a	Cleanup and FileCheck-ize a test. llvm-svn: 147603	2012-01-05 11:05:47 +00:00
Victor Umansky	87d5ada510	Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX. Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX) Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov llvm-svn: 147601	2012-01-05 08:46:19 +00:00
Benjamin Kramer	e5589bccdd	FileCheck hygiene. llvm-svn: 147580	2012-01-05 00:43:34 +00:00
Jakob Stoklund Olesen	23eeb1f7b5	Reapply r146997, "Heed spill slot alignment on ARM." Now that canRealignStack() understands frozen reserved registers, it is safe to use it for aligned spill instructions. It will only return true if the registers reserved at the beginning of register allocation allow for dynamic stack realignment. <rdar://problem/10625436> llvm-svn: 147579	2012-01-05 00:26:57 +00:00
Nick Lewycky	d6260dc3cb	Teach instcombine all sorts of great stuff about shifts that have exact, nuw or nsw bits on them. llvm-svn: 147528	2012-01-04 09:28:29 +00:00
NAKAMURA Takumi	6ebbc05c9d	test/CodeGen/X86/jump_sign.ll: Add -mcpu=pentiumpro for non-x86 hosts. It uses "cmov". llvm-svn: 147521	2012-01-04 03:52:23 +00:00
Akira Hatanaka	fdcba196ca	Have getRegForInlineAsmConstraint return the correct register class when target is Mips64. llvm-svn: 147516	2012-01-04 02:45:01 +00:00
Evan Cheng	4e60b65bc6	Fix more places which should be checking for iOS, not darwin. llvm-svn: 147513	2012-01-04 01:55:04 +00:00
Evan Cheng	caba5d2fc2	For x86, canonicalize max (x > y) ? x : y => (x >= y) ? x : y So for something like (x - y) > 0 : (x - y) ? 0 It will be (x - y) >= 0 : (x - y) ? 0 This makes is possible to test sign-bit and eliminate a comparison against zero. e.g. subl %esi, %edi testl %edi, %edi movl $0, %eax cmovgl %edi, %eax => xorl %eax, %eax subl %esi, $edi cmovsl %eax, %edi rdar://10633221 llvm-svn: 147512	2012-01-04 01:41:39 +00:00
Kostya Serebryany	c69557e758	[asan] one more test for asan instrumentation: (*a)++ should be instrumented only once. llvm-svn: 147509	2012-01-04 01:02:14 +00:00
Jakob Stoklund Olesen	993997b659	Revert r146997, "Heed spill slot alignment on ARM." This patch caused a miscompilation of oggenc because a frame pointer was suddenly needed halfway through register allocation. <rdar://problem/10625436> llvm-svn: 147487	2012-01-03 22:34:35 +00:00
Nadav Rotem	79f1692fe0	Revert 147426 because it caused pr11696. llvm-svn: 147485	2012-01-03 22:19:42 +00:00
Nadav Rotem	cc49c4d74d	Fix incorrect widening of the bitcast sdnode in case the incoming operand is integer-promoted. llvm-svn: 147484	2012-01-03 22:12:28 +00:00
Chad Rosier	afcaa8f38a	Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather then a vxorps + vinsertf128 pair if the original vector came from a load. rdar://10594409 llvm-svn: 147481	2012-01-03 21:05:52 +00:00
Elena Demikhovsky	40f2c3077f	Fixed a bug in SelectionDAG.cpp. The failure seen on win32, when i64 type is illegal. It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. The failure message is: llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT \|\| (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. I added a special test that checks vector shuffle on win32. llvm-svn: 147445	2012-01-03 11:59:04 +00:00
Andrew Trick	6839d66ab3	Fix SCEVExpander to handle loops with no preheader when LSR gives it a "phony" insertion point. Fixes rdar://10619599: "SelectionDAGBuilder shouldn't visit PHI nodes!" assert llvm-svn: 147439	2012-01-02 21:25:10 +00:00
Nadav Rotem	6929a8868b	Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend instructions only look at the highest bit. llvm-svn: 147426	2012-01-02 08:05:46 +00:00
Craig Topper	f7c9bf17dd	Allow CRC32 instructions to be selected when AVX is enabled. llvm-svn: 147411	2012-01-01 19:51:58 +00:00
Craig Topper	d8ae2d9f27	Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers. llvm-svn: 147409	2012-01-01 19:40:22 +00:00
Rafael Espindola	1cb17796db	Revert 147399. It broke CodeGen/ARM/vext.ll. llvm-svn: 147400	2012-01-01 17:36:23 +00:00
Elena Demikhovsky	9b74049783	Fixed a bug in SelectionDAG.cpp. The failure seen on win32, when i64 type is illegal. It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR. The failure message is: llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT \|\| (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed. I added a special test that checks vector shuffle on win32. llvm-svn: 147399	2012-01-01 16:22:47 +00:00
Craig Topper	0311c45aed	Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load. llvm-svn: 147393	2011-12-31 23:24:49 +00:00
Craig Topper	c01ce759d7	Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected. llvm-svn: 147392	2011-12-31 23:15:11 +00:00
Nick Lewycky	c7e12f7dbf	Make use of the exact bit when optimizing '(X >>exact 3) << 1' to eliminate the 'and' that would zero out the trailing bits, and to produce an exact shift ourselves. llvm-svn: 147391	2011-12-31 21:30:22 +00:00
Craig Topper	b4db8689ee	Add disassembler support for VPERMIL2PD and VPERMIL2PS. llvm-svn: 147368	2011-12-30 06:23:39 +00:00
Craig Topper	089be4fefa	Add FMA4 instructions to disassembler. llvm-svn: 147367	2011-12-30 05:20:36 +00:00
Craig Topper	33091db89a	Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms. llvm-svn: 147361	2011-12-30 02:18:36 +00:00
Craig Topper	e066262284	Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere. llvm-svn: 147360	2011-12-30 01:49:53 +00:00
Hal Finkel	4a09216dfb	Cleanup stack/frame register define/kill states. This fixes two bugs: 1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test). 2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this. llvm-svn: 147359	2011-12-30 00:34:00 +00:00
Rafael Espindola	db7319d272	Implement cfi_restore. Patch by Brian Anderson! llvm-svn: 147356	2011-12-29 21:43:03 +00:00
Craig Topper	97e84c23a1	Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions. llvm-svn: 147353	2011-12-29 20:43:40 +00:00
Rafael Espindola	27298c6f33	Implement .cfi_escape. Patch by Brian Anderson! llvm-svn: 147352	2011-12-29 20:24:47 +00:00
Craig Topper	bcfd070378	Expose FMA3 instructions to the disassembler. llvm-svn: 147351	2011-12-29 20:03:14 +00:00
Nick Lewycky	7425820374	Change CaptureTracking to pass a Use* instead of a Value* when a value is captured. This allows the tracker to look at the specific use, which may be especially interesting for function calls. Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does not iterate until a fixpoint and does not guarantee that it produces the same result regardless of iteration order. The new implementation builds up a graph of how arguments are passed from function to function, and uses a bottom-up walk on the argument-SCCs to assign nocapture. This gets us nocapture more often, and does so rather efficiently and independent of iteration order. llvm-svn: 147327	2011-12-28 23:24:21 +00:00
Eli Friedman	db54b4b68f	Fix type-checking for load transformation which is not legal on floating-point types. PR11674. llvm-svn: 147323	2011-12-28 21:24:44 +00:00
Nadav Rotem	d8c4880903	PR11662. Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage. llvm-svn: 147309	2011-12-28 13:08:20 +00:00
Elena Demikhovsky	9b4613ff14	Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR. Matching MOVLP mask for AVX (265-bit vectors) was wrong. The failure was detected by conformance tests. llvm-svn: 147308	2011-12-28 08:14:01 +00:00
Nick Lewycky	f4c21901a3	Turn cos(-x) into cos(x). Patch by Alexander Malyshev! llvm-svn: 147291	2011-12-27 18:25:50 +00:00
Nick Lewycky	295e397220	Teach simplifycfg to recompute branch weights when merging some branches, and to discard weights when appropriate. Still more to do (and a new TODO), but it's a start! llvm-svn: 147286	2011-12-27 04:31:52 +00:00
Eli Friedman	064187912e	Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2. llvm-svn: 147283	2011-12-26 22:49:32 +00:00
Nick Lewycky	56e04db381	Update the branch weight metadata when reversing the order of a branch. llvm-svn: 147280	2011-12-26 20:54:14 +00:00
Chandler Carruth	a012c64ced	Add an explicit test that we now fold cttz.i32(..., true) >> 5 -> 0. This is a result of Benjamin's work on ValueTracking. llvm-svn: 147259	2011-12-24 22:34:15 +00:00
Benjamin Kramer	94f07f8c2c	InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff x is smaller than 2^n and it fuses with a following add. This was intended to undo the sub canonicalization in cases where it's not profitable, but it also finds some cases on it's own. llvm-svn: 147256	2011-12-24 17:31:53 +00:00
Benjamin Kramer	b5e584392b	ComputeMaskedBits: Make knownzero computation more aggressive for ctlz with undef zero. unsigned foo(unsigned x) { return 31 - __builtin_clz(x); } now compiles into a single "bsrl" instruction on x86. llvm-svn: 147255	2011-12-24 17:31:46 +00:00
Benjamin Kramer	0b4d2e3d2a	InstCombine: Canonicalize (2^n)-1 - x into (2^n)-1 ^ x iff x is known to be smaller than 2^n. This has the obvious advantage of being commutable and is always a win on x86 because const - x wastes a register there. On less weird architectures this may lead to a regression because other arithmetic doesn't fuse with it anymore. I'll address that problem in a followup. llvm-svn: 147254	2011-12-24 17:31:38 +00:00
Chandler Carruth	7a5c52fadf	Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the LZCNT instructions are available. Force promotion to i32 to get a smaller encoding since the fix-ups necessary are just as complex for either promoted type We can't do standard promotion for CTLZ when lowering through BSR because it results in poor code surrounding the 'xor' at the end of this instruction. Essentially, if we promote the entire CTLZ node to i32, we end up doing the xor on a 32-bit CTLZ implementation, and then subtracting appropriately to get back to an i8 value. Instead, our custom logic just uses the knowledge of the incoming size to compute a perfect xor. I'd love to know of a way to fix this, but so far I'm drawing a blank. I suspect the legalizer could be more clever and/or it could collude with the DAG combiner, but how... ;] llvm-svn: 147251	2011-12-24 12:12:34 +00:00
Chandler Carruth	82b7a7478b	Add systematic testing for cttz as well, and fix the bug I spotted by inspection earlier. llvm-svn: 147250	2011-12-24 11:46:10 +00:00
Chandler Carruth	b52ba33d0a	Add i8 and i64 testing for ctlz on x86. Also simplify the i16 test. llvm-svn: 147249	2011-12-24 11:26:59 +00:00
Chandler Carruth	800a803717	Tidy up this rather crufty test. Put the declarations at the top to make my C-brain happy. Remove the unnecessary bits of pedantic IR fluff like nounwind. Remove stray uses comments. Name things semantically rather than tN so that adding a new test in the middle doesn't cause pain, and so that new tests can be grouped semantically. This exposes how little systematic testing is going on here. I noticed this by finding several bugs via inspection and wondering why this test wasn't catching any of them. =[ llvm-svn: 147248	2011-12-24 11:26:57 +00:00
Chandler Carruth	48f5be6ce0	Expand more when we have a nice 'tzcnt' instruction, to avoid generating 'bsf' instructions here. This one is actually debatable to my eyes. It's not clear that any chip implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding. Still, this restores the old behavior with 'tzcnt' enabled for now. llvm-svn: 147246	2011-12-24 11:11:38 +00:00
Chandler Carruth	1846086903	Tidy up some of these tests. llvm-svn: 147245	2011-12-24 11:11:36 +00:00
Chandler Carruth	9ef50ef1f7	Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the X86ISelLowering C++ code. Because this is lowered via an xor wrapped around a bsr, we want the dagcombine which runs after isel lowering to have a chance to clean things up. In particular, it is very common to see code which looks like: (sizeof(x)8 - 1) ^ __builtin_clz(x) Which is trying to compute the most significant bit of 'x'. That's actually the value computed directly by the 'bsr' instruction, but if we match it too late, we'll get completely redundant xor instructions. The more naive code for the above (subtracting rather than using an xor) still isn't handled correctly due to the dagcombine getting confused. Also, while here fix an issue spotted by inspection: we should have been expanding the zero-undef variants to the normal variants when there is an 'lzcnt' instruction. Do so, and test for this. We don't want to generate unnecessary 'bsr' instructions. These two changes fix some regressions in encoding and decoding benchmarks. However, there is still a lot* to be improve on in this type of code. llvm-svn: 147244	2011-12-24 10:55:54 +00:00
Chandler Carruth	514920d53b	Cleanup this test a bit, sorting things and grouping them more clearly. llvm-svn: 147243	2011-12-24 10:55:42 +00:00
Akira Hatanaka	72c5800ed2	Test case for r147232. llvm-svn: 147233	2011-12-24 03:05:43 +00:00
Nick Lewycky	0c92d31b61	Move this test from date-name to feature-name, and port it to FileCheck. llvm-svn: 147223	2011-12-23 18:41:31 +00:00
Jakob Stoklund Olesen	c97d7d26bd	Experimental support for aligned NEON spills. ARM targets with NEON units have access to aligned vector loads and stores that are potentially faster than unaligned operations. Add support for spilling the callee-saved NEON registers to an aligned stack area using 16-byte aligned NEON loads and store. This feature is off by default, controlled by an -align-neon-spills command line option. llvm-svn: 147211	2011-12-23 00:36:18 +00:00
Jim Grosbach	a678ad9ecc	ARM VFP assembly parsing and encoding for VCVT(float <--> fixed point). rdar://10558523 llvm-svn: 147189	2011-12-22 22:19:05 +00:00
Rafael Espindola	eba1c0eb00	Fix incorrect relocation generation. Patch by Kristof Beyls. Fixes PR11214. llvm-svn: 147180	2011-12-22 21:36:43 +00:00
Chad Rosier	d16131e35c	Reinstate r146578; it doesn't appear to be the cause of some recent execution- time regressions. In general, it is beneficial to compile-time. Original commit message: Fix for bug #11429: Wrong behaviour for switches. Small improvement for code size heuristics. llvm-svn: 147175	2011-12-22 21:06:36 +00:00
Jim Grosbach	100e3aaffa	ARM assembler should accept shift-by-zero for any shifted-immediate operand. Just treat it as-if the shift wasn't there at all. 'as' compatibility. rdar://10604767 llvm-svn: 147153	2011-12-22 18:04:04 +00:00
Benjamin Kramer	5d07d63540	Give string constants generated by IRBuilder private linkage. Fixes PR11640. llvm-svn: 147144	2011-12-22 14:22:14 +00:00
Chandler Carruth	acdf352b77	Make the unreachable probability much much heavier. The previous probability wouldn't be considered "hot" in some weird loop structures or other compounding probability patterns. This makes it much harder to confuse, but isn't really a principled fix. I'd actually like it if we could model a zero probability, as it would make this much easier to reason about. Suggestions for how to do this better are welcome. llvm-svn: 147142	2011-12-22 09:26:37 +00:00
Chad Rosier	4ab165f664	Speculatively revert r146578 to determine if it is the cause of a number of performance regressions (both execution-time and compile-time) on our nightly testers. Original commit message: Fix for bug #11429: Wrong behaviour for switches. Small improvement for code size heuristics. llvm-svn: 147131	2011-12-22 02:40:57 +00:00
Akira Hatanaka	e7bcf63d98	Local dynamic TLS model for direct object output. Create the correct TLS MIPS ELF relocations. Patch by Jack Carter. llvm-svn: 147118	2011-12-22 01:05:17 +00:00
Jim Grosbach	7d31680e2d	ARM VFP optional data type on VMOV GPR<-->SPR. llvm-svn: 147104	2011-12-21 23:24:15 +00:00
Jim Grosbach	2bbc41fa26	Thumb2 assembly parsing of 'mov rd, rn, rrx'. Maps to the RRX instruction. Missed this case earlier. rdar://10615373 llvm-svn: 147096	2011-12-21 21:04:19 +00:00
Jim Grosbach	91faf5d15f	Thumb2 assembly parsing of 'mov(register shifted register)' aliases. These map to the ASR, LSR, LSL, ROR instruction definitions. rdar://10615373 llvm-svn: 147094	2011-12-21 20:54:00 +00:00
Jim Grosbach	f7236d1084	ARM NEON assmebly parsing for VLD2 to all lanes instructions. llvm-svn: 147069	2011-12-21 19:40:55 +00:00
Chad Rosier	c2f31859cc	Fix a couple of copy-n-paste bugs. Noticed by George Russell! llvm-svn: 147064	2011-12-21 18:56:22 +00:00
Nick Lewycky	9adbd36737	Make some intrinsics safe to speculatively execute. llvm-svn: 147036	2011-12-21 05:52:02 +00:00
Evan Cheng	fb22f64814	Fix a couple of copy-n-paste bugs. Noticed by George Russell. llvm-svn: 147032	2011-12-21 03:04:10 +00:00
Jim Grosbach	6bd1044b03	ARM NEON VLD2 assembly parsing for structure to all lanes, non-writeback. llvm-svn: 147025	2011-12-21 00:38:54 +00:00
Akira Hatanaka	4ab17eaca0	Fix bug in zero-store peephole pattern reported in pr11615. The patch and test case were originally written by Mans Rullgard. llvm-svn: 147024	2011-12-21 00:31:10 +00:00
Akira Hatanaka	0af792d12b	Expand 64-bit CTLZ nodes if target architecture does not support it. Add test case for DCLO and DCLZ. llvm-svn: 147022	2011-12-21 00:20:27 +00:00
Akira Hatanaka	fb94688c7a	Test case for r147017. llvm-svn: 147018	2011-12-20 23:58:36 +00:00
Jim Grosbach	0768f2c420	Enable and fix a test. llvm-svn: 147011	2011-12-20 23:20:00 +00:00
Akira Hatanaka	2e4f1786b1	Add function MipsDAGToDAGISel::SelectMULT and factor out code that generates nodes needed for multiplication. Add code for selecting 64-bit MULHS and MULHU nodes. llvm-svn: 147008	2011-12-20 23:10:57 +00:00
Akira Hatanaka	f728a1b2c5	64-bit data directive. llvm-svn: 147005	2011-12-20 22:52:19 +00:00
Akira Hatanaka	ad193d95ae	32-to-64-bit sext_inreg pattern. llvm-svn: 147004	2011-12-20 22:40:40 +00:00
Akira Hatanaka	8728f4ed69	Add code in MipsDAGToDAGISel for selecting constant +0.0. MIPS64 can generate constant +0.0 with a single DMTC1 instruction. llvm-svn: 146999	2011-12-20 22:25:50 +00:00
Jakob Stoklund Olesen	2b24e1eac4	Heed spill slot alignment on ARM. Use the spill slot alignment as well as the local variable alignment to determine when the stack needs to be realigned. This works now that the ARM target can always realign the stack by using a base pointer. Still respect the ARMBaseRegisterInfo::canRealignStack() function vetoing a realigned stack. Don't use aligned spill code in that case. llvm-svn: 146997	2011-12-20 22:15:04 +00:00
Jim Grosbach	8978194025	ARM assembly parsing and encoding for VST2 single-element, double spaced. llvm-svn: 146990	2011-12-20 20:46:29 +00:00
Jim Grosbach	3f48367a1b	ARM enable a few more tests. llvm-svn: 146985	2011-12-20 20:03:00 +00:00
Jim Grosbach	8156a5dcee	ARM assembly parsing and encoding for VLD2 single-element, double spaced. llvm-svn: 146983	2011-12-20 19:21:26 +00:00
Evan Cheng	46b085721a	ARM target code clean up. Check for iOS, not Darwin where it makes sense. llvm-svn: 146981	2011-12-20 18:26:50 +00:00
Elena Demikhovsky	b37883fe87	This is the second fix related to VZEXT_MOVL node. The failure that I see in the current version is: LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14] 0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13] 0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12] 0x18b9870: v4i64 = undef [ID=4] 0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10] 0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9770: i32 = TargetConstant<0> [ID=6] 0x18b9970: i32 = Constant<0> [ID=3] 0x18b9170: v2i64 = undef [ORD=1] [ID=1] 0x18b9570: i32 = Constant<2> [ID=5] llvm-svn: 146975	2011-12-20 13:34:28 +00:00
Chandler Carruth	7564e8371a	Begin teaching the X86 target how to efficiently codegen patterns that use the zero-undefined variants of CTTZ and CTLZ. These are just simple patterns for now, there is more to be done to make real world code using these constructs be optimized and codegen'ed properly on X86. The existing tests are spiffed up to check that we no longer generate unnecessary cmov instructions, and that we generate the very important 'xor' to transform bsr which counts the index of the most significant one bit to the number of leading (most significant) zero bits. Also they now check that when the variant with defined zero result is used, the cmov is still produced. llvm-svn: 146974	2011-12-20 11:19:37 +00:00
Andrew Trick	a1c4f73f87	Unit test for r146950: LSR postinc expansion, PR11571. llvm-svn: 146951	2011-12-20 01:43:20 +00:00
Bob Wilson	8439df9506	Mark ARM eh_sjlj_dispatchsetup as clobbering all registers. Radar 10567930. We used to rely on the *eh_sjlj_setjmp instructions to mark that a function with setjmp/longjmp exception handling clobbers all the registers. But with the recent reorganization of ARM EH, those eh_sjlj_setjmp instructions are expanded away earlier, before PEI can see them to determine what registers to save and restore. Mark the dispatchsetup instruction in the same way, since that instruction cannot be expanded early. This also more accurately reflects when the registers are clobbered. llvm-svn: 146949	2011-12-20 01:29:27 +00:00
Jim Grosbach	3f5493c136	ARM assembly shifts by zero should be plain 'mov' instructions. "mov r1, r2, lsl #0" should assemble as "mov r1, r2" even though it's not strictly legal UAL syntax. It's a common extension and the friendly thing to do. rdar://10604663 llvm-svn: 146937	2011-12-20 00:59:38 +00:00
Chris Lattner	c1d9c0a2a3	Now that PR11464 is fixed, reapply the patch to fix PR11464, merging types by name when we can. We still don't guarantee type name linkage but we do it when obviously the right thing to do. This makes LTO type names easier to read, for example. llvm-svn: 146932	2011-12-20 00:12:26 +00:00
Chris Lattner	998998b3e7	fix PR11464 by preventing the linker from mapping two different struct types from the source module onto the same opaque destination type. An opaque type can only be resolved to one thing or another after all. llvm-svn: 146929	2011-12-20 00:03:52 +00:00
Evan Cheng	9362ee62bc	Move tests to FileCheck. llvm-svn: 146923	2011-12-19 23:26:44 +00:00
Jim Grosbach	343f270350	ARM assembly parsing and encoding support for LDRD(label). rdar://9932658 llvm-svn: 146921	2011-12-19 23:06:24 +00:00
Akira Hatanaka	7ef923c1f0	Add a test case for r146900. llvm-svn: 146901	2011-12-19 20:24:28 +00:00
Akira Hatanaka	e54da3bfa2	Add patterns for matching immediates whose lower 16-bit is cleared. These patterns emit a single LUi instruction instead of a pair of LUi and ORi. llvm-svn: 146900	2011-12-19 20:21:18 +00:00
Jim Grosbach	797a88284c	ARM NEON two-operand aliases for VPADD. rdar://10602276 llvm-svn: 146895	2011-12-19 19:51:03 +00:00
Akira Hatanaka	804863071f	Remove definitions of double word shift plus 32 instructions. Assembler or direct-object emitter should emit the appropriate shift instruction depending on the shift amount. llvm-svn: 146893	2011-12-19 19:44:09 +00:00
Akira Hatanaka	b7ebcb2ded	Remove the restriction on the first operand of the add node in SelectAddr. This change reduces the number of instructions generated. For example, (load (add (sub $n0, $n1), (MipsLo got(s)))) results in the following sequence of instructions: 1. sub $n2, $n0, $n1 2. lw got(s)($n2) Previously, three instructions were needed. 1. sub $n2, $n0, $n1 2. addiu $n3, $n2, got(s) 3. lw 0($n3) llvm-svn: 146888	2011-12-19 19:28:37 +00:00
Jim Grosbach	520db82971	ARM NEON implied destination aliases for VMAX/VMIN. llvm-svn: 146885	2011-12-19 18:57:38 +00:00
Jim Grosbach	f4ca84a7ab	ARM NEON relax parse time diagnostics for alignment specifiers. There's more variation that we need to handle. Error checking will need to be on operand predicates. llvm-svn: 146884	2011-12-19 18:31:43 +00:00
Joerg Sonnenberger	8cf8d64d19	Allow inlining of functions with returns_twice calls, if they have the attribute themselve. llvm-svn: 146851	2011-12-18 20:35:43 +00:00
Chad Rosier	b870a13cd8	Revert 146728 as it's causing failures on some of the external bots as well as internal nightly testers. Original commit message: By popular demand, link up types by name if they are isomorphic and one is an autorenamed version of the other. This makes the IR easier to read, because we don't end up with random renamed versions of the types after LTO'ing a large app. llvm-svn: 146838	2011-12-17 22:19:53 +00:00
Kevin Enderby	42fffe915a	Revert r146822 at Pete Cooper's request as it broke clang self hosting. Hope I did this correctly :) llvm-svn: 146834	2011-12-17 19:48:52 +00:00
Pete Cooper	0ec73f6e98	SimplifyCFG now predicts some conditional branches to true or false depending on previous branch on same comparison operands. For example, if (a == b) { if (a > b) // this is false Fixes some of the issues on <rdar://problem/10554090> llvm-svn: 146822	2011-12-17 06:32:38 +00:00
Manuel Klimek	09f4d148b8	Deleting the json-bench-test until I understand why it is flaky. llvm-svn: 146821	2011-12-17 06:29:32 +00:00
Evan Cheng	23574ec02a	Fix a CPSR liveness tracking bug introduced when I converted IT block to bundle. llvm-svn: 146805	2011-12-17 01:25:34 +00:00
Rafael Espindola	549d0683b1	Add back the MC bits of 126425. Original patch by Nathan Jeffords. I added the asm parsing and testcase. llvm-svn: 146801	2011-12-17 01:14:52 +00:00
Lang Hames	e32ef23ba8	Make sure that the lower bits on the VSELECT condition are properly set. llvm-svn: 146800	2011-12-17 01:08:46 +00:00
Dan Gohman	9c8c9a8f62	The powers that be have decided that LLVM IR should now support 16-bit "half precision" floating-point with a first-class type. This patch adds basic IR support (but not codegen support). llvm-svn: 146786	2011-12-17 00:04:22 +00:00
Eric Christopher	38b0b94ef2	When recursing for the original size of a type, stop if we are at a pointer or a reference type - we actually just want the size of the pointer then for that. Fixes rdar://10335756 llvm-svn: 146785	2011-12-16 23:42:45 +00:00
Jakob Stoklund Olesen	445cdbb987	Fix off-by-one error in bucket sort. The bad sorting caused a misaligned basic block when building 176.vpr in ARM mode. <rdar://problem/10594653> llvm-svn: 146767	2011-12-16 23:00:05 +00:00
Benjamin Kramer	04d6a4c456	Hexagon: Fix a nasty order-of-initialization bug. Reenable the tests. llvm-svn: 146750	2011-12-16 19:08:59 +00:00
Manuel Klimek	2f7cf4e64b	Adds a JSON parser and a benchmark (json-bench) to catch performance regressions. llvm-svn: 146735	2011-12-16 13:09:10 +00:00
Chris Lattner	26f06c927f	By popular demand, link up types by name if they are isomorphic and one is an autorenamed version of the other. This makes the IR easier to read, because we don't end up with random renamed versions of the types after LTO'ing a large app. llvm-svn: 146728	2011-12-16 08:36:07 +00:00
Craig Topper	88e2bfef0a	Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes. llvm-svn: 146726	2011-12-16 08:06:31 +00:00
Kostya Serebryany	c78b00cab4	[asan] add a test for instrumenting globals llvm-svn: 146718	2011-12-16 01:28:19 +00:00
Eli Friedman	f626b19bda	Make sure we correctly note the existence of an i8 immediate for vblendvps and friends, so we compute fixups correctly. PR11586. llvm-svn: 146709	2011-12-15 23:46:18 +00:00
Jim Grosbach	30f4b285a6	ARM NEON VCLE is an alias for VCGE w/ the source operands reversed. llvm-svn: 146699	2011-12-15 22:56:33 +00:00
Jim Grosbach	b79d2a8f50	ARM NEON VTBL/VTBX assembly parsing and encoding. llvm-svn: 146691	2011-12-15 22:27:11 +00:00
Chad Rosier	62ebee9859	Add missing zmovl AVX patterns which were causing crashes. Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>! llvm-svn: 146689	2011-12-15 22:11:31 +00:00
Chad Rosier	e74b3b1469	Fix assert in LowerBUILD_VECTOR for v16i16 type on AVX. Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>! llvm-svn: 146684	2011-12-15 21:34:44 +00:00
Lang Hames	d5cee672a7	Set specific target cpu for testcase. llvm-svn: 146678	2011-12-15 20:22:34 +00:00
Lang Hames	0e361e816d	Added test case for r146671. llvm-svn: 146675	2011-12-15 19:56:07 +00:00
Hal Finkel	e8220d9927	Add a test case to make sure that the nop really does follow the bl on ppc64 elf llvm-svn: 146666	2011-12-15 17:59:23 +00:00
Eli Friedman	09abc453ac	Fix test. llvm-svn: 146642	2011-12-15 04:52:47 +00:00
Eli Friedman	f6ae3a7caf	Make constant folding for GEPs a bit more aggressive. llvm-svn: 146639	2011-12-15 04:33:48 +00:00
Eli Friedman	71c0914b64	Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570. llvm-svn: 146630	2011-12-15 02:07:20 +00:00
Chad Rosier	b93733686c	Add support for lowering fneg when AVX is enabled. rdar://10566486 llvm-svn: 146625	2011-12-15 01:02:25 +00:00
Pete Cooper	550b96ab46	Added InstCombine for "select cond, ~cond, x" type patterns These can be reduced to "~cond & x" or "~cond \| x" llvm-svn: 146624	2011-12-15 00:56:45 +00:00
Eli Friedman	5dd57bb40a	Make loop preheader insertion in LoopSimplify handle the case where the loop header is a landing pad correctly (by splitting the landingpad out of the loop header). Make some adjustments to the rest of LoopSimplify to make it clear that the rest of LoopSimplify isn't making bad assumptions about the presence of landing pads. PR11575. llvm-svn: 146621	2011-12-15 00:50:34 +00:00
Dan Gohman	1add31cc93	Move Instruction::isSafeToSpeculativelyExecute out of VMCore and into Analysis as a standalone function, since there's no need for it to be in VMCore. Also, update it to use isKnownNonZero and other goodies available in Analysis, making it more precise, enabling more aggressive optimization. llvm-svn: 146610	2011-12-14 23:49:11 +00:00
Jim Grosbach	75db252aee	ARM NEON VLD2/VST2 lane indexed assembly parsing and encoding. llvm-svn: 146605	2011-12-14 23:25:46 +00:00
Devang Patel	0db1ed1a48	Do not sink instruction, if it is not profitable. On ARM, peephole optimization for ABS creates a trivial cfg triangle which tempts machine sink to sink instructions in code which is really straight line code. Sometimes this sinking may alter register allocator input such that use and def of a reg is divided by a branch in between, which may result in extra spills. Now mahine sink avoids sinking if final sink destination is post dominator. Radar 10266272. llvm-svn: 146604	2011-12-14 23:20:38 +00:00
Kevin Enderby	bc6d6388c2	Improve the implementation of .incbin directive by replacing a loop by using getStreamer().EmitBytes. Suggestion by Benjamin Kramer! llvm-svn: 146599	2011-12-14 22:34:45 +00:00
Andrew Trick	9c88f32f94	LSR: Fold redundant bitcasts on-the-fly. llvm-svn: 146597	2011-12-14 22:07:19 +00:00
Jim Grosbach	83520a5b70	ARM NEON fix alignment encoding for VST2 w/ writeback. Add tests for w/ writeback instruction parsing and encoding. llvm-svn: 146594	2011-12-14 21:49:24 +00:00
Kevin Enderby	b0b669eb26	Add the .incbin directive which takes the binary data from a file and emits it to the streamer. rdar://10383898 llvm-svn: 146592	2011-12-14 21:47:48 +00:00
Jim Grosbach	44829ab9d2	ARM NEON VST2 assembly parsing and encoding. Work in progress. Parsing for non-writeback, single spaced register lists works now. The rest have the representations better factored, but still need more to be able to parse properly. llvm-svn: 146579	2011-12-14 19:35:22 +00:00
Stepan Dyatkovskiy	14cb78c6fb	Fix for bug #11429 : Wrong behaviour for switches. Small improvement for code size heuristics. llvm-svn: 146578	2011-12-14 19:19:17 +00:00
Dan Gohman	e9572aa680	It turns out that clang does use pointer-to-function types to point to ARC-managed pointers sometimes. This fixes rdar://10551239. llvm-svn: 146577	2011-12-14 19:10:53 +00:00
Akira Hatanaka	3fca32d88e	Add support for local dynamic TLS model in LowerGlobalTLSAddress. Direct object emission is not supported yet, but a patch that adds the support should follow soon. llvm-svn: 146572	2011-12-14 18:26:41 +00:00
Jim Grosbach	54372eef76	ARM/Thumb2 'cmp rn, #imm' alias to cmn. When 'cmp rn #imm' doesn't match due to the immediate not being representable, but 'cmn rn, #-imm' does match, use the latter in place of the former, as it's equivalent. rdar://10552389 llvm-svn: 146567	2011-12-14 17:30:24 +00:00
Jim Grosbach	628ae663ef	ARM assembler support for the target-specific .req directive. rdar://10549683 llvm-svn: 146543	2011-12-14 02:16:11 +00:00
Evan Cheng	68ba5536f3	- Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function to finalize MI bundles (i.e. add BUNDLE instruction and computing register def and use lists of the BUNDLE instruction) and a pass to unpack bundles. - Teach more of MachineBasic and MachineInstr methods to be bundle aware. - Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to prevent IT blocks from being broken apart. llvm-svn: 146542	2011-12-14 02:11:42 +00:00
Chad Rosier	33f40b2c25	Add newline at EOF. llvm-svn: 146538	2011-12-14 01:34:39 +00:00
Jim Grosbach	089ad574d8	Thumb2 assembler aliases for "mov(shifted register)" rdar://10549767 llvm-svn: 146520	2011-12-13 22:45:11 +00:00
Jim Grosbach	bd33fc6efd	ARM LDM/STM system instruction variants. rdar://10550269 llvm-svn: 146519	2011-12-13 21:48:29 +00:00
Jim Grosbach	7db50010cc	Test for 146516 llvm-svn: 146517	2011-12-13 21:06:59 +00:00
Jim Grosbach	13d3509445	ARM thumb2 parsing of "rsb rd, rn, #0 ". rdar://10549741 llvm-svn: 146515	2011-12-13 20:50:38 +00:00
Jim Grosbach	dfec87fe2f	ARM NEON two-operand aliases for VQDMULH. llvm-svn: 146514	2011-12-13 20:40:37 +00:00
Jim Grosbach	0ba5ba4535	ARM pre-UAL NEG mnemonic for convenience when porting old code. llvm-svn: 146511	2011-12-13 20:23:22 +00:00
Chad Rosier	8af97606a9	[fast-isel] Unaligned loads of floats are not supported. Therefore, convert to a regular load and then move the result from a GPR to a FPR. llvm-svn: 146502	2011-12-13 19:22:14 +00:00
Akira Hatanaka	23f439aca1	Add test/MC/Mips/dg.exp. llvm-svn: 146472	2011-12-13 04:12:49 +00:00
Akira Hatanaka	a9290d5ab9	Move direct object emitter test to directory test/MC/Mips. Rename it to elf-relsym.ll. llvm-svn: 146470	2011-12-13 03:50:34 +00:00
Akira Hatanaka	28140f744a	Relocation against a symbol, instead of against section. We had some extreme test cases where there were a lot of relocations applied relative to a large rodata section. Gas would create a symbol for each of these whereas we would be relative to the beginning of the rodata section. This change mimics what gas does. Patch by Jack Carter. llvm-svn: 146468	2011-12-13 02:27:40 +00:00
Nick Lewycky	90a4c39a28	Don't rely on a particular version string for llvm. llvm-svn: 146456	2011-12-13 00:34:14 +00:00
Tony Linthicum	da0dd81cf1	Temporarily disable Hexagon tests. They are failing on OS X llvm-svn: 146455	2011-12-13 00:33:45 +00:00
Akira Hatanaka	46dd9e66a6	Test case for r146432 by Jack Carter. llvm-svn: 146433	2011-12-12 22:41:39 +00:00
Bob Wilson	70f6f24d68	Implement 'e' and 'f' modifiers for Neon inline asm. <rdar://problem/10551006> These modifiers simply select either the low or high D subregister of a Neon Q register. I've also removed the unimplemented 'p' modifier, which turns out to be a bit different than the comment here suggests and as far as I can tell was only intended for internal use in Apple's version of gcc. llvm-svn: 146417	2011-12-12 21:45:15 +00:00
Tony Linthicum	61adbf8dc5	Hexagon backend support llvm-svn: 146412	2011-12-12 21:14:40 +00:00
Joerg Sonnenberger	5b25b4d437	Only replace fwrite with fputc, if the return value is unused. llvm-svn: 146411	2011-12-12 20:18:31 +00:00
Jan Sjödin	b9e2da0d9a	XOP instructions and encoding tests. llvm-svn: 146407	2011-12-12 19:37:49 +00:00
Roman Divacky	a450b8b2c8	Add support for gnu_indirect_function. llvm-svn: 146377	2011-12-12 17:34:04 +00:00
Chandler Carruth	2bedf185c9	Manually upgrade the test suite to specify the flag to cttz and ctlz. I followed three heuristics for deciding whether to set 'true' or 'false': - Everything target independent got 'true' as that is the expected common output of the GCC builtins. - If the target arch only has one way of implementing this operation, set the flag in the way that exercises the most of codegen. For most architectures this is also the likely path from a GCC builtin, with 'true' being set. It will (eventually) require lowering away that difference, and then lowering to the architecture's operation. - Otherwise, set the flag differently dependending on which target operation should be tested. Let me know if anyone has any issue with this pattern or would like specific tests of another form. This should allow the x86 codegen to just iteratively improve as I teach the backend how to differentiate between the two forms, and everything else should remain exactly the same. llvm-svn: 146370	2011-12-12 11:59:10 +00:00
Chandler Carruth	36be5dd1e8	Add an explicit test of the auto-upgrade functionality for the new intrinsic syntax. Now that this is explicitly covered, I plan to upgrade the existing test suite to use an explicit immediate. Note that I plan to specify 'true' in most places rather than the auto-upgraded value as that is the far more common value to end up here as that is the value coming from GCC's builtins. The only place I'm likely to put a 'false' in is when testing x86 which actually has different instructions for the two variants. llvm-svn: 146369	2011-12-12 11:23:11 +00:00
Chandler Carruth	d733f059d0	Teach the verifier to reject all non-constant arguments to the second argument of the cttz and ctlz intrinsics. llvm-svn: 146360	2011-12-12 04:36:02 +00:00
Stepan Dyatkovskiy	bf1423bdcd	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Third attempt: simplified checks in test for armv7-apple-darwin11. llvm-svn: 146341	2011-12-11 14:35:48 +00:00
Chandler Carruth	afb8199f38	Don't assume things about the exact details of the LLVM version number, such as what VCS information is attached. llvm-svn: 146333	2011-12-10 21:40:31 +00:00
Chad Rosier	fa74c25947	Revert associate SelectInsertValue test as well. llvm-svn: 146332	2011-12-10 21:34:28 +00:00
Chad Rosier	d8a265c838	Revert r146322 to appease buildbots. Original commit message: Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146328	2011-12-10 19:55:03 +00:00
Stepan Dyatkovskiy	5b2b42e8c9	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146322	2011-12-10 08:42:24 +00:00
Hal Finkel	d591c94df7	Make CR spill and restore use a reserved register. These operations cannot use the register scavenger because the scavenger can only scavenge one register and frame-index elimination may have already grabbed it. llvm-svn: 146318	2011-12-10 04:50:53 +00:00
Rafael Espindola	9b9d35cc05	Handle expressions of the form _GLOBAL_OFFSET_TABLE_-symbol the same way gas does. The _GLOBAL_OFFSET_TABLE_ is still magical in that we get a R_386_GOTPC, but it doesn't change the immediate in the same way as when the expression has no right hand side symbol. llvm-svn: 146311	2011-12-10 02:28:43 +00:00
Eli Friedman	ca06c3a2bd	Splats can contain undef's; make sure to handle them correctly. PR11526. llvm-svn: 146299	2011-12-09 23:54:42 +00:00
Jim Grosbach	356ad6d232	ARM assembly aliases for BIC<-->AND (immediate). When the immediate operand of an AND or BIC instruction isn't representable in the immediate field of the instruction, but the bitwise negation of the immediate is, assemble the instruction as the inverse operation instead with the inverted immediate as the operand. rdar://10550057 llvm-svn: 146283	2011-12-09 22:02:17 +00:00
Evan Cheng	77f0fb0296	Update test to something more sensible. llvm-svn: 146282	2011-12-09 21:54:10 +00:00
Jim Grosbach	489e81da30	ARM assembly parsing and encoding for VLD2 with writeback. Refactor the instructions into fixed writeback and register-stride writeback variants to simplify the offset operand (no more optional register operand using reg0). This is a simpler representation and allows the assembly parser to more easily handle these instructions. Add tests for the instruction variants now supported. llvm-svn: 146278	2011-12-09 21:28:25 +00:00
Chad Rosier	7e0dc23863	[fast-isel] Add support for selecting insertvalue. rdar://10530851 llvm-svn: 146276	2011-12-09 20:09:54 +00:00
Rafael Espindola	b5c511f7b7	Handle reloc_signed_4byte in here. Not doing so was a regression from my previous commit. It is strange that we see it in 32 bits. We already have a fixme about it. llvm-svn: 146273	2011-12-09 19:57:29 +00:00
Kevin Enderby	63cf89d532	The second part of support for generating dwarf for assembly source files. This generates the dwarf Compile Unit DIE and a dwarf subprogram DIE for each non-temporary label. The next part will be to get the clang driver to enable this when assembling a .s file. rdar://9275556 llvm-svn: 146262	2011-12-09 18:09:40 +00:00
Benjamin Kramer	06cd66b1d7	X86: Add patterns for the various rounding ops for SSE4.1 and AVX. llvm-svn: 146257	2011-12-09 15:44:03 +00:00
Andrew Trick	4f0b3bb42b	Add -unroll-runtime for unrolling loops with run-time trip counts. Patch by Brendon Cahoon! This extends the existing LoopUnroll and LoopUnrollPass. Brendon measured no regressions in the llvm test suite with -unroll-runtime enabled. This implementation works by using the existing loop unrolling code to unroll the loop by a power-of-two (default 8). It generates an if-then-else sequence of code prior to the loop to execute the extra iterations before entering the unrolled loop. llvm-svn: 146245	2011-12-09 06:19:40 +00:00
Evan Cheng	2c8bac6b4c	Forgot setting -march. llvm-svn: 146244	2011-12-09 06:15:00 +00:00
Rafael Espindola	82e22767cf	Handle the case of the magical _GLOBAL_OFFSET_TABLE_ showing up in a symbol difference. This matches gas behavior and fixes PR11513. We still don't handle _GLOBAL_OFFSET_TABLE_ in data sections. llvm-svn: 146238	2011-12-09 03:03:58 +00:00
Akira Hatanaka	ce89ae9f84	jalr should use t9 ($25) for indirect calls regardless of the relocation model specified. llvm-svn: 146229	2011-12-09 01:45:12 +00:00
Eli Friedman	8f3db3867c	Fix a couple of logic bugs in TargetLowering::SimplifyDemandedBits. PR11514. llvm-svn: 146219	2011-12-09 01:16:26 +00:00
Nick Lewycky	d2c1661e9f	Fix infinite loop in DSE when deleting a free in a reachable loop that's also trivially infinite. llvm-svn: 146197	2011-12-08 22:36:35 +00:00
Evan Cheng	ad8debd736	Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417 llvm-svn: 146196	2011-12-08 22:30:45 +00:00
Jim Grosbach	62873cae5f	ARM 64-bit VEXT assembly uses a .64 suffix, not .32, amazingly enough. llvm-svn: 146194	2011-12-08 22:19:04 +00:00
Jim Grosbach	a33fa8aa88	ARM VSHR implied destination operand form aliases. llvm-svn: 146192	2011-12-08 22:06:06 +00:00
Evan Cheng	d8a73b8918	Add various missing AVX patterns which was causing crashes. Sadly, the generated code looks pretty bad compared to SSE. rdar://10538793 llvm-svn: 146191	2011-12-08 22:05:28 +00:00
Jim Grosbach	af9cc198cf	Tidy up a bit. llvm-svn: 146190	2011-12-08 22:04:40 +00:00
Jim Grosbach	78020c4642	ARM VSUB implied destination operand form aliases. llvm-svn: 146182	2011-12-08 20:56:26 +00:00
Jim Grosbach	957be45ccf	Tidy up a bit. llvm-svn: 146181	2011-12-08 20:53:19 +00:00
Jim Grosbach	a33af36947	ARM VQADD implied destination operand form aliases. llvm-svn: 146179	2011-12-08 20:49:43 +00:00
Jim Grosbach	405e213008	ARM a few more VMUL implied destination operand form aliases. llvm-svn: 146177	2011-12-08 20:42:35 +00:00
Owen Anderson	d003a613e7	Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise. llvm-svn: 146171	2011-12-08 19:32:14 +00:00
Evan Cheng	0e0e920975	Add test for r146163. llvm-svn: 146167	2011-12-08 19:21:39 +00:00
Daniel Dunbar	c192ce505d	Revert r146143, "Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2).", it is failing tests. llvm-svn: 146157	2011-12-08 17:32:18 +00:00
NAKAMURA Takumi	671c1da473	test/CodeGen/X86/vec_compare-2.ll: Add explicit -mtriple=i686-linux. llvm-svn: 146152	2011-12-08 15:24:09 +00:00
Nadav Rotem	341b30a457	Fix a bug in the integer-promotion of bitcast operations on vector types. We must not issue a bitcast operation for integer-promotion of vector types, because the location of the values in the vector may be different. llvm-svn: 146150	2011-12-08 13:10:01 +00:00
Stepan Dyatkovskiy	8fde5b6eb4	Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). llvm-svn: 146143	2011-12-08 07:55:03 +00:00
Jim Grosbach	e1fe053f6e	ARM NEON two-operand aliases for VSHL(immediate). llvm-svn: 146125	2011-12-08 01:30:04 +00:00
Jim Grosbach	3e9384b103	ARM NEON two-operand aliases for VSHL(register). llvm-svn: 146123	2011-12-08 01:12:35 +00:00
Jim Grosbach	3b4d5c0510	ARM optional destination operand variants for VEXT instructions. llvm-svn: 146114	2011-12-08 00:43:47 +00:00
Jim Grosbach	0c64182f7c	Tidy up. llvm-svn: 146113	2011-12-08 00:41:54 +00:00
Jim Grosbach	c1cf417595	ARM assembler aliases for "add Rd, #-imm" to "sub Rd, #imm". llvm-svn: 146111	2011-12-08 00:31:07 +00:00
Jim Grosbach	6146f79b7d	ARM assembly, allow 'asl' as a synonym for 'lsl' in shifted-register operands. For 'gas' compatibility. llvm-svn: 146106	2011-12-07 23:40:58 +00:00
Akira Hatanaka	7db0038ac0	32 to 64-bit zext pattern. llvm-svn: 146096	2011-12-07 23:14:41 +00:00
Jim Grosbach	dd3788b044	ARM two-operand aliases for VAND/VEOR/VORR instructions. llvm-svn: 146095	2011-12-07 23:08:12 +00:00
Jim Grosbach	da0a3e310a	ARM two-operand aliases for VADDW instructions. llvm-svn: 146093	2011-12-07 23:01:10 +00:00
Jim Grosbach	ecf9c2bb21	ARM two-operand aliases for VADD instructions. llvm-svn: 146091	2011-12-07 22:52:54 +00:00
Akira Hatanaka	b8e63b4c07	64-bit WrapperPICPat patterns. llvm-svn: 146086	2011-12-07 22:11:43 +00:00
Akira Hatanaka	2b45547782	Modify LowerFCOPYSIGN to handle Mips64. llvm-svn: 146080	2011-12-07 21:48:50 +00:00
Akira Hatanaka	19d6cd4d0e	Fix 64-bit immediate patterns. llvm-svn: 146059	2011-12-07 20:10:24 +00:00
Jim Grosbach	2f57374e32	Darwin assembler improved relocs when w/o subsections_via_symbols. When the file isn't being built with subsections-via-symbols, symbol differences involving non-local symbols can be resolved more aggressively. Needed for gas compatibility. llvm-svn: 146054	2011-12-07 19:46:59 +00:00
Jim Grosbach	1ccae84fa7	Thumb2 alias for long-form pop and friends. rdar://10542474 llvm-svn: 146046	2011-12-07 18:32:28 +00:00
Jim Grosbach	81cb9952c9	ARM support the .arm and .thumb directives for assembly mode switching. llvm-svn: 146042	2011-12-07 18:04:19 +00:00

... 3 4 5 6 7 ...

15592 Commits