llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-08 12:21:04 +00:00

Author	SHA1	Message	Date
Craig Topper	e878c775cf	Add code for lowering v32i8 shifts by a splat to AVX2 immediate shift instructions. Remove 256-bit splat handling from LowerShift as it was already handled by PerformShiftCombine. llvm-svn: 145005	2011-11-20 00:12:05 +00:00
Craig Topper	6ed413c495	Use 256-bit vcmpeqd for creating an all ones vector when AVX2 is enabled. llvm-svn: 145004	2011-11-19 22:34:59 +00:00
Chandler Carruth	f24d3f8fc7	Move the handling of unanalyzable branches out of the loop-driven chain formation phase and into the initial walk of the basic blocks. We essentially pre-merge all blocks where unanalyzable fallthrough exists, as we won't be able to update the terminators effectively after any reorderings. This is quite a bit more principled as there may be CFGs where the second half of the unanalyzable pair has some analyzable predecessor that gets placed first. Then it may get placed next, implicitly breaking the unanalyzable branch even though we never even looked at the part that isn't analyzable. I've included a test case that triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize some more general ones as I dig into related issues. Also, to make this new scheme work we have to be able to handle branches into the middle of a chain, so add this check. We always fallback on the incoming ordering. Finally, this starts to really underscore a known limitation of the current implementation -- we don't consider broken predecessors when merging successors. This can caused major missed opportunities, and is something I'm planning on looking at next (modulo more bug reports). llvm-svn: 144994	2011-11-19 10:26:02 +00:00
Craig Topper	44b06b0096	Test cases for SSSE3/AVX integer horizontal add/sub. llvm-svn: 144990	2011-11-19 09:03:33 +00:00
Craig Topper	536f9d9434	Extend VPBLENDVB and VPSIGN lowering to work for AVX2. llvm-svn: 144987	2011-11-19 07:07:26 +00:00
Andrew Trick	fe5f7fc3b8	Fix a corner case in updating LoopInfo after fully unrolling an outer loop. The loop tree's inclusive block lists are painful and expensive to update. (I have no idea why they're inclusive). The design was supposed to handle this case but the implementation missed it and my unit tests weren't thorough enough. Fixes PR11335: loop unroll update. llvm-svn: 144970	2011-11-18 03:42:41 +00:00
Nadav Rotem	08f8a75c2c	Add AVX2 vpbroadcast support llvm-svn: 144967	2011-11-18 02:49:55 +00:00
Kostya Serebryany	3a83736893	[asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler llvm-svn: 144962	2011-11-18 01:41:06 +00:00
Devang Patel	a0973b0c53	DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange. llvm-svn: 144937	2011-11-17 23:43:15 +00:00
Andrew Trick	7dc21d8c0e	Fix an overly general check in SimplifyIndvar to handle useless phi cycles. The right way to check for a binary operation is cast<BinaryOperator>. The original check: cast<Instruction> && numOperands() == 2 would match phi "instructions", leading to an infinite loop in extreme corner case: a useless phi with operands [self, constant] that prior optimization passes failed to remove, being used in the loop by another useless phi, in turn being used by an lshr or udiv. Fixes PR11350: runaway iteration assertion. llvm-svn: 144935	2011-11-17 23:36:35 +00:00
Kostya Serebryany	3b8d362511	fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr llvm-svn: 144933	2011-11-17 23:14:59 +00:00
Chad Rosier	2673f8862f	When fast iseling a GEP, accumulate the offset rather than emitting a series of ADDs. MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs being: (1) If we can't materialize the large constant then we'll cause fast-isel to bail. (2) Too large of an offset can't be directly encoded in the ADD resulting in a MOV+ADD. Generally not a bad thing because otherwise we would have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix for that. (3) Conversely, too low of a threshold we'll miss opportunities to coalesce ADDs. rdar://10412592 llvm-svn: 144886	2011-11-17 07:15:58 +00:00
Eli Friedman	d02d82d355	Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom names for fwrite and fputs. Fixes <rdar://problem/9815881>. llvm-svn: 144876	2011-11-17 01:27:36 +00:00
Daniel Dunbar	4affd889a2	build/make/test: Get rid of unused BUGPOINT_TOPTS variable. llvm-svn: 144864	2011-11-16 23:56:03 +00:00
Eli Friedman	51adc2ea5a	Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393. llvm-svn: 144863	2011-11-16 23:50:22 +00:00
Jim Grosbach	1b837af2bd	Remove obsolete test. The PLD encoding is checked via the .s file now. llvm-svn: 144853	2011-11-16 22:50:38 +00:00
Jim Grosbach	fe5f0cfa29	Generalize the fixup info for ARM mode. We don't (yet) have the granularity in the fixups to be specific about which bitranges are affected. That's a future cleanup, but we're not there yet. llvm-svn: 144852	2011-11-16 22:48:37 +00:00
Jim Grosbach	8fae277866	Update test for r144842. llvm-svn: 144851	2011-11-16 22:46:27 +00:00
Evan Cheng	5bae2333cb	Another missing X86ISD::MOVLPD pattern. rdar://10450317 llvm-svn: 144839	2011-11-16 22:24:44 +00:00
Evan Cheng	5242b6aaa1	Disable expensive two-address optimizations at -O0. rdar://10453055 llvm-svn: 144806	2011-11-16 18:44:48 +00:00
Nick Lewycky	29efc8f15d	Fix typo in test. llvm-svn: 144774	2011-11-16 03:56:38 +00:00
Nick Lewycky	ff690249a9	Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when looking at the size of the pointee. Fixes PR11390! llvm-svn: 144773	2011-11-16 03:49:48 +00:00
Eli Friedman	c2a46f1a90	Fix testcase. llvm-svn: 144769	2011-11-16 03:03:52 +00:00
Eli Friedman	1f3d774ba4	CONCAT_VECTORS can have more than two operands. PR11389. llvm-svn: 144768	2011-11-16 02:52:39 +00:00
Kostya Serebryany	4105068ea9	AddressSanitizer, first commit (compiler module only) llvm-svn: 144758	2011-11-16 01:35:23 +00:00
Andrew Trick	fe618116fc	Fix SCEV overly optimistic back edge taken count for multi-exit loops. Fixes PR11375: Different results for 'clang++ huh.cpp'... llvm-svn: 144746	2011-11-16 00:52:40 +00:00
Jim Grosbach	044acb8bee	ARM assembly parsing for register range syntax for VLD/VST register lists. For example, vld1.f64 {d2-d5}, [r2,:128]! Should be equivalent to: vld1.f64 {d2,d3,d4,d5}, [r2,:128]! It's not documented syntax in the ARM ARM, but it is consistent with what's accepted for VLDM/VSTM and is unambiguous in meaning, so it's a good thing to support. rdar://10451128 llvm-svn: 144727	2011-11-15 23:19:15 +00:00
Nadav Rotem	63be4a26a9	AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code. llvm-svn: 144720	2011-11-15 22:50:37 +00:00
NAKAMURA Takumi	f99c0f0fcd	test/CodeGen/X86/dec-eflags-lower.ll: Relax expression for win32 x64. llvm-svn: 144714	2011-11-15 22:30:37 +00:00
Jim Grosbach	b8ebc386df	ARM assembly parsing two operand forms for shift instructions. llvm-svn: 144713	2011-11-15 22:27:54 +00:00
Pete Cooper	8441c08e0b	Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used by later instructions. Only done for DEC64m right now. Fixes <rdar://problem/6172640> llvm-svn: 144705	2011-11-15 21:57:53 +00:00
Jim Grosbach	4d0ad5a4e0	ARM alternate size suffices for VTRN instructions. rdar://10435076 llvm-svn: 144694	2011-11-15 20:49:46 +00:00
Jim Grosbach	8987b277cb	ARM assembly parsing for optional datatype suffix on VFP VMOV GPR<->VFP insns. Yet more of rdar://10435076. llvm-svn: 144691	2011-11-15 20:29:42 +00:00
Jim Grosbach	f0690cd90c	ARM assembly parsing for two-operand form of 'mul' instruction. rdar://10449856. llvm-svn: 144689	2011-11-15 20:14:51 +00:00
Jim Grosbach	8b1d4c989c	ARM assembly parsing for two-operand form of 'mul' instruction. Ongoing rdar://10435114. llvm-svn: 144688	2011-11-15 20:02:06 +00:00
Jim Grosbach	e4933acaa7	Testcase for r144684. llvm-svn: 144685	2011-11-15 19:56:17 +00:00
Owen Anderson	35f049f1fb	Fix an ambiguous decoding where we failed to properly decode VMOVv2f32 and VMOVv4f32. llvm-svn: 144683	2011-11-15 19:55:00 +00:00
Jim Grosbach	df951fa128	Thumb2 assembly parsing for mul.w in IT block fix. When the 3rd operand is not a low-register, and the first two operands are the same low register, the parser was incorrectly trying to use the 16-bit instruction encoding. rdar://10449281 llvm-svn: 144679	2011-11-15 19:29:45 +00:00
Rafael Espindola	95f4e0c409	We currently use a callback to handle an IL pass deleting a BB that still has a reference to it. Unfortunately, that doesn't work for codegen passes since we don't get notified of MBB's being deleted (the original BB stays). Use that fact to our advantage and after printing a function, check if any of the IL BBs corresponds to a symbol that was not printed. This fixes pr11202. llvm-svn: 144674	2011-11-15 19:08:46 +00:00
Jakob Stoklund Olesen	606d4e999b	Revert r144611 and r144613. These tests are actually correct, clang was miscompiling ExeDepsFix::processUses. Evan fixed the miscompilation in r144628. llvm-svn: 144630	2011-11-15 07:13:03 +00:00
Chandler Carruth	fdcba17bec	Rather than trying to use the loop block sequence or the function block sequence when recovering from unanalyzable control flow constructs, always use the function sequence. I'm not sure why I ever went down the path of trying to use the loop sequence, it is fundamentally not the correct sequence to use. We're trying to preserve the incoming layout in the cases of unreasonable control flow, and that is only encoded at the function level. We already have a filter to select exactly the sub-set of blocks within the function that we're trying to form into a chain. The resulting code layout is also significantly better because of this. In several places we were ending up with completely unreasonable control flow constructs due to the ordering chosen by the loop structure for its internal storage. This change removes a completely wasteful vector of basic blocks, saving memory allocation in the common case even though it costs us CPU in the fairly rare case of unnatural loops. Finally, it fixes the latest crasher reduced out of GCC's single source. Thanks again to Benjamin Kramer for the reduction, my bugpoint skills failed at it. llvm-svn: 144627	2011-11-15 06:26:43 +00:00
Craig Topper	a584521daa	Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370. llvm-svn: 144622	2011-11-15 05:55:35 +00:00
Jakob Stoklund Olesen	a0ceee932a	Really fix test. llvm-svn: 144613	2011-11-15 03:17:01 +00:00
Jakob Stoklund Olesen	a4ad7080ff	Allow for depencendy-breaking instructions before cvt*. This should unbreak clang-x86_64-darwin10-RA, but I can't actually reproduce the failure. llvm-svn: 144611	2011-11-15 02:29:48 +00:00
Evan Cheng	47d8f8af84	Add vmov.f32 to materialize f32 immediate splats which cannot be handled by integer variants. rdar://10437054 llvm-svn: 144608	2011-11-15 02:12:34 +00:00
Jakob Stoklund Olesen	2709f65821	Break false dependencies before partial register updates. Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix about instructions with partial register updates causing false unwanted dependencies. The ExecutionDepsFix pass will break the false dependencies if the updated register was written in the previoius N instructions. The small loop added to sse-domains.ll runs twice as fast with dependency-breaking instructions inserted. llvm-svn: 144602	2011-11-15 01:15:30 +00:00
Jim Grosbach	6846540505	ARM parsing datatype suffix variants for non-writeback VST1 instructions. rdar://10435076 llvm-svn: 144593	2011-11-14 23:43:46 +00:00
Jim Grosbach	a1a28df278	ARM parsing datatype suffix variants for non-writeback VLD1 instructions. rdar://10435076 llvm-svn: 144592	2011-11-14 23:32:59 +00:00
Jim Grosbach	00283a5c8e	ARM parsing optional datatype suffix for VAND/VEOR/VORR instructions. rdar://10435076 llvm-svn: 144587	2011-11-14 23:11:19 +00:00
Jim Grosbach	4a2f107b04	ARM VLDR/VSTR instructions don't need a size suffix. Canonicallize on the non-suffixed form, but continue to accept assembly that has any correctly sized type suffix. llvm-svn: 144583	2011-11-14 23:03:21 +00:00
Nick Lewycky	a0b2f7ca1d	Refactor capture tracking (which already had a couple flags for whether returns and stores capture) to permit the caller to see each capture point and decide whether to continue looking. Use this inside memdep to do an analysis that basicaa won't do. This lets us solve another devirtualization case, fixing PR8908! llvm-svn: 144580	2011-11-14 22:49:42 +00:00
Chad Rosier	b107c825eb	Add newline to end of file. Thanks, Eli. llvm-svn: 144579	2011-11-14 22:48:33 +00:00
Chad Rosier	48b92815e0	Add support for inlining small memcpys. rdar://10412592 llvm-svn: 144578	2011-11-14 22:46:17 +00:00
Chad Rosier	8aa8f14940	Fix a performance regression from r144565. Positive offsets were being lowered into registers, rather then encoded directly in the load/store. llvm-svn: 144576	2011-11-14 22:34:48 +00:00
Evan Cheng	2034ff3b0b	Add a missing pattern for X86ISD::MOVLPD. rdar://10436044 llvm-svn: 144566	2011-11-14 20:35:52 +00:00
Chad Rosier	65395ac4d0	Add support for Thumb load/stores with negative offsets. rdar://10412592 llvm-svn: 144565	2011-11-14 20:22:27 +00:00
Evan Cheng	f19d257488	Teach two-address pass to re-schedule two-address instructions (or the kill instructions of the two-address operands) in order to avoid inserting copies. This fixes the few regressions introduced when the two-address hack was disabled (without regressing the improvements). rdar://10422688 llvm-svn: 144559	2011-11-14 19:48:55 +00:00
Pete Cooper	c9d6834f38	Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered Constant idx case is still done in tablegen but other cases are then expanded Fixes <rdar://problem/10435460> llvm-svn: 144557	2011-11-14 19:38:42 +00:00
Jakob Stoklund Olesen	6035535c96	Fix early-clobber handling in shrinkToUses. I broke this in r144515, it affected most ARM testers. <rdar://problem/10441389> llvm-svn: 144547	2011-11-14 18:45:38 +00:00
Jakob Stoklund Olesen	83d6dda738	Delete stale comment. llvm-svn: 144542	2011-11-14 18:03:05 +00:00
Chandler Carruth	462bb16130	Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on the sum of the edge weights not overflowing uint32, and crashed when they did. This is generally safe as BranchProbabilityInfo tries to provide this guarantee. However, the CFG can get modified during codegen in a way that grows the sum of the edge weights. This doesn't seem unreasonable (imagine just adding more blocks all with the default weight of 16), but it is hard to come up with a case that actually triggers 32-bit overflow. Fortuately, the single-source GCC build is good at this. The solution isn't very pretty, but its no worse than the previous code. We're already summing all of the edge weights on each query, we can sum them, check for an overflow, compute a scale, and sum them again. I've included a greatly reduced test case out of the GCC source that triggers it. It's a pretty lame test, as it clearly is just barely triggering the overflow. I'd like to have something that is much more definitive, but I don't understand the fundamental pattern that triggers an explosion in the edge weight sums. The buggy code is duplicated within this file. I'll colapse them into a single implementation in a subsequent commit. llvm-svn: 144526	2011-11-14 08:50:16 +00:00
Chad Rosier	0e5094ca87	Add support for ARM halfword load/stores and signed byte loads with negative offsets. rdar://10412592 llvm-svn: 144518	2011-11-14 04:09:28 +00:00
Chandler Carruth	b7f21af176	Teach machine block placement to cope with unnatural loops. These don't get loop info structures associated with them, and so we need some way to make forward progress selecting and placing basic blocks. The technique used here is pretty brutal -- it just scans the list of blocks looking for the first unplaced candidate. It keeps placing blocks like this until the CFG becomes tractable. The cost is somewhat unfortunate, it requires allocating a vector of all basic block pointers eagerly. I have some ideas about how to simplify and optimize this, but I'm trying to get the logic correct first. Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly there are other bugs that GCC is tickling that I'm reducing and working on now. llvm-svn: 144516	2011-11-14 00:00:35 +00:00
Chandler Carruth	e67c92282f	Rewrite #3 of machine block placement. This is based somewhat on the second algorithm, but only loosely. It is more heavily based on the last discussion I had with Andy. It continues to walk from the inner-most loop outward, but there is a key difference. With this algorithm we ensure that as we visit each loop, the entire loop is merged into a single chain. At the end, the entire function is treated as a "loop", and merged into a single chain. This chain forms the desired sequence of blocks within the function. Switching to a single algorithm removes my biggest problem with the previous approaches -- they had different behavior depending on which system triggered the layout. Now there is exactly one algorithm and one basis for the decision making. The other key difference is how the chain is formed. This is based heavily on the idea Andy mentioned of keeping a worklist of blocks that are viable layout successors based on the CFG. Having this set allows us to consistently select the best layout successor for each block. It is expensive though. The code here remains very rough. There is a lot that needs to be done to clean up the code, and to make the runtime cost of this pass much lower. Very much WIP, but this was a giant chunk of code and I'd rather folks see it sooner than later. Everything remains behind a flag of course. I've added a couple of tests to exercise the issues that this iteration was motivated by: loop structure preservation. I've also fixed one test that was exhibiting the broken behavior of the previous version. llvm-svn: 144495	2011-11-13 11:20:44 +00:00
Chad Rosier	58ab241006	The order in which the predicate is added differs between Thumb and ARM mode. Fix predicate when in ARM mode and restore SelectIntrinsicCall. llvm-svn: 144494	2011-11-13 09:44:21 +00:00
Chad Rosier	8cfccc356e	Temporarily disable SelectIntrinsicCall when in ARM mode. This is causing failures. llvm-svn: 144492	2011-11-13 05:14:43 +00:00
Chad Rosier	acd199b5a4	Add support for emitting both signed- and zero-extend loads. Fix SimplifyAddress to handle either a 12-bit unsigned offset or the ARM +/-imm8 offsets (addressing mode 3). This enables a load followed by an integer extend to be folded into a single load. For example: ldrb r1, [r0] ldrb r1, [r0] uxtb r2, r1 => mov r3, r2 mov r3, r1 llvm-svn: 144488	2011-11-13 02:23:59 +00:00
Jakob Stoklund Olesen	3eaaa93104	Remove the -color-ss-with-regs option. It was off by default. The new register allocators don't have the problems that made it necessary to reallocate registers during stack slot coloring. llvm-svn: 144481	2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen	d0ddec5771	Delete the 'standard' spiller with used the old spilling framework. The current register allocators all use the inline spiller. llvm-svn: 144477	2011-11-12 23:29:02 +00:00
Jakob Stoklund Olesen	bb527a67c0	Remove histogram tests. Counting the number of occurences of each opcode is not a useful test. llvm-svn: 144474	2011-11-12 22:39:40 +00:00
Jakob Stoklund Olesen	9195bec6e7	RAGreedy is better about hinting now. Or maybe we are just getting lucky. llvm-svn: 144473	2011-11-12 22:39:37 +00:00
Jakob Stoklund Olesen	4aa9c6888f	Linear scan is going away. llvm-svn: 144472	2011-11-12 22:39:34 +00:00
Jakob Stoklund Olesen	e1b1bbb882	XFAIL test that depends on linear scan to remove dead code. Filed PR11364 to track the problem. Should the register allocator eliminate dead code? llvm-svn: 144471	2011-11-12 22:39:30 +00:00
Jakob Stoklund Olesen	43b7a3871b	Remove obsolete test. This test was committed with a bugfix to RemoveCopyByCommutingDef, but that optimization is no longer triggered by this test. llvm-svn: 144470	2011-11-12 22:39:27 +00:00
Jakob Stoklund Olesen	6a290484cb	Remove obsolete test. This test is for a very specific LocalRewriter bug. LocalRewriter is going away. llvm-svn: 144469	2011-11-12 22:39:24 +00:00
Jakob Stoklund Olesen	005eabf28a	Remove obsolete test. I don't think this test does what is was supposed to do, and LocalRewriter is going away anyway. llvm-svn: 144463	2011-11-12 20:37:57 +00:00
Jakob Stoklund Olesen	c11d7a9b4d	Eliminate more linear scan tests. llvm-svn: 144462	2011-11-12 20:35:26 +00:00
Jakob Stoklund Olesen	0fe59856fd	Switch a couple -O0 tests to RABasic. llvm-svn: 144461	2011-11-12 20:11:04 +00:00
Jakob Stoklund Olesen	94ce588b20	Switch a few tests off linearscan. llvm-svn: 144460	2011-11-12 19:53:52 +00:00
Jakob Stoklund Olesen	f8fed2a3a7	Delete old test of a VirtRegRewriter feature. This test doesn't expose the issue with RAGreedy. I filed PR11363 to track the missing InlineSpiller feature. llvm-svn: 144459	2011-11-12 19:53:48 +00:00
Jakob Stoklund Olesen	49118cf9a5	Remove old test that doesn't make sense. The test is checking that the output doesn't contains any 'mov ' strings. It does contain movl, though. llvm-svn: 144458	2011-11-12 19:53:45 +00:00
Craig Topper	0458cdf64a	Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code. llvm-svn: 144457	2011-11-12 09:58:49 +00:00
Nick Lewycky	772024a00d	Don't try to loop on iterators that are potentially invalidated inside the loop. Fixes PR11361! llvm-svn: 144454	2011-11-12 03:09:12 +00:00
Eli Friedman	a83fbaff5f	Make sure scalarrepl picks the correct alloca when it rewrites a bitcast. Fixes PR11353. llvm-svn: 144442	2011-11-12 02:07:50 +00:00
Rafael Espindola	5f6b14719f	The dwarf standard says that the only differences between a out-of-line instance and a concrete inlined instance are the use of DW_TAG_subprogram instead of DW_TAG_inlined_subroutine and the who owns the tree. We were also omitting DW_AT_inline from the abstract roots. To fix this, make sure we mark abstract instance roots with DW_AT_inline even when we have only out-of-line instances referring to them with DW_AT_abstract_origin. FileCheck is not a very good tool for tests like this, maybe we should add a -verify mode to llvm-dwarfdump. llvm-svn: 144441	2011-11-12 01:57:54 +00:00
Eli Friedman	8563e57e38	Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029. llvm-svn: 144438	2011-11-12 00:35:34 +00:00
Jim Grosbach	13b7ab7527	ARM optional size suffix for VLDR/VSTR syntax. llvm-svn: 144427	2011-11-11 23:34:43 +00:00
Chad Rosier	a2a0fbeded	Add support in fast-isel for selecting memset/memcpy/memmove intrinsics. llvm-svn: 144426	2011-11-11 23:31:03 +00:00
Chad Rosier	88ab27405f	Loosen test by using REs. Approved by Devang. llvm-svn: 144425	2011-11-11 23:25:38 +00:00
Andrew Trick	6ff75a5d8d	Preserve MachineMemOperands in ARMLoadStoreOptimizer. Fixes PR8113. llvm-svn: 144409	2011-11-11 22:18:09 +00:00
Jim Grosbach	1d581ecb00	ARM allow Q registers in vldm/vstm register lists. rdar://9672822 llvm-svn: 144407	2011-11-11 21:27:40 +00:00
Devang Patel	09f4f9890c	Move X86 specific test in X86 directory. llvm-svn: 144395	2011-11-11 18:13:19 +00:00
Devang Patel	a804f1a297	Move X86 specific test in X86 directory. llvm-svn: 144394	2011-11-11 18:10:38 +00:00
Dan Bailey	ad6c209a79	allow non-device function calls in PTX when natively handling device-side printf llvm-svn: 144388	2011-11-11 14:45:12 +00:00
Craig Topper	50df7c3842	Add lowering for AVX2 shift instructions. llvm-svn: 144380	2011-11-11 07:39:23 +00:00
Chad Rosier	feb72bfc08	Add support for using immediates with select instructions. rdar://10412592 llvm-svn: 144376	2011-11-11 06:20:39 +00:00
Eli Friedman	285b451941	Make sure to expand SIGN_EXTEND_INREG for NEON vectors. PR11319, round 3. llvm-svn: 144361	2011-11-11 03:16:38 +00:00
Eli Friedman	127d98ab35	Get rid of an optimization in SCCP which appears to have many issues. Specifically, it doesn't handle many cases involving undef correctly, and it is missing other checks which lead to it trying to re-mark a value marked as a constant with a different value. It also appears to trigger very rarely. Fixes PR11357. llvm-svn: 144352	2011-11-11 01:16:15 +00:00
Chad Rosier	ac92994773	Add support for using MVN to materialize negative constants. rdar://10412592 llvm-svn: 144348	2011-11-11 00:36:21 +00:00
Jim Grosbach	bd7df609b7	Thumb2 parsing for push/pop w/ hi registers in the reglist. rdar://10130228. llvm-svn: 144331	2011-11-10 23:17:11 +00:00
Rafael Espindola	e7024f983a	Check in getOrCreateSubprogramDIE if a declaration exists and if so output it first. This is a more general fix to pr11300. llvm-svn: 144324	2011-11-10 22:34:29 +00:00
Jim Grosbach	c3651cb620	Thumb MUL assembly parsing for 3-operand form. Get the source register that isn't tied to the destination register correct, even when the assembly source operand order is backwards. rdar://10428630 llvm-svn: 144322	2011-11-10 22:10:12 +00:00
Chad Rosier	7b7dced006	When in ARM mode, LDRH/STRH require special handling of negative offsets. For correctness, disable this for now. rdar://10418009 llvm-svn: 144316	2011-11-10 21:09:49 +00:00
Jim Grosbach	f5943e4c5e	ARM assembly parsing for LSR/LSL/ROR(immediate). More of rdar://9704684 llvm-svn: 144301	2011-11-10 19:18:01 +00:00
Jim Grosbach	b66dfc2999	ARM assembly parsing for ASR(immediate). Start of rdar://9704684 llvm-svn: 144293	2011-11-10 16:44:55 +00:00
NAKAMURA Takumi	ea14fd81c6	test/CodeGen/X86/lsr-loop-exit-cond.ll: Try to appease linux and freebsd bots to specify explicit -mtriple=x86_64-darwin. I guess it expects -relocation-model=pic. llvm-svn: 144290	2011-11-10 14:18:59 +00:00
Evan Cheng	4760ff0763	Use a bigger hammer to fix PR11314 by disabling the "forcing two-address instruction lower optimization" in the pre-RA scheduler. The optimization, rather the hack, was done before MI use-list was available. Now we should be able to implement it in a better way, perhaps in the two-address pass until a MI scheduler is available. Now that the scheduler has to backtrack to handle call sequences. Adding artificial scheduling constraints is just not safe. Furthermore, the hack is not taking all the other scheduling decisions into consideration so it's just as likely to pessimize code. So I view disabling this optimization goodness regardless of PR11314. llvm-svn: 144267	2011-11-10 07:43:16 +00:00
Chad Rosier	69cdae5eb9	For immediate encodings of icmp, zero or sign extend first. Then determine if the value is negative and flip the sign accordingly. rdar://10422026 llvm-svn: 144258	2011-11-10 01:30:39 +00:00
Jakob Stoklund Olesen	bc48cd34b6	Strip old implicit operands after foldMemoryOperand. The TII.foldMemoryOperand hook preserves implicit operands from the original instruction. This is not what we want when those implicit operands refer to the register being spilled. Implicit operands referring to other registers are preserved. This fixes PR11347. llvm-svn: 144247	2011-11-10 00:17:03 +00:00
Jim Grosbach	8591bd2bab	Thumb2 assembly parsing STMDB w/ optional .w suffix. rdar://10422955 llvm-svn: 144242	2011-11-09 23:44:23 +00:00
Eli Friedman	c93f8aa514	Make sure we correctly unroll conversions between v2f64 and v2i32 on ARM. llvm-svn: 144241	2011-11-09 23:36:02 +00:00
Pete Cooper	38700a1201	DeadStoreElimination can now trim the size of a store if the end of the store is dead. Currently checks alignment and killing stores on a power of 2 boundary as this is likely to trim the size of the earlier store without breaking large vector stores into scalar ones. Fixes <rdar://problem/10140300> llvm-svn: 144239	2011-11-09 23:07:35 +00:00
Eli Friedman	b01f15653c	Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319. llvm-svn: 144216	2011-11-09 22:25:12 +00:00
Nadav Rotem	ddc6bfa543	AVX2: Add patterns for variable shift operations llvm-svn: 144212	2011-11-09 21:22:13 +00:00
Chad Rosier	228dc76221	Use REs to remove dependencies on the register allocation order. llvm-svn: 144209	2011-11-09 20:06:13 +00:00
Duncan Sands	2934a0eaeb	Speculatively revert commit 144124 (djg) in the hope that the 32 bit dragonegg self-host buildbot will recover (it is complaining about object files differing between different build stages). Original commit message: Add a hack to the scheduler to disable pseudo-two-address dependencies in basic blocks containing calls. This works around a problem in which these artificial dependencies can get tied up in calling seqeunce scheduling in a way that makes the graph unschedulable with the current approach of using artificial physical register dependencies for calling sequences. This fixes PR11314. llvm-svn: 144188	2011-11-09 14:20:48 +00:00
Nadav Rotem	e66a72a2c4	Add AVX2 support for vselect of v32i8 llvm-svn: 144187	2011-11-09 13:21:28 +00:00
Craig Topper	432dd8d623	Enable execution dependency fix pass for YMM registers when AVX2 is enabled. Add AVX2 logical operations to list of replaceable instructions. llvm-svn: 144179	2011-11-09 09:37:21 +00:00
Craig Topper	7ff77dc2b1	Add instruction selection for AVX2 integer comparisons. llvm-svn: 144176	2011-11-09 08:06:13 +00:00
Craig Topper	d82abb7156	Add AVX2 instruction lowering for add, sub, and mul. llvm-svn: 144174	2011-11-09 07:28:55 +00:00
Nick Lewycky	c08fa4916a	Don't forget to check FlagNW when determining whether an AddRecExpr will wrap or not. Patch by Brendon Cahoon! llvm-svn: 144173	2011-11-09 07:11:37 +00:00
Chad Rosier	e32fed6868	Add support for encoding immediates in icmp and fcmp. Hopefully, this will remove a fair number of unnecessary materialized constants. rdar://10412592 llvm-svn: 144163	2011-11-09 03:22:02 +00:00
Jakob Stoklund Olesen	1239fed1e2	Collapse DomainValues across loop back-edges. During the initial RPO traversal of the basic blocks, remember the ones that are incomplete because of back-edges from predecessors that haven't been visited yet. After the initial RPO, revisit all those loop headers so the incoming DomainValues on the back-edges can be properly collapsed. This will properly fix execution domains on software pipelined code, like the included test case. llvm-svn: 144151	2011-11-09 01:06:56 +00:00
Dan Gohman	b6cf7c4e94	Add a hack to the scheduler to disable pseudo-two-address dependencies in basic blocks containing calls. This works around a problem in which these artificial dependencies can get tied up in calling seqeunce scheduling in a way that makes the graph unschedulable with the current approach of using artificial physical register dependencies for calling sequences. This fixes PR11314. llvm-svn: 144124	2011-11-08 21:29:06 +00:00
Evan Cheng	08e61752f2	Add workaround for Cortex-M3 errata 602117 by replacing ldrd x, y, [x] with ldm or ldr pairs. llvm-svn: 144123	2011-11-08 21:21:09 +00:00
Eli Friedman	6bda990650	Fix code to match comment. Fixes PR11340, a regression from r143209. llvm-svn: 144121	2011-11-08 21:08:02 +00:00
Pete Cooper	a85aa24d64	LICM pass now understands invariant load metadata. Nothing generates this yet so it will currently never get used in real tests llvm-svn: 144107	2011-11-08 19:30:00 +00:00
Pete Cooper	a1c151814a	Adding test for machine-licm operating on invariant load instructions llvm-svn: 144104	2011-11-08 19:06:53 +00:00
Lang Hames	ee7de1cff0	Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported. Add support for trimming constants to GetDemandedBits. This fixes some funky constant generation that occurs when stores are expanded for targets that don't support unaligned stores natively. llvm-svn: 144102	2011-11-08 18:56:23 +00:00
NAKAMURA Takumi	e7c7964113	test/CodeGen/X86/vec_shuffle-39.ll: Add explicit -mtriple=x86_64-linux. Passing packed value is not compatible on Win32 x64. llvm-svn: 144068	2011-11-08 03:46:39 +00:00
NAKAMURA Takumi	7094a0d830	test/CodeGen/X86/vec_shuffle-38.ll: Relax expression for Win32 x64. llvm-svn: 144067	2011-11-08 03:46:32 +00:00
NAKAMURA Takumi	8bc13fe0b2	test/CodeGen/X86/vec_shuffle.ll: Add explicit -mtriple=i686-linux. We may see some suboptimal frame (%ebp) emission on certain hosts. Possible [PR11031] llvm-svn: 144066	2011-11-08 03:46:25 +00:00
Eli Friedman	d5ba38a3d2	Make sure to mark vector extload's as expand on ARM. Fixes PR11319. llvm-svn: 144057	2011-11-08 01:43:53 +00:00
Eli Friedman	741d364aa9	Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318. Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up. llvm-svn: 144055	2011-11-08 01:25:24 +00:00
Evan Cheng	4a63100fe3	Add x86 isel logic and patterns to match movlps from clang generated IR for _mm_loadl_pi(). rdar://10134392, rdar://10050222 llvm-svn: 144052	2011-11-08 00:31:58 +00:00
Bill Wendling	a855903bda	Convert to the new EH model. llvm-svn: 144050	2011-11-08 00:23:01 +00:00
Bill Wendling	788df1dca1	Convert to the new EH model. llvm-svn: 144049	2011-11-08 00:17:28 +00:00
Bill Wendling	16499170c2	Convert tests to the new EH model. llvm-svn: 144048	2011-11-08 00:09:27 +00:00
Chad Rosier	4b12a5b7fc	Enable support for returning i1, i8, and i16. Nothing special todo as it's the callee's responsibility to sign or zero-extend the return value. The additional test case just checks to make sure the calls are selected (i.e., -fast-isel-abort doesn't assert). llvm-svn: 144047	2011-11-08 00:03:32 +00:00
Pete Cooper	2f5c35ae89	Added missing newline llvm-svn: 144046	2011-11-08 00:03:24 +00:00
Eli Friedman	8d138bf571	Revert r144034 while I try to track down a crash. llvm-svn: 144044	2011-11-07 23:53:20 +00:00
Jakob Stoklund Olesen	1900a5f521	Fix test for Windows as well. llvm-svn: 144038	2011-11-07 23:10:43 +00:00
Jakob Stoklund Olesen	9380d5daff	Kill and collapse outstanding DomainValues. DomainValues that are only used by "don't care" instructions are now collapsed to the first possible execution domain after all basic blocks have been processed. This typically means the PS domain on x86. For example, the vsel_i64 and vsel_double functions in sse2-blend.ll are completely collapsed to the PS domain instead of containing a mix of execution domains created by isel. llvm-svn: 144037	2011-11-07 23:08:21 +00:00
Pete Cooper	1d5d364e06	InstCombine now optimizes vector udiv by power of 2 to shifts Fixes r8429 llvm-svn: 144036	2011-11-07 23:04:49 +00:00
Eli Friedman	c1bb1b2b09	Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318. llvm-svn: 144034	2011-11-07 22:51:10 +00:00
Benjamin Kramer	89ebc7ab4b	Simplify some uses of utohexstr. As a side effect hex is printed lowercase instead of uppercase now. llvm-svn: 144013	2011-11-07 21:00:59 +00:00
Jakob Stoklund Olesen	d33a581d93	Fix test for Linux. llvm-svn: 144003	2011-11-07 20:47:23 +00:00
Jakob Stoklund Olesen	b53be3a67d	Expand V_SET0 to xorps by default. The xorps instruction is smaller than pxor, so prefer that encoding. The ExecutionDepsFix pass will switch the encoding to pxor and xorpd when appropriate. llvm-svn: 143996	2011-11-07 19:15:58 +00:00
Craig Topper	7eab73f510	Add AVX2 variable shift instructions and intrinsics. llvm-svn: 143915	2011-11-07 08:26:24 +00:00
Craig Topper	b1ef950217	Add AVX2 VPMOVMASK instructions and intrinsics. llvm-svn: 143904	2011-11-07 03:20:35 +00:00
Craig Topper	d422190c0f	Add AVX2 VEXTRACTI128 and VINSERTI128 instructions. Fix VPERM2I128 to be qualified with HasAVX2 instead of HasAVX. Mark VINSERTF128 and VEXTRACTF128 as never having side effects. llvm-svn: 143902	2011-11-07 02:00:04 +00:00
Craig Topper	01b852b95a	More AVX2 instructions and their intrinsics. llvm-svn: 143895	2011-11-06 23:04:08 +00:00
Craig Topper	31b1d79474	Add more AVX2 instructions and intrinsics. llvm-svn: 143861	2011-11-06 06:12:20 +00:00
Chad Rosier	806ffd8918	Add support for passing i1, i8, and i16 call parameters. Also, be sure to zero-extend the constant integer encoding. Test case provides testing for both call parameters and materialization of i1, i8, and i16 types. llvm-svn: 143821	2011-11-05 20:16:15 +00:00
Benjamin Kramer	fde45fcf3c	Update lit's list of tools. llvm-svn: 143815	2011-11-05 16:20:52 +00:00
Benjamin Kramer	4c8932e3b8	Add an option to pad an uleb128 to MCObjectWriter and remove the uleb128 encoding from the DWARF asm printer. As a side effect we now print dwarf ulebs with .ascii directives. llvm-svn: 143809	2011-11-05 11:52:44 +00:00
Nick Lewycky	7ea3dd8ae5	Do simple cross-block DSE when we encounter a free statement. Fixes PR11240. llvm-svn: 143808	2011-11-05 10:48:42 +00:00
Eli Friedman	1478b657c8	Enhanced vzeroupper insertion pass that avoids inserting vzeroupper where it is unnecessary through local analysis. Patch from Bruno Cardoso Lopes, with some additional changes. I'm going to wait for any review comments and perform some additional testing before turning this on by default. llvm-svn: 143750	2011-11-04 23:46:11 +00:00
Daniel Dunbar	e57462ccc0	build/cmake: Change to require Python be available. llvm-svn: 143742	2011-11-04 23:04:05 +00:00
Rafael Espindola	a13d4ca525	Add triple to test. llvm-svn: 143735	2011-11-04 20:20:34 +00:00
Rafael Espindola	a022f4813e	Emit declarations before definitions if they are available. This causes DW_AT_specification to point back in the file in the included testcase. Fixes PR11300. llvm-svn: 143726	2011-11-04 19:00:29 +00:00
Dan Gohman	e689158987	Add tests for existing InstSimplify features. llvm-svn: 143721	2011-11-04 18:39:16 +00:00
Dan Gohman	19a8523a2f	Teach instsimplify to simplify calls to undef. llvm-svn: 143719	2011-11-04 18:32:42 +00:00
Craig Topper	6ae8fe6fbe	Add intrinsics for X86 vcvtps2ph and vcvtph2ps instructions llvm-svn: 143682	2011-11-04 06:59:21 +00:00
Chad Rosier	21cd759234	Add fast-isel support for returning i1, i8, and i16. llvm-svn: 143669	2011-11-04 00:50:21 +00:00
Daniel Dunbar	0193e03f99	Speculatively revert "DeadStoreElimination can now trim the size of a store if the end of it is dead.", which appears to break bootstrapping LLVM. llvm-svn: 143668	2011-11-04 00:48:26 +00:00
Dan Gohman	a5f382da8b	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Pete Cooper	ad3d5b2eee	Reverted r143600 - selector reference change llvm-svn: 143646	2011-11-03 20:47:50 +00:00
Dan Bailey	986e6b02b8	fixed global array handling for ptx to use the correct bit widths llvm-svn: 143640	2011-11-03 19:24:46 +00:00
Pete Cooper	4902705b5f	DeadStoreElimination can now trim the size of a store if the end of it is dead. Only currently done if the later store is writing to a power of 2 address or has the same alignment as the earlier store as then its likely to not break up large stores into smaller ones Fixes <rdar://problem/10140300> llvm-svn: 143630	2011-11-03 18:01:56 +00:00
Craig Topper	124b2fd08c	Add new X86 AVX2 VBROADCAST instructions. llvm-svn: 143612	2011-11-03 07:35:53 +00:00
Chad Rosier	74c4e2c2d9	Add support for sign-extending non-legal types in SelectSIToFP(). llvm-svn: 143603	2011-11-03 02:04:59 +00:00
Pete Cooper	c8a657a2b2	Treat objc selector reference globals as invariant so that MachineLICM can hoist them out of loops. Fixes <rdar://problem/6027699> llvm-svn: 143600	2011-11-03 00:56:36 +00:00
Lang Hames	ceec8ec67e	Try to lower memset/memcpy/memmove to vector instructions on ARM where the alignment permits. llvm-svn: 143582	2011-11-02 22:52:45 +00:00
Nick Lewycky	3c8d2be421	I added the first test to run llvm-dwarfdump. llvm-svn: 143571	2011-11-02 21:02:27 +00:00
Nick Lewycky	691d7f80c2	Don't emit a directory entry for the value in DW_AT_comp_dir, that is always implied by directory index zero. llvm-svn: 143570	2011-11-02 20:55:33 +00:00
Chad Rosier	8a613c5ec5	Add support for comparing integer non-legal types. llvm-svn: 143559	2011-11-02 18:08:25 +00:00
Owen Anderson	ac9fd95057	Fix the issue that r143552 was trying to address the _right_ way. One-register lists are legal on LDM/STM instructions, but we should not print the PUSH/POP aliases when they appear. This fixes round tripping on this instruction. llvm-svn: 143557	2011-11-02 18:03:14 +00:00
Daniel Dunbar	4169d2ddc9	tests: Clean up tests/CMakeLists.txt to drop some variable configuration we no longer need substitutions for. llvm-svn: 143555	2011-11-02 17:54:51 +00:00
Andrew Trick	3c1e831108	Rewrite LinearFunctionTestReplace to handle pointer-type IVs. We've been hitting asserts in this code due to the many supported combintions of modes (iv-rewrite/no-iv-rewrite) and IV types. This second rewrite of the code attempts to deal with these cases systematically. llvm-svn: 143546	2011-11-02 17:19:57 +00:00
Craig Topper	a2a55bd0b4	More AVX2 instructions and intrinsics. llvm-svn: 143536	2011-11-02 06:54:17 +00:00
Craig Topper	c5482eb697	Add a bunch more X86 AVX2 instructions and their corresponding intrinsics. llvm-svn: 143529	2011-11-02 04:42:13 +00:00
Andrew Trick	c9baf3a7a1	Broaden an assert to handle enable-iv-rewrite=true following r143183. Narrowest possible fix for PR11279. llvm-svn: 143522	2011-11-02 00:02:45 +00:00
Kevin Enderby	b5dc88b394	Fixed a bug in the code to create a dwarf file and directory table entires when it is separating the directory part from the basename of the FileName. Noticed that this: .file 1 "dir/foo" when assembled got the two parts switched. Using the Mac OS X dwarfdump tool it can be seen easily: % dwarfdump -a a.out include_directories[ 1] = 'foo' Dir Mod Time File Len File Name ---- ---------- ---------- --------------------------- file_names[ 1] 1 0x00000000 0x00000000 dir ... Which should be: ... include_directories[ 1] = 'dir' Dir Mod Time File Len File Name ---- ---------- ---------- --------------------------- file_names[ 1] 1 0x00000000 0x00000000 foo llvm-svn: 143521	2011-11-01 23:39:05 +00:00
Owen Anderson	0d69f6aa51	Fix disassembly of some VST1 instructions. llvm-svn: 143507	2011-11-01 22:18:13 +00:00
Eli Friedman	c60a0ad611	Teach the x86 backend a couple tricks for dealing with v16i8 sra by a constant splat value. Fixes PR11289. llvm-svn: 143498	2011-11-01 21:18:39 +00:00
Richard Osborne	5a9e575e81	Don't fold negative offsets into cp / dp accesses to avoid relocation errors. This can happen if the address + addend is less than the start of the cp / dp. llvm-svn: 143459	2011-11-01 11:31:53 +00:00
Richard Osborne	8175a9601d	Combine various XCore tests for floating point intrinsic support into a single test. llvm-svn: 143458	2011-11-01 10:51:48 +00:00
Richard Osborne	280d51dd14	Move various XCore tests to FileCheck llvm-svn: 143457	2011-11-01 10:41:28 +00:00
Craig Topper	361c873b52	Fix operand type for x86 pmadd_ub_sw intrinsic. llvm-svn: 143455	2011-11-01 07:25:22 +00:00
Eli Friedman	676558ae92	Make sure we use the right insertion point when instcombine replaces a PHI with another instruction. (Specifically, don't insert an arbitrary instruction before a PHI.) Fixes PR11275. llvm-svn: 143437	2011-11-01 04:49:29 +00:00
Eli Friedman	172ff3d328	Move x86-specific tests into X86 folder. llvm-svn: 143424	2011-11-01 03:21:48 +00:00
Eli Friedman	b32279f1fc	Move another test requiring x86 into X86 directory. llvm-svn: 143421	2011-11-01 03:12:47 +00:00
Eli Friedman	b97ce79891	Move test requiring x86 backend into X86 directory. llvm-svn: 143420	2011-11-01 03:11:41 +00:00
Matt Beaumont-Gay	6f16a87ae3	Change the actual tests to match the input directory rename (duh) llvm-svn: 143404	2011-10-31 23:56:52 +00:00
Matt Beaumont-Gay	a5dfba561b	Rename "TestObjectFiles" to "Inputs" (like the pattern for Clang tests) llvm-svn: 143400	2011-10-31 23:46:38 +00:00
Rafael Espindola	dd7a1f625b	Move test to the X86 directory, note the PR number and only run MC once. llvm-svn: 143352	2011-10-31 17:23:09 +00:00
Owen Anderson	d7700cb13f	More not-crashing NEON disassembly updates for the vld refactoring. llvm-svn: 143351	2011-10-31 17:17:32 +00:00
Craig Topper	dbf10927d7	Fix operand type for int_x86_ssse3_phadd_sw_128 intrinsic llvm-svn: 143336	2011-10-31 07:16:37 +00:00
Craig Topper	c0f93132bd	Test case for X86 FS/GS Base intrinsics llvm-svn: 143332	2011-10-31 02:15:47 +00:00

... 2 3 4 5 6 ...

15167 Commits