Commit Graph

15592 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
7f7f8a2e77 Consider unknown alignment caused by OptimizeThumb2Instructions().
This function runs after all constant islands have been placed, and may
shrink some instructions to their 2-byte forms.  This can actually cause
some constant pool entries to move out of range because of growing
alignment padding.

Treat instructions that may be shrunk the same as inline asm - they
erode the known alignment bits.

Also reinstate an old assertion in verify(). It is correct now that
basic block offsets include alignments.

Add a single large test case that will hopefully exercise many parts of
the constant island pass.

<rdar://problem/10670199>

llvm-svn: 147885
2012-01-10 22:32:14 +00:00
Jim Grosbach
59537e1ce3 ARM updating VST2 pseudo-lowering fixed vs. register update.
rdar://10663487

llvm-svn: 147876
2012-01-10 21:11:12 +00:00
Kevin Enderby
75f4b470f9 Various crash reporting tools have a problem with the dwarf generated for
assembly source when it generates the TAG_subprogram dwarf debug info for
the labels that have nothing between them as in this bit of assembly source:

% cat ZeroLength.s 
_func1:
_func2:
 nop

One solution would be to not emit the subsequent labels with the same address
and use the next label with a different address or the end of the section for
the AT_high_pc value of the TAG_subprogram.

Turns out in llvm-mc it is not possible in all cases to determine of two
symbols have the same value at the point we put out the TAG_subprogram dwarf
debug info.

So we will have llvm-mc instead of putting out TAG_subprogram's put out
DW_TAG_label's.  And the DW_TAG_label does not have a AT_high_pc value which
avoids the problem.

This commit is only the functional change to make the diffs clear as to what is
really being changed.  The next commit will be to clean up the names of such
things like MCGenDwarfSubprogramEntry to something like MCGenDwarfLabelEntry.

rdar://10666925

llvm-svn: 147860
2012-01-10 17:52:29 +00:00
Nadav Rotem
969b8a6903 Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back.
llvm-svn: 147851
2012-01-10 14:28:46 +00:00
Craig Topper
01eba20904 Fix a crash in AVX2 when trying to broadcast a double into a 128-bit vector. There is no vbroadcastsd xmm, but we do need to support 64-bit integers broadcasted into xmm. Also factor the AVX check into the isVectorBroadcast function. This makes more sense since the AVX2 check was already inside.
llvm-svn: 147844
2012-01-10 08:23:59 +00:00
Evan Cheng
7855c5d08f Allow machine-cse to look across MBB boundary when cse'ing instructions that
define physical registers. It's currently very restrictive, only catching
cases where the CE is in an immediate (and only) predecessor. But it catches
a surprising large number of cases.

rdar://10660865

llvm-svn: 147827
2012-01-10 02:02:58 +00:00
Andrew Trick
db66631fb3 Enable LSR IV Chains with sufficient heuristics.
These heuristics are sufficient for enabling IV chains by
default. Performance analysis has been done for i386, x86_64, and
thumbv7. The optimization is rarely important, but can significantly
speed up certain cases by eliminating spill code within the
loop. Unrolled loops are prime candidates for IV chains. In many
cases, the final code could still be improved with more target
specific optimization following LSR. The goal of this feature is for
LSR to make the best choice of induction variables.

Instruction selection may not completely take advantage of this
feature yet. As a result, there could be cases of slight code size
increase.

Code size can be worse on x86 because it doesn't support postincrement
addressing. In fact, when chains are formed, you may see redundant
address plus stride addition in the addressing mode. GenerateIVChains
tries to compensate for the common cases.

On ARM, code size increase can be mitigated by using postincrement
addressing, but downstream codegen currently misses some opportunities.

llvm-svn: 147826
2012-01-10 01:45:08 +00:00
Andrew Trick
09d73ea35b Adding IV chain generation to LSR.
After collecting chains, check if any should be materialized. If so,
hide the chained IV users from the LSR solver. LSR will only solve for
the head of the chain. GenerateIVChains will then materialize the
chained IV users by computing the IV relative to its previous value in
the chain.

In theory, chained IV users could be exposed to LSR's solver. This
would be considerably complicated to implement and I'm not aware of a
case where we need it. In practice it's more important to
intelligently prune the search space of nontrivial loops before
running the solver, otherwise the solver is often forced to prune the
most optimal solutions. Hiding the chained users does this well, so
that LSR is more likely to find the best IV for the chain as a whole.

llvm-svn: 147801
2012-01-09 21:18:52 +00:00
Benjamin Kramer
f9cefbfed0 InstCombine: Teach foldLogOpOfMaskedICmpsHelper that sign bit tests are bit tests.
This subsumes several other transforms while enabling us to catch more cases.

llvm-svn: 147777
2012-01-09 17:23:27 +00:00
Chandler Carruth
9a418a2713 Cleanup and FileCheck-ize a test.
llvm-svn: 147772
2012-01-09 09:44:26 +00:00
Craig Topper
ee2dabebe3 Clean up patterns for MOVNT*. Not sure why there were floating point types on MOVNTPS and MOVNTDQ. And v4i64 was completely missing.
llvm-svn: 147767
2012-01-09 06:52:46 +00:00
Rafael Espindola
7618aa1c64 Don't print an unused label before .cfi_endproc.
llvm-svn: 147763
2012-01-09 00:17:29 +00:00
Craig Topper
1fdcf7071d Don't disable MMX support when AVX is enabled. Fix predicates for MMX instructions that were added along with SSE instructions to check for AVX in addition to SSE level.
llvm-svn: 147762
2012-01-09 00:11:29 +00:00
Benjamin Kramer
e1321329f4 Tweak my last commit to be less conservative about uses.
We still save an instruction when just the "and" part is replaced.
Also change the code to match comments more closely.

llvm-svn: 147753
2012-01-08 21:12:51 +00:00
Benjamin Kramer
e94856c8c4 InstCombine: If we have a bit test and a sign test anded/ored together, merge the sign bit into the bit test.
This is common in bit field code, e.g. checking if the first or the last bit of a bit field is set.

llvm-svn: 147749
2012-01-08 18:32:24 +00:00
Victor Umansky
5d24f5f51a Reverted commit #147601 upon Evan's request.
llvm-svn: 147748
2012-01-08 17:20:33 +00:00
Rafael Espindola
19a13321f8 Don't print a label before .cfi_startproc when we don't need to. This makes
the produce assembly when using CFI just a bit more readable.

llvm-svn: 147743
2012-01-07 22:42:19 +00:00
Jakob Stoklund Olesen
71f92061aa Use getRegForValue() to materialize the address of ARM globals.
This enables basic local CSE, giving us 20% smaller code for
consumer-typeset in -O0 builds.

<rdar://problem/10658692>

llvm-svn: 147720
2012-01-07 04:07:22 +00:00
Andrew Trick
d9eb9c8780 LSR: Don't optimize loops if an outer loop has no preheader.
LoopSimplify may not run on some outer loops, e.g. because of indirect
branches. SCEVExpander simply cannot handle outer loops with no preheaders.
Fixes rdar://10655343 SCEVExpander segfault.

llvm-svn: 147718
2012-01-07 03:16:50 +00:00
Rafael Espindola
2d545fa143 Split Finish into Finish and FinishImpl to have a common place to do end of
file error checking. Use that to error on an unfinished cfi_startproc.

The error is not nice, but is already better than a segmentation fault.

llvm-svn: 147717
2012-01-07 03:13:18 +00:00
Evan Cheng
8af07ba749 Added a late machine instruction copy propagation pass. This catches
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
        movl    %eax, %ecx
        movl    %ecx, %eax
        ret

The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)

This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.

rdar://10428165
rdar://10640363

llvm-svn: 147716
2012-01-07 03:02:36 +00:00
Jakob Stoklund Olesen
536b4e24d8 Use movw+movt in ARMFastISel::ARMMaterializeGV.
This eliminates a lot of constant pool entries for -O0 builds of code
with many global variable accesses.

This speeds up -O0 codegen of consumer-typeset by 2x because the
constant island pass no longer has to look at thousands of constant pool
entries.

<rdar://problem/10629774>

llvm-svn: 147712
2012-01-07 01:47:05 +00:00
Andrew Trick
8a5a1e603e Extended replaceCongruentPhis to handle mixed phi types.
llvm-svn: 147707
2012-01-07 01:12:09 +00:00
Eric Christopher
09a41e8939 Make the 'x' constraint work for AVX registers as well.
Fixes rdar://10614894

llvm-svn: 147704
2012-01-07 01:02:09 +00:00
Andrew Trick
0ec80535d9 comment typo
llvm-svn: 147701
2012-01-07 00:29:20 +00:00
Jakob Stoklund Olesen
39a3fa2c29 Enable aligned NEON spilling by default.
Experiments show this to be a small speedup for modern ARM cores.

llvm-svn: 147689
2012-01-06 22:19:37 +00:00
Dan Gohman
a4fde8485d Fix SpeculativelyExecuteBB to either speculate all or none of the phis
present in the bottom of the CFG triangle, as the transformation isn't
ever valuable if the branch can't be eliminated.

Also, unify some heuristics between SimplifyCFG's multiple
if-converters, for consistency.

This fixes rdar://10627242.

llvm-svn: 147630
2012-01-05 23:58:56 +00:00
Eli Friedman
5af9c3cbbb PR11705, part 2: globalopt shouldn't put inttoptr/ptrtoint operations into global initializers if there's an implied extension or truncation.
llvm-svn: 147625
2012-01-05 23:03:32 +00:00
Rafael Espindola
aaed8dbd04 Link symbols with different visibilities according to the rules in the
System V Application Binary Interface. This lets us use
-fvisibility-inlines-hidden with LTO.
Fixes PR11697.

llvm-svn: 147624
2012-01-05 23:02:01 +00:00
Dan Gohman
4fc691d9ef Revert r56315. When the instruction to speculate is a load, this
code can incorrectly move the load across a store. This never
happens in practice today, but only because the current
heuristics accidentally preclude it.

llvm-svn: 147623
2012-01-05 22:54:35 +00:00
Chandler Carruth
90f124f420 Prevent a DAGCombine from firing where there are two uses of
a combined-away node and the result of the combine isn't substantially
smaller than the input, it's just canonicalized. This is the first part
of a significant (7%) performance gain for Snappy's hot decompression
loop.

llvm-svn: 147604
2012-01-05 11:05:55 +00:00
Chandler Carruth
75ff869d6a Cleanup and FileCheck-ize a test.
llvm-svn: 147603
2012-01-05 11:05:47 +00:00
Victor Umansky
87d5ada510 Peephole optimization of ptest-conditioned branch in X86 arch. Performs instruction combining of sequences generated by ptestz/ptestc intrinsics to ptest+jcc pair for SSE and AVX.
Testing: passed 'make check' including LIT tests for all sequences being handled (both SSE and AVX)

Reviewers: Evan Cheng, David Blaikie, Bruno Lopes, Elena Demikhovsky, Chad Rosier, Anton Korobeynikov
llvm-svn: 147601
2012-01-05 08:46:19 +00:00
Benjamin Kramer
e5589bccdd FileCheck hygiene.
llvm-svn: 147580
2012-01-05 00:43:34 +00:00
Jakob Stoklund Olesen
23eeb1f7b5 Reapply r146997, "Heed spill slot alignment on ARM."
Now that canRealignStack() understands frozen reserved registers, it is
safe to use it for aligned spill instructions.

It will only return true if the registers reserved at the beginning of
register allocation allow for dynamic stack realignment.

<rdar://problem/10625436>

llvm-svn: 147579
2012-01-05 00:26:57 +00:00
Nick Lewycky
d6260dc3cb Teach instcombine all sorts of great stuff about shifts that have exact, nuw or
nsw bits on them.

llvm-svn: 147528
2012-01-04 09:28:29 +00:00
NAKAMURA Takumi
6ebbc05c9d test/CodeGen/X86/jump_sign.ll: Add -mcpu=pentiumpro for non-x86 hosts. It uses "cmov".
llvm-svn: 147521
2012-01-04 03:52:23 +00:00
Akira Hatanaka
fdcba196ca Have getRegForInlineAsmConstraint return the correct register class when target
is Mips64.

llvm-svn: 147516
2012-01-04 02:45:01 +00:00
Evan Cheng
4e60b65bc6 Fix more places which should be checking for iOS, not darwin.
llvm-svn: 147513
2012-01-04 01:55:04 +00:00
Evan Cheng
caba5d2fc2 For x86, canonicalize max
(x > y) ? x : y
=>
(x >= y) ? x : y

So for something like
(x - y) > 0 : (x - y) ? 0
It will be
(x - y) >= 0 : (x - y) ? 0

This makes is possible to test sign-bit and eliminate a comparison against
zero. e.g.
subl   %esi, %edi
testl  %edi, %edi
movl   $0, %eax
cmovgl %edi, %eax
=>
xorl   %eax, %eax
subl   %esi, $edi
cmovsl %eax, %edi

rdar://10633221

llvm-svn: 147512
2012-01-04 01:41:39 +00:00
Kostya Serebryany
c69557e758 [asan] one more test for asan instrumentation: (*a)++ should be instrumented only once.
llvm-svn: 147509
2012-01-04 01:02:14 +00:00
Jakob Stoklund Olesen
993997b659 Revert r146997, "Heed spill slot alignment on ARM."
This patch caused a miscompilation of oggenc because a frame pointer was
suddenly needed halfway through register allocation.

<rdar://problem/10625436>

llvm-svn: 147487
2012-01-03 22:34:35 +00:00
Nadav Rotem
79f1692fe0 Revert 147426 because it caused pr11696.
llvm-svn: 147485
2012-01-03 22:19:42 +00:00
Nadav Rotem
cc49c4d74d Fix incorrect widening of the bitcast sdnode in case the incoming operand is integer-promoted.
llvm-svn: 147484
2012-01-03 22:12:28 +00:00
Chad Rosier
afcaa8f38a Enhance DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.
rdar://10594409

llvm-svn: 147481
2012-01-03 21:05:52 +00:00
Elena Demikhovsky
40f2c3077f Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147445
2012-01-03 11:59:04 +00:00
Andrew Trick
6839d66ab3 Fix SCEVExpander to handle loops with no preheader when LSR gives it a
"phony" insertion point.

Fixes rdar://10619599: "SelectionDAGBuilder shouldn't visit PHI nodes!" assert

llvm-svn: 147439
2012-01-02 21:25:10 +00:00
Nadav Rotem
6929a8868b Optimize the sequence blend(sign_extend(x)) to blend(shl(x)) since SSE blend instructions only look at the highest bit.
llvm-svn: 147426
2012-01-02 08:05:46 +00:00
Craig Topper
f7c9bf17dd Allow CRC32 instructions to be selected when AVX is enabled.
llvm-svn: 147411
2012-01-01 19:51:58 +00:00
Craig Topper
d8ae2d9f27 Fix sfence, lfence, mfence, and clflush to be able to be selected when AVX is enabled. Fix monitor and mwait to require SSE3 or AVX, previously they worked even if SSE3 was disabled. Make prefetch instructions not set the execution domain since they don't use XMM registers.
llvm-svn: 147409
2012-01-01 19:40:22 +00:00
Rafael Espindola
1cb17796db Revert 147399. It broke CodeGen/ARM/vext.ll.
llvm-svn: 147400
2012-01-01 17:36:23 +00:00
Elena Demikhovsky
9b74049783 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147399
2012-01-01 16:22:47 +00:00
Craig Topper
0311c45aed Add patterns for integer forms of SHUFPD/VSHUFPD with a memory load.
llvm-svn: 147393
2011-12-31 23:24:49 +00:00
Craig Topper
c01ce759d7 Fix typo in a SHUFPD and VSHUFPD pattern that prevented SHUFPD/VSHUFPD with a load from being selected.
llvm-svn: 147392
2011-12-31 23:15:11 +00:00
Nick Lewycky
c7e12f7dbf Make use of the exact bit when optimizing '(X >>exact 3) << 1' to eliminate the
'and' that would zero out the trailing bits, and to produce an exact shift
ourselves.

llvm-svn: 147391
2011-12-31 21:30:22 +00:00
Craig Topper
b4db8689ee Add disassembler support for VPERMIL2PD and VPERMIL2PS.
llvm-svn: 147368
2011-12-30 06:23:39 +00:00
Craig Topper
089be4fefa Add FMA4 instructions to disassembler.
llvm-svn: 147367
2011-12-30 05:20:36 +00:00
Craig Topper
33091db89a Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms.
llvm-svn: 147361
2011-12-30 02:18:36 +00:00
Craig Topper
e066262284 Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere.
llvm-svn: 147360
2011-12-30 01:49:53 +00:00
Hal Finkel
4a09216dfb Cleanup stack/frame register define/kill states. This fixes two bugs:
1. The ST*UX instructions that store and update the stack pointer did not set define/kill on R1. This became a problem when I activated post-RA scheduling (and had incorrectly adjusted the Frames-large test).

2. eliminateFrameIndex did not kill its scavenged temporary register, and this could cause the scavenger to exhaust all available registers (and its emergency spill slot) when there were a lot of CR values to spill. The 2010-02-12-saveCR test has been adjusted to check for this.

llvm-svn: 147359
2011-12-30 00:34:00 +00:00
Rafael Espindola
db7319d272 Implement cfi_restore. Patch by Brian Anderson!
llvm-svn: 147356
2011-12-29 21:43:03 +00:00
Craig Topper
97e84c23a1 Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions.
llvm-svn: 147353
2011-12-29 20:43:40 +00:00
Rafael Espindola
27298c6f33 Implement .cfi_escape. Patch by Brian Anderson!
llvm-svn: 147352
2011-12-29 20:24:47 +00:00
Craig Topper
bcfd070378 Expose FMA3 instructions to the disassembler.
llvm-svn: 147351
2011-12-29 20:03:14 +00:00
Nick Lewycky
7425820374 Change CaptureTracking to pass a Use* instead of a Value* when a value is
captured. This allows the tracker to look at the specific use, which may be
especially interesting for function calls.

Use this to fix 'nocapture' deduction in FunctionAttrs. The existing one does
not iterate until a fixpoint and does not guarantee that it produces the same
result regardless of iteration order. The new implementation builds up a graph
of how arguments are passed from function to function, and uses a bottom-up walk
on the argument-SCCs to assign nocapture. This gets us nocapture more often, and
does so rather efficiently and independent of iteration order.

llvm-svn: 147327
2011-12-28 23:24:21 +00:00
Eli Friedman
db54b4b68f Fix type-checking for load transformation which is not legal on floating-point types. PR11674.
llvm-svn: 147323
2011-12-28 21:24:44 +00:00
Nadav Rotem
d8c4880903 PR11662.
Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage.

llvm-svn: 147309
2011-12-28 13:08:20 +00:00
Elena Demikhovsky
9b4613ff14 Fixed a bug in LowerVECTOR_SHUFFLE and LowerBUILD_VECTOR.
Matching MOVLP mask for AVX (265-bit vectors) was wrong.
The failure was detected by conformance tests.

llvm-svn: 147308
2011-12-28 08:14:01 +00:00
Nick Lewycky
f4c21901a3 Turn cos(-x) into cos(x). Patch by Alexander Malyshev!
llvm-svn: 147291
2011-12-27 18:25:50 +00:00
Nick Lewycky
295e397220 Teach simplifycfg to recompute branch weights when merging some branches, and
to discard weights when appropriate. Still more to do (and a new TODO), but
it's a start!

llvm-svn: 147286
2011-12-27 04:31:52 +00:00
Eli Friedman
064187912e Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2.
llvm-svn: 147283
2011-12-26 22:49:32 +00:00
Nick Lewycky
56e04db381 Update the branch weight metadata when reversing the order of a branch.
llvm-svn: 147280
2011-12-26 20:54:14 +00:00
Chandler Carruth
a012c64ced Add an explicit test that we now fold cttz.i32(..., true) >> 5 -> 0.
This is a result of Benjamin's work on ValueTracking.

llvm-svn: 147259
2011-12-24 22:34:15 +00:00
Benjamin Kramer
94f07f8c2c InstCombine: Add a combine that turns (2^n)-1 ^ x back into (2^n)-1 - x iff x is smaller than 2^n and it fuses with a following add.
This was intended to undo the sub canonicalization in cases where it's not profitable, but it also
finds some cases on it's own.

llvm-svn: 147256
2011-12-24 17:31:53 +00:00
Benjamin Kramer
b5e584392b ComputeMaskedBits: Make knownzero computation more aggressive for ctlz with undef zero.
unsigned foo(unsigned x) { return 31 - __builtin_clz(x); }
now compiles into a single "bsrl" instruction on x86.

llvm-svn: 147255
2011-12-24 17:31:46 +00:00
Benjamin Kramer
0b4d2e3d2a InstCombine: Canonicalize (2^n)-1 - x into (2^n)-1 ^ x iff x is known to be smaller than 2^n.
This has the obvious advantage of being commutable and is always a win on x86 because
const - x wastes a register there. On less weird architectures this may lead to
a regression because other arithmetic doesn't fuse with it anymore. I'll address that
problem in a followup.

llvm-svn: 147254
2011-12-24 17:31:38 +00:00
Chandler Carruth
7a5c52fadf Use standard promotion for i8 CTTZ nodes and i8 CTLZ nodes when the
LZCNT instructions are available. Force promotion to i32 to get
a smaller encoding since the fix-ups necessary are just as complex for
either promoted type

We can't do standard promotion for CTLZ when lowering through BSR
because it results in poor code surrounding the 'xor' at the end of this
instruction. Essentially, if we promote the entire CTLZ node to i32, we
end up doing the xor on a 32-bit CTLZ implementation, and then
subtracting appropriately to get back to an i8 value. Instead, our
custom logic just uses the knowledge of the incoming size to compute
a perfect xor. I'd love to know of a way to fix this, but so far I'm
drawing a blank. I suspect the legalizer could be more clever and/or it
could collude with the DAG combiner, but how... ;]

llvm-svn: 147251
2011-12-24 12:12:34 +00:00
Chandler Carruth
82b7a7478b Add systematic testing for cttz as well, and fix the bug I spotted by
inspection earlier.

llvm-svn: 147250
2011-12-24 11:46:10 +00:00
Chandler Carruth
b52ba33d0a Add i8 and i64 testing for ctlz on x86. Also simplify the i16 test.
llvm-svn: 147249
2011-12-24 11:26:59 +00:00
Chandler Carruth
800a803717 Tidy up this rather crufty test. Put the declarations at the top to make
my C-brain happy. Remove the unnecessary bits of pedantic IR fluff like
nounwind. Remove stray uses comments. Name things semantically rather
than tN so that adding a new test in the middle doesn't cause pain, and
so that new tests can be grouped semantically.

This exposes how little systematic testing is going on here. I noticed
this by finding several bugs via inspection and wondering why this test
wasn't catching any of them. =[

llvm-svn: 147248
2011-12-24 11:26:57 +00:00
Chandler Carruth
48f5be6ce0 Expand more when we have a nice 'tzcnt' instruction, to avoid generating
'bsf' instructions here.

This one is actually debatable to my eyes. It's not clear that any chip
implementing 'tzcnt' would have a slow 'bsf' for any reason, and unless
EFLAGS or a zero input matters, 'tzcnt' is just a longer encoding.
Still, this restores the old behavior with 'tzcnt' enabled for now.

llvm-svn: 147246
2011-12-24 11:11:38 +00:00
Chandler Carruth
1846086903 Tidy up some of these tests.
llvm-svn: 147245
2011-12-24 11:11:36 +00:00
Chandler Carruth
9ef50ef1f7 Switch the lowering of CTLZ_ZERO_UNDEF from a .td pattern back to the
X86ISelLowering C++ code. Because this is lowered via an xor wrapped
around a bsr, we want the dagcombine which runs after isel lowering to
have a chance to clean things up. In particular, it is very common to
see code which looks like:

  (sizeof(x)*8 - 1) ^ __builtin_clz(x)

Which is trying to compute the most significant bit of 'x'. That's
actually the value computed directly by the 'bsr' instruction, but if we
match it too late, we'll get completely redundant xor instructions.

The more naive code for the above (subtracting rather than using an xor)
still isn't handled correctly due to the dagcombine getting confused.

Also, while here fix an issue spotted by inspection: we should have been
expanding the zero-undef variants to the normal variants when there is
an 'lzcnt' instruction. Do so, and test for this. We don't want to
generate unnecessary 'bsr' instructions.

These two changes fix some regressions in encoding and decoding
benchmarks. However, there is still a *lot* to be improve on in this
type of code.

llvm-svn: 147244
2011-12-24 10:55:54 +00:00
Chandler Carruth
514920d53b Cleanup this test a bit, sorting things and grouping them more clearly.
llvm-svn: 147243
2011-12-24 10:55:42 +00:00
Akira Hatanaka
72c5800ed2 Test case for r147232.
llvm-svn: 147233
2011-12-24 03:05:43 +00:00
Nick Lewycky
0c92d31b61 Move this test from date-name to feature-name, and port it to FileCheck.
llvm-svn: 147223
2011-12-23 18:41:31 +00:00
Jakob Stoklund Olesen
c97d7d26bd Experimental support for aligned NEON spills.
ARM targets with NEON units have access to aligned vector loads and
stores that are potentially faster than unaligned operations.

Add support for spilling the callee-saved NEON registers to an aligned
stack area using 16-byte aligned NEON loads and store.

This feature is off by default, controlled by an -align-neon-spills
command line option.

llvm-svn: 147211
2011-12-23 00:36:18 +00:00
Jim Grosbach
a678ad9ecc ARM VFP assembly parsing and encoding for VCVT(float <--> fixed point).
rdar://10558523

llvm-svn: 147189
2011-12-22 22:19:05 +00:00
Rafael Espindola
eba1c0eb00 Fix incorrect relocation generation. Patch by Kristof Beyls.
Fixes PR11214.

llvm-svn: 147180
2011-12-22 21:36:43 +00:00
Chad Rosier
d16131e35c Reinstate r146578; it doesn't appear to be the cause of some recent execution-
time regressions.  In general, it is beneficial to compile-time.

Original commit message:
Fix for bug #11429: Wrong behaviour for switches. Small improvement for code
size heuristics.

llvm-svn: 147175
2011-12-22 21:06:36 +00:00
Jim Grosbach
100e3aaffa ARM assembler should accept shift-by-zero for any shifted-immediate operand.
Just treat it as-if the shift wasn't there at all. 'as' compatibility.

rdar://10604767

llvm-svn: 147153
2011-12-22 18:04:04 +00:00
Benjamin Kramer
5d07d63540 Give string constants generated by IRBuilder private linkage.
Fixes PR11640.

llvm-svn: 147144
2011-12-22 14:22:14 +00:00
Chandler Carruth
acdf352b77 Make the unreachable probability much much heavier. The previous
probability wouldn't be considered "hot" in some weird loop structures
or other compounding probability patterns. This makes it much harder to
confuse, but isn't really a principled fix. I'd actually like it if we
could model a zero probability, as it would make this much easier to
reason about. Suggestions for how to do this better are welcome.

llvm-svn: 147142
2011-12-22 09:26:37 +00:00
Chad Rosier
4ab165f664 Speculatively revert r146578 to determine if it is the cause of a number of
performance regressions (both execution-time and compile-time) on our
nightly testers.

Original commit message:
Fix for bug #11429: Wrong behaviour for switches. Small improvement for code
size heuristics.

llvm-svn: 147131
2011-12-22 02:40:57 +00:00
Akira Hatanaka
e7bcf63d98 Local dynamic TLS model for direct object output. Create the correct TLS MIPS
ELF relocations.

Patch by Jack Carter.

llvm-svn: 147118
2011-12-22 01:05:17 +00:00
Jim Grosbach
7d31680e2d ARM VFP optional data type on VMOV GPR<-->SPR.
llvm-svn: 147104
2011-12-21 23:24:15 +00:00
Jim Grosbach
2bbc41fa26 Thumb2 assembly parsing of 'mov rd, rn, rrx'.
Maps to the RRX instruction. Missed this case earlier.

rdar://10615373

llvm-svn: 147096
2011-12-21 21:04:19 +00:00
Jim Grosbach
91faf5d15f Thumb2 assembly parsing of 'mov(register shifted register)' aliases.
These map to the ASR, LSR, LSL, ROR instruction definitions.

rdar://10615373

llvm-svn: 147094
2011-12-21 20:54:00 +00:00
Jim Grosbach
f7236d1084 ARM NEON assmebly parsing for VLD2 to all lanes instructions.
llvm-svn: 147069
2011-12-21 19:40:55 +00:00
Chad Rosier
c2f31859cc Fix a couple of copy-n-paste bugs. Noticed by George Russell!
llvm-svn: 147064
2011-12-21 18:56:22 +00:00
Nick Lewycky
9adbd36737 Make some intrinsics safe to speculatively execute.
llvm-svn: 147036
2011-12-21 05:52:02 +00:00
Evan Cheng
fb22f64814 Fix a couple of copy-n-paste bugs. Noticed by George Russell.
llvm-svn: 147032
2011-12-21 03:04:10 +00:00
Jim Grosbach
6bd1044b03 ARM NEON VLD2 assembly parsing for structure to all lanes, non-writeback.
llvm-svn: 147025
2011-12-21 00:38:54 +00:00
Akira Hatanaka
4ab17eaca0 Fix bug in zero-store peephole pattern reported in pr11615.
The patch and test case were originally written by Mans Rullgard.

llvm-svn: 147024
2011-12-21 00:31:10 +00:00
Akira Hatanaka
0af792d12b Expand 64-bit CTLZ nodes if target architecture does not support it. Add test
case for DCLO and DCLZ.

llvm-svn: 147022
2011-12-21 00:20:27 +00:00
Akira Hatanaka
fb94688c7a Test case for r147017.
llvm-svn: 147018
2011-12-20 23:58:36 +00:00
Jim Grosbach
0768f2c420 Enable and fix a test.
llvm-svn: 147011
2011-12-20 23:20:00 +00:00
Akira Hatanaka
2e4f1786b1 Add function MipsDAGToDAGISel::SelectMULT and factor out code that generates
nodes needed for multiplication. Add code for selecting 64-bit MULHS and MULHU
nodes.

llvm-svn: 147008
2011-12-20 23:10:57 +00:00
Akira Hatanaka
f728a1b2c5 64-bit data directive.
llvm-svn: 147005
2011-12-20 22:52:19 +00:00
Akira Hatanaka
ad193d95ae 32-to-64-bit sext_inreg pattern.
llvm-svn: 147004
2011-12-20 22:40:40 +00:00
Akira Hatanaka
8728f4ed69 Add code in MipsDAGToDAGISel for selecting constant +0.0.
MIPS64 can generate constant +0.0 with a single DMTC1 instruction. 

llvm-svn: 146999
2011-12-20 22:25:50 +00:00
Jakob Stoklund Olesen
2b24e1eac4 Heed spill slot alignment on ARM.
Use the spill slot alignment as well as the local variable alignment to
determine when the stack needs to be realigned. This works now that the
ARM target can always realign the stack by using a base pointer.

Still respect the ARMBaseRegisterInfo::canRealignStack() function
vetoing a realigned stack.  Don't use aligned spill code in that case.

llvm-svn: 146997
2011-12-20 22:15:04 +00:00
Jim Grosbach
8978194025 ARM assembly parsing and encoding for VST2 single-element, double spaced.
llvm-svn: 146990
2011-12-20 20:46:29 +00:00
Jim Grosbach
3f48367a1b ARM enable a few more tests.
llvm-svn: 146985
2011-12-20 20:03:00 +00:00
Jim Grosbach
8156a5dcee ARM assembly parsing and encoding for VLD2 single-element, double spaced.
llvm-svn: 146983
2011-12-20 19:21:26 +00:00
Evan Cheng
46b085721a ARM target code clean up. Check for iOS, not Darwin where it makes sense.
llvm-svn: 146981
2011-12-20 18:26:50 +00:00
Elena Demikhovsky
b37883fe87 This is the second fix related to VZEXT_MOVL node.
The failure that I see in the current version is:

LLVM ERROR: Cannot select: 0x18b8f70: v4i64 = X86ISD::VZEXT_MOVL 0x18beee0 [ID=14]
  0x18beee0: v4i64 = insert_subvector 0x18b8c70, 0x18b9170, 0x18b9570 [ID=13]
    0x18b8c70: v4i64 = insert_subvector 0x18b9870, 0x18bf4e0, 0x18b9970 [ID=12]
      0x18b9870: v4i64 = undef [ID=4]
      0x18bf4e0: v2i64 = bitcast 0x18bf3e0 [ID=10]
        0x18bf3e0: v4i32 = BUILD_VECTOR 0x18b9770, 0x18b9770, 0x18b9770, 0x18b9770 [ID=8]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
          0x18b9770: i32 = TargetConstant<0> [ID=6]
      0x18b9970: i32 = Constant<0> [ID=3]
    0x18b9170: v2i64 = undef [ORD=1] [ID=1]
    0x18b9570: i32 = Constant<2> [ID=5]

llvm-svn: 146975
2011-12-20 13:34:28 +00:00
Chandler Carruth
7564e8371a Begin teaching the X86 target how to efficiently codegen patterns that
use the zero-undefined variants of CTTZ and CTLZ. These are just simple
patterns for now, there is more to be done to make real world code using
these constructs be optimized and codegen'ed properly on X86.

The existing tests are spiffed up to check that we no longer generate
unnecessary cmov instructions, and that we generate the very important
'xor' to transform bsr which counts the index of the most significant
one bit to the number of leading (most significant) zero bits. Also they
now check that when the variant with defined zero result is used, the
cmov is still produced.

llvm-svn: 146974
2011-12-20 11:19:37 +00:00
Andrew Trick
a1c4f73f87 Unit test for r146950: LSR postinc expansion, PR11571.
llvm-svn: 146951
2011-12-20 01:43:20 +00:00
Bob Wilson
8439df9506 Mark ARM eh_sjlj_dispatchsetup as clobbering all registers. Radar 10567930.
We used to rely on the *eh_sjlj_setjmp instructions to mark that a function
with setjmp/longjmp exception handling clobbers all the registers.  But with
the recent reorganization of ARM EH, those eh_sjlj_setjmp instructions are
expanded away earlier, before PEI can see them to determine what registers to
save and restore.  Mark the dispatchsetup instruction in the same way, since
that instruction cannot be expanded early.  This also more accurately reflects
when the registers are clobbered.

llvm-svn: 146949
2011-12-20 01:29:27 +00:00
Jim Grosbach
3f5493c136 ARM assembly shifts by zero should be plain 'mov' instructions.
"mov r1, r2, lsl #0" should assemble as "mov r1, r2" even though it's
not strictly legal UAL syntax. It's a common extension and the friendly
thing to do.

rdar://10604663

llvm-svn: 146937
2011-12-20 00:59:38 +00:00
Chris Lattner
c1d9c0a2a3 Now that PR11464 is fixed, reapply the patch to fix PR11464,
merging types by name when we can.  We still don't guarantee type name linkage
but we do it when obviously the right thing to do.  This makes LTO type names 
easier to read, for example.

llvm-svn: 146932
2011-12-20 00:12:26 +00:00
Chris Lattner
998998b3e7 fix PR11464 by preventing the linker from mapping two different struct types from the source module onto the same opaque destination type. An opaque type can only be resolved to one thing or another after all.
llvm-svn: 146929
2011-12-20 00:03:52 +00:00
Evan Cheng
9362ee62bc Move tests to FileCheck.
llvm-svn: 146923
2011-12-19 23:26:44 +00:00
Jim Grosbach
343f270350 ARM assembly parsing and encoding support for LDRD(label).
rdar://9932658

llvm-svn: 146921
2011-12-19 23:06:24 +00:00
Akira Hatanaka
7ef923c1f0 Add a test case for r146900.
llvm-svn: 146901
2011-12-19 20:24:28 +00:00
Akira Hatanaka
e54da3bfa2 Add patterns for matching immediates whose lower 16-bit is cleared. These
patterns emit a single LUi instruction instead of a pair of LUi and ORi.

llvm-svn: 146900
2011-12-19 20:21:18 +00:00
Jim Grosbach
797a88284c ARM NEON two-operand aliases for VPADD.
rdar://10602276

llvm-svn: 146895
2011-12-19 19:51:03 +00:00
Akira Hatanaka
804863071f Remove definitions of double word shift plus 32 instructions. Assembler or
direct-object emitter should emit the appropriate shift instruction depending
on the shift amount.

llvm-svn: 146893
2011-12-19 19:44:09 +00:00
Akira Hatanaka
b7ebcb2ded Remove the restriction on the first operand of the add node in SelectAddr.
This change reduces the number of instructions generated.

For example, 
(load (add (sub $n0, $n1), (MipsLo got(s))))

results in the following sequence of instructions:
1. sub $n2, $n0, $n1
2. lw got(s)($n2)

Previously, three instructions were needed.
1. sub $n2, $n0, $n1
2. addiu $n3, $n2, got(s)
3. lw 0($n3)

llvm-svn: 146888
2011-12-19 19:28:37 +00:00
Jim Grosbach
520db82971 ARM NEON implied destination aliases for VMAX/VMIN.
llvm-svn: 146885
2011-12-19 18:57:38 +00:00
Jim Grosbach
f4ca84a7ab ARM NEON relax parse time diagnostics for alignment specifiers.
There's more variation that we need to handle. Error checking will need
to be on operand predicates.

llvm-svn: 146884
2011-12-19 18:31:43 +00:00
Joerg Sonnenberger
8cf8d64d19 Allow inlining of functions with returns_twice calls, if they have the
attribute themselve.

llvm-svn: 146851
2011-12-18 20:35:43 +00:00
Chad Rosier
b870a13cd8 Revert 146728 as it's causing failures on some of the external bots as well as
internal nightly testers.  Original commit message:

By popular demand, link up types by name if they are isomorphic and one is an
autorenamed version of the other.   This makes the IR easier to read, because
we don't end up with random renamed versions of the types after LTO'ing a large
app.

llvm-svn: 146838
2011-12-17 22:19:53 +00:00
Kevin Enderby
42fffe915a Revert r146822 at Pete Cooper's request as it broke clang self hosting.
Hope I did this correctly :)

llvm-svn: 146834
2011-12-17 19:48:52 +00:00
Pete Cooper
0ec73f6e98 SimplifyCFG now predicts some conditional branches to true or false depending on previous branch on same comparison operands.
For example, 

if (a == b) {
    if (a > b) // this is false
    
Fixes some of the issues on <rdar://problem/10554090>

llvm-svn: 146822
2011-12-17 06:32:38 +00:00
Manuel Klimek
09f4d148b8 Deleting the json-bench-test until I understand why it is flaky.
llvm-svn: 146821
2011-12-17 06:29:32 +00:00
Evan Cheng
23574ec02a Fix a CPSR liveness tracking bug introduced when I converted IT block to bundle.
llvm-svn: 146805
2011-12-17 01:25:34 +00:00
Rafael Espindola
549d0683b1 Add back the MC bits of 126425. Original patch by Nathan Jeffords. I added the
asm parsing and testcase.

llvm-svn: 146801
2011-12-17 01:14:52 +00:00
Lang Hames
e32ef23ba8 Make sure that the lower bits on the VSELECT condition are properly set.
llvm-svn: 146800
2011-12-17 01:08:46 +00:00
Dan Gohman
9c8c9a8f62 The powers that be have decided that LLVM IR should now support 16-bit
"half precision" floating-point with a first-class type.

This patch adds basic IR support (but not codegen support).

llvm-svn: 146786
2011-12-17 00:04:22 +00:00
Eric Christopher
38b0b94ef2 When recursing for the original size of a type, stop if we are at a
pointer or a reference type - we actually just want the size of the
pointer then for that.

Fixes rdar://10335756

llvm-svn: 146785
2011-12-16 23:42:45 +00:00
Jakob Stoklund Olesen
445cdbb987 Fix off-by-one error in bucket sort.
The bad sorting caused a misaligned basic block when building 176.vpr in
ARM mode.

<rdar://problem/10594653>

llvm-svn: 146767
2011-12-16 23:00:05 +00:00
Benjamin Kramer
04d6a4c456 Hexagon: Fix a nasty order-of-initialization bug.
Reenable the tests.

llvm-svn: 146750
2011-12-16 19:08:59 +00:00
Manuel Klimek
2f7cf4e64b Adds a JSON parser and a benchmark (json-bench) to catch performance regressions.
llvm-svn: 146735
2011-12-16 13:09:10 +00:00
Chris Lattner
26f06c927f By popular demand, link up types by name if they are isomorphic and one is an
autorenamed version of the other.   This makes the IR easier to read, because
we don't end up with random renamed versions of the types after LTO'ing a large app.

llvm-svn: 146728
2011-12-16 08:36:07 +00:00
Craig Topper
88e2bfef0a Don't try to match 'unpackl/h v, v' for 32xi8 and 16xi16 when only AVX1 is supported. Fix 'unpackh v, v' for 256-bit types to understand 128-bit lanes.
llvm-svn: 146726
2011-12-16 08:06:31 +00:00
Kostya Serebryany
c78b00cab4 [asan] add a test for instrumenting globals
llvm-svn: 146718
2011-12-16 01:28:19 +00:00
Eli Friedman
f626b19bda Make sure we correctly note the existence of an i8 immediate for vblendvps and friends, so we compute fixups correctly. PR11586.
llvm-svn: 146709
2011-12-15 23:46:18 +00:00
Jim Grosbach
30f4b285a6 ARM NEON VCLE is an alias for VCGE w/ the source operands reversed.
llvm-svn: 146699
2011-12-15 22:56:33 +00:00
Jim Grosbach
b79d2a8f50 ARM NEON VTBL/VTBX assembly parsing and encoding.
llvm-svn: 146691
2011-12-15 22:27:11 +00:00
Chad Rosier
62ebee9859 Add missing zmovl AVX patterns which were causing crashes.
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!

llvm-svn: 146689
2011-12-15 22:11:31 +00:00
Chad Rosier
e74b3b1469 Fix assert in LowerBUILD_VECTOR for v16i16 type on AVX.
Patch by Elena Demikhovsky <elena.demikhovsky@intel.com>!

llvm-svn: 146684
2011-12-15 21:34:44 +00:00
Lang Hames
d5cee672a7 Set specific target cpu for testcase.
llvm-svn: 146678
2011-12-15 20:22:34 +00:00
Lang Hames
0e361e816d Added test case for r146671.
llvm-svn: 146675
2011-12-15 19:56:07 +00:00
Hal Finkel
e8220d9927 Add a test case to make sure that the nop really does follow the bl on ppc64 elf
llvm-svn: 146666
2011-12-15 17:59:23 +00:00
Eli Friedman
09abc453ac Fix test.
llvm-svn: 146642
2011-12-15 04:52:47 +00:00
Eli Friedman
f6ae3a7caf Make constant folding for GEPs a bit more aggressive.
llvm-svn: 146639
2011-12-15 04:33:48 +00:00
Eli Friedman
71c0914b64 Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570.
llvm-svn: 146630
2011-12-15 02:07:20 +00:00
Chad Rosier
b93733686c Add support for lowering fneg when AVX is enabled.
rdar://10566486

llvm-svn: 146625
2011-12-15 01:02:25 +00:00
Pete Cooper
550b96ab46 Added InstCombine for "select cond, ~cond, x" type patterns
These can be reduced to "~cond & x" or "~cond | x"

llvm-svn: 146624
2011-12-15 00:56:45 +00:00
Eli Friedman
5dd57bb40a Make loop preheader insertion in LoopSimplify handle the case where the loop header is a landing pad correctly (by splitting the landingpad out of the loop header). Make some adjustments to the rest of LoopSimplify to make it clear that the rest of LoopSimplify isn't making bad assumptions about the presence of landing pads. PR11575.
llvm-svn: 146621
2011-12-15 00:50:34 +00:00
Dan Gohman
1add31cc93 Move Instruction::isSafeToSpeculativelyExecute out of VMCore and
into Analysis as a standalone function, since there's no need for
it to be in VMCore. Also, update it to use isKnownNonZero and
other goodies available in Analysis, making it more precise,
enabling more aggressive optimization.

llvm-svn: 146610
2011-12-14 23:49:11 +00:00
Jim Grosbach
75db252aee ARM NEON VLD2/VST2 lane indexed assembly parsing and encoding.
llvm-svn: 146605
2011-12-14 23:25:46 +00:00
Devang Patel
0db1ed1a48 Do not sink instruction, if it is not profitable.
On ARM, peephole optimization for ABS creates a trivial cfg triangle which tempts machine sink to sink instructions in code which is really straight line code. Sometimes this sinking may alter register allocator input such that use and def of a reg is divided by a branch in between, which may result in extra spills. Now mahine sink avoids sinking if final sink destination is post dominator.

Radar 10266272.

llvm-svn: 146604
2011-12-14 23:20:38 +00:00
Kevin Enderby
bc6d6388c2 Improve the implementation of .incbin directive by replacing a loop by using
getStreamer().EmitBytes.  Suggestion by Benjamin Kramer!

llvm-svn: 146599
2011-12-14 22:34:45 +00:00
Andrew Trick
9c88f32f94 LSR: Fold redundant bitcasts on-the-fly.
llvm-svn: 146597
2011-12-14 22:07:19 +00:00
Jim Grosbach
83520a5b70 ARM NEON fix alignment encoding for VST2 w/ writeback.
Add tests for w/ writeback instruction parsing and encoding.

llvm-svn: 146594
2011-12-14 21:49:24 +00:00
Kevin Enderby
b0b669eb26 Add the .incbin directive which takes the binary data from a file and emits
it to the streamer.  rdar://10383898

llvm-svn: 146592
2011-12-14 21:47:48 +00:00
Jim Grosbach
44829ab9d2 ARM NEON VST2 assembly parsing and encoding.
Work in progress. Parsing for non-writeback, single spaced register lists
works now. The rest have the representations better factored, but still
need more to be able to parse properly.

llvm-svn: 146579
2011-12-14 19:35:22 +00:00
Stepan Dyatkovskiy
14cb78c6fb Fix for bug #11429: Wrong behaviour for switches. Small improvement for code size heuristics.
llvm-svn: 146578
2011-12-14 19:19:17 +00:00
Dan Gohman
e9572aa680 It turns out that clang does use pointer-to-function types to
point to ARC-managed pointers sometimes. This fixes rdar://10551239.

llvm-svn: 146577
2011-12-14 19:10:53 +00:00
Akira Hatanaka
3fca32d88e Add support for local dynamic TLS model in LowerGlobalTLSAddress. Direct object
emission is not supported yet, but a patch that adds the support should follow
soon.

llvm-svn: 146572
2011-12-14 18:26:41 +00:00
Jim Grosbach
54372eef76 ARM/Thumb2 'cmp rn, #imm' alias to cmn.
When 'cmp rn #imm' doesn't match due to the immediate not being representable,
but 'cmn rn, #-imm' does match, use the latter in place of the former, as
it's equivalent.

rdar://10552389

llvm-svn: 146567
2011-12-14 17:30:24 +00:00
Jim Grosbach
628ae663ef ARM assembler support for the target-specific .req directive.
rdar://10549683

llvm-svn: 146543
2011-12-14 02:16:11 +00:00
Evan Cheng
68ba5536f3 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.

llvm-svn: 146542
2011-12-14 02:11:42 +00:00
Chad Rosier
33f40b2c25 Add newline at EOF.
llvm-svn: 146538
2011-12-14 01:34:39 +00:00
Jim Grosbach
089ad574d8 Thumb2 assembler aliases for "mov(shifted register)"
rdar://10549767

llvm-svn: 146520
2011-12-13 22:45:11 +00:00
Jim Grosbach
bd33fc6efd ARM LDM/STM system instruction variants.
rdar://10550269

llvm-svn: 146519
2011-12-13 21:48:29 +00:00
Jim Grosbach
7db50010cc Test for 146516
llvm-svn: 146517
2011-12-13 21:06:59 +00:00
Jim Grosbach
13d3509445 ARM thumb2 parsing of "rsb rd, rn, #0".
rdar://10549741

llvm-svn: 146515
2011-12-13 20:50:38 +00:00
Jim Grosbach
dfec87fe2f ARM NEON two-operand aliases for VQDMULH.
llvm-svn: 146514
2011-12-13 20:40:37 +00:00
Jim Grosbach
0ba5ba4535 ARM pre-UAL NEG mnemonic for convenience when porting old code.
llvm-svn: 146511
2011-12-13 20:23:22 +00:00
Chad Rosier
8af97606a9 [fast-isel] Unaligned loads of floats are not supported. Therefore, convert to a regular
load and then move the result from a GPR to a FPR.

llvm-svn: 146502
2011-12-13 19:22:14 +00:00
Akira Hatanaka
23f439aca1 Add test/MC/Mips/dg.exp.
llvm-svn: 146472
2011-12-13 04:12:49 +00:00
Akira Hatanaka
a9290d5ab9 Move direct object emitter test to directory test/MC/Mips. Rename it to
elf-relsym.ll.

llvm-svn: 146470
2011-12-13 03:50:34 +00:00
Akira Hatanaka
28140f744a Relocation against a symbol, instead of against section. We had some extreme
test cases where there were a lot of relocations applied relative to a large
rodata section. Gas would create a symbol for each of these whereas we would
be relative to the beginning of the rodata section. This change mimics what
gas does.

Patch by Jack Carter.

llvm-svn: 146468
2011-12-13 02:27:40 +00:00
Nick Lewycky
90a4c39a28 Don't rely on a particular version string for llvm.
llvm-svn: 146456
2011-12-13 00:34:14 +00:00
Tony Linthicum
da0dd81cf1 Temporarily disable Hexagon tests. They are failing on OS X
llvm-svn: 146455
2011-12-13 00:33:45 +00:00
Akira Hatanaka
46dd9e66a6 Test case for r146432 by Jack Carter.
llvm-svn: 146433
2011-12-12 22:41:39 +00:00
Bob Wilson
70f6f24d68 Implement 'e' and 'f' modifiers for Neon inline asm. <rdar://problem/10551006>
These modifiers simply select either the low or high D subregister of a Neon
Q register.  I've also removed the unimplemented 'p' modifier, which turns out
to be a bit different than the comment here suggests and as far as I can tell
was only intended for internal use in Apple's version of gcc.

llvm-svn: 146417
2011-12-12 21:45:15 +00:00
Tony Linthicum
61adbf8dc5 Hexagon backend support
llvm-svn: 146412
2011-12-12 21:14:40 +00:00
Joerg Sonnenberger
5b25b4d437 Only replace fwrite with fputc, if the return value is unused.
llvm-svn: 146411
2011-12-12 20:18:31 +00:00
Jan Sjödin
b9e2da0d9a XOP instructions and encoding tests.
llvm-svn: 146407
2011-12-12 19:37:49 +00:00
Roman Divacky
a450b8b2c8 Add support for gnu_indirect_function.
llvm-svn: 146377
2011-12-12 17:34:04 +00:00
Chandler Carruth
2bedf185c9 Manually upgrade the test suite to specify the flag to cttz and ctlz.
I followed three heuristics for deciding whether to set 'true' or
'false':

- Everything target independent got 'true' as that is the expected
  common output of the GCC builtins.
- If the target arch only has one way of implementing this operation,
  set the flag in the way that exercises the most of codegen. For most
  architectures this is also the likely path from a GCC builtin, with
  'true' being set. It will (eventually) require lowering away that
  difference, and then lowering to the architecture's operation.
- Otherwise, set the flag differently dependending on which target
  operation should be tested.

Let me know if anyone has any issue with this pattern or would like
specific tests of another form. This should allow the x86 codegen to
just iteratively improve as I teach the backend how to differentiate
between the two forms, and everything else should remain exactly the
same.

llvm-svn: 146370
2011-12-12 11:59:10 +00:00
Chandler Carruth
36be5dd1e8 Add an explicit test of the auto-upgrade functionality for the new
intrinsic syntax.

Now that this is explicitly covered, I plan to upgrade the existing test
suite to use an explicit immediate. Note that I plan to specify 'true'
in most places rather than the auto-upgraded value as that is the far
more common value to end up here as that is the value coming from GCC's
builtins. The only place I'm likely to put a 'false' in is when testing
x86 which actually has different instructions for the two variants.

llvm-svn: 146369
2011-12-12 11:23:11 +00:00
Chandler Carruth
d733f059d0 Teach the verifier to reject all non-constant arguments to the second
argument of the cttz and ctlz intrinsics.

llvm-svn: 146360
2011-12-12 04:36:02 +00:00
Stepan Dyatkovskiy
bf1423bdcd Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Third attempt: simplified checks in test for armv7-apple-darwin11.
llvm-svn: 146341
2011-12-11 14:35:48 +00:00
Chandler Carruth
afb8199f38 Don't assume things about the exact details of the LLVM version number,
such as what VCS information is attached.

llvm-svn: 146333
2011-12-10 21:40:31 +00:00
Chad Rosier
fa74c25947 Revert associate SelectInsertValue test as well.
llvm-svn: 146332
2011-12-10 21:34:28 +00:00
Chad Rosier
d8a265c838 Revert r146322 to appease buildbots. Original commit message:
Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for
FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second
attempt.

llvm-svn: 146328
2011-12-10 19:55:03 +00:00
Stepan Dyatkovskiy
5b2b42e8c9 Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt.
llvm-svn: 146322
2011-12-10 08:42:24 +00:00
Hal Finkel
d591c94df7 Make CR spill and restore use a reserved register. These operations cannot use the register scavenger because the scavenger can only scavenge one register and frame-index elimination may have already grabbed it.
llvm-svn: 146318
2011-12-10 04:50:53 +00:00
Rafael Espindola
9b9d35cc05 Handle expressions of the form _GLOBAL_OFFSET_TABLE_-symbol the same way gas
does. The _GLOBAL_OFFSET_TABLE_ is still magical in that we get a R_386_GOTPC,
but it doesn't change the immediate in the same way as when the expression
has no right hand side symbol.

llvm-svn: 146311
2011-12-10 02:28:43 +00:00
Eli Friedman
ca06c3a2bd Splats can contain undef's; make sure to handle them correctly. PR11526.
llvm-svn: 146299
2011-12-09 23:54:42 +00:00
Jim Grosbach
356ad6d232 ARM assembly aliases for BIC<-->AND (immediate).
When the immediate operand of an AND or BIC instruction isn't representable
in the immediate field of the instruction, but the bitwise negation of the
immediate is, assemble the instruction as the inverse operation instead
with the inverted immediate as the operand.

rdar://10550057

llvm-svn: 146283
2011-12-09 22:02:17 +00:00
Evan Cheng
77f0fb0296 Update test to something more sensible.
llvm-svn: 146282
2011-12-09 21:54:10 +00:00
Jim Grosbach
489e81da30 ARM assembly parsing and encoding for VLD2 with writeback.
Refactor the instructions into fixed writeback and register-stride
writeback variants to simplify the offset operand (no more optional
register operand using reg0). This is a simpler representation and allows
the assembly parser to more easily handle these instructions.

Add tests for the instruction variants now supported.

llvm-svn: 146278
2011-12-09 21:28:25 +00:00
Chad Rosier
7e0dc23863 [fast-isel] Add support for selecting insertvalue.
rdar://10530851

llvm-svn: 146276
2011-12-09 20:09:54 +00:00
Rafael Espindola
b5c511f7b7 Handle reloc_signed_4byte in here. Not doing so was a regression from my
previous commit. It is strange that we see it in 32 bits. We already
have a fixme about it.

llvm-svn: 146273
2011-12-09 19:57:29 +00:00
Kevin Enderby
63cf89d532 The second part of support for generating dwarf for assembly source files. This
generates the dwarf Compile Unit DIE and a dwarf subprogram DIE for each
non-temporary label.

The next part will be to get the clang driver to enable this when assembling
a .s file.  rdar://9275556

llvm-svn: 146262
2011-12-09 18:09:40 +00:00
Benjamin Kramer
06cd66b1d7 X86: Add patterns for the various rounding ops for SSE4.1 and AVX.
llvm-svn: 146257
2011-12-09 15:44:03 +00:00
Andrew Trick
4f0b3bb42b Add -unroll-runtime for unrolling loops with run-time trip counts.
Patch by Brendon Cahoon!

This extends the existing LoopUnroll and LoopUnrollPass. Brendon
measured no regressions in the llvm test suite with -unroll-runtime
enabled. This implementation works by using the existing loop
unrolling code to unroll the loop by a power-of-two (default 8). It
generates an if-then-else sequence of code prior to the loop to
execute the extra iterations before entering the unrolled loop.

llvm-svn: 146245
2011-12-09 06:19:40 +00:00
Evan Cheng
2c8bac6b4c Forgot setting -march.
llvm-svn: 146244
2011-12-09 06:15:00 +00:00
Rafael Espindola
82e22767cf Handle the case of the magical _GLOBAL_OFFSET_TABLE_ showing up in a
symbol difference. This matches gas behavior and fixes PR11513.

We still don't handle _GLOBAL_OFFSET_TABLE_ in data sections.

llvm-svn: 146238
2011-12-09 03:03:58 +00:00
Akira Hatanaka
ce89ae9f84 jalr should use t9 ($25) for indirect calls regardless of the relocation model
specified.

llvm-svn: 146229
2011-12-09 01:45:12 +00:00
Eli Friedman
8f3db3867c Fix a couple of logic bugs in TargetLowering::SimplifyDemandedBits. PR11514.
llvm-svn: 146219
2011-12-09 01:16:26 +00:00
Nick Lewycky
d2c1661e9f Fix infinite loop in DSE when deleting a free in a reachable loop that's also
trivially infinite.

llvm-svn: 146197
2011-12-08 22:36:35 +00:00
Evan Cheng
ad8debd736 Add 256-bit variant vmovss and vmovsd patterns. rdar://10538417
llvm-svn: 146196
2011-12-08 22:30:45 +00:00
Jim Grosbach
62873cae5f ARM 64-bit VEXT assembly uses a .64 suffix, not .32, amazingly enough.
llvm-svn: 146194
2011-12-08 22:19:04 +00:00
Jim Grosbach
a33fa8aa88 ARM VSHR implied destination operand form aliases.
llvm-svn: 146192
2011-12-08 22:06:06 +00:00
Evan Cheng
d8a73b8918 Add various missing AVX patterns which was causing crashes. Sadly, the generated
code looks pretty bad compared to SSE.

rdar://10538793

llvm-svn: 146191
2011-12-08 22:05:28 +00:00
Jim Grosbach
af9cc198cf Tidy up a bit.
llvm-svn: 146190
2011-12-08 22:04:40 +00:00
Jim Grosbach
78020c4642 ARM VSUB implied destination operand form aliases.
llvm-svn: 146182
2011-12-08 20:56:26 +00:00
Jim Grosbach
957be45ccf Tidy up a bit.
llvm-svn: 146181
2011-12-08 20:53:19 +00:00
Jim Grosbach
a33af36947 ARM VQADD implied destination operand form aliases.
llvm-svn: 146179
2011-12-08 20:49:43 +00:00
Jim Grosbach
405e213008 ARM a few more VMUL implied destination operand form aliases.
llvm-svn: 146177
2011-12-08 20:42:35 +00:00
Owen Anderson
d003a613e7 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Evan Cheng
0e0e920975 Add test for r146163.
llvm-svn: 146167
2011-12-08 19:21:39 +00:00
Daniel Dunbar
c192ce505d Revert r146143, "Fix bug 9905: Failure in code selection for llvm intrinsics
sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP,
FEXP2).", it is failing tests.

llvm-svn: 146157
2011-12-08 17:32:18 +00:00
NAKAMURA Takumi
671c1da473 test/CodeGen/X86/vec_compare-2.ll: Add explicit -mtriple=i686-linux.
llvm-svn: 146152
2011-12-08 15:24:09 +00:00
Nadav Rotem
341b30a457 Fix a bug in the integer-promotion of bitcast operations on vector types.
We must not issue a bitcast operation for integer-promotion of vector types, because the
location of the values in the vector may be different.

llvm-svn: 146150
2011-12-08 13:10:01 +00:00
Stepan Dyatkovskiy
8fde5b6eb4 Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2).
llvm-svn: 146143
2011-12-08 07:55:03 +00:00
Jim Grosbach
e1fe053f6e ARM NEON two-operand aliases for VSHL(immediate).
llvm-svn: 146125
2011-12-08 01:30:04 +00:00
Jim Grosbach
3e9384b103 ARM NEON two-operand aliases for VSHL(register).
llvm-svn: 146123
2011-12-08 01:12:35 +00:00
Jim Grosbach
3b4d5c0510 ARM optional destination operand variants for VEXT instructions.
llvm-svn: 146114
2011-12-08 00:43:47 +00:00
Jim Grosbach
0c64182f7c Tidy up.
llvm-svn: 146113
2011-12-08 00:41:54 +00:00
Jim Grosbach
c1cf417595 ARM assembler aliases for "add Rd, #-imm" to "sub Rd, #imm".
llvm-svn: 146111
2011-12-08 00:31:07 +00:00
Jim Grosbach
6146f79b7d ARM assembly, allow 'asl' as a synonym for 'lsl' in shifted-register operands.
For 'gas' compatibility.

llvm-svn: 146106
2011-12-07 23:40:58 +00:00
Akira Hatanaka
7db0038ac0 32 to 64-bit zext pattern.
llvm-svn: 146096
2011-12-07 23:14:41 +00:00
Jim Grosbach
dd3788b044 ARM two-operand aliases for VAND/VEOR/VORR instructions.
llvm-svn: 146095
2011-12-07 23:08:12 +00:00
Jim Grosbach
da0a3e310a ARM two-operand aliases for VADDW instructions.
llvm-svn: 146093
2011-12-07 23:01:10 +00:00
Jim Grosbach
ecf9c2bb21 ARM two-operand aliases for VADD instructions.
llvm-svn: 146091
2011-12-07 22:52:54 +00:00
Akira Hatanaka
b8e63b4c07 64-bit WrapperPICPat patterns.
llvm-svn: 146086
2011-12-07 22:11:43 +00:00
Akira Hatanaka
2b45547782 Modify LowerFCOPYSIGN to handle Mips64.
llvm-svn: 146080
2011-12-07 21:48:50 +00:00
Akira Hatanaka
19d6cd4d0e Fix 64-bit immediate patterns.
llvm-svn: 146059
2011-12-07 20:10:24 +00:00
Jim Grosbach
2f57374e32 Darwin assembler improved relocs when w/o subsections_via_symbols.
When the file isn't being built with subsections-via-symbols, symbol
differences involving non-local symbols can be resolved more aggressively.
Needed for gas compatibility.

llvm-svn: 146054
2011-12-07 19:46:59 +00:00
Jim Grosbach
1ccae84fa7 Thumb2 alias for long-form pop and friends.
rdar://10542474

llvm-svn: 146046
2011-12-07 18:32:28 +00:00
Jim Grosbach
81cb9952c9 ARM support the .arm and .thumb directives for assembly mode switching.
llvm-svn: 146042
2011-12-07 18:04:19 +00:00