12256 Commits

Author SHA1 Message Date
Reid Kleckner
0e68b2ed88 Add encodings and mnemonics for FXSAVE64 and FXRSTOR64.
These are just FXSAVE and FXRSTOR with REX.W prefixes.  These versions use
64-bit pointer values instead of 32-bit pointer values in the memory map they
dump and restore.

llvm-svn: 125446
2011-02-12 23:24:13 +00:00
Venkatraman Govindaraju
3cc16c2b89 Prevent IMPLICIT_DEF/KILL to become a delay filler instruction in SPARC backend.
llvm-svn: 125444
2011-02-12 19:02:33 +00:00
Daniel Dunbar
74c3b94237 SimplifyLibCalls: Add missing legalize check on various printf to puts and
putchar transforms, their return values are not compatible.

llvm-svn: 125442
2011-02-12 18:19:57 +00:00
Daniel Dunbar
33aae18345 tests: FileCheckize
llvm-svn: 125441
2011-02-12 18:19:53 +00:00
Nadav Rotem
367d73ccb6 A fix for 9165.
The DAGCombiner created illegal BUILD_VECTOR operations.
The patch added a check that either illegal operations are
allowed or that the created operation is legal.

llvm-svn: 125435
2011-02-12 14:40:33 +00:00
Benjamin Kramer
793cd269de Also fold (A+B) == A -> B == 0 when the add is commuted.
llvm-svn: 125411
2011-02-11 21:46:48 +00:00
Chris Lattner
fec8b6bd6d Per discussion with Dan G, inbounds geps *certainly* can have
unsigned overflow (e.g. "gep P, -1"), and while they can have
signed wrap in theoretical situations, modelling an AddRec as
not having signed wrap is going enough for any case we can 
think of today.  In the future if this isn't enough, we can
revisit this.  Modeling them as having NUW isn't causing any
known problems either FWIW.

llvm-svn: 125410
2011-02-11 21:43:33 +00:00
Nate Begeman
0a8f9ff53b Implement sdiv & udiv for <4 x i16> and <8 x i8> NEON vector types.
This avoids moving each element to the integer register file and calling __divsi3 etc. on it.

llvm-svn: 125402
2011-02-11 20:53:29 +00:00
Nadav Rotem
c2e51ee68a Fix 9173.
Add more folding patterns to constant expressions of vector selects and vector
bitcasts.

llvm-svn: 125393
2011-02-11 19:37:55 +00:00
Daniel Dunbar
5620b11961 Disable this test for now...
llvm-svn: 125361
2011-02-11 02:59:08 +00:00
Evan Cheng
7cfe7b71e6 Fix buggy fcopysign lowering.
This
define float @foo(float %x, float %y) nounwind readnone {
entry:
  %0 = tail call float @copysignf(float %x, float %y) nounwind readnone
  ret float %0
}

Was compiled to:
    vmov     s0, r1
    bic      r0, r0, #-2147483648
    vmov     s1, r0
    vcmpe.f32    s0, #0
    vmrs         apsr_nzcv, fpscr
    it           lt
    vneglt.f32   s1, s1
    vmov         r0, s1
    bx           lr

This fails to copy the sign of -0.0f because it's lost during the float to int
conversion. Also, it's sub-optimal when the inputs are in GPR registers.

Now it uses integer and + or operations when it's profitable. And it's correct!
    lsrs    r1, r1, #31
    bfi     r0, r1, #31, #1
    bx      lr
rdar://8984306

llvm-svn: 125357
2011-02-11 02:28:55 +00:00
Cameron Zwarich
898b10d36b Add a test for the LSR issue exposed by r125254.
llvm-svn: 125325
2011-02-11 00:49:27 +00:00
Nick Lewycky
6380885ba1 Tolerate degenerate phi nodes that can occur in the middle of optimization
passes. Fixes PR9112. Patch by Jakub Staszak!

llvm-svn: 125319
2011-02-10 23:54:10 +00:00
Cameron Zwarich
9d8ab7d0f7 Rename 'loopsimplify' to 'loop-simplify'.
llvm-svn: 125317
2011-02-10 23:38:10 +00:00
Bruno Cardoso Lopes
3792d82914 Add mips o32 tests again with the hope that the buildbot won't complaint again
llvm-svn: 125316
2011-02-10 23:37:20 +00:00
Bruno Cardoso Lopes
c9968d544b Remove the test to silence the buildbot, will check it in again with a proper fix soon
llvm-svn: 125305
2011-02-10 20:10:17 +00:00
Bruno Cardoso Lopes
7cc40009a8 Fix a lot of o32 CC issues and add a bunch of tests. Patch by Akira Hatanaka with some small modifications by me.
llvm-svn: 125292
2011-02-10 18:05:10 +00:00
Che-Liang Chiou
762ff2a943 ptx: add passing parameter to kernel functions
llvm-svn: 125279
2011-02-10 12:01:24 +00:00
Chris Lattner
d2c1936c14 implement the first part of PR8882: when lowering an inbounds
gep to explicit addressing, we know that none of the intermediate
computation overflows.

This could use review: it seems that the shifts certainly wouldn't
overflow, but could the intermediate adds overflow if there is a 
negative index?

Previously the testcase would instcombine to:

define i1 @test(i64 %i) {
  %p1.idx.mask = and i64 %i, 4611686018427387903
  %cmp = icmp eq i64 %p1.idx.mask, 1000
  ret i1 %cmp
}

now we get:

define i1 @test(i64 %i) {
  %cmp = icmp eq i64 %i, 1000
  ret i1 %cmp
}

llvm-svn: 125271
2011-02-10 07:11:16 +00:00
Chris Lattner
6e84f48cd8 Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.

Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck.  I believe that this takes care of a bunch of related
code quality issues attached to PR8862.

llvm-svn: 125267
2011-02-10 05:36:31 +00:00
Chris Lattner
72ac244f4e Enhance the "compare with shift" and "compare with div"
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts.  For example, these (which
are the same except the first is 'exact' sdiv:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %A = sdiv exact i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %A = sdiv i64 %X, -5   ; X/-5 == 0 --> x == 0
  %B = icmp eq i64 %A, 0
  ret i1 %B
}

compile down to:

define i1 @sdiv_icmp4_exact(i64 %X) nounwind {
  %1 = icmp eq i64 %X, 0
  ret i1 %1
}

define i1 @sdiv_icmp4(i64 %X) nounwind {
  %X.off = add i64 %X, 4
  %1 = icmp ult i64 %X.off, 9
  ret i1 %1
}

This happens when you do something like:
  (ptr1-ptr2) == 42

where the pointers are pointers to non-unit types.

llvm-svn: 125266
2011-02-10 05:23:05 +00:00
Chris Lattner
0decae4bf7 more cleanups, notably bitcast isn't used for "signed to unsigned type
conversions". :)

llvm-svn: 125265
2011-02-10 05:17:27 +00:00
Evan Cheng
5a42a6a20f After 3-addressifying a two-address instruction, update the register maps; add a missing check when considering whether it's profitable to commute. rdar://8977508.
llvm-svn: 125259
2011-02-10 02:20:55 +00:00
Jim Grosbach
48e8554772 Do AsmMatcher operand classification per-opcode.
When matching operands for a candidate opcode match in the auto-generated
AsmMatcher, check each operand against the expected operand match class.
Previously, operands were classified independently of the opcode being
handled, which led to difficulties when operand match classes were
more complicated than simple subclass relationships.

llvm-svn: 125245
2011-02-10 00:08:28 +00:00
Chris Lattner
02088f3ab8 Teach instsimplify some tricks about exact/nuw/nsw shifts.
improve interfaces to instsimplify to take this info.

llvm-svn: 125196
2011-02-09 17:15:04 +00:00
Chris Lattner
e29022d779 merge two tests.
llvm-svn: 125195
2011-02-09 17:06:41 +00:00
Chris Lattner
c2750a7e3b remove a small scattering of basically pointless tests. These are
all covered by llvm-test, which is what they were reduced from back
in 2003.

llvm-svn: 125189
2011-02-09 16:41:31 +00:00
Chris Lattner
8abd838838 remove a broken test, this is matching nounwind on intrinsics, not the old unwind instruction
llvm-svn: 125188
2011-02-09 16:40:56 +00:00
Richard Osborne
112cff2533 Add intrinsic for setc instruction on the XCore.
llvm-svn: 125186
2011-02-09 13:22:12 +00:00
Nick Lewycky
b162446cda When removing a function from the function set and adding it to deferred, we
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.

Also add some debugging statements.

llvm-svn: 125180
2011-02-09 06:32:02 +00:00
NAKAMURA Takumi
0c1790b8d2 test/lit.cfg: Seek sane tools(and bash) in directories and set to $PATH.
LitConfig.getBashPath() will not seek in $PATH after LitConfig.getToolsPath() was executed.

llvm-svn: 125176
2011-02-09 04:19:21 +00:00
NAKAMURA Takumi
81da9fa2a4 CMake: Add the new option LLVM_LIT_TOOLS_DIR. It can specify "Path to GnuWin32 tools".
llvm-svn: 125173
2011-02-09 04:18:58 +00:00
Owen Anderson
899a6d74bf Revert both r121082 (which broke a bunch of constant pool stuff) and r125074 (which worked around it). This should get us back to the old, correct behavior, though it will make the integrated assembler unhappy for the time being.
llvm-svn: 125127
2011-02-08 22:39:40 +00:00
Benjamin Kramer
8ff71a1384 Support for .ifdef / .ifndef in the assembler parser. Patch by Joerg Sonnenberger.
llvm-svn: 125120
2011-02-08 22:29:56 +00:00
Andrew Trick
4bf0d6a37a PostRA antidependence breaker unit test for PR8986.
llvm-svn: 125091
2011-02-08 17:42:05 +00:00
Andrew Trick
4c1ebd08ef PostRA antidependence breaker unit test for rdar://8959122.
llvm-svn: 125090
2011-02-08 17:41:12 +00:00
Benjamin Kramer
04249128ab SimplifyCFG: Track the number of used icmps when turning a icmp chain into a switch. If we used only one icmp, don't turn it into a switch.
Also prevent the switch-to-icmp transform from creating identity adds, noticed by Marius Wachtler.

llvm-svn: 125056
2011-02-07 22:37:28 +00:00
Bruno Cardoso Lopes
0ce5b0f4a8 Add support for parsing dmb/dsb instructions
llvm-svn: 125055
2011-02-07 22:09:15 +00:00
Evan Cheng
56b78e409e Fix an obvious typo which caused an isel assertion. rdar://8964854.
llvm-svn: 125023
2011-02-07 18:50:47 +00:00
Devang Patel
46db608b81 Reduce test case, smaller is better.
llvm-svn: 125019
2011-02-07 18:24:18 +00:00
Bob Wilson
65f4a70b82 Add codegen support for using post-increment NEON load/store instructions.
The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using
post-increment versions, but all the rest of the NEON load/store instructions
should be handled now.

llvm-svn: 125014
2011-02-07 17:43:21 +00:00
Chris Lattner
2fd09e3397 implement .ll and .bc support for nsw/nuw on shl and exact on lshr/ashr.
Factor some code better.

llvm-svn: 125006
2011-02-07 16:40:21 +00:00
Jason W Kim
7342155b4c Teach ARM/MC/ELF about gcc compatible reloc output to get past odd linkage
failures with relocations.

The code committed is a first cut at compatibility for emitted relocations in
ELF .o.

Why do this? because existing ARM tools like emitting relocs symbols as
explicit relocations, not as section-offset relocs.

Result is that with these changes,
1) relocs are now substantially identical what to gcc outputs.
2) larger apps (including many spec2k tests) compile, cross-link, and pass

Added reminder fixme to tests for future conversion to .s form.

llvm-svn: 124996
2011-02-07 01:11:15 +00:00
Jason W Kim
b0d4492aa1 Rework some .ARM.attribute work for improved gcc compatibility.
Unified EmitTextAttribute for both Asm and Obj emission (.cpu only)
Added necessary cortex-A8 related attrs for codegen compat tests.

llvm-svn: 124995
2011-02-07 00:49:53 +00:00
Chris Lattner
1c1b342a62 teach instsimplify to transform (X / Y) * Y to X
when the div is an exact udiv.

llvm-svn: 124994
2011-02-06 22:05:31 +00:00
Chris Lattner
8d427ed03c rename test.
llvm-svn: 124993
2011-02-06 21:59:10 +00:00
Chris Lattner
7b6a968f5d enhance vmcore to know that udiv's can be exact, and add a trivial
instcombine xform to exercise this.

Nothing forms exact udivs yet though.  This is progress on PR8862

llvm-svn: 124992
2011-02-06 21:44:57 +00:00
Anders Carlsson
1eeebf1c22 When loading from a constant, fold inttoptr if the integer type and the resulting pointer type both have the same size.
llvm-svn: 124987
2011-02-06 20:11:56 +00:00
NAKAMURA Takumi
07a84f5950 Target/X86: Tweak allocating shadow area (aka home) on Win64. It must be enough for caller to allocate one.
llvm-svn: 124949
2011-02-05 15:11:32 +00:00
Bob Wilson
d46874eb5d Move a test that ended up in the wrong place.
llvm-svn: 124933
2011-02-05 04:15:50 +00:00