10295 Commits

Author SHA1 Message Date
Dan Gohman
e9c4426bb0 Apply the SSE dependence idiom for SSE unary operations to
SD instructions too, in addition to SS instructions. And
add a comment about it.

llvm-svn: 108191
2010-07-12 20:46:04 +00:00
Bruno Cardoso Lopes
a4889e6f93 Add AVX 256-bit MOVMSK forms
llvm-svn: 108184
2010-07-12 20:06:32 +00:00
Daniel Dunbar
d1c2da9d0b MC/AsmParser: Move .tbss and .zerofill parsing to Darwin specific parser.
llvm-svn: 108180
2010-07-12 19:37:35 +00:00
Daniel Dunbar
a1e5852feb MC/AsmParser: Move .desc parsing to Darwin specific parser.
llvm-svn: 108179
2010-07-12 19:22:53 +00:00
Daniel Dunbar
50b931bbac MC/AsmParser: Move some misc. Darwin directive handling to DarwinAsmParser.
llvm-svn: 108174
2010-07-12 18:49:22 +00:00
Dan Gohman
0af129aa1b Add a lint check for mismatched return types, inspired by PR6944.
llvm-svn: 108162
2010-07-12 18:02:04 +00:00
Benjamin Kramer
cf8ad46899 Nope, still breaks the release selfhost bots :(
llvm-svn: 108153
2010-07-12 16:38:48 +00:00
Benjamin Kramer
e391789246 Reapply the "or" half of r108136, which seems to be less problematic.
llvm-svn: 108152
2010-07-12 16:15:48 +00:00
Benjamin Kramer
98c95e7743 Revert r108141 again, sigh.
llvm-svn: 108148
2010-07-12 14:42:04 +00:00
Benjamin Kramer
c4f46375d3 Reapply 108136 with an ugly pasto fixed.
llvm-svn: 108141
2010-07-12 13:44:00 +00:00
Benjamin Kramer
d9bf737e62 Revert r108136 until I figure out why it broke selfhost.
llvm-svn: 108139
2010-07-12 12:35:49 +00:00
Benjamin Kramer
f00a49ceff instcombine: fold (x & y) | (~x & z) and (x & y) ^ (~x & z) into ((y ^ z) & x) ^ z which is one instruction shorter. (PR6773)
before:
  %and = and i32 %y, %x
  %neg = xor i32 %x, -1
  %and4 = and i32 %z, %neg
  %xor = xor i32 %and4, %and

after:
  %xor1 = xor i32 %z, %y
  %and2 = and i32 %xor1, %x
  %xor = xor i32 %and2, %z

llvm-svn: 108136
2010-07-12 11:54:45 +00:00
Chris Lattner
59bffe35a1 fix PR7311 by avoiding breaking casts when a bitcast from scalar->vector
is involved.

llvm-svn: 108117
2010-07-12 01:19:22 +00:00
Chris Lattner
baef771d17 if jump threading is able to infer interesting values on both
the LHS and RHS of an and/or instruction, don't multiply add
known predecessor values.  This fixes the crash on testcase
from PR7498

llvm-svn: 108114
2010-07-12 00:47:34 +00:00
Chris Lattner
d8288040c3 fix PR7429, a crash turning a load from a string into a float.
llvm-svn: 108113
2010-07-12 00:22:51 +00:00
Chris Lattner
68f5ec0fa2 convert to filechecconvert to filecheckk
llvm-svn: 108112
2010-07-12 00:21:10 +00:00
Chris Lattner
64eeea9044 merge two tests.
llvm-svn: 108111
2010-07-12 00:19:47 +00:00
Jakob Stoklund Olesen
a28aa26057 Remove TargetInstrInfo::copyRegToReg entirely.
Targets must now implement TargetInstrInfo::copyPhysReg instead. There is no
longer a default implementation forwarding to copyRegToReg.

llvm-svn: 108095
2010-07-11 17:01:17 +00:00
Rafael Espindola
84716579d4 Fix va_arg for doubles. With this patch VAARG nodes always contain the
correct alignment information, which simplifies ExpandRes_VAARG a bit.

The patch introduces a new alignment information to TargetLoweringInfo. This is
needed since the two natural candidates cannot be used:

* The 's' in target data: If this is set to the minimal alignment of any
  argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for
  example.
* The getTransientStackAlignment method. It is possible for an architecture to
  have argument less aligned than what we maintain the stack pointer.

llvm-svn: 108072
2010-07-11 04:01:49 +00:00
Dan Gohman
7734fa156d Fix this test.
llvm-svn: 108059
2010-07-10 22:42:12 +00:00
Jakob Stoklund Olesen
36e769aa7b FileCheckize inline asm FP stack tests
llvm-svn: 108046
2010-07-10 16:30:25 +00:00
Dan Gohman
a05b7a2c34 Add an explicit triple to make this test behave consistently.
llvm-svn: 108041
2010-07-10 09:01:35 +00:00
Dan Gohman
bd6a232794 Fix this XTARGET so that this does doesn't XPASS on non-darwin hosts.
llvm-svn: 108040
2010-07-10 09:01:03 +00:00
Dan Gohman
fef30fcd5e Reapply bottom-up fast-isel, with several fixes for x86-32:
- Check getBytesToPopOnReturn().
 - Eschew ST0 and ST1 for return values.
 - Fix the PIC base register initialization so that it doesn't ever
   fail to end up the top of the entry block.

llvm-svn: 108039
2010-07-10 09:00:22 +00:00
Bruno Cardoso Lopes
f4180a9a7b Add AVX 256-bit packed MOVNT variants
llvm-svn: 108021
2010-07-09 21:42:42 +00:00
Bruno Cardoso Lopes
6ca8dc935c Add AVX 256-bit unpack and interleave
llvm-svn: 108017
2010-07-09 21:20:35 +00:00
Jakob Stoklund Olesen
53d777f3bd Fix a few tests
llvm-svn: 108011
2010-07-09 20:43:09 +00:00
Jim Grosbach
b591b3b48d In the presence of variable sized objects, allocate an emergency spill slot.
rdar://8131327

llvm-svn: 108008
2010-07-09 20:27:06 +00:00
Dan Gohman
e7a09d478d Add a target triple.
llvm-svn: 108003
2010-07-09 19:17:36 +00:00
Dan Gohman
0c7d7ee288 Fix MachineLICM to actually visit inner loops.
llvm-svn: 108001
2010-07-09 18:49:45 +00:00
Bruno Cardoso Lopes
3676e24b67 Start the support for AVX instructions with 256-bit %ymm registers. A couple of
notes:
- The instructions are being added with dummy placeholder patterns using some 256
  specifiers, this is not meant to work now, but since there are some multiclasses
  generic enough to accept them,  when we go for codegen, the stuff will be already
  there.
- Add VEX encoding bits to support YMM
- Add MOVUPS and MOVAPS in the first round
- Use "Y" as suffix for those Instructions: MOVUPSYrr, ...
- All AVX instructions in X86InstrSSE.td will move soon to a new X86InstrAVX
  file.

llvm-svn: 107996
2010-07-09 18:27:43 +00:00
Bob Wilson
9e8c9204ef --- Reverse-merging r107947 into '.':
U    utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U    test/CodeGen/X86/fast-isel.ll
U    test/CodeGen/X86/fast-isel-loads.ll
U    include/llvm/Target/TargetLowering.h
U    include/llvm/Support/PassNameParser.h
U    include/llvm/CodeGen/FunctionLoweringInfo.h
U    include/llvm/CodeGen/CallingConvLower.h
U    include/llvm/CodeGen/FastISel.h
U    include/llvm/CodeGen/SelectionDAGISel.h
U    lib/CodeGen/LLVMTargetMachine.cpp
U    lib/CodeGen/CallingConvLower.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U    lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U    lib/CodeGen/SelectionDAG/FastISel.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U    lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U    lib/CodeGen/SelectionDAG/TargetLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.h
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 107987
2010-07-09 16:37:18 +00:00
Jakob Stoklund Olesen
b4c406a101 Fix test to be less sensitive of regalloc accidents
llvm-svn: 107951
2010-07-09 01:32:11 +00:00
Bob Wilson
f15e542bdc Print "dregpair" NEON operands with a space between them, for readability and
consistency with other instructions that have lists of register operands.

llvm-svn: 107944
2010-07-09 00:47:20 +00:00
Dan Gohman
7e6e4dd058 Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL.

llvm-svn: 107943
2010-07-09 00:39:23 +00:00
Bob Wilson
dd02fe62a2 Reenable DAG combining for vector shuffles. It looks like it was temporarily
disabled and then never turned back on again.  Adjust some tests, one because
this change avoids an unnecessary instruction, and the other to make it
continue testing what it was intended to test.

llvm-svn: 107941
2010-07-09 00:38:12 +00:00
Bill Wendling
8f0fcd1623 Extension of r107506. Make sure that we don't mark a function as having a call
if the inline ASM doesn't need a stack frame.

llvm-svn: 107922
2010-07-08 22:38:02 +00:00
Chris Lattner
012d7537ee Rework segment prefix emission code to handle segments
in memory operands at the same type as hard coded segments.
This fixes problems where we'd emit the segment override after
the REX prefix on instructions like:
mov %gs:(%rdi), %rax

This fixes rdar://8127102.  I have several cleanup patches coming
next.

llvm-svn: 107917
2010-07-08 22:28:12 +00:00
Stuart Hastings
6a15a8d86f Test case for r107843. Radar 8152866.
llvm-svn: 107907
2010-07-08 20:31:05 +00:00
Evan Cheng
5307ec12d7 Check for FiniteOnlyFPMath as well.
llvm-svn: 107904
2010-07-08 20:12:24 +00:00
Benjamin Kramer
27eb255a70 Teach instcombine to transform
(X >s -1) ? C1 : C2 and (X <s  0) ? C2 : C1
into ((X >>s 31) & (C2 - C1)) + C1, avoiding the conditional.

This optimization could be extended to take non-const C1 and C2 but we better
stay conservative to avoid code size bloat for now.

for
int sel(int n) {
     return n >= 0 ? 60 : 100;
}

we now generate
  sarl  $31, %edi
  andl  $40, %edi
  leal  60(%rdi), %eax

instead of
  testl %edi, %edi
  movl  $60, %ecx
  movl  $100, %eax
  cmovnsl %ecx, %eax

llvm-svn: 107866
2010-07-08 11:39:10 +00:00
Eric Christopher
091bf69467 A slight reworking of the custom patterns for x86-64 tpoff codegen and
correct the testcase for valid assembly.

Needs more tests.

llvm-svn: 107860
2010-07-08 07:36:46 +00:00
Evan Cheng
3e8530bf14 r107852 is only safe with -enable-unsafe-fp-math to account for +0.0 == -0.0.
llvm-svn: 107856
2010-07-08 06:01:49 +00:00
Evan Cheng
ed3f224f04 Optimize some vfp comparisons to integer ones. This patch implements the simplest case when the following conditions are met:
1. The arguments are f32.
2. The arguments are loads and they have no uses other than the comparison.
3. The comparison code is EQ or NE.

e.g.
        vldr.32 s0, [r1]
        vldr.32 s1, [r0]
        vcmpe.f32       s1, s0
        vmrs    apsr_nzcv, fpscr
	beq     LBB0_2
=>
        ldr     r1, [r1]
        ldr     r0, [r0]
        cmp     r0, r1
        beq     LBB0_2

More complicated cases will be implemented in subsequent patches.

llvm-svn: 107852
2010-07-08 02:08:50 +00:00
Dale Johannesen
2df647f882 Changes to ARM tail calls, mostly cosmetic.
Add explicit testcases for tail calls within the same module.
Duplicate some code to humor those who think .w doesn't apply on ARM.
Leave this disabled on Thumb1, and add some comments explaining why it's hard
and won't gain much.

llvm-svn: 107851
2010-07-08 01:18:23 +00:00
Dan Gohman
4dcc56a102 Revert 107840 107839 107813 107804 107800 107797 107791.
Debug info intrinsics win for now.

llvm-svn: 107850
2010-07-08 01:00:56 +00:00
Chris Lattner
bf009b527a Fix the second half of PR7437: scalarrepl wasn't preserving
address spaces when SRoA'ing memcpy's.

llvm-svn: 107846
2010-07-08 00:27:05 +00:00
Chris Lattner
6a5db9c9c9 Implement the major chunk of PR7195: support for 'callw'
in the integrated assembler.  Still some discussion to be
done.

llvm-svn: 107825
2010-07-07 22:27:31 +00:00
Bruno Cardoso Lopes
b92b51191e Add more assembly opcodes for SSE compare instructions
llvm-svn: 107823
2010-07-07 22:24:03 +00:00
Jakob Stoklund Olesen
34ec644313 Allow copies between GR8_ABCD_L and GR8_ABCD_H.
This fixes PR7540.

llvm-svn: 107809
2010-07-07 20:33:27 +00:00