Commit Graph

7302 Commits

Author SHA1 Message Date
Jush Lu
2946549a28 [arm-fast-isel] Add support for shl, lshr, and ashr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-03 02:37:48 +00:00
Manman Ren
127eea87d6 X86 Peephole: fold loads to the source register operand if possible.
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161207 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 19:37:32 +00:00
Akira Hatanaka
bddf83614a Set transient stack alignment in constructor of MipsFrameLowering and re-enable
test o32_cc_vararg.ll.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161189 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 18:15:13 +00:00
NAKAMURA Takumi
ac89c0ddfd llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Make sure this is testing without +avx.
FIXME: Could +avx be checked here too?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161156 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 06:36:56 +00:00
NAKAMURA Takumi
25fa9a4890 llvm/test/CodeGen/X86/fold-pcmpeqd-1.ll: Rewrite expressions to pass regardless of PR11031.
- Relax to match even if epilogue (pop %ebp) were emitted.
  - Assume the return value is stored to %xmm0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161155 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 06:33:58 +00:00
Manman Ren
d7d003c2b7 X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161152 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-02 00:56:42 +00:00
Matt Beaumont-Gay
b705e4a8b6 Line endings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161117 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 16:42:35 +00:00
Elena Demikhovsky
1503aba4a0 Added FMA functionality to X86 target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-01 12:06:00 +00:00
Akira Hatanaka
cdb3ba71ce Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and
MipsSEFrameLowering.

Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be
reserved if there is a call with a large call frame or there are variable sized
objects on the stack.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 22:50:19 +00:00
Akira Hatanaka
1d53f1bbab Let PEI::calculateFrameObjectOffsets compute the final stack size rather than
computing it in MipsFrameLowering::emitPrologue.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161078 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 21:28:49 +00:00
Akira Hatanaka
1d165f1c25 Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering.
The frame object which points to the dynamically allocated area will not be
needed after changes are made to cease reserving call frames.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161076 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 20:54:48 +00:00
Akira Hatanaka
e2d529ac11 When store nodes or memcpy nodes are created to copy the function call
arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and
integer offset operands rather than frame object operands.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161068 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:46:41 +00:00
Chad Rosier
d97f3a5ab0 [x86 frame lowering] In 32-bit mode, use ESI as the base pointer.
Previously, we were using EBX, but PIC requires the GOT to be in EBX before 
function calls via PLT GOT pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161066 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:29:21 +00:00
Akira Hatanaka
36bcc11236 Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as
single-precision load and store.

Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect
to map unaligned floating point load/store nodes to these instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:16:49 +00:00
Manman Ren
53b59d1d97 MachineSink: Sort the successors before trying to find SuccToSinkTo.
One motivating example is to sink an instruction from a basic block which has
two successors: one outside the loop, the other inside the loop. We should try
to sink the instruction outside the loop.

rdar://11980766


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161062 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 18:10:39 +00:00
Jakob Stoklund Olesen
34af6f597b Clear kill flags in removeCopyByCommutingDef().
We are extending live ranges, so kill flags are not accurate. They
aren't needed until they are recomputed after RA anyway.

<rdar://problem/11950722>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 02:47:24 +00:00
Manman Ren
1123614317 Reverse order of the two branches at end of a basic block if it is profitable.
We branch to the successor with higher edge weight first.
Convert from
     je    LBB4_8  --> to outer loop
     jmp   LBB4_14 --> to inner loop
to
     jne   LBB4_14
     jmp   LBB4_8

PR12750
rdar: 11393714


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@161018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-31 01:11:07 +00:00
Pete Cooper
32ecfb4158 Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160987 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-30 20:23:19 +00:00
Manman Ren
e8b4a4a9d1 Revert r160920 and r160919 due to dragonegg and clang selfhost failure
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160927 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-29 02:44:09 +00:00
Manman Ren
14148c41d9 X86 Peephole: fold loads to the source register operand if possible.
Trying to fix the bot by specifying a triple in the failing testing cases.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160920 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-28 17:51:24 +00:00
Manman Ren
0eb3edea9c X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160919 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-28 16:48:01 +00:00
Manman Ren
43d9ab1812 X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-28 03:15:46 +00:00
Evan Cheng
9c777a4844 Teach CodeGenPrep to look past bitcast when it's duplicating return instruction
into predecessor blocks to enable tail call optimization.

rdar://11958338


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160894 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen
72e7dbf88b Add <imp-def> of super-register when lowering SUBREG_TO_REG.
Patch by Tyler Nowicki!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160888 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen
369a4c7759 Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160816 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-26 21:40:42 +00:00
Akira Hatanaka
e11246c64e Fix call setup for PIC.
Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160774 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-26 02:24:43 +00:00
Manman Ren
24182757bf Update testing case for Atom when disabling rematerialization in
TwoAddressInstructionPass.

The generated code for Atom has a different code sequence. This is realted
to commit r160749.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 20:17:14 +00:00
Manman Ren
d68e8cda24 Disable rematerialization in TwoAddressInstructionPass.
It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.

This also fixed rdar://11830760 where xor is expected instead of movi0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 18:28:13 +00:00
Rafael Espindola
1cee71099c When a return struct pointer is passed in registers, the called has nothing
to pop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160725 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 13:41:10 +00:00
Akira Hatanaka
de4a127470 Eliminate the stack slot used to save the global base register.
The long branch pass (fixed in r160601) no longer uses the global base register
to compute addresses of branch destinations, so it is not necessary to reserve
a slot on the stack.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160703 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-25 03:16:47 +00:00
Rafael Espindola
d9fb65dee6 Add a cpu to the test. Should fix the atom bot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160701 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 22:56:06 +00:00
Rafael Espindola
423ebb1f38 Add a triple to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 21:55:04 +00:00
Rafael Espindola
cde227bc2a In order to correctly compile
struct s {
  double x1;
  float x2;
};
__attribute__((regparm(3))) struct s f(int a, int b, int c);
void g(void) {
  f(41, 42, 43);
}

We need to be able to represent passing the address of s to f (sret) in a
register (inreg). Turns out that all that is needed is to not mark them as
mutually incompatible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160695 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 21:40:17 +00:00
David Chisnall
23a62cbaf5 ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160687 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-24 20:04:16 +00:00
Akira Hatanaka
3ee306cbc0 Add basic ability to setup call frame, and make procedure calls.
Hello world will compile and execute with this patch.

Patch by Reed Kotler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160651 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-23 23:45:54 +00:00
Sylvestre Ledru
c8e41c5917 Fix a typo (the the => the)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160621 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-23 08:51:15 +00:00
Nadav Rotem
ed1a335ece Fixed DAGCombine optimizations which generate select_cc for targets
that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160619 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-23 07:59:50 +00:00
Akira Hatanaka
60287963c7 Fix Mips long branch pass.
This pass no longer requires that the global pointer value be saved to the
stack or register since it uses bal instruction to compute branch distance.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160601 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-21 03:30:44 +00:00
Jakob Stoklund Olesen
2ec0cda5d5 Avoid folding loads that are unsafe to move.
LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.

This fixes PR13414.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160575 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen
c321a20b2e Split loop exiting edges more aggressively.
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.

Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.

The test case demonstrates the improvement. Before:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1

After:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1

  movl  %esi, %eax

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160571 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-20 20:49:53 +00:00
Preston Gurd
5b50701b1c Fix remaining lit tests which were failing when run on an Atom
processor.

Patches by Tyler Nowicki, Andy Zhang, and Preston Gurd!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160520 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-19 18:53:21 +00:00
Jush Lu
ee649839a2 [arm-fast-isel] Add support for vararg function calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160500 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-19 09:49:00 +00:00
Manman Ren
62a89f5808 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160454 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 21:40:01 +00:00
Preston Gurd
d4d961615c This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160451 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 20:49:17 +00:00
Chandler Carruth
32d75bec4b Fix a somewhat nasty crasher in PR13378. This crashes inside of
LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160443 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 18:58:22 +00:00
Victor Oliveira
26ad67adcd test commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160438 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 17:53:05 +00:00
Jack Carter
a0f14afee1 Mips specific inline asm operand modifier 'M':
Print the high order register of a double word register operand.

In 32 bit mode, a 64 bit double word integer will be represented
by 2 32 bit registers. This modifier causes the high order register
to be used in the asm expression. It is useful if you are using 
doubles in assembler and continue to control register to variable
relationships.

This patch also fixes a related bug in a previous patch:

    case 'D': // Second part of a double word register operand
    case 'L': // Low order register of a double word register operand
    case 'M': // High order register of a double word register operand

I got 'D' and 'M' confused. The second part of a double word operand
will only match 'M' for one of the endianesses. I had 'L' and 'D'
be the opposite twins when 'L' and 'M' are.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160429 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 06:41:36 +00:00
Joel Jones
7c82e6a32a More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160410 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-18 00:02:16 +00:00
Evan Cheng
afb24cecff Add test case for r160387
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160389 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 19:40:05 +00:00
Nadav Rotem
5589a69f0a Fix a crash in the legalization of large vectors.
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160357 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 09:07:37 +00:00
Evan Cheng
f5c0539092 Implement r160312 as target indepedenet dag combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160354 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 08:31:11 +00:00
Evan Cheng
70e10d3fe4 This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160346 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-17 06:53:39 +00:00
Akira Hatanaka
1bda7863e3 Fix function select_cc_f32 in test/CodeGen/Mips/selectcc.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160329 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 23:56:51 +00:00
Evan Cheng
98819c9d1e For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 19:35:43 +00:00
Nadav Rotem
7ee0e5ae60 Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160305 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 18:34:53 +00:00
Tom Stellard
a4be0720c5 Revert "test/CodeGen/R600: Add some basic tests v6"
This reverts commit 11d3457afcda7848448dd7f11b2ede6552ffb9ea.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160300 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 18:19:43 +00:00
Alexey Samsonov
e0f5aedf97 Fix tests that failed on i686-win32 after r160248:
1. FileCheck-ize epilogue.ll and allow another asm instruction to restore %rsp.
2. Remove check in widen_arith-3.ll that was hitting instruction in epilogue instead of
vector add.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160274 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 14:33:36 +00:00
Tom Stellard
2015236dfc test/CodeGen/R600: Add some basic tests v6
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 14:17:19 +00:00
Nadav Rotem
d93ea88cde Fix a bug in the 3-address conversion of LEA when one of the operands is an
undef virtual register. The problem is that ProcessImplicitDefs removes the
definition of the register and marks all uses as undef. If we lose the undef
marker then we get a register which has no def, is not marked as undef. The
live interval analysis does not collect information for these virtual
registers and we crash in later passes.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160260 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 10:52:25 +00:00
Alexey Samsonov
99a92f269d This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment.
It is intended to fix PR11468.

Old prologue and epilogue looked like this:
push %rbp
mov %rsp, %rbp
and $alignment, %rsp
push %r14
push %r15
...
pop %r15
pop %r14
mov %rbp, %rsp
pop %rbp

The problem was to reference the locations of callee-saved registers in exception handling:
locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would
take some effort to implement this in LLVM, as currently MachineLocation can only have the form
"Register + Offset". Funciton prologue and epilogue are now changed to:

push %rbp
mov %rsp, %rbp
push %14
push %15
and $alignment, %rsp
...
lea -$size_of_saved_registers(%rbp), %rsp
pop %r15
pop %r14
pop %rbp

Reviewed by Chad Rosier.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160248 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 06:54:09 +00:00
Nadav Rotem
46646572f7 Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160235 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 20:39:08 +00:00
Nadav Rotem
d896e24299 Teach getTargetVShiftNode about TargetConstant nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 20:27:43 +00:00
NAKAMURA Takumi
a2a179dd7d llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Rewrite expressions to fit various targets.
- Make sure existence of "barrier".
  - Confirm reload corresponding to spill.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160232 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 14:38:35 +00:00
Nadav Rotem
aec9f382dd Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.

PR12782.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-15 12:26:30 +00:00
Nadav Rotem
65f489fd7d AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector.
This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160222 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-14 22:26:05 +00:00
Nadav Rotem
b7e230d999 Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160221 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-14 21:30:27 +00:00
Joel Jones
06a6a300c5 This is one of the first steps at moving to replace target-dependent
intrinsics with target-indepdent intrinsics.  The first instruction(s) to be 
handled are the vector versions of count leading zeros (ctlz).

The changes here are to clang so that it generates a target independent 
vector ctlz when it sees an ARM dependent vector ctlz.  The changes in llvm 
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp 
to update any existing bc files containing ARM dependent vector ctlzs with 
target-independent ctlzs.  There are also changes to an existing test case in 
llvm for ARM vector count instructions and a new test for the bitcode upgrade.

<rdar://problem/11831778>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160200 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-13 23:25:25 +00:00
Duncan Sands
ab6e47bee6 Restrict this to x86, hopefully fixing ARM buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160163 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-13 07:02:00 +00:00
Benjamin Kramer
feae00a68e Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it.
I already had the necessary things in place for IR-level passes but missed the machine passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160137 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 18:14:57 +00:00
Nadav Rotem
4dff258bfb The LIT tests below do not specify the exact cpu model and fail on AVX2 machines, because we select different instructions such as vbroadcast, new shuffles, etc.
Patch by Michael Liao.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160129 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 13:45:15 +00:00
NAKAMURA Takumi
223875e3f8 llvm/test/CodeGen/X86/rdrand.ll: Relax expression corresponding to Win64 CC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160124 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 10:22:57 +00:00
Benjamin Kramer
51bf934e4d Use %s instead of the explicit name, the latter doesn't work in out-of-tree builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160120 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:36:29 +00:00
Benjamin Kramer
b9bee04995 Add intrinsics for Ivy Bridge's rdrand instruction.
The rdrand/cmov sequence is the same that is emitted by both
GCC and ICC.

Fixes PR13284.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160117 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:31:43 +00:00
Duncan Sands
4e8982a34d The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160116 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 09:01:35 +00:00
Craig Topper
5aba78bd80 Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160110 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-12 06:52:41 +00:00
Manman Ren
45ed19499b ARM: Fix optimizeCompare to correctly check safe condition.
It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160090 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 22:51:44 +00:00
Akira Hatanaka
ce66afb22d Test case for r160036.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160067 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 19:50:46 +00:00
Manman Ren
84ae7e9034 X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair.
When Movr0 is between sub and cmp, we move Movr0 before sub if it enables
removal of Cmp.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160066 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 19:35:12 +00:00
Akira Hatanaka
3fef29d881 Implement MipsTargetLowering::LowerSELECT_CC to custom lower SELECT_CC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 19:32:27 +00:00
Benjamin Kramer
597f2950d8 PR13326: Fix a subtle edge case in the udiv -> magic multiply generator.
This caused 6 of 65k possible 8 bit udivs to be wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160058 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 18:31:59 +00:00
Nadav Rotem
5cd95e1478 When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160044 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 13:27:05 +00:00
Akira Hatanaka
ba584fe8fe Lower RETURNADDR node in Mips backend.
Patch by Sasa Stankovic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160031 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-11 00:53:32 +00:00
Jack Carter
bb78930489 Mips specific inline asm operand modifier 'L'.
Low order register of a double word register operand. Operands 
   are defined by the name of the variable they are marked with in
   the inline assembler code. This is a way to specify that the 
   operand just refers to the low order register for that variable.
   
   It is the opposite of modifier 'D' which specifies the high order
   register.
   
   Example:
   
 main()
{

    long long ll_input = 0x1111222233334444LL;
    long long ll_val = 3;
    int i_result = 0;

    __asm__ __volatile__( 
		   "or	%0, %L1, %2"
	     : "=r" (i_result) 
	     : "r" (ll_input), "r" (ll_val)); 
}

   Which results in:
   
   	lui	$2, %hi(_gp_disp)
	addiu	$2, $2, %lo(_gp_disp)
	addiu	$sp, $sp, -8
	addu	$2, $2, $25
	sw	$2, 0($sp)
	lui	$2, 13107
	ori	$3, $2, 17476     <-- Low 32 bits of ll_input
	lui	$2, 4369
	ori	$4, $2, 8738      <-- High 32 bits of ll_input
	addiu	$5, $zero, 3  <-- Low 32 bits of ll_val
	addiu	$2, $zero, 0  <-- High 32 bits of ll_val
	#APP
	or	$3, $4, $5        <-- or i_result, high 32 ll_input, low 32 of ll_val
	#NO_APP
	addiu	$sp, $sp, 8
	jr	$ra

If not direction is done for the long long for 32 bit variables results
in using the low 32 bits as ll_val shows.

There is an existing bug if 'L' or 'D' is used for the destination register
for 32 bit long longs in that the target value will be updated incorrectly
for the non-specified part unless explicitly set within the inline asm code.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160028 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 22:41:20 +00:00
Chad Rosier
542e35f62d Add newline.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160006 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 17:57:00 +00:00
Chad Rosier
3d3c75c665 Add test case accidentally omitted from r160002.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160004 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 17:49:39 +00:00
Chad Rosier
3f0dbab963 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.  Basically, this is a reapplication of r158087 with a few fixes.

Specifically, (1) the stack pointer is restored from the base pointer before
popping callee-saved registers and (2) in obscure cases (see comments in patch)
we must cache the value of the original stack adjustment in the prologue and
apply it in the epilogue.

rdar://11496434


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160002 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 17:45:53 +00:00
Nadav Rotem
2dd83eb1ab Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159991 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 13:25:08 +00:00
Akira Hatanaka
182ef6fcaa Make register Mips::RA allocatable if not in mips16 mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159971 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-10 00:19:06 +00:00
Owen Anderson
d9bf71fdd2 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159957 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-09 20:31:12 +00:00
Manman Ren
6209364834 X86: implement functions to analyze & synthesize CMOV|SET|Jcc
getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond

No functional change intended.
If we want to update the condition code of CMOV|SET|Jcc, we first analyze the
opcode to get the condition code, then update the condition code, finally
synthesize the new opcode form the new condition code.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159955 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-09 18:57:12 +00:00
Manman Ren
2d4215f759 X86: Fix optimizeCompare to correctly check safe condition.
It is safe if EFLAGS is killed or re-defined.
When we are done with the basic block, check whether EFLAGS is live-out.
Do not optimize away cmp if EFLAGS is live-out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159888 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-07 03:34:46 +00:00
Manman Ren
2af66dc51a X86: peephole optimization to remove cmp instruction
For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159838 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 17:36:20 +00:00
Chad Rosier
fd065bbed1 [fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159837 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 17:33:39 +00:00
Duncan Sands
4c3916f840 Attempt to fix windows buildbots. Patch by James Benton.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159826 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 14:43:16 +00:00
NAKAMURA Takumi
365f1b84d3 test/CodeGen/X86/sext-setcc-self.ll: Mark it as XFAIL: cygwin,mingw32,win32. Investigating.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159820 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 12:12:39 +00:00
NAKAMURA Takumi
bd985efa99 Revert r159804, "[arm-fast-isel] Add support for vararg function calls."
It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159817 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 11:12:44 +00:00
Jush Lu
a8c4d739f2 [arm-fast-isel] Add support for vararg function calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159804 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-06 03:02:37 +00:00
Jack Carter
244a84ee57 Mips specific inline asm operand modifier D.
Print the second half of a double word operand.
   
   The include list was cleaned up a bit as well.
   
   Also the test case was modified to test for both
   big and little patterns.
   


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159787 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-05 23:58:21 +00:00
Akira Hatanaka
d45e37a0a5 test case for r159770.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159771 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-05 19:29:31 +00:00
Duncan Sands
e7de3b29f7 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159739 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-05 09:32:46 +00:00
Jakob Stoklund Olesen
b872078701 Ensure CopyToReg nodes are always glued to the call instruction.
The CopyToReg nodes that set up the argument registers before a call
must be glued to the call instruction. Otherwise, the scheduler may emit
the physreg copies long before the call, causing long live ranges for
the fixed registers.

Besides disabling good register allocation, that can also expose
problems when EmitInstrWithCustomInserter() splits a basic block during
the live range of a physreg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159721 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 19:28:31 +00:00
Rafael Espindola
25dd5fc1cd Add a testcase for pr13209. It is not a great test, but it still fails if
159509 and 159479 are reverted. It would be really nice to be able to run
just the coalescer :-(

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159715 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 16:06:00 +00:00
Jakob Stoklund Olesen
59bde4d8a1 Add early if-conversion support to X86.
Implement the TII hooks needed by EarlyIfConversion to create cmov
instructions and estimate their latency.

Early if-conversion is still not enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159695 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-04 00:09:58 +00:00
NAKAMURA Takumi
84f2ae332f test/CodeGen/SPARC/private.ll: Fixup. Forgot to prune old RUN lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159643 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 04:29:20 +00:00
NAKAMURA Takumi
0176dfed27 test/CodeGen/SPARC/private.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159642 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 04:21:57 +00:00
NAKAMURA Takumi
ea957f0c56 test/CodeGen/X86/sincos.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159639 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:59:22 +00:00
NAKAMURA Takumi
a16d8c30cc test/CodeGen/X86/fabs.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159638 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:59:15 +00:00
NAKAMURA Takumi
40b7e7eb97 test/CodeGen/X86/2007-09-05-InvalidAsm.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159637 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:59:08 +00:00
NAKAMURA Takumi
0e0d62ebd9 test/CodeGen/X86/2004-03-30-Select-Max.ll: FileCheck-ize.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159636 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-03 03:58:59 +00:00
Jack Carter
10de025a67 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159625 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 23:35:23 +00:00
Eric Christopher
80c1b38eff Revert " mips32 long long register inline asm constraint support." as
it appears to be breaking the bots.

This reverts commit 1b055ce320.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159619 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 23:22:25 +00:00
Jack Carter
3aaefc12bb deleted test/CodeGen/Mips/inlineasm-cnstrnt-bad-r-1.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159617 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 23:21:22 +00:00
Jack Carter
1b055ce320 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159610 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 22:39:45 +00:00
Bob Wilson
30a507a1f5 Extend TargetPassConfig to allow running only a subset of the normal passes.
This is still a work in progress but I believe it is currently good enough
to fix PR13122 "Need unit test driver for codegen IR passes".  For example,
you can run llc with -stop-after=loop-reduce to have it dump out the IR after
running LSR.  Serializing machine-level IR is not yet supported but we have
some patches in progress for that.

The plan is to serialize the IR to a YAML file, containing separate sections
for the LLVM IR, machine-level IR, and whatever other info is needed.  Chad
suggested that we stash the stop-after pass in the YAML file and use that
instead of the start-after option to figure out where to restart the
compilation.  I think that's a great idea, but since it's not implemented yet
I put the -start-after option into this patch for testing purposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159570 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 19:48:45 +00:00
Chandler Carruth
1de43ede89 Fix the remaining TCL-style quotes found in the testsuite. This is
another mechanical change accomplished though the power of terrible Perl
scripts.

I have manually switched some "s to 's to make escaping simpler.

While I started this to fix tests that aren't run in all configurations,
the massive number of tests is due to a really frustrating fragility of
our testing infrastructure: things like 'grep -v', 'not grep', and
'expected failures' can mask broken tests all too easily.

Essentially, I'm deeply disturbed that I can change the testsuite so
radically without causing any change in results for most platforms. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159547 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 19:09:46 +00:00
Chandler Carruth
49589f0d0e Convert the uses of '|&' to use '2>&1 |' instead, which works on old
versions of Bash. In addition, I can back out the change to the lit
built-in shell test runner to support this.

This should fix the majority of fallout on Darwin, but I suspect there
will be a few straggling issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 18:37:59 +00:00
Bob Wilson
ac03af4ea9 Do not attempt to use ROR for Thumb1.
Patch by Matt Fischer!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159538 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 17:22:47 +00:00
Chandler Carruth
b9c5bd00e9 Fix the TCL-style quoting in one random test that somehow slipped
through my perl nets.

With this, the test suite passes even if I force it to run with the
built-in shell test logic, except for a test which REQUIREs shell.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159529 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 13:29:47 +00:00
Chandler Carruth
4177e6fff5 Convert all tests using TCL-style quoting to use shell-style quoting.
This was done through the aid of a terrible Perl creation. I will not
paste any of the horrors here. Suffice to say, it require multiple
staged rounds of replacements, state carried between, and a few
nested-construct-parsing hacks that I'm not proud of. It happens, by
luck, to be able to deal with all the TCL-quoting patterns in evidence
in the LLVM test suite.

If anyone is maintaining large out-of-tree test trees, feel free to poke
me and I'll send you the steps I used to convert things, as well as
answer any painful questions etc. IRC works best for this type of thing
I find.

Once converted, switch the LLVM lit config to use ShTests the same as
Clang. In addition to being able to delete large amounts of Python code
from 'lit', this will also simplify the entire test suite and some of
lit's architecture.

Finally, the test suite runs 33% faster on Linux now. ;]
For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159525 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 12:47:22 +00:00
Chandler Carruth
38adf0bdaa Rewrite three tests that had truly egregious abuses of 'grep' in them to
use FileCheck.

Aside from removing a dependence on TCL-style quoting, this also makes
the tests ... significantly more robust. =] It would be really, *really*
great of the maintainer(s) of the CellSPU backend went through and
systematically rewrite these tests to use FileCheck. There are a lot
more that have nearly this bad of abuses.

Another step along the path to a TclTest-free testsuite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159523 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-02 12:20:14 +00:00
Rafael Espindola
9c3d5a70f4 Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159509 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-01 17:08:01 +00:00
Elena Demikhovsky
8f40f7b867 Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159504 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-01 06:12:26 +00:00
Jakob Stoklund Olesen
8ccaad526a Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159461 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 21:00:03 +00:00
Rafael Espindola
94e3b388e5 In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 04:22:35 +00:00
Manman Ren
40307c7dbe X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159402 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-29 00:54:20 +00:00
Nuno Lopes
85b408991a add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159383 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-28 22:30:12 +00:00
Jack Carter
7c3cd4d24e The Mips specific inline asm operand modifier 'z' has the
following description in the gnu sources:

    Print $0 if operand is zero otherwise print the op normally.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-28 01:33:40 +00:00
Akira Hatanaka
e3e12450e6 Test case for r159240.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-27 00:40:34 +00:00
Rafael Espindola
275c85f1a7 Fix llc's -print-before=pass and -print-after=pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159227 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 21:33:36 +00:00
Manman Ren
1f7a1b68a0 X86: add GATHER intrinsics (AVX2) in LLVM
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159221 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 19:47:59 +00:00
Jack Carter
0518fca843 There are a number of generic inline asm operand modifiers that
up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159203 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 13:49:27 +00:00
Elena Demikhovsky
1596373671 Shuffle optimization for AVX/AVX2.
The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159188 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 08:04:10 +00:00
Andrew Trick
c9b1e25493 Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159183 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-26 04:11:38 +00:00
Eli Friedman
52d418df5d Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159176 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 23:42:33 +00:00
Manman Ren
540cda34b0 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159166 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 21:49:38 +00:00
Jakob Stoklund Olesen
a4e6397fd9 Enforce stricter liveness rules for PHIs.
Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen
5984d2b31f Run ProcessImplicitDefs on SSA form where it can be much simpler.
Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159149 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-25 18:12:18 +00:00
Jakob Stoklund Olesen
82d58b147f %RCX is not a function live-out in eh.return functions.
The function live-out registers must be live at all function returns,
and %RCX is only used by eh.return. When a function also has a normal
return, only %RAX holds a return value.

This fixes PR13188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159116 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 15:53:01 +00:00
Pete Cooper
b49998d76c DAG legalisation can now handle illegal fma vector types by scalarisation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159092 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-24 00:05:44 +00:00
Hans Wennborg
ce718ff9f4 Extend the IL for selecting TLS models (PR9788)
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159077 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 11:37:03 +00:00
Rafael Espindola
ce0a5cda8a Handle aliases to tls variables in all architectures, not just x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159058 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 00:30:03 +00:00
Evan Cheng
fc47253294 (sub X, imm) gets canonicalized to (add X, -imm)
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=>   sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=>   movw r1, #65535
     sub r0, r0, r1

rdar://11726136


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159057 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-23 00:29:06 +00:00
Hal Finkel
009f7afbeb Add support for the PPC isel instruction.
The isel (integer select) instruction is supported on the 440 and A2
embedded cores and on the POWER7.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159045 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 23:10:08 +00:00
Chad Rosier
e5457d2116 FileCheckize tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159044 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 23:04:02 +00:00
Lang Hames
59d454959f Rename fp-op fusion option (yet again) for compatibility with GCC option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159042 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 22:31:00 +00:00
Evan Cheng
c90a1fcf9f EmitZerofill should take a 64-bit size or else it's chopping off large zero-filled global. rdar://11729134
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 20:14:46 +00:00
NAKAMURA Takumi
84f64f317f test/CodeGen/Generic/asm-large-immediate.ll: Mark it as XFAIL: powerpc, possibly due to r158939.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158994 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 13:41:00 +00:00
Jakob Stoklund Olesen
e208c49172 Functions calling __builtin_eh_return must have a frame pointer.
The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame
pointer exists, but the frame pointer was forced by the presence of
llvm.eh.unwind.init which isn't guaranteed.

If llvm.eh.unwind.init is actually required in functions calling
eh.return (is it?), we should diagnose that instead of emitting bad
machine code.

This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158961 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 03:04:27 +00:00
Andrew Trick
ef2d9e59ab ARM scheduling fix: compute predicated implicit use properly.
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158959 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 02:50:31 +00:00
Lang Hames
e023141322 Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158956 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-22 01:09:09 +00:00
Jack Carter
4db98becf7 The inline asm operand modifier 'n' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Negate the immediate constant

Several Architectures such as x86 have local implementations
of operand modifier 'n' which go beyond the above description
slightly. This won't affect them.

Affected files:

    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'n' to the switch cases.

    test/CodeGen/Generic/asm-large-immediate.ll
        Generic compiled test (x86 for me)

    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158939 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 21:37:54 +00:00
Akira Hatanaka
54c5bc8799 1. fix null program output after some other changes
2. re-enable null.ll test
3. fix some minor style violations

Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 20:39:10 +00:00
Hal Finkel
2bbc9193b4 Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC.
Thanks to Tobias von Koch for pointing out this problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158932 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 20:10:48 +00:00
Jack Carter
d5e11ad51a The inline asm operand modifier 'c' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Substitute immediate value without immediate syntax

Several Architectures such as x86 have local implementations
of operand modifier 'c' which go beyond the above description
slightly. To make use of the generic modifiers without overriding
local implementation one can make a call to the base class method
for AsmPrinter::PrintAsmOperand() in the locally derived method's 
"default" case in the switch statement. That way if it is already
defined locally the generic version will never get called.

This change is needed when test/CodeGen/generic/asm-large-immediate.ll
failed on a native Mips board. The test was assuming a generic
implementation was in place.

Affected files:

    lib/Target/Mips/MipsAsmPrinter.cpp:
        Changed the default case to call the base method.
    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'c' to the switch cases.
    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158925 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 17:14:46 +00:00
NAKAMURA Takumi
e42e9ce20f Revert r158209, "test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911."
It passes according to ppc changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158917 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 13:43:06 +00:00
Lang Hames
dc13d2ed2f Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158902 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 06:10:00 +00:00
Evan Cheng
8ef0968dc2 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158900 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen
c4118452bc Remove the -live-regunits command line option.
Register allocators depend on it being permanently enabled now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158873 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 23:31:34 +00:00
Jakob Stoklund Olesen
7824152557 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 18:00:57 +00:00
Hal Finkel
0fcdd8b2cc Add support for generating reg+reg (indexed) pre-inc loads on PPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158823 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 15:43:03 +00:00
Craig Topper
703c38bf58 Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158792 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-20 05:39:26 +00:00
Lang Hames
d693cafcfb Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158757 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen
9efc6d3ac3 Add a triple.
The test was failing on Linux because of asm syntax differences.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158748 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 21:46:25 +00:00
Jakob Stoklund Olesen
7164288c3e Implement PPCInstrInfo::isCoalescableExtInstr().
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.

This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.

This is related to PR5997.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158743 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 21:14:34 +00:00
Hal Finkel
ac81cc3282 Add support for generating reg+reg preinc stores on PPC.
PPC will now generate STWUX and friends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 02:34:32 +00:00
Rafael Espindola
565bdbf598 really add a triple :-(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158696 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 02:17:35 +00:00
Rafael Espindola
e08c1347d9 Add a triple to the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158695 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 01:42:34 +00:00
Rafael Espindola
d6b43a317e Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158692 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-19 00:48:28 +00:00
Manman Ren
eda9fdf979 ARM: use NOEN loads and stores if possible when handling struct byval.
This change is to be enabled in clang.

rdar://9877866


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158684 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 22:23:48 +00:00
Joel Jones
96ef284da4 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.  The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.

<rdar://problem/11481151>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158659 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 14:51:32 +00:00
Chandler Carruth
457dfbac8a Add a regression test for the bug exposed by r158087, which has been
temporarily reverted.

This test is annoyingly overspecified, but I don't know of another way
to thoroughly test the saving and restoring of the registers. While this
will have to be adjusted even with the issue fixed in order to re-apply
r158087, those adjustments should very clearly indicate that it is still
correct (%esp getting restored prior to pops), whereas without it, this
case can easily slip under the radar.

Still, any suggestions for improvements are very welcome.

All credit to Matt Beaumont-Gay for reducing this out of an insane
Address Sanitizer crash to a reasonably small seg-faulting C program
when built with -mstackrealign. I just reduced it to IR, which was much
simpler. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158656 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 09:15:04 +00:00
Chandler Carruth
43369249e7 Temporarily revert r158087.
This patch causes problems when both dynamic stack realignment and
dynamic allocas combine in the same function. With this patch, we no
longer build the epilog correctly, and silently restore registers from
the wrong position in the stack.

Thanks to Matt for tracking this down, and getting at least an initial
test case to Chad. I'm going to try to check a variation of that test
case in so we can easily track the fixes required.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158654 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-18 07:03:12 +00:00
Hal Finkel
2741d2cfdf Cleanup trip-count finding for PPC CTR loops (and some bug fixes).
This cleans up the method used to find trip counts in order to form CTR loops on PPC.
This refactoring allows the pass to find loops which have a constant trip count but also
happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different
classes of loops that are currently ignored.

In addition, we now search through all potential induction operations instead of just the first.
Also, we check the predicate code on the conditional branch and abort the transformation if the
code is not EQ or NE, and we then make sure that the branch to be transformed matches the
condition register defined by the comparison (multiple possible comparisons will be considered).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158607 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-16 20:34:07 +00:00
Manman Ren
307473dec0 ARM: optimization for sub+abs.
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi

For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.

rdar: 11633193


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158551 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15 21:32:12 +00:00
Jakob Stoklund Olesen
d628a587b6 Preserve <undef> flags in ARMExpandPseudo.
This probably mostly shows up in bugpoint-generated code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158527 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15 17:46:54 +00:00
Akira Hatanaka
1418045472 1. introduce MipsPat in place of Pat in order to exclude those from
being used by Mips16 or Micro Mips
2. clean up a few lines too long encountered

Patch by Reed Kotler.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158470 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 21:03:23 +00:00
Akira Hatanaka
6b0cd9b9c6 Make machine verifier check the first instruction of the last bundle instead of
the last instruction of a basic block.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158468 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 20:51:13 +00:00
Manman Ren
39f5eb1563 Revert: test/CodeGen/ARM/iabs.ll in r158441
Sorry that I accidently checked in this file with my previous commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158442 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 06:04:02 +00:00
Manman Ren
7a0575b9a8 InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158441 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 05:57:42 +00:00
Akira Hatanaka
ce5c6fb55f Test case for MIPS long branch pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158438 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 02:12:21 +00:00
Akira Hatanaka
163d706c46 Fix test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158435 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-14 01:21:00 +00:00
Akira Hatanaka
8782707f50 Implement a DAGCombine in MipsISelLowering.cpp which transforms the following
pattern:

(add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt))

"tjt" is a TargetJumpTable node. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158419 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 20:33:18 +00:00
Akira Hatanaka
e193b32583 Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158414 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 19:33:32 +00:00
Akira Hatanaka
777a120285 Implement fastcc calling convention for MIPS.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158410 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 18:06:00 +00:00
Richard Osborne
aa08c8b2ba Fix pattern for MKMSK instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158409 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 17:59:12 +00:00
Craig Topper
cc95b57d42 Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158396 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 07:18:53 +00:00
Akira Hatanaka
942918d13e disable use of directive .set nomicromips
until this directive is pushed in gas to open source fsf

Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158381 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 02:41:14 +00:00
Andrew Trick
1c2d3c538c sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158380 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-13 02:39:03 +00:00
Chad Rosier
49d6fc02ef [arm-fast-isel] Add support for -arm-long-calls.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158368 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-12 19:25:13 +00:00
Jakob Stoklund Olesen
ad27086e66 Fix test that depends on register allocation.
The test is really checking the prolog/epilog load/store multiple
formation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158328 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 21:14:28 +00:00
Jakob Stoklund Olesen
7ed1ad54f2 Fix test case to work on ARM.
Patch by James Benton!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 16:01:14 +00:00
Bill Wendling
ad5c880892 Re-enable the CMN instruction.
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-11 08:07:26 +00:00
Hal Finkel
71ffcfe9f8 Enable ILP scheduling for all nodes by default on PPC.
Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).

Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)

Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158296 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-10 19:32:29 +00:00
Hal Finkel
0a3e33b633 Improve ext/trunc patterns on PPC64.
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
would leave self-moves in the final assembly. Replacing those patterns with ones
based on the SUBREG builtins yields better-looking code.

Thanks to Jakob and Owen for their suggestions in this matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 22:10:19 +00:00
Craig Topper
c29106b36f Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158278 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 16:46:13 +00:00
Hal Finkel
8bf75ed41c Enable tail merging on PPC.
Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).

Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%

Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%

This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158259 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-09 03:14:50 +00:00
Jakob Stoklund Olesen
6660ed5f2f Don't run RAFast in the optimizing regalloc pipeline.
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.

Fast register allocation in the optimizing pipeline is better done by
RABasic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158242 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 23:15:12 +00:00
Hal Finkel
7255d2a808 Enable PPC CTR loop formation by default.
Thanks to Jakob's help, this now causes no new test suite failures!

Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%

The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%

In light of these slowdowns, additional profiling work is obviously needed!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158223 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 19:19:53 +00:00
Manman Ren
6620ccf5d8 Test case for r158160
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158218 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 18:42:37 +00:00
Chad Rosier
28dd960cd1 Fix a crash in APInt::lshr when shiftAmt > BitWidth.
Patch by James Benton <jbenton@vmware.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158213 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 18:04:52 +00:00
NAKAMURA Takumi
7c18922f85 test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158209 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 16:28:06 +00:00
Hal Finkel
09fdc7baae Disable the PPC CTR-Loops pass by default.
The pass itself works well, but the something in the Machine* infrastructure
does not understand terminators which define registers. Without the ability
to use the block-placement pass, etc. this causes performance regressions (and
so is turned off by default). Turning off the analysis turns off the problems
with the Machine* infrastructure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158206 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 15:38:25 +00:00
Hal Finkel
daa03ec604 Fix a bug in the new PPC CTR-Loops pass.
The code which tests for an induction operation cannot assume that any
ADDI instruction will have a register operand because the operand could
also be a frame index; for example:
    %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158205 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 15:38:23 +00:00
Hal Finkel
99f823f943 Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code.
This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon
pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are
no longer otherwise used. Also, invalid preheader DebugLoc is not used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158204 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08 15:38:21 +00:00
Manman Ren
9236362a64 X86: optimize generated code for integer ABS
This patch will generate the following for integer ABS:
      movl    %edi, %eax
      negl    %eax
      cmovll  %edi, %eax
INSTEAD OF
      movl    %edi, %ecx
      sarl    $31, %ecx
      leal    (%rdi,%rcx), %eax
      xorl    %ecx, %eax

There exists a target-independent DAG combine for integer ABS, which converts
integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. 
This is implemented in PerformXorCombine.

rdar://10695237


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158175 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 22:39:10 +00:00
Rafael Espindola
c07f5bbd3b Use a base register instead of an index register with the local dynamic model.
Fixes pr13048.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158158 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 18:39:19 +00:00
Manman Ren
87253c2ebd X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158126 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07 00:42:47 +00:00
Manman Ren
2afde7782d Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158122 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06 23:53:03 +00:00
Chad Rosier
a97b180fc4 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.
rdar://11496434


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158087 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06 17:37:40 +00:00
Joel Jones
e061053051 Revert commit r157966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157972 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-05 00:47:21 +00:00
Joel Jones
dd52bf2ed8 This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.

<rdar://problem/11481151>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157966 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 23:38:57 +00:00
Akira Hatanaka
f0378f4381 Add a test case for mips64 unaligned load/store instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157939 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 17:57:06 +00:00
Akira Hatanaka
8c345b4567 Rename test/CodeGen/Mips/load-shift-left-right.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157938 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 17:50:36 +00:00
Roman Divacky
fd42ed676e Implement local-exec TLS on PowerPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157935 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 17:36:38 +00:00
Nadav Rotem
fcb2c3cf5e Remove the "-promote-elements" flag. This flag is now enabled by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157925 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 11:27:21 +00:00
Hal Finkel
77838f9ca9 Enable generating PPC pre-increment (r+imm) instructions by default.
It seems that this no longer causes test suite failures on PPC64 (after r157159),
and often gives a performance benefit, so it can be enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157911 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-04 02:21:00 +00:00
Craig Topper
a15f9d5311 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157903 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 18:58:46 +00:00
Craig Topper
529ce07c5f Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157898 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 07:26:46 +00:00
Manman Ren
c73ea9102b Revert r157831
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157896 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 03:14:24 +00:00
Craig Topper
57ae246a6a Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157895 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-03 01:40:43 +00:00
Manman Ren
be4258adc5 ARM: add testing case for struct byval
rdar://9877866


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157876 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 05:37:44 +00:00
Akira Hatanaka
3309dbab30 Add another test case which tests Mips' unaligned load/store instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157874 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 01:13:10 +00:00
Akira Hatanaka
bdd26783b8 Fix test cases in test/CodeGen/Mips.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157868 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-02 00:05:45 +00:00
Manman Ren
73c2f7f5ed X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157831 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 19:49:33 +00:00
Chris Lattner
2b76473929 testcase for PR13006, thanks to Duncan for filing it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 18:19:46 +00:00
Hans Wennborg
f0234fcbc9 Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 16:27:21 +00:00
Craig Topper
3a8172ad8d Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157804 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 06:07:48 +00:00
Chris Lattner
f59e4e3452 enhance the logic for looking through tailcalls to look through transparent casts
in multiple-return value scenarios, like what happens on X86-64 when returning
small structs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157800 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:29:15 +00:00
Chris Lattner
5b0d946537 enhance getNoopInput to know about vector<->vector bitcasts of legal
types, as well as int<->ptr casts.  This allows us to tailcall functions
with some trivial casts between the call and return (i.e. because the
return types disagree).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157798 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:16:33 +00:00
Chris Lattner
09c14c0836 add some simple 64-bit tail call tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157797 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:03:31 +00:00
Chris Lattner
e8ea60b8ba merge some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157795 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 05:00:54 +00:00
Chris Lattner
e109648880 rename test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157794 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-01 04:58:50 +00:00
Owen Anderson
a835f00702 Make this testcase independent of register allocation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157761 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 18:07:02 +00:00
Manman Ren
91c5346d91 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157755 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 17:20:29 +00:00
Elena Demikhovsky
177cf1e1a3 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157737 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 09:20:20 +00:00
Craig Topper
0559a2f8ae Add intrinsic for pclmulqdq instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157731 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-31 04:37:40 +00:00
Jakob Stoklund Olesen
9cda1be0aa Prioritize smaller register classes for urgent evictions.
It helps compile exotic inline asm. In the test case, normal GR32
virtual registers use up eax-edx so the final GR32_ABCD live range has
no registers left. Since all the live ranges were tiny, we had no way of
prioritizing the smaller register class.

This patch allows tiny unspillable live ranges to be evicted by tiny
unspillable live ranges from a smaller register class.

<rdar://problem/11542429>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157715 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 21:46:58 +00:00
Eric Christopher
6ab75b4dcb Add support for the mips inline asm 'm' output modifier.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157709 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 19:05:19 +00:00
Owen Anderson
f917d20561 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157708 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 18:54:50 +00:00
Owen Anderson
85ef6f4c99 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157707 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 18:50:39 +00:00
Chris Lattner
f186df0d3e it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157699 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 18:08:02 +00:00
Chris Lattner
5aaabbfe62 Extend the (abi-irrelevant) return convention to be able to return more than two values in
integer registers.  This is already supported by the fastcc convention, but it doesn't
hurt to support it in the standard conventions as well.

In cases where we can cheat at the calling convention, this allows us to avoid returning
things through memory in more cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 17:50:14 +00:00
Chad Rosier
ada759d5fa [arm-fast-isel] Add support for the llvm.frameaddress() intrinsic.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157696 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 17:23:22 +00:00
Evan Cheng
eb25bd2356 Teach taildup to update livein set. rdar://11538365
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157663 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 00:42:39 +00:00
Bob Wilson
6e1b812850 Add an insertPass API to TargetPassConfig. <rdar://problem/11498613>
Besides adding the new insertPass function, this patch uses it to
enhance the existing -print-machineinstrs so that the MachineInstrs
after a specific pass can be printed.

Patch by Bin Zeng!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157655 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-30 00:17:12 +00:00
Benjamin Kramer
1386e9b7b1 Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157634 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-29 19:05:25 +00:00
Peter Collingbourne
b34d3aa35b Add llvm.fabs intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157594 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-28 21:48:37 +00:00
Chris Lattner
c32cef6aa1 These tests used intrinsics with the wrong prototype. They weren't caught because
the old verifier just checked that something "was a pointer", but not that the pointee
was correct.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157544 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-27 19:35:41 +00:00
Benjamin Kramer
c511b2a5a1 SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights.
SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move
the most likely condition to the front so it is checked first and the others can
be skipped. This is currently not as effective as it could be because SimplifyCFG
destroys profiling metadata when merging branches and switches. Merging branch
weight metadata is tricky though.

This code touches at most 3 cases so I didn't use a proper sorting algorithm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157521 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-26 20:01:32 +00:00
Justin Holewinski
968b09d03f [NVPTX] Add a new test case for the newly-enabled call handling
NV_CONTRIB

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157485 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 17:20:38 +00:00
NAKAMURA Takumi
f755e0001a test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157474 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 15:40:54 +00:00
NAKAMURA Takumi
a389a23bbb test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157471 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 15:12:21 +00:00
Eli Friedman
2db0e9ebb6 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157446 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-25 00:09:29 +00:00
David Blaikie
28a5ab2fb4 Fix for CHECK-NOT misspelling.
Patch by Nicklas Bo Jensen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157421 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-24 22:08:29 +00:00
Justin Holewinski
42a0b48dd3 Remove the PTX back-end and all of its artifacts (triple, etc.)
This back-end was deprecated in favor of the NVPTX back-end.

NV_CONTRIB

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157417 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-24 21:38:21 +00:00
Akira Hatanaka
c784395a79 Turn on mips16 pseudo op when compiling for mips16.
Expand test case for this.

Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157410 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-24 18:37:43 +00:00
Akira Hatanaka
4a5a8949cd Enable Mips16 compiler to compile a null program.
First code from the Mips16 compiler. Includes trivial test program.

Patch by Reed Kotler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157408 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-24 18:32:33 +00:00
Jakob Stoklund Olesen
39867e6646 Add a test case for global live range splitting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157357 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-23 23:42:23 +00:00
Jakob Stoklund Olesen
d74d284757 Add a last resort tryInstructionSplit() to RAGreedy.
Live ranges with a constrained register class may benefit from splitting
around individual uses. It allows the remaining live range to use a
larger register class where it may allocate. This is like spilling to a
different register class.

This is only attempted on constrained register classes.

<rdar://problem/11438902>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157354 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-23 22:37:27 +00:00
Jakob Stoklund Olesen
e3b548219f Correctly deal with identity copies in RegisterCoalescer.
Now that the coalescer keeps live intervals and machine code in sync at
all times, it needs to deal with identity copies differently.

When merging two virtual registers, all identity copies are removed
right away. This means that other identity copies must come from
somewhere else, and they are going to have a value number.

Deal with such copies by merging the value numbers before erasing the
copy instruction. Otherwise, we leave dangling value numbers in the live
interval.

This fixes PR12927.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157340 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-23 20:21:06 +00:00
Chad Rosier
1c8fccbc12 [arm-fast-isel] Add support for non-global callee.
Patch by Jush Lu <jush.msn@gmail.com>.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-23 18:38:57 +00:00
Nuno Lopes
23e75da7e0 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157255 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen
76ff741836 Only erase virtregs with no uses left.
Also make sure registers aren't erased twice if the dead def mentions
the register twice.

This fixes PR12911.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157254 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-22 14:52:12 +00:00
Jim Grosbach
6b07bc604e FileCheck'ize test, and add a bit to test for r157221.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157222 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-21 23:50:00 +00:00
Craig Topper
8ae97baef2 Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157175 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-21 06:40:16 +00:00
Peter Collingbourne
92d63ccfc7 When legalising shifts, do not pre-build a list of operands which
may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve
the other operands when calling UpdateNodeOperands.  Fixes PR12889.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157162 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 18:36:15 +00:00
Hal Finkel
2e8e5c0eca Add a missing PPC 64-bit stwu pattern.
This seems to fix the remaining compile-time failures on PPC64 when
compiling with -enable-ppc-preinc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157159 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 17:11:24 +00:00
Jakob Stoklund Olesen
027c32a14e Use the right register class for LDRrs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157152 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 06:38:47 +00:00
Jakob Stoklund Olesen
6e6269a976 Transfer memory operands to the right instruction.
They need to go on the PICLDR as the verifier points out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157151 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 06:38:42 +00:00
Jakob Stoklund Olesen
ee0d5d4398 Properly constrain register classes for sub-registers.
Not all GR64 registers have sub_8bit sub-registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen
8e86929e3c Properly constrain register classes in 2-addr.
X86 has 2-addr instructions with different constraints on the tied def
and use operands. One is GR32, one is GR32_NOSP.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157149 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-20 06:38:32 +00:00
Jakob Stoklund Olesen
7ebed91fdd Fix 12892.
Dead code elimination during coalescing could cause a virtual register
to be split into connected components. The following rewriting would be
confused about the already joined copies present in the code, but
without a corresponding value number in the live range.

Erase all joined copies instantly when joining intervals such that the
MI and LiveInterval representations are always in sync.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157135 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 23:34:59 +00:00
Jakob Stoklund Olesen
ccce1233a2 Erase joined copies immediately.
The late dead code elimination is no longer necessary.

The test changes are cause by a register hint that can be either %rdi or
%rax. The choice depends on the use list order, which this patch changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157131 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 20:54:07 +00:00
Nadav Rotem
87d35e8c71 On Haswell, perfer storing YMM registers using a single instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157129 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 20:30:08 +00:00
Nadav Rotem
4fc8a5de44 Add support for additional in-reg vbroadcast patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157127 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 19:57:37 +00:00
Eric Christopher
75f89b54b5 Add support for the 'd' mips inline asm output modifier.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157093 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-19 00:51:56 +00:00
Jim Grosbach
3e96531186 Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157062 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 19:12:01 +00:00
Eric Christopher
550c25e974 Add support for the mips 'x' inline asm modifier.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157057 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 17:39:35 +00:00
Craig Topper
b82b5abf78 Simplify handling of v16i8 shuffles and fix a missed optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157043 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 06:42:06 +00:00
Evan Cheng
ad75364815 Teach two-address pass to update the "source" map so it doesn't perform a
non-profitable commute using outdated info. The test case would still fail
because of poor pre-RA schedule. That will be fixed by MI scheduler.

rdar://11472010


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157038 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 01:33:51 +00:00
Jakob Stoklund Olesen
0e5e821a69 Remove a test that was only testing for physreg joining.
This is the same as the other tests: Clever tricks are required to make
the arguments and return value line up in a single-instruction function.
It rarely happens in real life.

We have plenty other examples of this behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157030 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18 00:07:14 +00:00
Jakob Stoklund Olesen
ed18a3e6b2 Remove -join-physregs from the test suite.
This option has been disabled for a while, and it is going away so I can
clean up the coalescer code.

The tests that required physreg joining to be enabled were almost all of
the form "tiny function with interference between arguments and return
value". Such functions are usually inlined in the real world.

The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly
rare.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157027 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-17 23:44:19 +00:00
Tim Northover
44600d7081 Remove incorrect pattern for ARM SMML instruction.
Patch by Meador Inge.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156989 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-17 13:12:13 +00:00
Evan Cheng
6100366c2f Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156896 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen
83b3a29334 Enable sub-sub-register copy coalescing.
It is now possible to coalesce weird skewed sub-register copies by
picking a super-register class larger than both original registers. The
included test case produces code like this:

  vld2.32 {d16, d17, d18, d19}, [r0]!
  vst2.32 {d18, d19, d20, d21}, [r0]

We still perform interference checking as if it were a normal full copy
join, so this is still quite conservative. In particular, the f1 and f2
functions in the included test case still have remaining copies because
of false interference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156878 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-15 23:31:35 +00:00
Sirish Pande
0031af6615 Enable all Hexagon tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-15 16:13:12 +00:00
Jakob Stoklund Olesen
4d10829e12 Fix PR12821.
RAFast must add an <imp-def> operand when it is rewriting a sub-register
def that isn't a read-modify-write.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156777 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 21:10:25 +00:00
Brendon Cahoon
5262abb268 Revert 156634 upon request until code improvement changes are made.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156775 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 19:35:42 +00:00
Dan Gohman
a6063c6e29 Rename @llvm.debugger to @llvm.debugtrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156774 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-14 18:58:10 +00:00
Sirish Pande
b33857040f Support for Hexagon feature, New Value Jump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156698 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 05:10:30 +00:00
Akira Hatanaka
1da1cdf504 Fix test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156697 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 03:25:16 +00:00
Akira Hatanaka
4147e4d054 Make the following changes in MipsAsmPrinter.cpp:
- Remove code which lowers pseudo SETGP01.
- Fix LowerSETGP01. The first two of the three instructions that are emitted to
  initialize the global pointer register now use register $2.
- Stop emitting .cpload directive.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156689 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 00:48:43 +00:00
Akira Hatanaka
27ba61df9f Insert instructions to the entry basic block which initializes the global
pointer register. 


This is the first of the series of patches which clean up the way global pointer
register is used. The patches will make the following improvements:

- Make $gp an allocatable temporary register rather than reserving it.
- Use a virtual register as the global pointer register and let the register
  allocator decide which register to assign to it or whether spill/reloads are
  needed.
- Make sure $gp is valid at the entry of a called function, which is necessary
  for functions using lazy binding.
- Remove the need for emitting .cprestore and .cpload directives.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156671 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-12 00:17:17 +00:00
Akira Hatanaka
3011a33a63 Do not replace operands of pseudo instructions with register $zero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156663 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 23:22:18 +00:00
Akira Hatanaka
f5b1e2802f Use regular expression to match register names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156656 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 23:00:40 +00:00
Chad Rosier
226ddf5278 [fast-isel] Add support for selecting @llvm.trap().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156646 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 21:33:49 +00:00
Brendon Cahoon
6d532d8860 Hexagon constant extender support.
Patch by Jyotsna Verma.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156634 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 19:56:59 +00:00
Chad Rosier
2b3b335f2d [fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156632 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 19:40:25 +00:00
Chad Rosier
2a2e9d54e9 [fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg
retval.  Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156628 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 18:51:55 +00:00
Hans Wennborg
12447575dc Fix test/CodeGen/X86/tls-pie.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156612 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 10:19:54 +00:00
Hans Wennborg
228756c744 Implement initial-exec TLS model for 32-bit PIC x86
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156611 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 10:11:01 +00:00
Manman Ren
247c5ab07c ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156599 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 01:30:47 +00:00
Dan Gohman
d4347e1af9 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156593 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-11 00:19:32 +00:00
Eric Christopher
05b7a50210 Add support for the 'X' inline asm operand modifier.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156577 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 21:48:22 +00:00
Sirish Pande
7517bbc91a Hexagon V5 FP Support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156568 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 20:20:25 +00:00
Manman Ren
fe65d98dad Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156556 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 18:49:43 +00:00
Manman Ren
8ae4f062e4 ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156550 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 16:48:21 +00:00
Nadav Rotem
b210651654 AVX2: Add an additional broadcast idiom.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156540 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 12:39:13 +00:00
Nadav Rotem
5fc2187a02 Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.

Fix PR11900.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156539 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-10 12:22:05 +00:00
Danil Malyshev
fd39afb5e1 Added a regress test for the bug #9964 before close it.
This bug was fixed by Jim Grosbach in #138879, thanks Jim!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156505 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 19:07:04 +00:00
Nuno Lopes
30759542aa change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept.
This commit only adds the parameter. Code taking advantage of it will follow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156473 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 15:52:43 +00:00
Akira Hatanaka
2b409b65d4 Add another peephole pattern for conditional moves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156460 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 02:29:29 +00:00
Akira Hatanaka
56ec9f2c09 Make register FP allocatable if the compiled function does not have dynamic
allocas.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156458 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 01:38:13 +00:00
Akira Hatanaka
a284acb8a7 Expand 64-bit shifts if target ABI is O32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156457 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-09 00:55:21 +00:00
Craig Topper
189bce48c7 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156375 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-08 06:58:15 +00:00
Owen Anderson
713e953118 Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 20:51:25 +00:00
Chad Rosier
42726835e3 Fix a regression from r147481. This combine should only happen if there is a
single use.
rdar://11360370


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 18:47:44 +00:00
Manman Ren
ed57984483 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;

rdar: 10961709


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 18:06:23 +00:00
Eric Christopher
4adbefebd2 Add support for the 'l' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156294 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 06:25:15 +00:00
Eric Christopher
1d5a392e2c Add support for the 'c' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156293 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 06:25:10 +00:00
Eric Christopher
54412a789a Add support for the 'P' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 06:25:02 +00:00
Eric Christopher
1ce2034e43 Add support for the 'O' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156285 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:48 +00:00
Eric Christopher
60cfc7908e Add support for the 'N' inline asm constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156284 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:43 +00:00
Eric Christopher
5ac47bba83 Add support for the 'L' inline asm constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156283 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:37 +00:00
Eric Christopher
f49f846eec Add support for the inline asm constraint 'K'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156282 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:46:29 +00:00
Craig Topper
5f9cccc509 Add SSE4A MOVNTSS/MOVNTSD instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156281 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 05:36:19 +00:00
Eric Christopher
e5076d484b Support the 'J' constraint.
Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156280 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 03:13:42 +00:00
Eric Christopher
50ab03954e Add support for the 'I' inline asm constraint. Also add tests
from the previous 2 patches.

Patch by Jack Carter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156279 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-07 03:13:32 +00:00
Benjamin Kramer
77c4ef8a47 Switch the select to branch transformation on by default.
The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156258 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-06 14:25:16 +00:00
Benjamin Kramer
59957500f9 CodeGenPrepare: Add a transform to turn selects into branches in some cases.
This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:

	ucomisd	(%rdi), %xmm0
	cmovbel	%edx, %esi

cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.

This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.

It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.


Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.

Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156234 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-05 12:49:22 +00:00
Justin Holewinski
49683f3c96 This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
The new target machines are:

nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX

The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.

NV_CONTRIB

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156196 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04 20:18:50 +00:00
Sebastian Pop
2c7e5c714c Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156195 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04 19:53:56 +00:00
Craig Topper
f3640d7ec1 Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156156 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-04 04:44:49 +00:00
Sirish Pande
26f61a158b Support for target dependent Hexagon VLIW packetizer.
This patch creates and optimizes packets as per Hexagon ISA rules.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156109 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 21:52:53 +00:00
Craig Topper
6b28d356c5 Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156059 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 07:12:59 +00:00
Evan Cheng
d99d68bcee Fix two-address pass's aggressive instruction commuting heuristics. It's meant
to catch cases like:
 %reg1024<def> = MOV r1
 %reg1025<def> = MOV r0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

By commuting ADD, it let coalescer eliminate all of the copies. However, there
was a bug in the heuristics where it ended up commuting the ADD in:

 %reg1024<def> = MOV r0
 %reg1025<def> = MOV 0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

That did no benefit but rather ensure the last MOV would not be coalesced.

rdar://11355268


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156048 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-03 01:45:13 +00:00
Owen Anderson
062c0a5b58 Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156029 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 22:17:40 +00:00
Owen Anderson
363e4b90c0 Teach DAG combine that multiplication by 1.0 can always be constant folded.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 21:32:35 +00:00
Manman Ren
e2849851b2 Revert r155853
The commit is intended to fix rdar://10961709.
But it is the root cause of PR12720.
Revert it for now.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155992 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 15:24:32 +00:00
Craig Topper
a9a568a79d Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155982 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-02 08:03:44 +00:00
Bill Wendling
95dd442041 Strip the pointer casts off of allocas so that the selection DAG can find them.
PR10799


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155954 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 22:50:45 +00:00
Manman Ren
769ea2f93f X86: optimization for max-like struct
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0

FROM
movl    %edi, %ecx
subl    %esi, %ecx
cmpl    %edi, %esi
movl    $0, %eax
cmovll  %ecx, %eax
TO
xorl    %eax, %eax
subl    %esi, %edi
cmovll  %eax, %edi
movl    %edi, %eax

rdar: 10734411


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155919 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 17:16:15 +00:00
Jay Foad
4ab6fcaadc Regression test for PR2960.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155912 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-01 11:11:34 +00:00
Manman Ren
16a76519a5 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155853 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 22:51:25 +00:00
Manman Ren
1701105022 test/CodeGen/X86/select.ll: remove spaces
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155840 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 18:54:27 +00:00
Derek Schuff
ddc693bd22 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

(this time, actually commit what was reviewed!)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155825 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 16:57:15 +00:00
Bob Wilson
ff73d8fef9 Don't introduce illegal types when creating vmull operations. <rdar://11324364>
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal.  Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155824 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-30 16:53:34 +00:00
Andrew Trick
2674a4acdb Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice.
This time, also fix the caller of AddGlue to properly handle
incomplete chains. AddGlue had failure modes, but shamefully hid them
from its caller. It's luck ran out.

Fixes rdar://11314175: BuildSchedUnits assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155749 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-28 01:03:23 +00:00
Derek Schuff
f3db6b855e Revert r155745
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155746 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 23:37:41 +00:00
Derek Schuff
9dc28b0722 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155745 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-27 23:27:17 +00:00