18908 Commits

Author SHA1 Message Date
Manman Ren
6a0ccf041c Aliasing rules for struct-path aware TBAA.
Added PathAliases to check if two struct-path tags can alias.
Added command line option -struct-path-tbaa.

llvm-svn: 179337
2013-04-11 23:24:18 +00:00
Preston Gurd
8e4196dfe6 Use FileCheck instead of grep.
llvm-svn: 179322
2013-04-11 21:39:01 +00:00
David Majnemer
82ec1d080e Optimize icmp involving addition better
Allows LLVM to optimize sequences like the following:

%add = add nsw i32 %x, 1
%cmp = icmp sgt i32 %add, %y

into:

%cmp = icmp sge i32 %x, %y

as well as:

%add1 = add nsw i32 %x, 20
%add2 = add nsw i32 %y, 57
%cmp = icmp sge i32 %add1, %add2

into:

%add = add nsw i32 %y, 37
%cmp = icmp sle i32 %cmp, %x

llvm-svn: 179316
2013-04-11 20:05:46 +00:00
Jack Carter
834846d7a8 Mips specific inline asm memory operand modifier test case
These changes are based on commit responses for r179135.

llvm-svn: 179315
2013-04-11 19:39:19 +00:00
Rafael Espindola
6c9d485a40 Revert my last two commits while I debug what is wrong in a big endian host.
llvm-svn: 179303
2013-04-11 17:46:10 +00:00
Rafael Espindola
857bda0e10 Print more information about relocations.
With this patch llvm-readobj now prints if a relocation is pcrel, its length,
if it is extern and if it is scattered.

It also refactors the code a bit to use bit fields instead of shifts and
masks all over the place.

llvm-svn: 179294
2013-04-11 16:31:37 +00:00
Benjamin Kramer
3b38288ea2 Fix for wrong instcombine on vector insert/extract
When trying to collapse sequences of insertelement/extractelement
instructions into single shuffle instructions, there is one specific
case where the Instruction Combiner wrongly updates the resulting
Mask of shuffle indexes.

The problem is in function CollectShuffleElments.

If we have a sequence of insert/extract element instructions
like the one below:

  %tmp1 = extractelement <4 x float> %LHS, i32 0
  %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1
  %tmp3 = extractelement <4 x float> %RHS, i32 2
  %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3

Where:
  . %RHS will have a mask of [4,5,6,7]
  . %LHS will have a mask of [0,1,2,3]

The Mask of shuffle indexes is wrongly computed to [4,1,6,7]
instead of [4,0,6,7].
When analyzing %tmp2 in order to compute the Mask for the
resulting shuffle instruction, the algorithm forgets to update
the mask index at position 1 with the index associated to the
element extracted from %LHS by instruction %tmp1.

Patch by Andrea DiBiagio!

llvm-svn: 179291
2013-04-11 15:10:09 +00:00
Eli Bendersky
0ce49fd520 Add a CHECK-NOT for a more faithful translation of the original grep | count 2.
Thanks to Reid Kleckner for catching this.

llvm-svn: 179289
2013-04-11 14:43:19 +00:00
Benjamin Kramer
f15ba24b8d Add missing colons to check lines.
llvm-svn: 179277
2013-04-11 12:41:41 +00:00
Benjamin Kramer
4413e71a39 FileCheckize a bunch of tests.
llvm-svn: 179276
2013-04-11 12:32:23 +00:00
Michael Liao
877d1576e6 Optimize vector select from all 0s or all 1s
As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane,
vector select could be simplified to AND/OR or removed if one or both values
being selected is all 0s or all 1s.

llvm-svn: 179267
2013-04-11 05:15:54 +00:00
Michael Liao
75c886a312 Add CLAC/STAC instruction encoding/decoding support
As these two instructions in AVX extension are privileged instructions for
special purpose, it's only expected to be used in inlined assembly.

llvm-svn: 179266
2013-04-11 04:52:28 +00:00
Michael Liao
87125582e9 Enhance bool simplifcation in X86 to handle more cases
This patch is revised based on patch from Victor Umansky
<victor.umansky@intel.com>. More cases are handled in X86's bool
simplification, i.e.
- SETCC_CARRY
- value is truncated to i1 with AND

As a by-product, PR5443 is also fixed.

llvm-svn: 179265
2013-04-11 04:43:09 +00:00
Rafael Espindola
3ac0776aa4 Add MachO-x86-64 tests.
The object was already checked in, but was not being tested.

llvm-svn: 179256
2013-04-11 02:52:29 +00:00
Eli Bendersky
90daaa543a Rewrite some of the test/CodeGen/X86 tests to use FileCheck instead of grep
llvm-svn: 179241
2013-04-10 23:30:20 +00:00
Nico Rieck
8e22855ea6 MC: Support COFF image-relative MCSymbolRefs
Add support for the COFF relocation types IMAGE_REL_I386_DIR32NB and
IMAGE_REL_AMD64_ADDR32NB for 32- and 64-bit respectively. These are
similar to normal 4-byte relocations except that they do not include
the base address of the image.

Image-relative relocations are used for debug information (32-bit) and
SEH unwind tables (64-bit).

A new MCSymbolRef variant called 'VK_COFF_IMGREL32' is introduced to
specify such relocations. For AT&T assembly, this variant can be accessed
using the symbol suffix '@imgrel'.

llvm-svn: 179240
2013-04-10 23:28:17 +00:00
Hal Finkel
03d47320aa Manually remove successors in if conversion when CopyAndPredicateBlock is used
In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is
used because the to-be-predicated block has other predecessors, we need to
explicitly remove the old copied block from the successors list. Normally if
conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges
to cleanup the successors list, but if the predicated block contained an
un-analyzable branch (such as a now-predicated return), then this will fail.

These extra successors were causing a problem on PPC because it was causing
later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in
the code.

llvm-svn: 179227
2013-04-10 22:05:25 +00:00
Jack Carter
8e319ed798 Mips specific inline asm memory operand modifier test case
These changes are based on commit responses for r179135.

llvm-svn: 179225
2013-04-10 22:02:32 +00:00
Kay Tiong Khoo
5f12d15d44 fixed xsave, xsaveopt, xrstor mnemonics with intel syntax; added test cases
llvm-svn: 179223
2013-04-10 21:52:25 +00:00
Eric Christopher
d419b3e326 Revert "Update the version of dwarf we say we're emitting to at least 3."
temporarily while we work on plumbing through some changes to continue
supporting gdb on darwin.

This reverts commit r179122.

llvm-svn: 179222
2013-04-10 21:45:07 +00:00
Jyotsna Verma
b244dff2f1 Add object-emission flag for lit tests. This flag is used
to disable following tests for Hexagon that require direct object
generation support.

DebugInfo/dwarf-public-names.ll
DebugInfo/dwarf-version.ll
DebugInfo/member-pointers.ll
DebugInfo/namespace.ll
DebugInfo/two-cus-from-same-file.ll

Fixes bug 15616 - http://llvm.org/bugs/show_bug.cgi?id=15616

llvm-svn: 179209
2013-04-10 19:53:26 +00:00
Nadav Rotem
6a6b998435 Make the SLP store-merger less paranoid about function calls. We check for function calls when we check if it is safe to sink instructions.
llvm-svn: 179207
2013-04-10 19:41:36 +00:00
Michel Danzer
c1562afdde R600/SI: Add pattern for AMDGPUurecip
21 more little piglits with radeonsi.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 179186
2013-04-10 17:17:56 +00:00
Reed Kotler
68e5128508 This is for an experimental option -mips-os16. The idea is to compile all
Mips32 code as Mips16 unless it can't be compiled as Mips 16. For now this
would happen as long as floating point instructions are not needed.
Probably it would also make sense to compile as mips32 if atomic operations
are needed too. There may be other cases too.

A module pass prescans the IR and adds the mips16 or nomips16 attribute
to functions depending on the functions needs.

Mips 16 mode can result in a 40% code compression by utililizing 16 bit
encoding of many instructions.

The hope is for this to replace the traditional gcc way of dealing with
Mips16 code using floating point which involves essentially using soft float
but with a library implemented using mips32 floating point. This gcc 
method also requires creating stubs so that Mips32 code can interact with
these Mips 16 functions that have floating point needs. My conjecture is
that in reality this traditional gcc method would never win over this
new method.

I will be implementing the traditional gcc method also. Some of it is already
done but I needed to do the stubs to finish the work and those required
this mips16/32 mixed mode capability.

I have more ideas for to make this new method much better and I think the old
method will just live in llvm for anyone that needs the backward compatibility
but I don't for what reason that would be needed.

llvm-svn: 179185
2013-04-10 16:58:04 +00:00
Peter Collingbourne
672b843bca Use a scheme closer to that of GNU as when deciding the type of a
symbol with multiple .type declarations.

Differential Revision: http://llvm-reviews.chandlerc.com/D607

llvm-svn: 179184
2013-04-10 16:52:15 +00:00
Vincent Lejeune
daa1e69206 R600: Add VTX_READ_* and RAT_WRITE_CACHELESS_* when computing cf addr
llvm-svn: 179174
2013-04-10 13:29:20 +00:00
Reid Kleckner
724d65c212 [test] Use lit's shell test runner on Windows
Summary:
I did a local comparison between using bash and using lit's runner, and
more of the suite passes with lit than passes with bash.  Most of the
bash failures have to do with /dev/null, which is nonsensical on
Windows, but the lit runner handles it.

The lit shell runner is also much faster than bash, so I would expect
most Windows devs would want it by default.

The behavior can be overridden on any OS by setting
LIT_USE_INTERNAL_SHELL to 0 or 1 in the environment.

Reviewers: chapuni, ddunbar

CC: llvm-commits, timurrrr

Differential Revision: http://llvm-reviews.chandlerc.com/D559

llvm-svn: 179173
2013-04-10 13:11:38 +00:00
Tim Northover
b82f729eb5 ARM: Make "SMC" instructions conditional on new TrustZone architecture feature.
These instructions aren't universally available, but depend on a specific
extension to the normal ARM architecture (rather than, say, v6/v7/...) so a new
feature is appropriate.

This also enables the feature by default on A-class cores which usually have
these extensions, to avoid breaking existing code and act as a sensible
default.

llvm-svn: 179171
2013-04-10 12:08:35 +00:00
Christian Konig
f40f671bab R600/SI: dynamical figure out the reg class of MIMG
Depending on the number of bits set in the writemask.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179166
2013-04-10 08:39:16 +00:00
Christian Konig
76cd1a76c2 R600/SI: adjust writemask to only the used components
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179165
2013-04-10 08:39:08 +00:00
Christian Konig
ffddac18a4 R600/SI: remove image sample writemask
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 179164
2013-04-10 08:39:01 +00:00
Evan Cheng
9f82233851 __sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not in
xmm0 / xmm1.

rdar://13599493

llvm-svn: 179141
2013-04-10 01:26:07 +00:00
Jack Carter
03f8f98410 Mips specific inline asm operand modifier 'D'
Modifier 'D' is to use the second word of a double integer.

We had previously implemented the pure register varient of 
the modifier and this patch implements the memory reference.



#include "stdio.h"

int b[8] = {0,1,2,3,4,5,6,7};
void main()
{
    int i;
    
    // The first word. Notice, no 'D'
    {asm (
    "lw    %0,%1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);

    // The second word
    {asm (
    "lw    %0,%D1;"
    : "=r" (i)
    : "m" (*(b+4))
    );}
    
    printf("%d\n",i);
}

llvm-svn: 179135
2013-04-09 23:19:50 +00:00
Hal Finkel
8b05494b58 Allow PPC B and BLR to be if-converted into some predicated forms
This enables us to form predicated branches (which are the same conditional
branches we had before) and also a larger set of predicated returns (including
instructions like bdnzlr which is a conditional return and loop-counter
decrement all in one).

At the moment, if conversion does not capture all possible opportunities. A
simple example is provided in early-ret2.ll, where if conversion forms one
predicated return, and then the PPCEarlyReturn pass picks up the other one. So,
at least for now, we'll keep both mechanisms.

llvm-svn: 179134
2013-04-09 22:58:37 +00:00
Eric Christopher
8a0c2a7dbd Update the version of dwarf we say we're emitting to at least 3.
Deals with a dwarf2 -> dwarf3 DW_FORM_ref_addr change.

llvm-svn: 179122
2013-04-09 20:22:47 +00:00
Reed Kotler
9b753510a5 This patch enables llvm to switch between compiling for mips32/mips64
and mips16 on a per function basis.

Because this patch is somewhat involved I have provide an overview of the
key pieces of it.

The patch is written so as to not change the behavior of the non mixed
mode. We have tested this a lot but it is something new to switch subtargets
so we don't want any chance of regression in the mainline compiler until
we have more confidence in this.

Mips32/64 are very different from Mip16 as is the case of ARM vs Thumb1.
For that reason there are derived versions of the register info, frame info, 
instruction info and instruction selection classes.

Now we register three separate passes for instruction selection.
One which is used to switch subtargets (MipsModuleISelDAGToDAG.cpp) and then
one for each of the current subtargets (Mips16ISelDAGToDAG.cpp and
MipsSEISelDAGToDAG.cpp).

When the ModuleISel pass runs, it determines if there is a need to switch
subtargets and if so, the owning pointers in MipsTargetMachine are
appropriately changed.

When 16Isel or SEIsel is run, they will return immediately without doing
any work if the current subtarget mode does not apply to them.

In addition, MipsAsmPrinter needs to be reset on a function basis.

The pass BasicTargetTransformInfo is substituted with a null pass since the
pass is immutable and really needs to be a function pass for it to be
used with changing subtargets. This will be fixed in a follow on patch.

llvm-svn: 179118
2013-04-09 19:46:01 +00:00
Nadav Rotem
96f8f45bd5 Add support for bottom-up SLP vectorization infrastructure.
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations.
The infrastructure has three potential users:

  1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]).

  2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute.

  3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization.

This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code:

void SAXPY(int *x, int *y, int a, int i) {
  x[i]   = a * x[i]   + y[i];
  x[i+1] = a * x[i+1] + y[i+1];
  x[i+2] = a * x[i+2] + y[i+2];
  x[i+3] = a * x[i+3] + y[i+3];
}

llvm-svn: 179117
2013-04-09 19:44:35 +00:00
Eric Christopher
48e215c890 The .dwo section shouldn't contain the unrelocated values (and
therefore not at all) of the pc or statement list. We also don't
need to emit the compilation dir so save so space and time
and don't bother.

Fix up the testcase accordingly and verify that we don't emit
the attributes or the items that they use.

llvm-svn: 179114
2013-04-09 19:23:15 +00:00
Nadav Rotem
8743b338cb Revert r176408 and r176407 to address PR15540.
llvm-svn: 179111
2013-04-09 18:16:05 +00:00
Benjamin Kramer
788f55c7d4 DAGCombiner: Fold a shuffle on CONCAT_VECTORS into a new CONCAT_VECTORS if possible.
This pattern occurs in SROA output due to the way vector arguments are lowered
on ARM.

The testcase from PR15525 now compiles into this, which is better than the code
we got with the old scalarrepl:
_Store:
	ldr.w	r9, [sp]
	vmov	d17, r3, r9
	vmov	d16, r1, r2
	vst1.8	{d16, d17}, [r0]
	bx	lr

Differential Revision: http://llvm-reviews.chandlerc.com/D647

llvm-svn: 179106
2013-04-09 17:41:43 +00:00
Hal Finkel
c72f2476a9 Use virtual base registers on PPC
On PowerPC, non-vector loads and stores have r+i forms; however, in functions
with large stack frames these were not being used to access slots far from the
stack pointer because such slots were out of range for the signed 16-bit
immediate offset field. This increases register pressure because we need a
separate register for each offset (when the r+r form is used). By enabling
virtual base registers, we can deal with large stack frames without unduly
increasing register pressure.

llvm-svn: 179105
2013-04-09 17:27:09 +00:00
Hal Finkel
68742d4f1f Convert test PowerPC/2007-09-07-LoadStoreIdxForms to FileCheck
llvm-svn: 179104
2013-04-09 17:26:55 +00:00
Eli Bendersky
3fbc32c278 Rewrite test/Linker tests to use FileCheck instead of grep.
Some translations here are not 1x1 because there are grep|grep
chains that are non-trivial to implement in terms of FileCheck features. I
made an effort for the tests to remain as similar as possible; do let me know
if you notice anything fishy. The good news are that some buggy tests were
fixed (grep | not grep - a bug waiting to happen).

llvm-svn: 179102
2013-04-09 16:51:13 +00:00
Michael Gottesman
66864f3c32 Converted 8x tests of SimplifyCFG to use FileCheck instead of grep.
llvm-svn: 179087
2013-04-09 05:18:53 +00:00
Nadav Rotem
fad36d8034 Revert 179071 because it is not the right way to support non standard new/new[] operators.
llvm-svn: 179084
2013-04-09 04:43:46 +00:00
Jakob Stoklund Olesen
6a275455ac Compute correct frame sizes for SPARC v9 64-bit frames.
The save area is twice as big and there is no struct return slot. The
stack pointer is always 16-byte aligned (after adding the bias).

Also eliminate the stack adjustment instructions around calls when the
function has a reserved stack frame.

llvm-svn: 179083
2013-04-09 04:37:47 +00:00
Nadav Rotem
e99e862deb c++ new operators are not malloc-like functions because they do not return uninitialized memory.
Users may overide new-operators and implement any function that they like.

llvm-svn: 179071
2013-04-08 23:40:47 +00:00
Eli Bendersky
d65a755ddf Rewrite test/Integer tests to use FileCheck instead of grep
llvm-svn: 179047
2013-04-08 20:18:15 +00:00
Eli Bendersky
88f7dd304e Rewrite test/ExecutionEngine tests to use FileCheck instead of grep
llvm-svn: 179043
2013-04-08 19:51:36 +00:00
Eli Bendersky
ea93ef7358 Rewrite test/Verifier tests to use FileCheck instead of grep
llvm-svn: 179036
2013-04-08 18:33:51 +00:00