Commit Graph

17216 Commits

Author SHA1 Message Date
Duncan Sands
9db0be6c19 Add the testcase from pr13254 (the old scalarreply pass handles this wrong;
the new sroa pass handles it right).

llvm-svn: 165644
2012-10-10 18:41:19 +00:00
Michael Liao
29bbad4a7b Specify CPU model to avoid breaking ATOM builds
llvm-svn: 165638
2012-10-10 18:04:52 +00:00
Michael Liao
6a09ff62ba Add support for FP_ROUND from v2f64 to v2f32
- Due to the current matching vector elements constraints in
  ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
  v2f32) is scalarized. Add a customized v2f32 widening to convert it
  into a target-specific X86ISD::VFPROUND to work around this
  constraints.

llvm-svn: 165631
2012-10-10 16:53:28 +00:00
NAKAMURA Takumi
1d6fb43bc9 [CMake] check-all: Don't include check-llvm into check-all without LLVM_BUILD_TOOLS.
FIXME: Would you like to run llvm/unittests w/o LLVM_BUILD_TESTS regardless of LLVM_BUILD_TOOLS?
llvm-svn: 165619
2012-10-10 13:33:00 +00:00
Stepan Dyatkovskiy
06c2fdd18f Fix for LDRB instruction:
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.

7 ops is needed, but SDNode with only 6 is created.

In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.

The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.

llvm-svn: 165617
2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy
5182bb8695 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 

llvm-svn: 165616
2012-10-10 11:37:36 +00:00
Craig Topper
96909f4ec9 Test case for r165480.
llvm-svn: 165594
2012-10-10 02:54:23 +00:00
Akira Hatanaka
6fbe997f0d Implement MipsTargetLowering::CanLowerReturn.
Patch by Sasa Stankovic. 

llvm-svn: 165585
2012-10-10 01:27:09 +00:00
Evan Cheng
7aa64222c5 When expanding atomic load arith instructions, do not lose target flags. rdar://12453106
llvm-svn: 165568
2012-10-09 23:48:33 +00:00
Jack Carter
f403d95eb4 Initial assembler implementation of Mips load address macro
This patch provides initial implementation of load address 
macro instruction for Mips. We have implemented two kinds 
of expansions with their variations depending on the size 
of immediate operand:

 1) load address with immediate value directly:
    * la d,j => addiu d,$zero,j   (for -32768 <= j <= 65535)
    * la d,j => lui d,hi16(j)
                ori d,d,lo16(j)   (for any other 32 bit value of j)

 2) load load address with register offset value
    * la d,j(s) => addiu d,s,j     (for -32768 <= j <= 65535)
    * la d,j(s) => lui d,hi16(j)   (for any other 32 bit value of j)
                   ori d,d,lo16(j)
                   addu d,d,s

This patch does not cover the case when the address is loaded 
from the value of the label or function.

Contributer: Vladimir Medic
llvm-svn: 165561
2012-10-09 23:29:45 +00:00
Bill Wendling
31ae26f466 Inline the checks for mutually exclusive attributes since they're used in only one module.
llvm-svn: 165539
2012-10-09 20:11:19 +00:00
Rafael Espindola
8e5adfa32c Enable response files in all tools. Patch by Liu, Yaxun (Sam). I have simplified
the test.

llvm-svn: 165535
2012-10-09 19:52:10 +00:00
Michael Ilseman
072e9fdbb9 New EarlyCSE tests for CSE-ing across commutativity.
llvm-svn: 165510
2012-10-09 16:58:13 +00:00
Alexey Samsonov
561aa02d50 Fix PR14016.
DeadArgumentElimination pass can replace one LLVM function with another,
invalidating a pointer stored in debug info metadata entry for this function.
To fix this, we collect debug info descriptors for functions before
running a DeadArgumentElimination pass and "patch" pointers in metadata nodes
if we replace a function.

llvm-svn: 165490
2012-10-09 08:13:15 +00:00
NAKAMURA Takumi
66a892ba2d Revert r117093, "test/Makefile: Force lit -j1 on Cygwin."
lit -jN works on cygwin in most cases, but still sometimes I can see stalls with iterative run on the buildbot.

llvm-svn: 165482
2012-10-09 05:07:18 +00:00
Chandler Carruth
e55d2920b1 Fix PR14034, an infloop / heap corruption / crash bug in the new SROA.
Thanks to Benjamin for the raw test case. This one took about 50 times
longer to reduce than to fix. =/

llvm-svn: 165476
2012-10-09 01:58:35 +00:00
Jakob Stoklund Olesen
28526a6248 Don't crash on extra evil irreducible control flow.
When the CFG contains a loop with multiple entry blocks, the traces
computed by MachineTraceMetrics don't always have the same nice
properties. Loop back-edges are normally excluded from traces, but
MachineLoopInfo doesn't recognize loops with multiple entry blocks, so
those back-edges may be included.

Avoid asserting when that happens by adding an isEarlierInSameTrace()
function that accurately determines if a dominating block is part of the
same trace AND is above the currrent block in the trace.

llvm-svn: 165434
2012-10-08 22:06:44 +00:00
Adhemerval Zanella
91fa3a3479 PR12716: PPC crashes on vector compare
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.

This patch changes the behavior by just returning a DAG with the 
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.

It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.

llvm-svn: 165419
2012-10-08 18:59:53 +00:00
Adhemerval Zanella
490909f7d1 Add floating-point to and from integer conversion
This patch add altivec support for v4i32 to v4f32 and for v4f32 to
v4i32 vector rounding conversion.

llvm-svn: 165409
2012-10-08 17:27:24 +00:00
Micah Villmow
fe3338a7eb Move TargetData to DataLayout.
llvm-svn: 165403
2012-10-08 16:39:34 +00:00
James Molloy
6addd26621 Some regression tests which are testing the old jit and are exercising functionality which is both known to be broken and not expected to be fixed in the old jit. To remove these from the regression test output, I've marked them XFAIL (for lit tests) and ifdef'd them out (unit tests). These modifications remove the last long-standing regression test failures from the buildbots (though updating the triple to reflect new ubuntu configuration has temporarily caused some new failures). Tested on x86-64 and ARM Linux.
Patch by David Tweed!

llvm-svn: 165390
2012-10-08 13:06:30 +00:00
Benjamin Kramer
ea6041a09a X86: fcmov doesn't handle all possible EFLAGS, fall back to a branch for the others.
Otherwise it will try to use SSE patterns and fail horribly if sse is disabled.
Fixes PR14035.

llvm-svn: 165377
2012-10-07 15:34:27 +00:00
Jack Carter
c5f946b170 Adding support for instructions mfc0, mfc2, mtc0, mtc2
move from and to coprocessors 0 and 2.

Contributer: Vladimir Medic
llvm-svn: 165351
2012-10-06 01:17:37 +00:00
Reed Kotler
edee8fd3c4 Patch for integer multiply, signed/unsigned, long/long long.
llvm-svn: 165322
2012-10-05 18:27:54 +00:00
NAKAMURA Takumi
da9a939cdb Enable llvm/test/ExecutionEngine/MCJIT also for cygwin.
llvm-svn: 165313
2012-10-05 14:10:29 +00:00
Rafael Espindola
2ebde0e0fb Convert to unix line endings.
llvm-svn: 165308
2012-10-05 13:32:38 +00:00
Eli Friedman
cf7d009911 Make sure to generate the right kind of MDNode for enum forward declarations.
PR14029, LLVM part.

llvm-svn: 165288
2012-10-05 01:49:14 +00:00
Evan Cheng
eb205858ec Follow up to r165072. Try a different approach: only move the load when it's going to be folded into the call. rdar://12437604
llvm-svn: 165287
2012-10-05 01:48:22 +00:00
Chandler Carruth
353476d536 Teach the new SROA a new trick. Now we zap any memcpy or memmoves which
are in fact identity operations. We detect these and kill their
partitions so that even splitting is unaffected by them. This is
particularly important because Clang relies on emitting identity memcpy
operations for struct copies, and these fold away to constants very
often after inlining.

Fixes the last big performance FIXME I have on my plate.

llvm-svn: 165285
2012-10-05 01:29:09 +00:00
Nadav Rotem
2f513f5a29 When merging connsecutive stores, use vectors to store the constant zero.
llvm-svn: 165267
2012-10-04 22:35:15 +00:00
Jim Grosbach
3237e667f1 ARM: locate user-defined text sections next to default text.
Make sure functions located in user specified text sections (via the
section attribute) are located together with the default text sections.
Otherwise, for large object files, the relocations for call instructions
are more likely to be out of range. This becomes even more likely in the
presence of LTO.

rdar://12402636

llvm-svn: 165254
2012-10-04 21:33:24 +00:00
Eric Christopher
1441bd0866 Update this a bit more to represent how the prologue should work:
a) frame setup instructions define the prologue
b) we shouldn't change our location mid-stream

Add a test to make sure that the stack adjustment stays within
the prologue.

llvm-svn: 165250
2012-10-04 20:46:14 +00:00
Benjamin Kramer
bbb006ad7d SimplifyCFG: Enhance the "remove CFG edge that leads to null pointer dereference" optimization to also handle instructions with multiple uses.
We conservatively only check the first use to avoid walking long use chains.
This catches the common case of having both a load and a store to a pointer
supplied by a PHI node.

llvm-svn: 165232
2012-10-04 16:11:49 +00:00
Duncan Sands
0ebee9338d In my recent change to avoid use of underaligned memory I didn't notice that
cpyDest can be mutated in some cases, which would then cause a crash later if
indeed the memory was underaligned.  This brought down several buildbots, so
I guess the underaligned case is much more common than I thought!

llvm-svn: 165228
2012-10-04 13:53:21 +00:00
Duncan Sands
448245e370 The alignment of an sret parameter is known: it must be at least the
alignment of the return type.  Teach the optimizers this.

llvm-svn: 165226
2012-10-04 13:36:31 +00:00
Chandler Carruth
9d83695de1 Fix PR13969, a mini-phase-ordering issue with the new SROA pass.
Currently, we re-visit allocas when something changes about the way they
might be *split* to allow better scalarization to take place. However,
we weren't handling the case when the *promotion* is what would change
the behavior of SROA. When an address derived from an alloca is stored
into another alloca, we consider the first to have escaped. If the
second is ever promoted to an SSA value, we will suddenly be able to run
the SROA pass on the first alloca.

This patch adds explicit support for this form if iteration. When we
detect a store of a pointer derived from an alloca, we flag the
underlying alloca for reprocessing after promotion. The logic works hard
to only do this when there is definitely going to be promotion and it
might remove impediments to the analysis of the alloca.

Thanks to Nick for the great test case and Benjamin for some sanity
check review.

llvm-svn: 165223
2012-10-04 12:33:50 +00:00
Duncan Sands
86f8827745 The memcpy optimizer was happily doing call slot forwarding when the new memory
was less aligned than the old.  In the testcase this results in an overaligned
memset: the memset alignment was correct for the original memory but is too much
for the new memory.  Fix this by either increasing the alignment of the new
memory or bailing out if that isn't possible.  Should fix the gcc-4.7 self-host
buildbot failure.

llvm-svn: 165220
2012-10-04 10:54:40 +00:00
Chandler Carruth
6e02238cee Teach the integer-promotion rewrite strategy to be endianness aware.
Sorry for this being broken so long. =/

As part of this, switch all of the existing tests to be Little Endian,
which is the behavior I was asserting in them anyways! Add in a new
big-endian test that checks the interesting behavior there.

Another part of this is to tighten the rules abotu when we perform the
full-integer promotion. This logic now rejects cases where there fully
promoted integer is a non-multiple-of-8 bitwidth or cases where the
loads or stores touch bits which are in the allocated space of the
alloca but are not loaded or stored when accessing the integer. Sadly,
these aren't really observable today as the rest of the pass will
already ensure the invariants hold. However, the latter situation is
likely to become a potential concern in the future.

Thanks to Benjamin and Duncan for early review of this patch. I'm still
looking into whether there are further endianness issues, please let me
know if anyone sees BE failures persisting past this.

llvm-svn: 165219
2012-10-04 10:39:28 +00:00
Jack Carter
a6d222bf00 Implement methods that enable expansion of load immediate
macro instruction (li) in the assembler.

We have identified three possible expansions depending on 
the size of immediate operand:
  1) for 0 ≤ j ≤ 65535.
     li d,j =>
     ori d,$zero,j

  2) for −32768 ≤ j < 0.
     li d,j =>
     addiu d,$zero,j

  3) for any other value of j that is representable as a 32-bit integer.
     li d,j =>
     lui d,hi16(j)
     ori d,d,lo16(j)

All of the above have been implemented in ths patch.

Contributer: Vladimir Medic
llvm-svn: 165199
2012-10-04 04:03:53 +00:00
Jack Carter
f160360b70 This patch is a partial implementation of mips .set assembler directive. Directive is defined as follows:
.set option
The patch implements following options

    at - lets the assembler use the $at register for macros,
         but generates warnings if the source program uses $at

    noat - let source programs use $at without issuingwarnings.

    noreorder - prevents the assembler from reordering machine 
                language instructions.
    nomacro - causes the assembler to print a warning whenever 
              an assembler operation generates more than one 
              machine language instruction.
    macro - lets the assembler generate multiple machine instructions 
            from a single assembler instruction
    reorder - lets the assembler reorder machine language 
               instructions to improve performance

The above variants are parsed and their boolean values set or unset.
The code to actually use them will come later.

Following options are not implemented yet:

nomips16
nomicromips
move
nomove

Contributer: Vladimir Medic
llvm-svn: 165194
2012-10-04 02:29:46 +00:00
Jakub Staszak
73d9bdcca5 Fix PR13967.
llvm-svn: 165187
2012-10-03 23:59:47 +00:00
Chad Rosier
711da11038 [ms-inline asm] Add support in the X86AsmPrinter for printing memory references
in the Intel syntax.

The MC layer supports emitting in the Intel syntax, but this would require the
inline assembly MachineInstr to be lowered to an MCInst before emission.  This
is potential future work, but for now emitting directly from the MachineInstr
suffices.

llvm-svn: 165173
2012-10-03 22:06:44 +00:00
Nadav Rotem
3ffafc2d33 Fix a cycle in the DAG. In this code we replace multiple loads with a single load and
multiple stores with a single load. We create the wide loads and stores (and their chains)
before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge
loads with a different chain. When that happened, the assumption that it is safe to RAUW
broke and a cycle was introduced.

llvm-svn: 165148
2012-10-03 19:30:31 +00:00
Tim Northover
0f0719897b Implement .rel relocation for R_ARM_ABS32 in MCJIT.
Patch by Amara Emerson.

llvm-svn: 165128
2012-10-03 16:29:42 +00:00
Nadav Rotem
b537146d7f A DAGCombine optimization for mergeing consecutive stores to memory. The optimization
is not profitable in many cases because modern processors perform multiple stores
in parallel and merging stores prior to merging requires extra work. We handle two main cases:

1. Store of multiple consecutive constants:
  q->a = 3;
  q->4 = 5;
In this case we store a single legal wide integer.

2. Store of multiple consecutive loads:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;
In this case we load/store either ilegal vector registers or legal wide integer registers.

llvm-svn: 165125
2012-10-03 16:11:15 +00:00
Dmitry Vyukov
f931f3a4d1 tsan: update the test for new atomic enums
llvm-svn: 165109
2012-10-03 13:19:20 +00:00
Dmitry Vyukov
52c2c1cd4f tsan: update the test for new atomic enums
llvm-svn: 165108
2012-10-03 13:13:54 +00:00
Silviu Baranga
c4986e5454 Fixed a bug in the ExecutionDependencyFix pass that caused dependencies to not propagate through implicit defs.
llvm-svn: 165102
2012-10-03 08:29:36 +00:00
Chandler Carruth
57e63536e6 Fix an issue where we failed to adjust the alignment constraint on
a memcpy to reflect that '0' has a different meaning when applied to
a load or store. Now we correctly use underaligned loads and stores for
the test case added.

llvm-svn: 165101
2012-10-03 08:26:28 +00:00
Chandler Carruth
c0353523f6 Try to use a better set of abstractions for computing the alignment
necessary during rewriting. As part of this, fix a real think-o here
where we might have left off an alignment specification when the address
is in fact underaligned. I haven't come up with any way to trigger this,
as there is always some other factor that reduces the alignment, but it
certainly might have been an observable bug in some way I can't think
of. This also slightly changes the strategy for placing explicit
alignments on loads and stores to only do so when the alignment does not
match that required by the ABI. This causes a few redundant alignments
to go away from test cases.

I've also added a couple of tests that really push on the alignment that
we end up with on loads and stores. More to come here as I try to fix an
underlying bug I have conjectured and produced test cases for, although
it's not clear if this bug is the one currently hitting dragonegg's
gcc47 bootstrap.

llvm-svn: 165100
2012-10-03 08:14:02 +00:00