Commit Graph

16930 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
45c5c57179 Allow overlaps between virtreg and physreg live ranges.
The RegisterCoalescer understands overlapping live ranges where one
register is defined as a copy of the other. With this change, register
allocators using LiveRegMatrix can do the same, at least for copies
between physical and virtual registers.

When a physreg is defined by a copy from a virtreg, allow those live
ranges to overlap:

  %CL<def> = COPY %vreg11:sub_8bit; GR32_ABCD:%vreg11
  %vreg13<def,tied1> = SAR32rCL %vreg13<tied0>, %CL<imp-use,kill>

We can assign %vreg11 to %ECX, overlapping the live range of %CL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163336 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 18:15:23 +00:00
Tim Northover
24b9f258f1 Diagnose invalid alignments on duplicating VLDn instructions.
Patch by Chris Lidbury.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163323 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 15:27:12 +00:00
Tim Northover
eae1d34029 Check for invalid alignment values when decoding VLDn/VSTn (single ln) instructions.
Patch by Chris Lidbury.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163321 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 15:17:49 +00:00
Arnold Schwaighofer
3d5f96ee1b BasicAA: Recognize cyclic NoAlias phis
Enhances basic alias analysis to recognize phis whose first incoming values are
NoAlias and whose other incoming values are just the phi node itself through
some amount of recursion.

Example: With this change basicaa reports that ptr_phi and ptr_phi2 do not alias
each other.

bb:
 ptr = ptr2 + 1

loop:
  ptr_phi = phi [bb, ptr], [loop, ptr_plus_one]
  ptr2_phi = phi [bb, ptr2], [loop, ptr2_plus_one]
  ...
  ptr_plus_one = gep ptr_phi, 1
  ptr2_plus_one = gep ptr2_phi, 1

This enables the elimination of one load in code like the following:

extern int foo;

int test_noalias(int *ptr, int num, int* coeff) {
  int *ptr2 = ptr;
  int result = (*ptr++) * (*coeff--);
  while (num--) {
    *ptr2++ = *ptr;
    result +=  (*coeff--) * (*ptr++);
  }
  *ptr = foo;
  return result;
}

Part 2/2 of fix for PR13564.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163319 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 14:41:53 +00:00
Tim Northover
64eacd9136 Use correct part of complex operand to encode VST1 alignment.
Patch by Chris Lidbury.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163318 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 14:36:55 +00:00
Arnold Schwaighofer
029032693f BasicAA: GEPs of NoAlias'ing base ptr with equivalent indices are NoAlias
If we can show that the base pointers of two GEPs don't alias each other using
precise analysis and the indices and base offset are equal then the two GEPs
also don't alias each other.
This is primarily needed for the follow up patch that analyses NoAlias'ing PHI
nodes.

Part 1/2 of fix for PR13564.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163317 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 14:31:51 +00:00
Nadav Rotem
79cb162e5d Disable stack coloring by default in order to resolve the i386 failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163316 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 14:27:06 +00:00
Elena Demikhovsky
4178946afb AVX2 optimization.
Added generation of VPSHUB instruction for <32 x i8> vector shuffle when possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163312 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 12:42:01 +00:00
Nadav Rotem
a76d7d64a4 Fix the test by specifying an exact cpu model.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163307 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 10:33:33 +00:00
Hans Wennborg
3bd51b8df3 Fix switch_to_lookup_table.ll test from r163302.
The lookup tables did not get built in a deterministic order.
This makes them get built in the order that the corresponding phi nodes
were found.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163305 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 10:10:35 +00:00
James Molloy
ba8562af44 Improve codegen for BUILD_VECTORs on ARM.
If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163304 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:55:02 +00:00
Hans Wennborg
486270aee6 Build lookup tables for switches (PR884)
This adds a transformation to SimplifyCFG that attemps to turn switch
instructions into loads from lookup tables. It works on switches that
are only used to initialize one or more phi nodes in a common successor
basic block, for example:

  int f(int x) {
    switch (x) {
    case 0: return 5;
    case 1: return 4;
    case 2: return -2;
    case 5: return 7;
    case 6: return 9;
    default: return 42;
  }

This speeds up the code by removing the hard-to-predict jump, and
reduces code size by removing the code for the jump targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163302 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:43:28 +00:00
Nadav Rotem
c05d30601c Add a new optimization pass: Stack Coloring, that merges disjoint static allocations (allocas). Allocas are known to be
disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163299 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:17:37 +00:00
James Molloy
6c822eea47 Optimize codegen for VSETLNi{8,16,32} operating on Q registers. Degenerate to a VSETLN on D registers, instead of an (INSERT_SUBREG (VSETLN (EXTRACT_SUBREG ))) sequence to help the register coalescer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163298 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 09:16:01 +00:00
Craig Topper
07149fe715 Add patterns for converting stores of subvector_extracts of lower 128-bits of a 256-bit vector to VMOVAPSmr/VMOVUPSmr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163292 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 05:15:01 +00:00
Jim Grosbach
ae6a2e2248 Revert "Enable MCJIT tests on Darwin."
This reverts commit 163278.

Works OK on x86_64, but not i386. Will re-enable when that's cleared up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163290 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 03:24:09 +00:00
Jim Grosbach
88a7e92fe1 Enable MCJIT tests on Darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163278 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 00:59:06 +00:00
Jack Carter
ad51a4a598 Mips specific llvm assembler support for branch and jump instructions.
Test case included.

Contributer: Vladimir Medic


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163277 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06 00:43:26 +00:00
Jakob Stoklund Olesen
098c6a547f Use predication instead of pseudo-opcodes when folding into MOVCC.
Now that it is possible to dynamically tie MachineInstr operands,
predicated instructions are possible in SSA form:

  %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg
  %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR

Becomes a predicated SUBri with a tied imp-use:

  SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0>

This means that any instruction that is safe to move can be folded into
a MOVCC, and the *CC pseudo-instructions are no longer needed.

The test case changes reflect that Thumb2SizeReduce recognizes the
predicated instructions. It didn't understand the pseudos.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163274 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 23:58:02 +00:00
Nick Lewycky
0c09e76e52 Add missing file for test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163272 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 23:52:20 +00:00
Nick Lewycky
033d182589 Teach libObject about some more ELF relocations. llvm-objdump -r now knows
every relocation in C++ hello world built with debug info.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163271 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 23:48:54 +00:00
Manman Ren
408853ea16 JumpThreading: when default destination is the destination of some cases in a
switch, make sure we include the value for the cases when calculating edge
value from switch to the default destination.

rdar://12241132


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163270 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 23:45:58 +00:00
Jack Carter
ec65be84cd Mips specific llvm assembler support for ALU instructions. This includes
register support. Test case included.

Contributer: Vladimir Medic


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163268 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 23:34:03 +00:00
Tim Northover
7bebddf55e Strip old MachineInstrs *after* we know we can put them back.
Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163230 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 18:37:53 +00:00
Pranav Bhandarkar
4c3d3ecdf8 LLVM Bug Fix 13709: Remove needless lsr(Rp, #32) instruction access the
subreg_hireg of register pair Rp.

	* lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New
	 DenseMap similar to PeepholeMap that additionally records subreg info
	 too.
        (runOnMachineFunction): Record information in PeepholeDoubleRegsMap
        and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to
	the instruction Rx = COPY Rp1:logreg_subreg.
	* test/CodeGen/Hexagon/remove_lsr.ll: New test.
	



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163214 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 16:01:40 +00:00
Silviu Baranga
3d5e161fe4 Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163203 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 08:57:21 +00:00
Logan Chien
fd91d8dd7e Fix UseInitArray option for MIPS target.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163193 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-05 06:17:17 +00:00
Dan Gohman
230768bd13 Make provenance checking conservative in cases when
pointers-to-strong-pointers may be in play. These can lead to retains and
releases happening in unstructured ways, foiling the optimizer. This fixes
rdar://12150909.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163180 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 23:16:20 +00:00
Jakob Stoklund Olesen
daddf07497 Move tie checks into MachineVerifier::visitMachineOperand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163152 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 18:38:28 +00:00
Preston Gurd
2e2efd9600 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163150 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 18:22:17 +00:00
Sergei Larin
3e59040810 Porting Hexagon MI Scheduler to the new API.
Change current Hexagon MI scheduler to use new converging
scheduler. Integrates DFA resource model into it.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163137 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 14:49:56 +00:00
Arnold Schwaighofer
67514e9066 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163136 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 14:37:49 +00:00
Elena Demikhovsky
3251020738 This patch optimizes shuffle instruction - generates 2 instructions instead of 4.
Since this specific shuffle is widely used in many workloads we have ~10% performance on them.

shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>

vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps       %ymm0, %ymm1, %ymm0

vmovaps (%rcx), %ymm0
vmovsldup       (%rdx), %ymm1
vblendps        $85, %ymm0, %ymm1, %ymm0


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163134 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 12:49:02 +00:00
Nadav Rotem
7765492a7a LICM may hoist an instruction with undefined behavior above a trap.
Scan the body of the loop and find instructions that may trap.
Use this information when deciding if it is safe to hoist or sink instructions.
Notice that we can optimize the search of instructions that may throw in the case of nested loops.

rdar://11518836



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163132 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 10:25:04 +00:00
Alexey Samsonov
5eae90d727 Add support for fetching inlining context (stack of source code locations)
by instruction address from DWARF.

Add --inlining flag to llvm-dwarfdump to demonstrate and test this functionality,
so that "llvm-dwarfdump --inlining --address=0x..." now works much like
"addr2line -i 0x...", provided that the binary has debug info
(Clang's -gline-tables-only *is* enough).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163128 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-04 08:12:33 +00:00
Bob Wilson
84451a110d Fix more fallout from r158919, similar to PR13547.
This code used to only handle malloc-like calls, which do not read memory.
r158919 changed it to check isNoAliasFn(), which includes strdup-like and
realloc-like calls, but it was not checking for dependencies on the memory
read by those calls.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163106 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-03 05:15:15 +00:00
Nuno Lopes
ad5a0ce40b escape special char when handling CXX_FOR_OCAMLOPT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163098 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 15:16:51 +00:00
Nuno Lopes
3ba5de6a63 fix test's RUN lines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163097 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 15:07:25 +00:00
Nadav Rotem
9f40cb32ac Not all targets have efficient ISel code generation for select instructions.
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163093 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 12:10:19 +00:00
Benjamin Kramer
7de7078933 LoopRotation: Make the brute force DomTree update more brute force.
We update until we hit a fixpoint. This is probably slow but also
slightly simplifies the code. It should also fix the occasional
invalid domtrees observed when building with expensive checking.

I couldn't find a case where this had a measurable slowdown, but
if someone finds a pathological case where it does we may have
to find a cleverer way of updating dominators here.

Thanks to Duncan for the test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163091 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 11:57:22 +00:00
Nadav Rotem
f55ef64544 Generate better select code by allowing the target to use scalar select, and not sign-extend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163086 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-02 08:20:07 +00:00
Pete Cooper
0fc44aba18 Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060"
This reverts commit 5dd9e214fb.

Thanks to Duncan for explaining how this should have been done.

Conflicts:

	test/CodeGen/X86/vec_select.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 17:37:55 +00:00
Logan Chien
8fccd013d8 Fix Thumb2 fixup kind in the integrated-as.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163063 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 15:06:36 +00:00
Owen Anderson
58d5729540 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163051 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 06:04:27 +00:00
NAKAMURA Takumi
5cf8bac4cc llvm/test/CodeGen/X86/fp-fast.ll: Suppress FMA4 on AMD Bulldozer host, corresponding to r162999.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163041 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:26:28 +00:00
Manman Ren
c11b7193a7 Fix Atom bots for r163036.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163040 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-01 00:17:06 +00:00
Manman Ren
2b7a2e8833 SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its
output chain is correctly setup.

As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.

rdar://11457792


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163036 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 23:16:57 +00:00
Craig Topper
dfb1e4babd Mark FMA4 instructions as commutable and add them to the folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163035 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 23:10:34 +00:00
Michael Liao
265bcb1e5b Fix PR12359
- In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as
  well as PSHUFB will zero elements with negative indices.

  Patch by Sriram Murali <sriram.murali@intel.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163018 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 20:12:31 +00:00
Jack Carter
3185f9a2ea The instruction DINS may be transformed into DINSU or DEXTM depending
on the size of the extraction and its position in the 64 bit word.

This patch allows support of the dext transformations with mips64 direct
object output.

0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword

32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword

32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@163010 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31 18:06:48 +00:00