Commit Graph

61 Commits

Author SHA1 Message Date
Sam McCall
c7c869be7e Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This crashes on boringSSL on PPC (will send reduced testcase)

This reverts commit r312328.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312490 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-04 15:47:00 +00:00
Geoff Berry
d168a77ec3 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review:
- Moved removal of dead instructions found by
  LiveIntervals::shrinkToUses() outside of loop iterating over
  instructions to avoid instructions being deleted while pointed to by
  iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312328 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 14:27:20 +00:00
Hans Wennborg
92b6b153a4 Revert r312154 "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!")

> Issues identified by buildbots addressed since original review:
> - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
> - The pass no longer forwards COPYs to physical register uses, since
>   doing so can break code that implicitly relies on the physical
>   register number of the use.
> - The pass no longer forwards COPYs to undef uses, since doing so
>   can break the machine verifier by creating LiveRanges that don't
>   end on a use (since the undef operand is not considered a use).
>
>   [MachineCopyPropagation] Extend pass to do COPY source forwarding
>
>   This change extends MachineCopyPropagation to do COPY source forwarding.
>
>   This change also extends the MachineCopyPropagation pass to be able to
>   be run during register allocation, after physical registers have been
>   assigned, but before the virtual registers have been re-written, which
>   allows it to remove virtual register COPY LiveIntervals that become dead
>   through the forwarding of all of their uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312178 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-30 22:11:37 +00:00
Geoff Berry
62c7c252f8 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues identified by buildbots addressed since original review:
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312154 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-30 18:41:07 +00:00
Geoff Berry
6c9f36933c Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2
This reverts commit r311135.

sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311142 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 01:43:11 +00:00
Geoff Berry
d93db263e5 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Two issues identified by buildbots were addressed:
    - The pass no longer forwards COPYs to physical register uses, since
      doing so can break code that implicitly relies on the physical
      register number of the use.
    - The pass no longer forwards COPYs to undef uses, since doing so
      can break the machine verifier by creating LiveRanges that don't
      end on a use (since the undef operand is not considered a use).

    [MachineCopyPropagation] Extend pass to do COPY source forwarding

    This change extends MachineCopyPropagation to do COPY source forwarding.

    This change also extends the MachineCopyPropagation pass to be able to
    be run during register allocation, after physical registers have been
    assigned, but before the virtual registers have been re-written, which
    allows it to remove virtual register COPY LiveIntervals that become dead
    through the forwarding of all of their uses.

    Reviewers: qcolombet, javed.absar, MatzeB, jonpa

    Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

    Differential Revision: https://reviews.llvm.org/D30751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311135 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 23:06:55 +00:00
Geoff Berry
a6a5be21df Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r311038.

Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change.  Reverting while
I investigate further.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311062 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 04:04:11 +00:00
Geoff Berry
31db6f3bd2 [MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.

This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa

Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

Differential Revision: https://reviews.llvm.org/D30751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311038 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 20:50:01 +00:00
Daniel Jasper
7da5231e32 Revert "r306529 - [X86] Correct dwarf unwind information in function epilogue"
I am 99% sure that this breaks the PPC ASAN build bot:
http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio

If it doesn't go back to green, we can recommit (and fix the original
commit message at the same time :) ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306676 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-29 13:58:24 +00:00
Petar Jovanovic
32d37d6720 [X86] Correct dwarf unwind information in function epilogue
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.

Majority of the changes in this patch:

1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.

These changes are target independent and described below.

Changed CFI instructions so that they:

1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal

Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.

Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.

Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D18046


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306529 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-28 10:21:17 +00:00
Simon Pilgrim
0261597a5e [X86][SSE] Change BUILD_VECTOR interleaving ordering to improve coalescing/combine opportunities
We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 1>
Step 2: unpcklps X, Y ==>    <3, 2, 1, 0>

The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc.

Instead, this patch unpacks progressively larger sequential vector elements together:

e.g. for v4f32:

Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0>
      : unpcklps 1, 3 ==> Y: <?, ?, 3, 2>
Step 2: unpcklpd X, Y ==>    <3, 2, 1, 0>

This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree.

Differential Revision: https://reviews.llvm.org/D33864

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304688 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-04 20:12:04 +00:00
Sanjay Patel
cf2a64aaaf [DAGCombiner] fix load narrowing transform to exclude loads with extension
The extending load possibility was missed in:
https://reviews.llvm.org/rL304072

We might want to handle this cases as a follow-up, but bailing out for now
to avoid miscompiling.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304153 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-29 13:24:58 +00:00
Ayman Musa
1ebb1b7a38 [X86][SSE2] Fix asm string for movq (Move Quadword) instruction.
Replace "mov{d|q}" with "movq".

Differential Revision: https://reviews.llvm.org/D32220



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301386 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-26 07:08:44 +00:00
Michael Kuperstein
bf82f16ca4 [X86] Revert r299387 due to AVX legalization infinite loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299720 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 22:33:25 +00:00
Simon Pilgrim
dc4f56dd52 [X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + VECTOR_SHUFFLE
It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging.

This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values.

There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch.

Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this.

Differential Revision: https://reviews.llvm.org/D31373

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299387 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-03 21:06:51 +00:00
Michael Zuckerman
414ca0751d [X86][TD][vpmovm2 ] New TD pattern for the vpmovm2 instruction
Up until now, vpmovm2 instruction described its destination operand size
by the source operand size. This patch adds new pattern for the vpmovm2
instruction. The node describes new expansion of the destination (from
{128|256} to 512).

Differential Revision: https://reviews.llvm.org/D30654


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298586 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 09:57:01 +00:00
Amjad Aboud
a90eb62d9e [X86] Generate VZEROUPPER for Skylake-avx512.
VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be.

Differential Revision: https://reviews.llvm.org/D29874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296859 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-03 09:03:24 +00:00
Simon Pilgrim
be2cd40ad4 [X86][SSE] Propagate undef upper elements from scalar_to_vector during shuffle combining
Only do this for integer types currently - floats types (in particular insertps) load folding often fails with this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295208 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 17:41:33 +00:00
Craig Topper
3612268cf2 [AVX-512] Add patterns to use a zero masked VPTERNLOG instruction for vselects of all ones and all zeros.
Previously we emitted a VPTERNLOG and a separate masked move.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291415 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-09 02:44:34 +00:00
Matthias Braun
26256fb2f9 MCStreamer: Use "cfi" for CFI related temp labels.
Choosing a "cfi" name makes the intend a bit clearer in an assembly dump
and more importantly the assembly dumps are slightly more stable as the
numbers don't move around anymore when unrelated code calls
createTempSymbol() more or less often.
As they are temp labels the name doesn't influence the generated object
code.

Differential Revision: https://reviews.llvm.org/D27244

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288290 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-30 23:48:26 +00:00
Simon Pilgrim
8edff6c940 [X86][SSE] Added support for combining bit-shifts with shuffles.
Bit-shifts by a whole number of bytes can be represented as a shuffle mask suitable for combining.

Added a 'getFauxShuffleMask' function to allow us to create shuffle masks from other suitable operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288040 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-28 16:25:01 +00:00
Craig Topper
53bf46f680 [AVX-512] Add support for creating SIGN_EXTEND_VECTOR_INREG and ZERO_EXTEND_VECTOR_INREG for 512-bit vectors to support vpmovzxbq and vpmovsxbq.
Summary: The one tricky thing about this is that the sign/zero_extend_inreg uses v64i8 as an input type which isn't legal without BWI support. Though the vpmovsxbq and vpmovzxbq instructions themselves don't require BWI. To support this we need to add custom lowering for ZERO_EXTEND_VECTOR_INREG with v64i8 input. This can mostly reuse the existing sign extend code with a couple checks for sign extend vs zero extend added.

Reviewers: delena, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25594

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285053 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 04:00:29 +00:00
Igor Breger
d942282a2f [X86][AVX512] Fix sext v32i1 -> v32i8 lowering.
Fix PR30600.

Differential Revision: https://reviews.llvm.org/D25554

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284134 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 17:20:38 +00:00
Craig Topper
bda8c85789 [AVX-512] Simplify X86InstrInfo::copyPhysReg for 128/256-bit vectors with AVX512, but not VLX. We should use the VEX opcodes and trust the register allocator to not use the extended XMM/YMM register space.
Previously we were extending to copying the whole ZMM register. The register allocator shouldn't use XMM16-31 or YMM16-31 in this configuration as the instructions to spill them aren't available.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280648 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-05 06:43:06 +00:00
Simon Pilgrim
751a136b07 [X86][AVX512BW] Add sext/zext AVX512BW 512-bit vector tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277957 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-07 12:41:36 +00:00
Simon Pilgrim
edf4139052 [X86][AVX512] Add sext/zext to 512-bit vector tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277956 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-07 12:10:46 +00:00
Simon Pilgrim
2e5c533f85 [X86][AVX2] Improve sign/zero extension on AVX2 targets
Split extensions to large vectors into 256-bit chunks - the equivalent of what we do with pre-AVX2 into 128-bit chunks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277939 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-06 21:21:12 +00:00
Craig Topper
22cd3ed2a2 [AVX512] Add ExeDomain to vector extend and truncate instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276394 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 05:46:44 +00:00
Craig Topper
3305a40150 [X86] Add AVX512 load opcodes and a couple AVX load opcodes to X86InstrInfo::areLoadsFromSameBasePtr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:43 +00:00
Craig Topper
b6d6904481 [AVX512] Use vpternlog with an immediate of 0xff to create 512-bit all one vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275045 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 05:36:48 +00:00
Matthias Braun
79519fecc3 VirtRegMap: Replace some identity copies with KILL instructions.
An identity COPY like this:
   %AL = COPY %AL, %EAX<imp-def>
has no semantic effect, but encodes liveness information: Further users
of %EAX only depend on this instruction even though it does not define
the full register.

Replace the COPY with a KILL instruction in those cases to maintain this
liveness information. (This reverts a small part of r238588 but this
time adds a comment explaining why a KILL instruction is useful).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274952 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-09 00:19:07 +00:00
Simon Pilgrim
c1faee3baa [X86][AVX512] Added AVX512F vector sign extend tests
Now that Elena has confirmed that PR26474 has been fixed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273560 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 14:01:45 +00:00
Simon Pilgrim
58140e4f69 [X86] Updated test checks script to generalise LCPI symbol refs
The script now replace '.LCPI888_8' style asm symbols with the {{\.LCPI.*}} re pattern - this helps stop hardcoded symbols in 32-bit x86 tests changing with every edit of the file

Refreshed some tests to demonstrate the new check

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272488 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-11 20:39:21 +00:00
Simon Pilgrim
2cd2d0a7e6 [X86][SSE] Added 16i8 -> 8i64 sext test
Shows poor codegen for AVX2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266560 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-17 15:10:42 +00:00
Simon Pilgrim
574e4b288d [X86][SSE] Vectorize a bit (AND/XOR/OR) op if a BUILD_VECTOR has the same op for all their scalar elements.
If all a BUILD_VECTOR's source elements are the same bit (AND/XOR/OR) operation type and each has one constant operand, lower to a pair of BUILD_VECTOR and just apply the bit operation to the vectors.

The constant operands will form a constant vector meaning that we still only have a single BUILD_VECTOR to lower and we will have replaced all the scalarized operations with a single SSE equivalent.

Its not in our interest to start make a general purpose vectorizer from this, but I'm seeing enough of these scalar bit operations from the later legalization/scalarization stages to support them at least.

Differential Revision: http://reviews.llvm.org/D18492

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264666 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-28 21:33:52 +00:00
Simon Pilgrim
6a78b654c3 [X86][AVX2] Fix SIGN_EXTEND vector handling on AVX2 targets.
On AVX2 target we are poorly legalizing SIGN_EXTEND ops for which the input's legalized type doesn't have the same number of elements as the destination, resulting in an ANY_EXTEND followed by a SIGN_EXTEND_INREG.

This patch uses the existing SIGN_EXTEND -> SIGN_EXTEND_VECTOR_INREG combine to extend the input to the size of the result and using SIGN_EXTEND_VECTOR_INREG instead.

Differential Revision: http://reviews.llvm.org/D16994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260210 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09 08:19:19 +00:00
Simon Pilgrim
392b9d21fc [X86][SSE] Resolve target shuffle inputs to sentinels to permit more combines
The combineX86ShufflesRecursively only supports unary shuffles, but was missing the opportunity to combine binary shuffles with a zero / undef second input.

This patch resolves target shuffle inputs, converting the shuffle mask elements to SM_SentinelUndef/SM_SentinelZero where possible. It then resolves the updated mask to check if we have created a faux unary shuffle.

Additionally, we now attempt to recursively call combineX86ShufflesRecursively for all input operands (we used to just recurse for unary integer shuffles and unary unpacks) - it safely returns early if its not a target shuffle.

Differential Revision: http://reviews.llvm.org/D16683


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260063 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-07 22:51:06 +00:00
Simon Pilgrim
6c77f5367f [X86][SSE] Added 8i8 to 8i64 sext/zext tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258868 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 22:19:22 +00:00
Michael Kuperstein
586219957f [X86] Improve shift combining
This folds (ashr (shl a, [56,48,32,24,16]), SarConst)
into       (shl, (sext (a), [56,48,32,24,16] - SarConst))
or into    (lshr, (sext (a), SarConst - [56,48,32,24,16]))
depending on sign of (SarConst - [56,48,32,24,16])

sexts in X86 are MOVs.
The MOVs have the same code size as above SHIFTs (only SHIFT by 1 has lower code size).
However the MOVs have 2 advantages to SHIFTs on x86:
1. MOVs can write to a register that differs from source.
2. MOVs accept memory operands.

This fixes PR24373.

Patch by: evgeny.v.stupachenko@intel.com
Differential Revision: http://reviews.llvm.org/D13161

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255761 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 11:22:37 +00:00
Andy Ayers
77a84a9451 findDeadCallerSavedReg needs to pay attention to calling convention
Caller saved regs differ between SysV and Win64. Use the tail call available set to scavenge from.

Refactor register info to create new helper to get at tail call GPRs. Added a new test case for windows. Fixed up a number of X64 tests since now RCX is preferred over RDX on SysV.

Differential Revision: http://reviews.llvm.org/D14878

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253927 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-23 22:17:44 +00:00
James Y Knight
423c686bec Make utils/update_llc_test_checks.py note that the assertions are
autogenerated.

Also update existing test cases which appear to be generated by it and
weren't modified (other than addition of the header) by rerunning it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253917 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-23 21:33:58 +00:00
Cong Hou
59f2cd8a68 [X86][SSE] Combine UNPCKL with vector_shuffle into UNPCKH to save one instruction for sext from v16i8 to v16i16 and v8i16 to v8i32.
This patch is enabling combining UNPCKL with vector_shuffle that moves the upper
half of a vector into the lower half, into a UNPCKH instruction. For example:

t2: v16i8 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t1, undef:v16i8
t3: v16i8 = X86ISD::UNPCKL undef:v16i8, t2

will be combined to:

t3: v16i8 = X86ISD::UNPCKH undef:v16i8, t1


Differential revision: http://reviews.llvm.org/D14399




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253067 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-13 19:47:43 +00:00
Simon Pilgrim
d97fb0831c [X86][SSE] Added load+sext tests for 16i1->16i8 and 32i1->32i8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251661 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 22:19:21 +00:00
Simon Pilgrim
fc0eed896e [X86][SSE] vector sext/zext tests - remove unnecessary mcpu arguments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251233 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-25 12:15:00 +00:00
Simon Pilgrim
878eaf48c1 [X86][SSE] Use lowerVectorShuffleWithUNPCK instead of custom matches.
Most 128-bit and 256-bit shuffles were manually matching UNPCK patterns - use lowerVectorShuffleWithUNPCK to be more thorough.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251211 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-24 22:45:04 +00:00
Simon Pilgrim
de8d7c41ca [X86][SSE] Match zero/any extension shuffles that don't start from the first element
This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane.

The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well.

Differential Revision: http://reviews.llvm.org/D12561


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248250 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 08:16:08 +00:00
Simon Pilgrim
1b1bacb118 [X86] Added i1 vector sextload tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 15:36:41 +00:00
Simon Pilgrim
e6dad29b16 [X86][SSE] Added additional vector sign/zero load extension tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243216 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-25 14:07:20 +00:00
Simon Pilgrim
af824151a1 [X86][SSE] Added additional vector sign/zero extension tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243212 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-25 11:17:35 +00:00
Simon Pilgrim
07c08a6a50 [X86][SSE] Tidied up vector extend/truncation tests. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-12 17:40:49 +00:00