Commit Graph

22389 Commits

Author SHA1 Message Date
Matt Arsenault
ec61af4bcc AMDGPU: Fix crash on immediate operand
We can have a v_mac with an immediate src0.
We can still fold if it's an inline immediate,
otherwise it already uses the constant bus.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-21 00:45:59 +00:00
Artem Belevich
34fb94caca [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.
Differential Revision: https://reviews.llvm.org/D38090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313820 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:23:07 +00:00
Matt Arsenault
820b8a54fb AMDGPU: Start selecting v_mad_mixhi_f16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:01:24 +00:00
Saleem Abdulrasool
9d697e1c3c X86: treat SwiftCC as Win64_CC on Win64
The Swift CC is identical to Win64 CC with the exception of swift error
being passed in r12 which is a CSR.  However, since this calling
convention is only used in swift -> swift code, it does not impact
interoperability and can be treated entirely as Win64 CC.  We would
previously incorrectly lower the frame setup as we did not treat the
frame as conforming to Win64 specifications.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313813 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 21:00:40 +00:00
Matt Arsenault
7287fcb5d5 AMDGPU: Start selecting v_mad_mixlo_f16
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 20:28:39 +00:00
Matt Arsenault
ae40a10420 AMDGPU: Fix encoding of op_sel for mad_mix* opcodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313797 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 19:09:28 +00:00
Saleem Abdulrasool
a689afa09d CodeGen: support SwiftError SwiftCC on Windows x64
Add support for passing SwiftError through a register on the Windows x64
calling convention.  This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter.  This partially enables building the swift standard library
for Windows x86_64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313791 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 18:40:59 +00:00
Simon Pilgrim
652f5e624e [X86][SSE] Add PR22415 test case
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313755 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 13:49:52 +00:00
Florian Hahn
6aed42109d Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.
This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313751 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 11:54:37 +00:00
Mikael Holmen
6727c50d80 [IfConversion] Add testcases [NFC]
These tests should have been included in r310697 / D34099 but apparently
I missed them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313737 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 08:23:29 +00:00
Matt Arsenault
a942315e5f AMDGPU: Match load d16 hi instructions
Also starts selecting global loads for constant address
in some cases. Some end up selecting to mubuf still, which
requires investigation.

We still get sub-optimal regalloc and extra waitcnts inserted
due to not really tracking the liveness of the separate register
halves.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313716 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 05:01:53 +00:00
Stanislav Mekhanoshin
fbf0e1603c [AMDGPU] Port of HSAIL inliner
Differential Revision: https://reviews.llvm.org/D36849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313714 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 04:25:58 +00:00
Matt Arsenault
8e11a03a95 AMDGPU: Match store d16_hi instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313712 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 03:20:09 +00:00
Quentin Colombet
8726dc54e6 [MIRPrinter] Print empty successor lists when they cannot be guessed
This re-applies commit r313685, this time with the proper updates to
the test cases.

Original commit message:
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print

         entry
        /      \
   true (def)   false (no list of successors)
       |
 split.true (use)

The MIR parser would understand this:

         entry
        /      \
   true (def)   false
       |        /  <-- invalid edge
 split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313696 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 23:34:12 +00:00
Quentin Colombet
8451f3b13d Revert "[MIRPrinter] Print empty successor lists when they cannot be guessed"
This reverts commit r313685.

I thought I had ran ninja check, but apparently I didn't...
Need to update a bunch of mir tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313686 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 22:03:50 +00:00
Quentin Colombet
4e58cdd6b0 [MIRPrinter] Print empty successor lists when they cannot be guessed
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print
          entry
         /      \
    true (def)   false (no list of successors)
        |
  split.true (use)

The MIR parser would understand this:
          entry
         /      \
    true (def)   false
        |        /  <-- invalid edge
  split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313685 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 21:55:51 +00:00
Stanislav Mekhanoshin
60873e298c [AMDGPU] Prevent post-RA scheduler from breaking memory clauses
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.

Differential Revision: https://reviews.llvm.org/D38014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313670 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 20:54:38 +00:00
Ulrich Weigand
5994e167a7 [SystemZ] Fix truncstore + bswap codegen bug
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.

This transformation is correct for regular stores, but not for
truncating stores.  The routine neglected to check for that case.

Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313669 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 20:50:05 +00:00
Tony Jiang
a2daaca0d6 [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.
Two blocks prior to the join each perform an li and the the join block has an
add using the initialized register. Optimize each predecessor block to instead
use addi and delete the li's and add.

Differential Revision: https://reviews.llvm.org/D36734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313639 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 16:14:37 +00:00
Evandro Menezes
fd1b754b1d [AArch64] Extend tests of loads and stores of register pairs
Include instances of FP register pairs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313638 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 15:46:35 +00:00
Daniel Sanders
68b21d6108 [globalisel] Add a G_BSWAP instruction and support bswap using it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313633 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 14:25:15 +00:00
Simon Pilgrim
9ba154704c [X86][SSE] Add 'redundant pand' test case from PR34620
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313632 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 14:02:16 +00:00
Sanjay Patel
031d34937f [x86] regenerate checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313631 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 13:43:09 +00:00
Daniel Sanders
8aded4e290 [globalisel] Add support for intrinsic_void
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313629 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 13:23:01 +00:00
Daniel Sanders
86721de9e5 [globalisel] Add support for intrinsic_w_chain.
This maps directly to G_INTRINSIC_W_SIDE_EFFECTS.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313627 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 12:56:36 +00:00
Jina Nahias
eef725fc85 [x86] Lowering Mask Set1 intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D37669

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313625 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 11:03:06 +00:00
Roger Ferrer Ibanez
081fd494d0 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564

Differential Revision: https://reviews.llvm.org/D35192



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313618 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 09:05:39 +00:00
Andrei Elovikov
8632b8a5bb Test commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313617 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 07:56:20 +00:00
Matt Arsenault
4b385be048 AMDGPU: Run internalize symbols at -O0
The relocations used for externally visible functions
aren't supported, so the direct call emitted ends
up hitting a linker error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313616 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 07:40:11 +00:00
Gadi Haber
bbbe81ad6a [X86][Skylake] Adding the scheduling information for the SkylakeClient target
This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.
The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879.

Please expect some performance fluctuations due to code alignment effects.

Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena
Differential Revision: https://reviews.llvm.org/D37294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313613 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 06:19:27 +00:00
Craig Topper
4c92030df7 [X86] Add VPERMPD/VPERMQ and VPERMPS/VPERMD to the execution domain fixing table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313610 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 04:39:55 +00:00
Yonghong Song
2865ab6996 bpf: add inline-asm support
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313593 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 23:29:36 +00:00
Sanjay Patel
a3209ae52e [DAGCombiner] fold assertzexts separated by trunc
If we have an AssertZext of a truncated value that has already been AssertZext'ed, 
we can assert on the wider source op to improve the zext-y knowledge:
 assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN

This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.

Differential Revision: https://reviews.llvm.org/D37017



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313577 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 22:05:35 +00:00
Konstantin Zhuravlyov
fe0a82a17c AMDGPU: Start selecting s_xnor_{b32, b64}
Differential Revision: https://reviews.llvm.org/D37981


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 21:22:45 +00:00
Sanjay Patel
e3402afeee [DAG, x86] allow store merging before and after legalization (PR34217)
rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217

We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.

I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.

Differential Revision: https://reviews.llvm.org/D37987


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313564 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 20:54:26 +00:00
Craig Topper
1ae3ba04b8 [X86] Make sure we still emit zext for GR32 to GR64 when the source of the zext is AssertZext
The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext.

This fixes PR28540.

Differential Revision: https://reviews.llvm.org/D37729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313563 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 20:49:13 +00:00
Sanjay Patel
d800c2eba5 [x86] add tests for PR34217; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313548 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 18:07:50 +00:00
Simon Pilgrim
d7c4504964 [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for 256-bit vector compare results.
As commented on D37849, AVX1 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313547 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 17:58:31 +00:00
Sanjay Patel
fe612a2f7f [x86] regenerate checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313545 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 17:33:47 +00:00
Manoj Gupta
31cc06a8d8 [LoopVectorizer] Add more testcases for PR33804.
Summary:
Add test cases when float <-> pointer types conversion is triggered
in presence of load instructions.

Reviewers: Ayal, srhines, mkuper, rengolin

Reviewed By: rengolin

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D37967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313544 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 17:28:15 +00:00
Simon Pilgrim
3632da7880 [SelectionDAG] Add BITCAST handling to ComputeNumSignBits for splatted sign bits.
For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements.

We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results.

Differential Revision: https://reviews.llvm.org/D37849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313543 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 16:45:05 +00:00
Craig Topper
09f2a0775a [X86] Fix two more places to prefer VPERMQ/PD over VPERM2X128 when AVX2 is enabled
The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there.

If we have VPERMQ/PD we should prefer those since they only have a single input.

Differential Revision: https://reviews.llvm.org/D37947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313542 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 16:39:49 +00:00
Simon Pilgrim
00fb00243b [X86][SSE] Improve support for vselect(Cond, 0, X) -> ANDN(Cond, X)
As discussed on PR28925 and D37849.

Differential Revision: https://reviews.llvm.org/D37975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313532 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 14:23:23 +00:00
Simon Pilgrim
70c6728fba [X86][SSE] Add vselect with zero tests (PR28925)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313529 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 13:32:33 +00:00
Nikolai Bozhenov
289d643bfb [X86FixupBWInsts] More precise register liveness if no <imp-use> on MOVs.
Summary:
Subregister liveness tracking is not implemented for X86 backend, so
sometimes the whole super register is said to be live, when only a
subregister is really live. That might happen if the def and the use
are located in different MBBs, see added fixup-bw-isnt.mir test.

However, using knowledge of the specific instructions handled by the
bw-fixup-pass we can get more precise liveness information which this
change does.

Reviewers: MatzeB, DavidKreitzer, ab, andrew.w.kaylor, craig.topper

Reviewed By: craig.topper

Subscribers: n.bozhenov, myatsina, llvm-commits, hiraditya

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D37559


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313524 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 10:17:59 +00:00
Mohammed Agabaria
bcb17980a9 [X86][Codegen] adding masked gathers tests for avx2
related to patch: https://reviews.llvm.org/D35772
adding llvm gathers test before gathers codegen support.

Differential Revision: https://reviews.llvm.org/D37800



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313516 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 06:49:54 +00:00
Craig Topper
d28a1eae08 [X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain.
MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313509 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 04:40:58 +00:00
Craig Topper
2dc0224dbf [X86] Teach execution domain fixing to convert between FP and int unpack instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313508 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 03:29:54 +00:00
Craig Topper
bcd1d60259 [X86] Teach execution domain fixing to convert between VPERMILPS and VPSHUFD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313507 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-18 03:29:47 +00:00
Craig Topper
1f42623bce [X86] Teach shuffle lowering to use MOVLHPS/MOVHLPS for lowering v4f32 unary shuffles with SSE1 only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313504 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-17 22:36:41 +00:00