26024 Commits

Author SHA1 Message Date
Krzysztof Parzyszek
15f33310a8 [Hexagon] Disable packets in test to avoid ordering issues in checks
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337624 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 21:55:55 +00:00
Roman Tereshin
6bb56ed117 Reapply "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reapplies commit r337489 reverted by r337541
Additionally, this commit contains a speculative fix to the issue reported in r337541
(the report does not contain an actionable reproducer, just a stack trace)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337606 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 20:10:04 +00:00
Craig Topper
34f6b250b3 [X86] Remove isel patterns for MOVSS/MOVSD ISD opcodes with integer types.
Ideally our ISD node types going into the isel table would have types consistent with their instruction domain. This prevents us having to duplicate patterns with different types for the same instruction.

Unfortunately, it seems our shuffle combining is currently relying on this a little remove some bitcasts. This seems to enable some switching between shufps and shufd. Hopefully there's some way we can address this in the combining.

Differential Revision: https://reviews.llvm.org/D49280

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337590 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 17:57:53 +00:00
Simon Pilgrim
354367414b [X86][XOP] Fix SUB constant folding for VPSHA/VPSHL shift lowering
We can safely use getConstant here as we're still lowering, which allows constant folding to kick in and simplify the vector shift codegen.

Noticed while working on D49562.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337578 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 16:55:18 +00:00
Evandro Menezes
6649de34e9 [ARM] Add new feature to enable optimizing the VFP registers
Enable the optimization of operations on DPR and SPR via a feature instead
of checking the target.

Differential revision: https://reviews.llvm.org/D49463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337575 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 16:49:28 +00:00
Simon Pilgrim
8ec2a959fa [X86][SSE] Use SplitOpsAndApply to improve HADD/HSUB lowering
Improve AVX1 256-bit vector HADD/HSUB matching by using SplitOpsAndApply to split into 128-bit instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337568 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 16:20:45 +00:00
Simon Pilgrim
42411242d6 [X86][AVX] Add support for i16 256-bit vector horizontal op redundant shuffle removal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337566 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 15:51:01 +00:00
Simon Pilgrim
0ec1f16b4e [X86][AVX] Add v16i16 horizontal op redundant shuffle tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337565 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 15:41:15 +00:00
Nirav Dave
708378d355 [DAG] Avoid Node Update assertion due to AND simplification
Check for construction-time folding for incomplete AND nodes in
BackwardsPropagateMask.

Fixes PR38185.

Reviewers: RKSimon, samparker

Reviewed By: samparker

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D49444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337563 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 15:27:24 +00:00
Simon Pilgrim
09babe53b1 [X86][AVX] Add support for 32/64 bits 256-bit vector horizontal op redundant shuffle removal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337561 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 15:24:12 +00:00
Nirav Dave
d5c8d8bb01 [DAG] Fix Memory ordering check in ReduceLoadOpStore.
When merging through a TokenFactor we need to check that the
load may be ordered such that no other aliasing memory operations may
happen. It is not sufficient to just check that the load is a member
of the chain token factor as it there may be a indirect chain. Require
the load's chain has only one use.

This fixes PR37826.

Reviewers: spatel, davide, efriedma, craig.topper, RKSimon

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D49388

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337560 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 15:20:50 +00:00
Simon Pilgrim
7b7da9f3c1 [X86][AVX] Add 256-bit vector horizontal op redundant shuffle tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337558 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 15:07:53 +00:00
Simon Pilgrim
6dde8495c0 Regenerate partial vector fold test. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337551 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 13:58:57 +00:00
Simon Pilgrim
077efdaae0 Regenerate remainder test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337546 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 13:14:29 +00:00
Ulrich Weigand
b15f0ae4d4 [SystemZ] Test case formatting fixes
Fix systematically wrong whitespace from a prior automated change.

NFC.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337542 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 12:12:10 +00:00
Sam McCall
1642979851 Revert "[LSV] Refactoring + supporting bitcasts to a type of different size"
This reverts commit r337489.
It causes asserts to fire in some TensorFlow tests, e.g.
tensorflow/compiler/tests/gather_test.py on GPU.

Example stack trace:
Start test case: GatherTest.testHigherRank
assertion failed at third_party/llvm/llvm/lib/Support/APInt.cpp:819 in llvm::APInt llvm::APInt::trunc(unsigned int) const: width && "Can't truncate to 0 bits"
    @     0x5559446ebe10  __assert_fail
    @     0x55593ef32f5e  llvm::APInt::trunc()
    @     0x55593d78f86e  (anonymous namespace)::Vectorizer::lookThroughComplexAddresses()
    @     0x55593d78f2bc  (anonymous namespace)::Vectorizer::areConsecutivePointers()
    @     0x55593d78d128  (anonymous namespace)::Vectorizer::isConsecutiveAccess()
    @     0x55593d78c926  (anonymous namespace)::Vectorizer::vectorizeInstructions()
    @     0x55593d78c221  (anonymous namespace)::Vectorizer::vectorizeChains()
    @     0x55593d78b948  (anonymous namespace)::Vectorizer::run()
    @     0x55593d78b725  (anonymous namespace)::LoadStoreVectorizer::runOnFunction()
    @     0x55593edf4b17  llvm::FPPassManager::runOnFunction()
    @     0x55593edf4e55  llvm::FPPassManager::runOnModule()
    @     0x55593edf563c  (anonymous namespace)::MPPassManager::runOnModule()
    @     0x55593edf5137  llvm::legacy::PassManagerImpl::run()
    @     0x55593edf5b71  llvm::legacy::PassManager::run()
    @     0x55593ced250d  xla::gpu::IrDumpingPassManager::run()
    @     0x55593ced5033  xla::gpu::(anonymous namespace)::EmitModuleToPTX()
    @     0x55593ced40ba  xla::gpu::(anonymous namespace)::CompileModuleToPtx()
    @     0x55593ced33d0  xla::gpu::CompileToPtx()
    @     0x55593b26b2a2  xla::gpu::NVPTXCompiler::RunBackend()
    @     0x55593b21f973  xla::Service::BuildExecutable()
    @     0x555938f44e64  xla::LocalService::CompileExecutable()
    @     0x555938f30a85  xla::LocalClient::Compile()
    @     0x555938de3c29  tensorflow::XlaCompilationCache::BuildExecutable()
    @     0x555938de4e9e  tensorflow::XlaCompilationCache::CompileImpl()
    @     0x555938de3da5  tensorflow::XlaCompilationCache::Compile()
    @     0x555938c5d962  tensorflow::XlaLocalLaunchBase::Compute()
    @     0x555938c68151  tensorflow::XlaDevice::Compute()
    @     0x55593f389e1f  tensorflow::(anonymous namespace)::ExecutorState::Process()
    @     0x55593f38a625  tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady()::$_1::operator()()
*** SIGABRT received by PID 7798 (TID 7837) from PID 7798; ***

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337541 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 12:03:00 +00:00
Jonas Paulsson
8c93523c81 [SystemZ] Reimplent SchedModel IssueWidth and WriteRes/ReadAdvance mappings.
As a consequence of recent discussions
(http://lists.llvm.org/pipermail/llvm-dev/2018-May/123164.html), this patch
changes the SystemZ SchedModels so that the IssueWidth is 6, which is the
decoder capacity, and NumMicroOps become the number of decoder slots needed
per instruction.

In addition, the SchedWrite latencies now match the MachineInstructions
def-operand indexes, and ReadAdvances have been added on instructions with
one register operand and one memory operand.

Review: Ulrich Weigand
https://reviews.llvm.org/D47008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337538 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 09:40:43 +00:00
Matt Arsenault
06b493f7f0 Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
Reverts r337079 with fix for msan error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337535 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 09:05:08 +00:00
Stephen Canon
14b97bb630 Add x86_64-unkown triple to llc for x86 test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337523 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 03:50:55 +00:00
Craig Topper
2bbe26162e [DAGCombiner] Fold X - (-Y *Z) -> X + (Y * Z)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337518 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 01:40:03 +00:00
Stephen Canon
7e9afd4946 Skip out of SimplifyDemandedBits for BITCAST of f16 to i16
Mirrors the existing exit path for f128, avoiding a crash later on.

Differential Revision: https://reviews.llvm.org/D49524



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337506 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 22:46:42 +00:00
Craig Topper
f97a90d958 [DAGCombiner] Teach DAGCombiner that A-(-B) is A+B.
We already knew A+(-B) is A-B in visitAdd. This does the opposite for visitSub.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337502 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 22:24:43 +00:00
Simon Pilgrim
91284ba1d6 [X86][AVX] Use extract_subvector to reduce vector op widths (PR36761)
We have a number of cases where we fail to reduce vector op widths, performing the op in a larger vector and then extracting a subvector. This is often because by default it would create illegal types.

This peephole patch attempts to handle a few common cases detailed in PR36761, which typically involved extension+conversion to vX2f64 types.

Differential Revision: https://reviews.llvm.org/D49556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337500 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 21:52:06 +00:00
Roman Tereshin
b2f9f92413 [LSV] Refactoring + supporting bitcasts to a type of different size
This is mostly a preparation work for adding a limited support for
select instructions. It proved to be difficult to do due to size and
irregularity of Vectorizer::isConsecutiveAccess, this is fixed here I
believe.

It also turned out that these changes make it simpler to finish one of
the TODOs and fix a number of other small issues, namely:

1. Looking through bitcasts to a type of a different size (requires
careful tracking of the original load/store size and some math
converting sizes in bytes to expected differences in indices of GEPs).

2. Reusing partial analysis of pointers done by first attempt in proving
them consecutive instead of starting from scratch. This added limited
support for nested GEPs co-existing with difficult sext/zext
instructions. This also required a careful handling of negative
differences between constant parts of offsets.

3. Handing a case where the first pointer index is not an add, but
something else (a function parameter for instance).

I observe an increased number of successful vectorizations on a large
set of shader programs. Only few shaders are affected, but those that
are affected sport >5% less loads and stores than before the patch.

Reviewed By: rampitec

Differential-Revision: https://reviews.llvm.org/D49342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337489 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 19:42:43 +00:00
Stefan Pintilie
f9fb677cc2 [Power9] Code Cleanup - Remove needsAggressiveScheduling()
As we already return true from needsAggressiveScheduling() for the most recent
hardware it would be cleaner to just return true for all PowerPC hardware.

Differential Revision: https://reviews.llvm.org/D48663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337488 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 19:34:18 +00:00
Andrea Di Biagio
3b09b9e80e [X86][BtVer2] correctly model the latency/throughput of LEA instructions.
This patch fixes the latency/throughput of LEA instructions in the BtVer2
scheduling model.

On Jaguar, A 3-operands LEA has a latency of 2cy, and a reciprocal throughput of
1. That is because it uses one cycle of SAGU followed by 1cy of ALU1.  An LEA
with a "Scale" operand is also slow, and it has the same latency profile as the
3-operands LEA. An LEA16r has a latency of 3cy, and a throughput of 0.5 (i.e.
RThrouhgput of 2.0).

This patch adds a new TIIPredicate named IsThreeOperandsLEAFn to X86Schedule.td.
The tablegen backend (for instruction-info) expands that definition into this
(file X86GenInstrInfo.inc):
```
static bool isThreeOperandsLEA(const MachineInstr &MI) {
  return (
    (
      MI.getOpcode() == X86::LEA32r
      || MI.getOpcode() == X86::LEA64r
      || MI.getOpcode() == X86::LEA64_32r
      || MI.getOpcode() == X86::LEA16r
    )
    && MI.getOperand(1).isReg()
    && MI.getOperand(1).getReg() != 0
    && MI.getOperand(3).isReg()
    && MI.getOperand(3).getReg() != 0
    && (
      (
        MI.getOperand(4).isImm()
        && MI.getOperand(4).getImm() != 0
      )
      || (MI.getOperand(4).isGlobal())
    )
  );
}
```

A similar method is generated in the X86_MC namespace, and included into
X86MCTargetDesc.cpp (the declaration lives in X86MCTargetDesc.h).

Back to the BtVer2 scheduling model:
A new scheduling predicate named JSlowLEAPredicate now checks if either the
instruction is a three-operands LEA, or it is an LEA with a Scale value
different than 1.
A variant scheduling class uses that new predicate to correctly select the
appropriate latency profile.

Differential Revision: https://reviews.llvm.org/D49436


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337469 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 16:42:15 +00:00
Simon Pilgrim
19ad9309e1 [X86][SSE] Add FPEXT vXf32 - vXf64 tests
Some basic subvector special cases based on PR36761

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337464 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 15:32:45 +00:00
Tim Northover
95f104ac2d ARM: switch armv7em MachO triple to hard-float defaults and libcalls.
We were emitting incorrect calls to libm functions that LLVM had decided it
knew about because the default is soft-float.

Recommitted without breaking ELF this time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337450 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 12:44:51 +00:00
Simon Pilgrim
3a2e78e840 [DAGCombiner] Add rotate-extract tests
Add new tests from D47681 to current codegen. Also added i686 codegen tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337445 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-19 09:27:34 +00:00
Heejin Ahn
9dc116b0c7 [WebAssembly] Add missing -mattr=+exception-handling guards
Summary:
The use of exception handling instructions should only be enabled with
`-mattr=+exception-handling` option.

Reviewers: jgravelle-google

Subscribers: dschuff, sbc100, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D49391

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337425 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 21:42:22 +00:00
Tim Northover
f7eb2f0fcb Revert "ARM: switch armv7em triple to hard-float defaults and libcalls."
This reverts commit r337385 until it can be targeted at MachO only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337424 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 21:32:49 +00:00
Simon Pilgrim
6e4fb35e80 [X86][SSE] Canonicalize scalar fp arithmetic shuffle patterns
As discussed on PR38197, this canonicalizes MOVS*(N0, OP(N0, N1)) --> MOVS*(N0, SCALAR_TO_VECTOR(OP(N0[0], N1[0])))

This returns the scalar-fp codegen lost by rL336971.

Additionally it handles the OP(N1, N0)) case for commutable (FADD/FMUL) ops.

Differential Revision: https://reviews.llvm.org/D49474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337419 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 19:55:19 +00:00
Nirav Dave
1b4d2184e1 [DAG] Add testcase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337414 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 18:34:52 +00:00
Nirav Dave
461ab257e2 [ScheduleDAG] Fix unfolding of SUnits to already existent nodes.
Summary:
If unfolding an SUnit results in both load or the operation using it which
already exist in the DAG, abort the unfold if they are already scheduled.
If not, make sure we don't add duplicate dependencies.

This fixes PR37916.

Reviewers: davide, eli.friedman, fhahn, bogner

Subscribers: MatzeB, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48666

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337409 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 18:01:03 +00:00
Roman Lebedev
c279de70f6 [NFC][X86][AArch64][DAGCombine] More tests for optimizeSetCCOfSignedTruncationCheck()
At least one of these cases is more canonical,
so we really do have to handle it.
https://godbolt.org/g/pkzP3X
https://rise4fun.com/Alive/pQyh

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337400 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 16:19:06 +00:00
Simon Atanasyan
bd2bb96fb1 [mips] Fix predicate for the MipsTruncIntFP pattern
This is a follow-up to the rL337171. This patch fixes regression
introduced by the r337171 and enables MipsTruncIntFP pattern.

Differential revision: https://reviews.llvm.org/D49469

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337392 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 14:11:22 +00:00
Tim Northover
feb1bb8b82 ARM: switch armv7em triple to hard-float defaults and libcalls.
We were emitting incorrect calls to libm functions that LLVM had decided it
knew about because the default is soft-float.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337385 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 12:37:04 +00:00
Simon Pilgrim
0c24604731 [X86][SSE] Add extra scalar fop + blend tests for commuted inputs
While working on PR38197, I noticed that we don't make use of FADD/FMUL being able to commute the inputs to support the addps+movss -> addss style combine

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337375 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 10:54:13 +00:00
Daniel Cederman
9500eff899 Revert "[Sparc] Use the IntPair reg class for r constraints with value type f64"
This reverts commit 55222c9183.

I missed that this patch has a dependency on https://reviews.llvm.org/D49219
that has not been approved yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337373 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 10:05:30 +00:00
Daniel Cederman
55222c9183 [Sparc] Use the IntPair reg class for r constraints with value type f64
Summary: This is how it appears to be handled in GCC and it prevents a
"Unknown mismatch" error in the SelectionDAGBuilder.

Reviewers: venkatra, jyknight, jrtc27

Reviewed By: jyknight, jrtc27

Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D49218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337370 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 09:25:33 +00:00
Craig Topper
b20ca5fcaa [X86] Enable commuting of VUNPCKHPD to VMOVLHPS to enable load folding by using VMOVLPS with a modified address.
This required an annoying amount of tablegen multiclass changes to make only VUNPCKHPDZ128rr commutable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337357 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 07:31:32 +00:00
Craig Topper
de4051f80c [X86] Add test case for missed opportunity to commute vunpckhpd to enable use of vmovlps to fold a load.
We do this transform for SSE, but not AVX or AVX512VL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337356 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 07:31:30 +00:00
Craig Topper
a29d420a5f [X86] Regenerate fma.ll checks using current version of the script which produces different regular expressions on spills and reloads. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337354 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 07:08:28 +00:00
Craig Topper
b14a5d35fa [X86] Generate v2f64 X86ISD::UNPCKL/UNPCKH instead of X86ISD::MOVLHPS/MOVHLPS for unary v2f64 {0,0} and {1,1} shuffles with SSE2.
I'm trying to restrict the MOVLHPS/MOVHLPS ISD nodes to SSE1 only. With SSE2 we can use unpcks. I believe this will allow some patterns to be cleaned up to require fewer bitcasts.

I've put in an odd isel hack to still select MOVHLPS instruction from the unpckh node to avoid changing tests and because movhlps is a shorter encoding. Ideally we'd do execution domain switching on this, but the operands are in the wrong order and are tied. We might be able to try a commute in the domain switching using custom code.

We already support domain switching for UNPCKLPD and MOVLHPS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337348 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 05:10:51 +00:00
Justin Hibbits
c486a43e86 Introduce codegen for the Signal Processing Engine
Summary:
The Signal Processing Engine (SPE) is found on NXP/Freescale e500v1,
e500v2, and several e200 cores.  This adds support targeting the e500v2,
as this is more common than the e500v1, and is in SoCs still on the
market.

This patch is very intrusive because the SPE is binary incompatible with
the traditional FPU.  After discussing with others, the cleanest
solution was to make both SPE and FPU features on top of a base PowerPC
subset, so all FPU instructions are now wrapped with HasFPU predicates.

Supported by this are:
* Code generation following the SPE ABI at the LLVM IR level (calling
conventions)
* Single- and Double-precision math at the level supported by the APU.

Still to do:
* Vector operations
* SPE intrinsics

As this changes the Callee-saved register list order, one test, which
tests the precise generated code, was updated to account for the new
register order.

Reviewed by: nemanjai
Differential Revision: https://reviews.llvm.org/D44830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337347 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 04:25:10 +00:00
Peter Collingbourne
70df6e955a CodeGen: Don't create address significance table entries for thread-local variables.
The presence of these symbols in the symbol table can cause symbol type
mismatch errors (or undefined symbol errors on emulated TLS targets)
and they can't be ICF'd anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337338 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-18 00:21:40 +00:00
Craig Topper
9722c06a8a [X86] Remove the vector alignment requirement from the patterns added in r337320.
The resulting instruction will only load 64 bits so alignment isn't required.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337334 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 23:26:20 +00:00
Peter Collingbourne
03bb42af9d CodeGen: Add a target option for emitting .addrsig directives for all address-significant symbols.
Differential Revision: https://reviews.llvm.org/D48143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337331 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 22:40:08 +00:00
Craig Topper
8fe83c7092 [X86] Add patterns for folding full vector load into MOVHPS and MOVLPS with SSE1 only.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337320 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 20:16:18 +00:00
Craig Topper
372f8c4f07 [X86] Add test case for missed opportunity to use MOVLPS on the SSE1 only targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337319 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-17 20:16:15 +00:00