25842 Commits

Author SHA1 Message Date
Craig Topper
c9994c8acf [X86] Remove the last of the 'x86.fma.' intrinsics and autoupgrade them to 'llvm.fma'. Add upgrade tests for all.
Still need to remove the AVX512 masked versions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336383 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 18:43:58 +00:00
Craig Topper
6e0b82fc61 [X86] Add SHUF128 to target shuffle decoding.
Differential Revision: https://reviews.llvm.org/D48954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336376 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 17:10:17 +00:00
Matt Arsenault
5ad067fad4 AMDGPU: Don't use spir_kernel in a test
Also use verify-machineinstrs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336374 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 17:01:29 +00:00
Matt Arsenault
e5d3d15134 AMDGPU/GlobalISel: Implement custom kernel arg lowering
Avoid using allocateKernArg / AssignFn. We do not want any
of the type splitting properties of normal calling convention
lowering.

For now at least this exists alongside the IR argument lowering
pass. This is necessary to handle struct padding correctly while
some arguments are still skipped by the IR argument lowering
pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336373 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 17:01:20 +00:00
Lei Huang
37aa5135e5 [Power9] Add lib calls for float128 operations with no equivalent PPC instructions
Map the following instructions to the proper float128 lib calls:
  pow[i], exp[2], log[2|10], sin, cos, fmin, fmax

Differential Revision: https://reviews.llvm.org/D48544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336361 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 15:21:37 +00:00
Simon Pilgrim
a86a3c2c45 [X86][SSE] Add srem x, (1 << c) combine tests
Now that D45806 has landed we can start trying to avoid scalarizing srem by constant - these tests demonstrate some example cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336360 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 15:15:47 +00:00
Sanjay Patel
b06fd49497 [AArch64, PowerPC, x86] add tests for signbit bit hacks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336348 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 13:16:46 +00:00
Ryan Taylor
e941c76442 [AMDGPU] Add VALU to V_INTERP Instructions
Wait states are not properly being inserted after buffer_store for v_interp instructions.

Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can
check and insert the appropriate wait states when needed.

Differential Revision: https://reviews.llvm.org/D48772

Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336339 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 12:02:07 +00:00
Krasimir Georgiev
816b1d58d7 Partially revert r336268 in address-offsets.ll
Summary: There the typos are intentional, explicitly introduced to disable these cases in r280285.

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: dschuff, sbc100, jgravelle-google, aheejin, llvm-commits

Differential Revision: https://reviews.llvm.org/D48962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336336 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 11:30:15 +00:00
Simon Pilgrim
5ad902a628 [X86][SSE] Add extra v16i16 shl x,c -> pmullw test
We want to compare shifts with repeated vs non-repeated v8i16 shuffle masks (for PBLENDW ymm) 

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336333 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 09:54:53 +00:00
Aleksandar Beserminji
f9df18f4ce [mips] Fix atomic operations at O0, v3
Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the stores can cause the
atomic sequence to fail.

This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.

This version addresses issues with the initial implementation and covers
all atomic operations.

This resolves PR/32020.

Thanks to James Cowgill for reporting the issue!

Patch By: Simon Dardis

Differential Revision: https://reviews.llvm.org/D31287


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336328 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 09:27:05 +00:00
Ivan A. Kosarev
e816e74216 [NEON] Fix combining of vldx_dup intrinsics with updating of base addresses
Resolves:
Unsupported ARM Neon intrinsics in Target-specific DAG combine
function for VLDDUP
https://bugs.llvm.org/show_bug.cgi?id=38031

Related diff: D48439

Differential Revision: https://reviews.llvm.org/D48920


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336325 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 08:59:49 +00:00
Mikael Holmen
53a066525e Partial revert of "NFC - Various typo fixes in tests"
This partially reverts r336268 since it causes buildbot failures.

Added FIXME at the places where the CHECKs are misspelled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336323 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 08:42:16 +00:00
Sjoerd Meijer
802e5e3d9a [ARM] ParallelDSP: only support i16 loads for now
We were miscompiling i8 loads, so reject them as unsupported narrow operations
for now.

Differential Revision: https://reviews.llvm.org/D48944


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336319 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 08:21:40 +00:00
Lei Huang
bd76e146be [Power9] Optimize codgen for conversions of int to float128
Optimize code sequences for integer conversion to fp128 when the integer is a result of:
  * float->int
  * float->long
  * double->int
  * double->long

Differential Revision: https://reviews.llvm.org/D48429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336316 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 07:46:01 +00:00
Craig Topper
5f1cfe90f3 [X86] Remove X86 specific scalar FMA intrinsics and upgrade to tart independent FMA and extractelement/insertelement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336315 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 06:52:55 +00:00
Lei Huang
e5da5ca56a [Power9][NFC] add back-end tests for passing homogeneous fp128 aggregates by value
Tests to verify that we are passing fp128 via VSX registers as per ABI.
These are related to clang commit rL336308.

Differential Revision: https://reviews.llvm.org/D48310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336314 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 06:51:38 +00:00
Lei Huang
d50e092736 [Power9] Add tests for passing float128 in VSX reg for non-homogenous aggregates
Add missing testcase for rL336310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336313 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 06:29:28 +00:00
Lei Huang
1ae65deb24 [Power9]Legalize and emit code for quad-precision convert from single-precision
Legalize and emit code for quad-precision floating point operation conversion of
single-precision value to quad-precision.

Differential Revision: https://reviews.llvm.org/D47569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336307 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 04:18:37 +00:00
Lei Huang
b42f206bec [Power9] Implement float128 parameter passing and return values
This patch enable parameter passing and return by value for float128 types.
Passing aggregate/union which contain float128 members will be submitted in
subsequent patches.

Differential Revision: https://reviews.llvm.org/D47552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336306 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 04:10:15 +00:00
Craig Topper
585ca3a7f5 [X86] Add support for combining FMSUB/FNMADD/FNMSUB ISD nodes with an fneg input.
Previously we could only negate the FMADD opcodes. This used to be mostly ok when we lowered FMA intrinsics during lowering. But with the move to llvm.fma from target specific intrinsics, we can combine (fneg (fma)) to (fmsub) earlier. So if we start with (fneg (fma (fneg))) we would get stuck at (fmsub (fneg)).

This patch fixes that so we can also combine things like (fmsub (fneg)).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336304 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 02:52:56 +00:00
Craig Topper
3d92cb5cdd [X86] Remove some of the packed FMA3 intrinsics since we no longer use them in clang.
There's a regression in here due to inability to combine fneg inputs of X86ISD::FMSUB/FNMSUB/FNMADD nodes.

More removals to come, but I wanted to stop and fix the regression that showed up in this first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336303 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 02:52:54 +00:00
Lei Huang
4fcda06d85 [Power9]Legalize and emit code for round & convert quad-precision values
Legalize and emit code for round & convert float128 to double precision and
single precision.

Differential Revision: https://reviews.llvm.org/D46997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336299 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 21:59:16 +00:00
Vladimir Stefanovic
d04690c89b [mips] Warn when crc, ginv, virt flags are used with too old revision
CRC and GINV ASE require revision 6, Virtualization requires revision 5.
Print a warning when revision is older than required.

Differential Revision: https://reviews.llvm.org/D48843



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336296 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 19:26:31 +00:00
Stefan Pintilie
1631efd1b8 [PowerPC] Replace the Post RA List Scheduler with the Machine Scheduler
We want to run the Machine Scheduler instead of the List Scheduler after RA.
  Checked with a performance run on a Power 9 machine with SPEC 2006 and while
  some benchmarks improved and others degraded the geomean was slightly improved
  with the Machine Scheduler.

  Differential Revision: https://reviews.llvm.org/D45265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336295 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 18:54:25 +00:00
Simon Pilgrim
4b43817e2b [X86][SSE] Add v16i16 shl x,c -> pmullw test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336277 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 14:20:58 +00:00
Simon Pilgrim
6c03d59fdb [X86][SSE] Add SSE2 target to some shift tests
Show the difference in behaviour cf SSE41 (no PMULLD, PBLENDW etc.)

Raised by D48936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336271 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 13:58:13 +00:00
Gabor Buella
0aae914817 NFC - Various typo fixes in tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336268 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 13:28:39 +00:00
Simon Pilgrim
c8da7b28b9 [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values (REAPPLIED)
We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.

Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336250 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 09:12:48 +00:00
Simon Pilgrim
19a6aea02d [X86][SSE] Add reduced crash test case for r336113 - [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values
The patch was reverted at r336189 due to crashes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336247 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 08:55:23 +00:00
Max Kazantsev
2dfeba53c3 [ImplicitNullChecks] Check for rewrite of register used in 'test' instruction
The following code pattern:

       mov %rax, %rcx
       test %rax, %rax
       %rax = ....
       je  throw_npe
       mov(%rcx), %r9
       mov(%rax), %r10

gets transformed into the following incorrect code after implicit null check pass:
        mov %rax, %rcx
       %rax = ....
       faulting_load_op("movl (%rax), %r10", throw_npe)
       mov(%rcx), %r9

For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization.

Patch by Surya Kumari Jangala!

Differential Revision: https://reviews.llvm.org/D48627
Reviewed By: skatkov


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336241 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 08:01:26 +00:00
Benjamin Kramer
4bbb8b7d1f [NVPTX] Expand v2f16 INSERT_VECTOR_ELT
Vectorization can create them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336227 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 20:40:04 +00:00
Roman Lebedev
cfa89ea7f9 [X86] Add tests for low/high bit clearing with different attributes.
D48768 may turn some of these into shifts.

Reviewers: spatel

Reviewed By: spatel

Subscribers: spatel, RKSimon, llvm-commits, craig.topper

Differential Revision: https://reviews.llvm.org/D48767

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336224 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 19:12:37 +00:00
Craig Topper
773b6e5431 [X86][AsmParser] Don't consider %eip as a valid register outside of 32-bit mode.
This might make the error message added in r335668 unneeded, but I'm not sure yet.

The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336217 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 17:40:51 +00:00
Amara Emerson
ebcc0927e3 [AArch64][GlobalISel] Fix fallbacks introduced in r336120 due to unselectable stores.
r336120 resulted in falling back to SelectionDAG more often due to the G_STORE
MMOs not matching the vreg size. This fixes that by explicitly any-extending the
value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336209 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 15:59:26 +00:00
Simon Pilgrim
6a79620306 [DAGCombiner] visitSDIV - Permit MIN_SIGNED_VALUE in pow2 vector codegen
Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336198 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 14:11:32 +00:00
Benjamin Kramer
d5d94ca3a7 Revert "[X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values"
This reverts commit r336113. It causes crashes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336189 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 11:15:17 +00:00
Petar Jovanovic
c95c8c872b [MIPS GlobalISel] Lower arguments using stack
Lower more than 4 arguments using stack. This patch targets MIPS32.
It supports only functions with arguments of type i32.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D47934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336185 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 09:31:48 +00:00
Craig Topper
2ef7a1ca8c [X86] Add avx512vl command line to break-false-dep.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336169 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 04:43:49 +00:00
Heejin Ahn
bb8c53976b [WebAssembly] Support for atomic stores
Summary: Add support for atomic store instructions.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D48839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336145 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 21:22:59 +00:00
Vadzim Dambrouski
2546414701 [ARM] Fix PR37382: Don't optimize mul.with.overflow on thumbv6m.
Reviewers: efriedma, rogfer01, javed.absar

Reviewed By: efriedma, rogfer01

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D48846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336144 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 21:05:26 +00:00
Dan Gohman
df015f19fc [WebAssembly] Fix fast-isel optimization of branch conditions.
LLVM doesn't guarantee anything about the high bits of a register holding
an i1 value at the IR level, so don't translate LLVM IR i1 values directly
into WebAssembly conditional branch operands. WebAssembly's conditional
branches do demand all 32 bits be valid.

Fixes PR38019.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336138 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 19:45:57 +00:00
Krzysztof Parzyszek
1c305da514 [X86] Add phony registers for high halves of regs with low halves
Add registers still missing after r328016 (D43353):
- for bits 15-8  of SI, DI, BP, SP (*H), and R8-R15 (*BH),
- for bits 31-16 of R8-R15 (*WH).

Thanks to Craig Topper for pointing it out.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336134 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 19:05:09 +00:00
Craig Topper
e5e0703516 [X86] Don't use aligned load/store instructions for fp128 if the load/store isn't aligned.
Similarily, don't fold fp128 loads into SSE instructions if the load isn't aligned. Unless we're targeting an AMD CPU that doesn't check alignment on arithmetic instructions.

Should fix PR38001

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336121 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 17:01:54 +00:00
Amara Emerson
636e853b42 [AArch64][GlobalISel] Any-extend vararg parameters to stack slot size on Darwin.
We currently don't any-extend vararg parameters before storing them to the stack
locations on Darwin. However, SelectionDAG however does this, and so user code
is in the wild which inadvertently relies on this extension. This can manifest
in cases where the value stored is (int)0, but the actual parameter is interpreted
by va_arg as a pointer, and so not extending to 64 bits causes the callee to
load additional undefined bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336120 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 16:39:09 +00:00
Sam Clegg
20c17e173f [WebAssembly] Convert remaining tests from elf to wasm output format
Differential Revision: https://reviews.llvm.org/D48748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336116 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 16:03:49 +00:00
Simon Pilgrim
273949e71b [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique values
We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336113 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 15:14:07 +00:00
Simon Pilgrim
3e5a75b28d [X86][SSE] Add v8i16 shift test for 2 shift values that doesn't match basic blend
We have special case support for 2 shift values for basic blends, but irregular shift patterns end up using the generic lowering, despite shuffle lowering being good enough to handle more complex blends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336112 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 14:53:41 +00:00
Petar Jovanovic
783ea2aeed [Mips][FastISel] Do not duplicate condition while lowering branches
This change fixes the issue that arises when we duplicate condition from
the predecessor block. If the condition's arguments are not considered alive
across the blocks, fast regalloc gets confused and starts generating reloads
from the slots that have never been spilled to. This change also leads to
smaller code given that, unlike on architectures with condition codes, on
Mips we can branch directly on register value, thus we gain nothing by
duplication.

Patch by Dragan Mladjenovic.

Differential Revision: https://reviews.llvm.org/D48642


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336084 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 08:56:57 +00:00
QingShan Zhang
38051ae89a [PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's multiple for i64 pre-inc load/store
For the below case, pre-inc prep think it's a good candidate to use pre-inc for the bucket, but 64bit integer load/store update (pre-inc) instruction on Power requires the displacement field should be DS-form (4's multiple). Since it can't satisfy the constraint, we have to do some fix ups later. As below, the original load/stores could be well-form, it makes things worse.

unsigned long long result = 0;
unsigned long long foo(char *p, unsigned long long n) {
  for (unsigned long long i = 0; i < n; i++) {
    unsigned long long x1 = *(unsigned long long *)(p - 50000 + i);
    unsigned long long x2 = *(unsigned long long *)(p - 61024 + i);
    unsigned long long x3 = *(unsigned long long *)(p - 62048 + i);
    unsigned long long x4 = *(unsigned long long *)(p - 64096 + i);
    result *= x1 * x2 * x3 * x4;
  }
  return result;
}

Patch by jedilyn(Kewen Lin).

Differential Revision: https://reviews.llvm.org/D48813 
--This line, and  those below, will be ignored--

M    lib/Target/PowerPC/PPCLoopPreIncPrep.cpp
A    test/CodeGen/PowerPC/preincprep-i64-check.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336074 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 05:46:09 +00:00