Commit Graph

18976 Commits

Author SHA1 Message Date
Craig Topper
fa494a82ff [AVX-512] Remove intrinsics for valignd/q and autoupgrade them to native shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287744 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-23 06:54:55 +00:00
Kuba Mracek
a64c974a27 [xray] Add XRay support for Mach-O in CodeGen
Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support.

Differential Revision: https://reviews.llvm.org/D26983



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287734 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-23 02:07:04 +00:00
Simon Pilgrim
2d1bdd9d61 [X86][AVX512DQ] Add fp <-> int tests for AVX512DQ/AVX512DQ+VL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287706 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 22:04:50 +00:00
Nemanja Ivanovic
aa687a6ca9 [PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LE
This patch corresponds to review:
https://reviews.llvm.org/D26861

It also fixes PR30730.

Committing on behalf of Lei Huang.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287679 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 19:02:07 +00:00
Simon Pilgrim
fcc1f76b4d [X86][SSE] Combine UNPCKL(FHADD,FHADD) -> FHADD for v2f64 shuffles.
This occurs during UINT_TO_FP v2f64 lowering. 

We can easily generalize this to other horizontal ops (FHSUB, PACKSS, PACKUS) as required - we are doing something similar with PACKUS in lowerV2I64VectorShuffle

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287676 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 17:50:06 +00:00
Simon Pilgrim
60c1ebdc81 Fix line endings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287638 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 13:27:29 +00:00
Benjamin Kramer
ef60bb05ab [wasm] hack around test failure after r287553.
This test is very brittle as small changes to block layout break the
check patterns. Hack around a change one more time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287637 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 13:13:33 +00:00
Simon Pilgrim
3d4bb54474 [SelectionDAG] ComputeNumSignBits of TRUNCATE operations
Add basic ComputeNumSignBits support for TRUNCATE ops for cases where the source's number of sign bits overlaps with the truncated size.

Improves X86 SIGN_EXTEND_IN_REG vector cases which were needlessly sign extending boolean vector results.

Differential Revision: https://reviews.llvm.org/D26851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287635 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 11:29:19 +00:00
Craig Topper
aa9982b218 [AVX-512] Add support for commuting VPERMT2(B/W/D/Q/PS/PD) to/from VPERMI2(B/W/D/Q/PS/PD).
Summary:
The index and one of the table operands can be swapped by changing the opcode to the other version. Neither of these operands are the one that can load from memory so this can't be used to increase memory folding opportunities.

We need to handle the unmasked forms and the kz forms. Since the load operand isn't being commuted we can commute the load and broadcast instructions too.

Reviewers: igorb, delena, Ayal, Farhana, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287621 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 04:57:34 +00:00
Craig Topper
000f65d1b1 [AVX-512] Add support for changing the element size of PALIGNR/VALIGND/VALIGNQ shuffles if they feed a vselect with a different type
Summary:
Shuffle lowering widens the element size of a shuffle if elements are contiguous. This is sometimes help because wider element types have more shuffle options. If the shuffle is one of the arguments to a vselect this shuffle widening can introduce a bitcast between the vselect and the shuffle. This will prevent isel from selecting a masked operation. If the shuffle can be written equally efficiently with a different element size to match the vselect type we should change the shuffle type to allow masking.

This patch does this conversion for all VALIGND/VALIGNQ sizes. It also supports turning 128-bit PALIGNR into VALIGND/VALIGNQ. This fixes the case shown in PR31018.

I plan to add support for more operations in future patches.

Reviewers: RKSimon, zvi, delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26902

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287612 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 03:51:53 +00:00
Stanislav Mekhanoshin
64620b1c31 [AMDGPU] Fix multiple vreg definitions in si-lower-control-flow
Differential Revision: https://reviews.llvm.org/D26939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-22 01:42:34 +00:00
Matt Arsenault
8b1782955d DAG: Ignore call site attributes when emitting target intrinsic
A target intrinsic may be defined as possibly reading memory,
but the call site may have additional knowledge that it doesn't read
memory. The intrinsic lowering will expect the pessimistic
assumption of the intrinsic definition, so the chain should
still be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287593 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 22:56:42 +00:00
Geoff Berry
e6b0810799 [AArch64LoadStoreOptimizer] Don't treat write to XZR/WZR as a clobber.
Summary:
When searching for load/store instructions to pair/merge don't treat
writes to WZR/XZR as clobbers since they don't change the value read
from WZR/XZR (which is always 0).

Reviewers: mcrosier, junbuml, jmolloy, t.p.northover

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287592 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 22:51:10 +00:00
Simon Dardis
981f5857a8 [mips] Add tests for half precision floating point support.
These should have been part of r287349.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287574 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 20:34:10 +00:00
Jun Bum Lim
b68036c70c [CodeGenPrep] Skip merging empty case blocks
Summary: Merging an empty case block into the header block of switch could cause
ISel to add COPY instructions in the header of switch, instead of the case
block, if the case block is used as an incoming block of a PHI. This could
potentially increase dynamic instructions, especially when the switch is in a
loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, davidxl

Subscribers: qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287553 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 16:47:28 +00:00
Simon Pilgrim
ab814ecdff [X86][SSE] Add SSE reciprocal estimate tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287543 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 15:28:21 +00:00
Simon Pilgrim
09211df264 [SelectionDAG] Add ComputeNumSignBits support for CONCAT_VECTORS opcode
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287541 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 14:36:19 +00:00
Benjamin Kramer
b65e1c7f35 Adjust arm64-irtranslator.ll test to changes from r287368
The test is currently broken, and this CL should fix it.

Patch by Adrian Kuegel!

Differential Revision: https://reviews.llvm.org/D26910

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287536 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 13:15:38 +00:00
Simon Pilgrim
f3db8f562f [X86][SSE] Allow PACKSS to be used to truncate any type of all/none sign bits input
At the moment we only use truncateVectorCompareWithPACKSS with direct vector comparison results (just one example of a known all/none signbits input).

This change relaxes the direct matching of a SETCC opcode by moving the logic up into SelectionDAG::ComputeNumSignBits and accepting any input with a known splatted signbit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287535 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 12:05:49 +00:00
Craig Topper
fd72811874 [AVX-512] Add EVEX form of VMOVZPQILo2PQIZrm to load folding tables to match SSE and AVX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287523 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 07:51:31 +00:00
Alexei Starovoitov
7362a66c70 [bpf] attempt to fix big-endian bots
attempt to fix big-endian bots failing on new dwarfdump test

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287522 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 07:26:23 +00:00
Alexei Starovoitov
db476dbf2e [bpf] fix dwarf elf relocs and line numbers
- teach RelocVisitor to recognize bpf relocations
- fix AsmInfo->PointerSize to make sure dwarf is emitted correctly
- add a test for the above

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287521 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 06:21:23 +00:00
Dean Michael Berris
4fedda5369 [XRay][AArch64] Implemented a test for the compile-time sleds emitted, and fixed a bug in the jump instruction
This patch adds a test for the assembly code emitted with XRay
instrumentation. It also fixes a bug where the operand of a jump
instruction must be not the number of bytes to jump over, but rather the
number of 4-byte instructions.

Author: rSerge

Reviewers: dberris, rengolin

Differential Revision: https://reviews.llvm.org/D26805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287516 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-21 03:01:43 +00:00
Simon Dardis
544f14c6b2 [mips] Restrict tail call optimization
The tail call optimization was being used without proper consideration of
ABI requirements for saving and restoring the GP. This patch restricts tail
call optimization to functions within the same translation unit.

Reviewers: vkalintiris

Differential Revision: https://reviews.llvm.org/D24763


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287505 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 21:23:08 +00:00
Simon Pilgrim
cbc1c49eda [X86][SSE] Add some initial combine tests that could (should?) use PACKSS
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287504 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 21:12:49 +00:00
Craig Topper
b19918a691 [AVX-512] Add tests for masked palignr/valignd/valignq shuffles, many of which show failures to fold the masking into the operation.
Many of these problems are because shuffle lowering widens element size and reduces element count when possible. This causes the shuffle to become separated from the select by a bitcast. Future patches will work to improve these cases by rewriting the shuffle back to a narrow element type if we think it can result in folding the mask.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287503 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 19:50:32 +00:00
Simon Pilgrim
d14ddee539 [X86][AVX512] Combine unary + zero target shuffles to VPERMV3 with a zero vector where possible
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287497 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 16:11:36 +00:00
Simon Pilgrim
73373f7847 [X86][AVX512] Add support for VBMI VPERMV3 target shuffle combines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287496 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 15:24:38 +00:00
Simon Pilgrim
ccb6a0ca98 [X86][AVX512] Add support for VBMI VPERMV target shuffle combines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287495 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 15:05:45 +00:00
Simon Pilgrim
c04ba41915 [X86][AVX512] Add some initial VBMI target shuffle combine tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287494 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 14:45:46 +00:00
Simon Pilgrim
d40266bd28 [X86][AVX512F] Add support for uint_to_fp v2i32 to v2f64 on AVX512F-only targets
Use 512-bit instructions (we already do something similar for uint_to_fp v4i32 to v4f64)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287491 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 14:03:23 +00:00
Oren Ben Simhon
88688e8d48 [X86] RegCall - Handling long double arguments
The change is part of RegCall calling convention support for LLVM.
Long double (f80) requires special treatment as the first f80 parameter is saved in FP0 (floating point stack).
This review present the change and the corresponding tests.

Differential Revision: https://reviews.llvm.org/D26151


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287485 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 11:06:07 +00:00
Alexei Starovoitov
e58b8cc117 [bpf] add BPF disassembler
add BPF disassembler, so tools like llvm-objdump can be used:
$ llvm-objdump -d -no-show-raw-insn ./sockex1_kern.o

./sockex1_kern.o:	file format ELF64-BPF

Disassembly of section socket1:
bpf_prog1:
       0:	r6 = r1
       8:	r0 = *(u8 *)skb[23]
      10:	*(u32 *)(r10 - 4) = r0
      18:	r1 = *(u32 *)(r6 + 4)
      20:	if r1 != 4 goto 8
      28:	r2 = r10
      30:	r2 += -4

ld_imm64 (the only 16-byte insn) and special ld_abs/ld_ind instructions
had to be treated in a special way. The decoders for the rest of the insns
are automatically generated.

Add tests to cover new functionality.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287477 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-20 02:25:00 +00:00
Simon Pilgrim
c6fddca294 [X86][SSE] Improve PSHUFB lowering from either input
Canonicalization may leave the zeroable vector in the first input.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287461 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-19 20:41:48 +00:00
Simon Pilgrim
b799ae96d6 [X86][AVX512] Add VPERMV/VPERMV3 v64i8 byte shuffles on avx512vbmi targets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287459 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-19 20:12:34 +00:00
Simon Pilgrim
23f587bb5d [X86][AVX512] Add avx512vbmi tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287447 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-19 18:12:48 +00:00
Simon Pilgrim
cee7f67a7d [X86][AVX512] Added some more complex v64i8 shuffles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287444 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-19 17:50:14 +00:00
Konstantin Zhuravlyov
5527d64b74 [AMDGPU] Change frexp.exp intrinsic to return i16 for f16 input
Differential Revision: https://reviews.llvm.org/D26862


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287389 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 22:31:08 +00:00
Simon Pilgrim
3000aab2fb [SelectionDAG] Add knowbits support for CONCAT_VECTOR opcode
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287387 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 22:21:22 +00:00
Simon Pilgrim
d4e6a9ba02 [X86] Add knownbits concat_vector test
Support coming in a future patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287385 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 21:59:38 +00:00
Geoff Berry
24feb85418 [MIRPrinter] XFAIL test for powerpc
This test introduced in r287368 is failing on powerpc for reasons
unrelated to branch probabilities.  See PR31062.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287375 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 20:08:05 +00:00
Geoff Berry
181c24a90c [MIRPrinter] Print raw branch probabilities as expected by MIRParser
Fixes PR28751.

Reviewers: MatzeB, qcolombet

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26775

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287368 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 19:37:24 +00:00
Simon Pilgrim
0f37c5c43a [X86][AVX512] Split AVX512F/AVX512VL tests to demonstrate missed int2fp opportunities without AVX512VL
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287348 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 15:31:36 +00:00
Tom Stellard
eb3384582f GlobalISel: Fix unconditional fallback with global isel abort is disabled
Reviewers: t.p.northover, ab, qcolombet

Subscribers: mehdi_amini, vkalintiris, wdng, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D26765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287344 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 14:14:35 +00:00
Tom Stellard
a006842d47 AMDGPU/SI: Remove zero_extend patterns for i16 ops selected to 32-bit insts
Summary:
The 32-bit instructions don't zero the high 16-bits like the 16-bit
instructions do.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D26828

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287342 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 13:53:34 +00:00
Nicolai Haehnle
1e27a618c6 AMDGPU: Fix legalization of MUBUF instructions in shaders
Summary:
The addr64-based legalization is incorrect for MUBUF instructions with idxen
set as well as for BUFFER_LOAD/STORE_FORMAT_* instructions.  This affects
e.g.  shaders that access buffer textures.

Since we never actually need the addr64-legalization in shaders, this patch
takes the easy route and keys off the calling convention.  If this ever
affects (non-OpenGL) compute, the type of legalization needs to be chosen
based on some TSFlag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98664

Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D26747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287339 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 11:55:52 +00:00
Ehsan Amiri
3d73fcad55 [Power9] Add patterns for vnegd, vnegw
Exploit new instructions by adding patterns to .td file.
https://reviews.llvm.org/D26551



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287334 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 11:05:55 +00:00
Simon Pilgrim
f04d638332 [X86][AVX2] Add v8i32->v8i64 mul test (PR30845)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287332 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 11:00:36 +00:00
Ehsan Amiri
072e86da0c [PPC][DAGCombine] Convert SETCC to subtract when the result is zero extended
When we see a SETCC whose only users are zero extend operations, we can replace
it with a subtraction. This results in doing all calculations in GPRs and
avoids CR use.

Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are
ways that this can be extended. For example for signed condition codes. In that
case we will be introducing additional sign extend instructions, so more careful
profitability analysis may be required.

Another direction to extend this is for equal, not equal conditions. Also when
users of SETCC are any_ext or sign_ext, we might be able to do something 
similar.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287329 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 10:41:44 +00:00
Craig Topper
e5e77e4a92 [AVX-512] Replace masked 16-bit element variable shift intrinsics with new unmasked versions and selects.
The same thing was done to 32-bit and 64-bit element sizes previously.

This will allow us to support these shuffls in InstCombineCalls along with the other variable shift intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287312 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 05:04:44 +00:00