Commit Graph

45189 Commits

Author SHA1 Message Date
Diana Picus
5281112161 [ARM GlobalISel] Move the check for Thumb higher up
We're currently bailing out for Thumb targets while lowering formal
parameters, but there used to be some other checks before it, which
could've caused some functions (e.g. those without formal parameters) to
sneak through unnoticed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317312 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03 10:30:12 +00:00
Martin Storsjo
19a3ba35df [AArch64] Use dwarf exception handling on MinGW
Ideally we should probably produce WinEH here as well, but until
then, we can use dwarf exceptions, without any further changes
required in clang, libunwind or libcxxabi.

Differential Revision: https://reviews.llvm.org/D39535

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317304 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03 07:33:20 +00:00
Craig Topper
c43a693efb [X86] Remove PALIGNR/VALIGN handling from combineBitcastForMaskedOp and move to isel patterns instead. Prefer 128-bit VALIGND/VALIGNQ over PALIGNR during lowering when possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317299 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03 06:48:02 +00:00
Sriraman Tallam
1bd292583c Avoid PLT for external calls when attribute nonlazybind is used.
Differential Revision: https://reviews.llvm.org/D39065

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317292 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-03 00:10:19 +00:00
Quentin Colombet
d8375d7368 [AArch64][RegisterBankInfo] Add mapping for G_FPEXT.
This fixes http://llvm.org/PR32560. We were missing a description for
half floating point type and as a result were using the FPR 32 mapping.
Because of the size mismatch the generic code was complaining that the
default mapping is not appropriate. Fix the mapping description so that
the default mapping can be properly applied.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317287 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 23:38:19 +00:00
Quentin Colombet
87cdca2231 [AArch64][RegisterBankInfo] Add FPR16 support in value mapping.
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317286 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 23:38:13 +00:00
Craig Topper
6d06c89303 [X86] Give AVX512VL instructions priority over their AVX equivalents.
I thought we had gotten all these priority bugs worked out, but I guess not.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317283 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 23:23:37 +00:00
Konstantin Zhuravlyov
f79fab6f98 AMDGPU: Fix warning discovered by r317266 [-Wunused-private-field]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317280 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 22:35:22 +00:00
Krzysztof Parzyszek
da35e5e8be [Hexagon] Prefer L2_loadrub_io over L4_loadrub_rr
If the offset is an immediate, avoid putting it in a register
to get Rs+Rt<<#0.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317275 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 21:56:59 +00:00
Konstantin Zhuravlyov
37bbee84d8 AMDGPU: Remove outdated fixme (it was already fixed)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317266 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 20:48:06 +00:00
Simon Dardis
652842ec9a [mips] Use register scavenging with MSA.
MSA stores and loads to the stack are more likely to require an
emergency GPR spill slot due to the smaller offsets available
with those instructions.

Handle this by overestimating the size of the stack by determining
the largest offset presuming that all callee save registers are
spilled and accounting of incoming arguments when determining
whether an emergency spill slot is required.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D39056


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317204 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 12:47:22 +00:00
Sam Parker
b7c0518566 [ARM] and, or, xor and add with shl combine
The generic dag combiner will fold:

(shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
(shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)

This can create constants which are too large to use as an immediate.
Many ALU operations are also able of performing the shl, so we can
unfold the transformation to prevent a mov imm instruction from being
generated.

Other patterns, such as b + ((a << 1) | 510), can also be simplified
in the same manner.

Differential Revision: https://reviews.llvm.org/D38084


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317197 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 10:43:10 +00:00
Andrew V. Tischenko
4746ebdd8b The patch updates sched numbers for YMM AVX instrs such as VMOVx, VORx, VXOR, VPERMILx, VBROADCASTx, etc.
PR32857 should be closed.
Differential Revision: https://reviews.llvm.org/D39227


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317196 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 10:33:41 +00:00
Petar Jovanovic
5616b72dcf Revert "Correct dwarf unwind information in function epilogue for X86"
This reverts r317100 as it introduced sanitizer-x86_64-linux-autoconf
buildbot failure (build #15606).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317136 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 23:05:52 +00:00
Craig Topper
01dd53d4bd [X86] Use foreach in X86.td to combine some of the CPU names that are obviously aliases. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317134 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 22:15:49 +00:00
Craig Topper
3f1a9263fc [X86] Add CMOV feature to 'i686' processor, making it a proper alias of pentiumpro which I believe it should be.
This is consistent with current gcc behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317133 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 22:15:40 +00:00
Simon Pilgrim
d005962cad [X86][SSE] Add PACKUS support to LowerTruncate
Similar to the existing code to lower to PACKSS, we can use PACKUS if the input vector's leading zero bits extend all the way to the packed/truncated value.

We have to account for pre-SSE41 targets not supporting PACKUSDW

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317128 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 21:52:29 +00:00
Craig Topper
e58f980c35 [X86] Add custom code to EVEX to VEX pass to turn unmasked 128-bit VPALIGND/Q into VPALIGNR if the extended registers aren't being used.
This will enable us to prefer VALIGND/Q during shuffle lowering in order to get the extended register encoding space when BWI isn't available. But if we end up not using the extended registers we can switch VPALIGNR for the shorter VEX encoding.

Differential Revision: https://reviews.llvm.org/D39401

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317122 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 21:00:59 +00:00
Konstantin Zhuravlyov
63dcaeaca4 AMDGPU: Fix set but not used warnings related to AMDGPUAS
Differential Revision: https://reviews.llvm.org/D39499


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317114 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 19:12:38 +00:00
Craig Topper
949005a477 [X86] Prevent fast isel from folding loads into the instructions listed in hasPartialRegUpdate.
This patch moves the check for opt size and hasPartialRegUpdate into the lower level implementation of foldMemoryOperandImpl to catch the entry point that fast isel uses.

We're still folding undef register instructions in AVX that we should also probably disable, but that's a problem for another patch.

Unfortunately, this requires reordering a bunch of functions which is why the diff is so large. I can do the function reordering separately if we want.

Differential Revision: https://reviews.llvm.org/D39402

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317112 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 18:10:06 +00:00
Graham Yiu
3031a585fb Adds code to PPC ISEL lowering to recognize half-word inserts from vector_shuffles, and use P9 shift and vector insert instructions instead of vperm.
Differential Revision: https://reviews.llvm.org/D34160

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317111 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 18:06:56 +00:00
Craig Topper
2a4a9564d6 [X86] Add 64-bit int to float/double conversion with AVX to X86FastISel::X86SelectSIToFP
Summary:
[X86] Teach fast isel to handle i64 sitofp with AVX.

For some reason we only handled i32 sitofp with AVX. But with SSE only we support i64 so we should do the same with AVX.

Also add i686 command lines for the 32-bit tests. 64-bit tests are in a separate file to avoid a fast-isel abort failure in 32-bit mode.

Reviewers: RKSimon, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39450

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317102 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 16:23:06 +00:00
Andrew V. Tischenko
078a3381eb Update VCVTx, VMOVNTPx and VROUNDYPx instructions scheduling on btver2.
Differential Revision: https://reviews.llvm.org/D39059


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317101 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 16:10:20 +00:00
Petar Jovanovic
bb38652ad4 Correct dwarf unwind information in function epilogue for X86
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.

It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.

The second part is platform independent and ensures that:

- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
  different passes. This is done in a late pass by analyzing information
  about cfa offset and cfa register in BBs and inserting additional CFI
  directives where necessary.

Changed CFI instructions so that they:

- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal

Added CFIInstrInserter pass:

- analyzes each basic block to determine cfa offset and register valid at
  its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
  incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
  rule for calculating CFA

Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.

CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.


Patch by Violeta Vukobrat.

Differential Revision: https://reviews.llvm.org/D35844


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317100 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 16:04:11 +00:00
Simon Pilgrim
41efea75bc [X86][SSE] Begun generalizing truncateVectorWithPACKSS to work with PACKSS/PACKUS functions
Renamed to truncateVectorWithPACK

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317098 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 15:31:51 +00:00
Roger Ferrer Ibanez
697969187c Revert r313618 "[ARM] Use ADDCARRY / SUBCARRY"
That change causes PR35103, so reverting until I figure it out.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317092 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 14:06:57 +00:00
NAKAMURA Takumi
598658d792 Fix warnings discovered by rL317076. [-Wunused-private-field]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317091 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 13:47:55 +00:00
NAKAMURA Takumi
800f768b3d Suppress a warning discovered by rL317076. [-Wunused-private-field]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317090 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 13:47:51 +00:00
Simon Pilgrim
2452271635 [X86][SSE] Truncate with PACKSS any input with sufficient sign-bits
So far we've only been using PACKSS truncations with 'all-bits or zero-bits' patterns (vector comparison results etc.). When really we can safely use it for any case as long as the number of sign bits reach down to the last 16-bits (or 8-bits if we're truncating to bytes).

The next steps after this is add the equivalent support for PACKUS and to support packing to sub-128 bit vectors for truncating stores etc.

Differential Revision: https://reviews.llvm.org/D39476

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317086 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 11:47:44 +00:00
Craig Topper
5d7d418e3b [X86] Add more type qualifiers to INSERT_SUBREG operations in rotate patterns so they don't get created with a v64i8 type.
Not sure why tablegen didn't error on this.

Fixes PR35158.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317079 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 07:11:32 +00:00
Craig Topper
38316c07ef [X86] Add AVX512 support to X86FastISel::fastMaterializeFloatZero.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317059 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 00:47:45 +00:00
Benjamin Kramer
63e6891819 [AMDGPU] Clean up symbols in the global namespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317051 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 23:21:30 +00:00
Marek Olsak
0ce6825d9e AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset
Summary:
Apps that benefit:
- alien isolation
- bioshock infinite
- civilization: beyond earth
- company of heroes 2
- dirt showdown
- dota 2
- F1 2015
- grid autosport
- hitman
- legend of grimrock
- serious sam 3: bfe
- shadow warrior
- talos principle
- total war: warhammer
- UE4 demos: effects cave, elemental, sun temple

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38914

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317038 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 21:06:42 +00:00
Reid Kleckner
95c7aaf451 [X86][AsmParser] Treat '%' as the modulo operator under Intel syntax
It can't be a register prefix, anyway. This is consistent with the masm
docs on MSDN: https://msdn.microsoft.com/en-us/library/t4ax90d2.aspx

This is a straight-forward extension of our support for "MOD"
implemented in https://reviews.llvm.org/D33876 / r306425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317011 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 16:47:38 +00:00
Simon Pilgrim
efb1cd4c28 [X86][SSE] Add VSRLI/VSRAI/VSLLI demanded elts support to computeKnownBits/ComputeNumSignBits
Mainly a perf improvements as most combines will have occurred before we lower to these instructions



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317005 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 16:06:21 +00:00
Michael Zuckerman
51db7fc904 [AVX512] Adding new patterns for extract_subvector of vXi1
extract subvector of vXi1 from vYi1 is poorly supported by LLVM and most of the time end with an assertion.
This patch fixes this issue by adding new patterns to the TD file.

Reviewers:
1. guyblank
2. igorb
3. zvi
4. ayman
5. craig.topper

Differential Revision: https://reviews.llvm.org/D39292

Change-Id: Ideb4d7e946c8d40cfce2920891f2d89fe64c58f8

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316981 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 10:00:19 +00:00
Craig Topper
a4989ddb03 [X86] Make AVX512_512_SET0 XMM16-31 lower to 128-bit XOR when AVX512VL is enabled. Use 128-bit VLX instruction when VLX is enabled.
Unfortunately, this weakens our ability to do domain fixing when AVX512DQ is not enabled, but it is consistent with our 256-bit behavior.

Maybe we should add custom handling to domain fixing to allow EVEX integer XOR/AND/OR/ANDN to switch to VEX encoded fp instructions if the high registers aren't being used?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316978 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 06:01:04 +00:00
Craig Topper
432bab12b8 [X86] Clang-format some code. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316973 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 02:34:29 +00:00
Javed Absar
8dddb8d5bb [AArch64]: range loopify frame-lowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316960 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 22:00:06 +00:00
Craig Topper
fb90a6544e [X86] Add AVX512 support to fast isel's X86ChooseCmpOpcode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316955 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 21:09:19 +00:00
Stefan Pintilie
b2d41e210a Revert "[PowerPC] Try to simplify a Swap if it feeds a Splat"
Revert r316478.
A test case has failed.
Will recommit this change once we find and fix the failure.

This reverts commit 7c330fabae.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316952 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 19:55:38 +00:00
Jina Nahias
1cf2bb25c9 [X86][AVX512] Adding a pattern for broadcastm intrinsic.
Differential Revision: https://reviews.llvm.org/D38312

Change-Id: I71c8605a8e4c98013ef25289694afc5cfd46bb0b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316921 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 16:37:28 +00:00
Rafael Espindola
afc231c459 Move isDSOLocal check and add a comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316920 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 16:32:31 +00:00
Fangrui Song
f98d7636f4 [PPC CodeGen] Fix the bitreverse.i64 intrinsic.
Summary: The two 32-bit words were swapped. Update a test omitted in reverted r316270.

Reviewers: jtony, aaron.ballman

Subscribers: nemanjai, kbarton

Differential Revision: https://reviews.llvm.org/D39163

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316916 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 16:03:44 +00:00
Craig Topper
87eb2fe42c [X86] Make sure we don't create locked inc/dec instructions when the carry flag is being used.
Summary:
INC/DEC don't update the carry flag so we need to make sure we don't try to use it.

This patch introduces new X86ISD opcodes for locked INC/DEC. Teaches lowerAtomicArithWithLOCK to emit these nodes if INC/DEC is not slow or the function is being optimized for size. An additional flag is added that allows the INC/DEC to be disabled if the caller determines that the carry flag is being requested.

The test_sub_1_cmp_1_setcc_ugt test is currently showing this bug. The other test case changes are recovering cases that were regressed in r316860.

This should fully fix PR35068 finishing the fix started in r316860.

Reviewers: RKSimon, zvi, spatel

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39411

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316913 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 14:51:37 +00:00
Craig Topper
7f36a3ae35 [X86] Remove AVX512 early out from X86FastISel::X86SelectCmp.
This shouldn't be needed anymore since i1 isn't a legal type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316912 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 14:50:11 +00:00
Yaxun Liu
a52756b2c9 [AMDGPU] Emit metadata for hidden arguments for kernel enqueue
Identifies kernels which performs device side kernel enqueues and emit
metadata for the associated hidden kernel arguments. Such kernels are
marked with calls-enqueue-kernel function attribute by
AMDGPUOpenCLEnqueueKernelLowering pass and later on
hidden kernel arguments metadata HiddenDefaultQueue and
HiddenCompletionAction are emitted for them.

Differential Revision: https://reviews.llvm.org/D39255


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316907 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 14:30:28 +00:00
Clement Courbet
4ccf677f27 [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316905 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 14:19:33 +00:00
Krzysztof Parzyszek
4d7518052e [Hexagon] Allow the RDF optimizations to be run in .mir testcases
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316904 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 14:11:52 +00:00
Javed Absar
3db281db10 [GlobalISel|ARM] : Allow legalizing G_FSUB
Adding support for VSUB.
Reviewed by: @rovka
Differential Revision: https://reviews.llvm.org/D39261



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316902 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 13:51:56 +00:00