127579 Commits

Author SHA1 Message Date
Craig Topper
0eff36e127 [X86] Guard against leaving a dangling node in combineTruncateWithSat.
When handling the packus pattern for i32->i8 we do a two step
process using a packss to i16 followed by a packus to i8. If the
final i8 step is a type with less than 64-bits the packus step
will return SDValue(), but the i32->i16 step might have succeeded.
This leaves the nodes from the middle step dangling.

Guard against this by pre-checking that the number of elements is
at least 8 before doing the middle step.

With that check in place this should mean the only other
case the middle step itself can fail is when SSE2 is disabled. So
add an early SSE2 check then just assert that neither the middle
or final step ever fail.

llvm-svn: 374460
2019-10-10 21:46:52 +00:00
Marcello Maggioni
c808ecf651 [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values.
In GISel we have both G_CONSTANT and G_FCONSTANT, but because
in GISel we don't really have a concept of Float vs Int value
the only difference between the two is where the data originates
from.

What both G_CONSTANT and G_FCONSTANT return is just a bag of bits
with the constant representation in it.

By making getConstantVRegVal() return G_FCONSTANTs bit representation
as well we allow ConstantFold and other things to operate with
G_FCONSTANT.

Adding tests that show ConstantFolding to work on mixed G_CONSTANT
and G_FCONSTANT sources.

Differential Revision: https://reviews.llvm.org/D68739

llvm-svn: 374458
2019-10-10 21:46:26 +00:00
Stanislav Mekhanoshin
09a67d006e [AMDGPU] Handle undef old operand in DPP combine
It was missing an undef flag.

Differential Revision: https://reviews.llvm.org/D68813

llvm-svn: 374455
2019-10-10 21:32:41 +00:00
Rong Xu
9154340be5 [ValueTracking] Improve pointer offset computation for cases of same base
This patch improves the handling of pointer offset in GEP expressions where
one argument is the base pointer. isPointerOffset() is being used by memcpyopt
where current code synthesizes consecutive 32 bytes stores to one store and
two memset intrinsic calls. With this patch, we convert the stores to one
memset intrinsic.

Differential Revision: https://reviews.llvm.org/D67989

llvm-svn: 374454
2019-10-10 21:30:43 +00:00
Evandro Menezes
ab3432d4a6 [InstCombine] Add test case for PR43617 (NFC)
Also, refactor check in `LibCallSimplifier::optimizeLog()`.

llvm-svn: 374453
2019-10-10 21:29:10 +00:00
Alina Sbirlea
4f6544812b [MemorySSA] Additional handling of unreachable blocks.
Summary:
Whenever we get the previous definition, the assumption is that the
recursion starts ina  reachable block.
If the recursion starts in an unreachable block, we may recurse
indefinitely. Handle this case by returning LoE if the block is
unreachable.

Resolves PR43426.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68809

llvm-svn: 374447
2019-10-10 20:43:06 +00:00
David Greene
5e281d2d23 [System Model] [TTI] Move default cache/prefetch implementations
Move the default implementations of cache and prefetch queries to
TargetTransformInfoImplBase and delete them from NoTIIImpl.  This brings these
interfaces in line with how other TTI interfaces work.

Differential Revision: https://reviews.llvm.org/D68804

llvm-svn: 374446
2019-10-10 20:39:27 +00:00
Zachary Turner
0f81feda8a [PDB] Fix bug when using multiple PCH header objects with the same name.
A common pattern in Windows is to have all your precompiled headers
use an object named stdafx.obj.  If you've got a project with many
different static libs, you might use a separate PCH for each one of
these.

During the final link step, a file from A might reference the PCH
object from A, but it will have the same name (stdafx.obj) as any
other PCH from another project.  The only difference will be the
path.  For example, A might be A/stdafx.obj while B is B/stdafx.obj.

The existing algorithm checks only the filename that was passed on
the command line (or stored in archive), but this is insufficient in
the case where relative paths are used, because depending on the
command line object file / library order, it might find the wrong
PCH object first resulting in a signature mismatch.

The fix here is to simply check whether the absolute path of the
PCH object (which is stored in the input obj file for the file that
references the PCH) *ends with* the full relative path of whatever
is specified on the command line (or is in the archive).

Differential Revision: https://reviews.llvm.org/D66431

llvm-svn: 374442
2019-10-10 20:25:51 +00:00
Jordan Rose
fa21e7962b ADT: Save a word in every StringSet entry
Add a specialization to StringMap (actually StringMapEntry) for a
value type of NoneType (the type of llvm::None), and use it for
StringSet. This'll save us a word from every entry in a StringSet,
used for alignment with the size_t that stores the string length.

I could have gone all the way to some kind of empty base class
optimization, but that seemed like overkill. Someone can consider
adding that in the future, though.

https://reviews.llvm.org/D68586

llvm-svn: 374440
2019-10-10 20:22:53 +00:00
Craig Topper
5e7bacf493 [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed inputs to be between 0 and 255 when zmm registers are disabled on SKX.
If we've disable zmm registers, the v16i32 will need to be split. This split will propagate through min/max the truncate. This creates two sequences that need to be concatenated back to v16i8. We can instead use packusdw to do part of the clamping, truncating, and concatenating all at once. Then we can use a vpmovuswb to finish off the clamp.

Differential Revision: https://reviews.llvm.org/D68763

llvm-svn: 374431
2019-10-10 19:40:44 +00:00
Nico Weber
ecefcc7891 win: Move Parallel.h off concrt to cross-platform code
r179397 added Parallel.h and implemented it terms of concrt in 2013.

In 2015, a cross-platform implementation of the functions has appeared
and is in use everywhere but on Windows (r232419).  r246219 hints that
<thread> had issues in MSVC2013, but r296906 suggests they've been fixed
now that we require 2015+.

So remove the concrt code. It's less code, and it sounds like concrt has
conceptual and performance issues, see PR41198.

I built blink_core.dll in a debug component build with full symbols and
in a release component build without any symbols.  I couldn't measure a
performance difference for linking blink_core.dll before and after this
patch.

Differential Revision: https://reviews.llvm.org/D68820

llvm-svn: 374421
2019-10-10 18:57:23 +00:00
Xiangling Liao
404b38b311 [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcode
Add a helper function getMCSymbolForTOCPseudoMO to clean up PPCAsmPrinter
a little bit.

Differential Revision: https://reviews.llvm.org/D68721

llvm-svn: 374420
2019-10-10 18:56:42 +00:00
Reid Kleckner
70a99e77f2 Print quoted backslashes in LLVM IR as \\ instead of \5C
This improves readability of Windows path string literals in LLVM IR.
The LLVM assembler has supported \\ in IR strings for a long time, but
the lexer doesn't tolerate escaped quotes, so they have to be printed as
\22 for now.

llvm-svn: 374415
2019-10-10 18:31:57 +00:00
Nico Weber
c82a4f49bc Fix Windows build after r374381
llvm-svn: 374413
2019-10-10 18:20:16 +00:00
Reid Kleckner
c8d38b27a7 Remove strings.h include to fix GSYM Windows build
Fifth time's the charm.

llvm-svn: 374411
2019-10-10 18:17:24 +00:00
Greg Clayton
a8235efc12 Unbreak buildbots.
llvm-svn: 374410
2019-10-10 18:13:13 +00:00
Greg Clayton
89c18fb30a Fix buildbots by using memset instead of bzero.
llvm-svn: 374409
2019-10-10 18:11:49 +00:00
Michael Liao
b70f431791 Fix build by adding the missing dependency.
llvm-svn: 374406
2019-10-10 18:04:52 +00:00
Greg Clayton
f8926d8dc5 Unbreak llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast buildbot.
llvm-svn: 374398
2019-10-10 17:52:33 +00:00
Sanjay Patel
fa57be68ec [DAGCombiner] fold select-of-constants to shift
This reverses the scalar canonicalization proposed in D63382.

Pre: isPowerOf2(C1)
%r = select i1 %cond, i32 C1, i32 0
=>
%z = zext i1 %cond to i32
%r = shl i32 %z, log2(C1)

https://rise4fun.com/Alive/Z50

x86 already tries to fold this pattern, but it isn't done
uniformly, so we still see a diff. AArch64 probably should
enable the TLI hook to benefit too, but that's a follow-on.

llvm-svn: 374397
2019-10-10 17:52:02 +00:00
Greg Clayton
1bedfd683c Unbreak windows buildbots.
llvm-svn: 374396
2019-10-10 17:49:33 +00:00
Greg Clayton
14bc9870e9 Add GsymCreator and GsymReader.
This patch adds the ability to create GSYM files with GsymCreator, and read them with GsymReader. Full testing has been added for both new classes.

This patch differs from the original patch https://reviews.llvm.org/D53379 in that is uses a StringTableBuilder class from llvm instead of a custom version. Support for big and little endian files has been added. If the endianness matches the current host, we use efficient extraction for the header, address table and address info offset tables.

Differential Revision: https://reviews.llvm.org/D68744

llvm-svn: 374381
2019-10-10 17:10:11 +00:00
David Green
61e346ad96 [ARM] VQSUB instruction
Same as VQADD, VQSUB can be selected from llvm.ssub.sat intrinsics.

Differential Revision: https://reviews.llvm.org/D68567

llvm-svn: 374377
2019-10-10 16:34:30 +00:00
David Green
733e784b92 [Codegen] Alter the default promotion for saturating adds and subs
The default promotion for the add_sat/sub_sat nodes currently does:
   1. ANY_EXTEND iN to iM
   2. SHL by M-N
   3. [US][ADD|SUB]SAT
   4. L/ASHR by M-N
If the promoted add_sat or sub_sat node is not legal, this can produce code
that effectively does a lot of shifting (and requiring large constants to be
materialised) just to use the overflow flag. It is simpler to just do the
saturation manually, using the higher bitwidth addition and a min/max against
the saturating bounds. That is what this patch attempts to do.

Differential Revision: https://reviews.llvm.org/D68643

llvm-svn: 374373
2019-10-10 16:04:49 +00:00
Kadir Cetinkaya
6959e479b6 Fix assertions disabled builds after rL374367
llvm-svn: 374372
2019-10-10 16:04:45 +00:00
Sanjay Patel
ab5b3e32b1 [DAGCombiner] reduce code duplication; NFC
llvm-svn: 374370
2019-10-10 15:38:29 +00:00
Guillaume Chatelet
c0ceeabb02 [Alignment][NFC] Use llv::Align in GISelKnownBits
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68786

llvm-svn: 374369
2019-10-10 15:38:22 +00:00
Yonghong Song
8e8851874c [BPF] Remove relocation for patchable externs
Previously, patchable extern relocations are introduced to patch
external variables used for multi versioning in
compile once, run everywhere use case. The load instruction
will be converted into a move with an patchable immediate
which can be changed by bpf loader on the host.

The kernel verifier has evolved and is able to load
and propagate constant values, so compiler relocation
becomes unnecessary. This patch removed codes related to this.

Differential Revision: https://reviews.llvm.org/D68760

llvm-svn: 374367
2019-10-10 15:33:09 +00:00
Simon Pilgrim
0de2f12106 Fix Wdocumentation warnings. NFCI.
llvm-svn: 374364
2019-10-10 15:25:16 +00:00
Dmitri Gribenko
0c3e83e627 Revert "[FileCheck] Implement --ignore-case option."
This reverts commit r374339. It broke tests:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19066

llvm-svn: 374359
2019-10-10 14:27:14 +00:00
Simon Pilgrim
cc3d2c101e [X86] combineFMA - Convert to use isNegatibleForFree/GetNegatedExpression.
Split off from D67557.

llvm-svn: 374356
2019-10-10 14:14:12 +00:00
Simon Pilgrim
7586e67625 [X86] combineFMADDSUB - Convert to use isNegatibleForFree/GetNegatedExpression.
Split off from D67557, fixes the compile time regression mentioned in rL372756

llvm-svn: 374351
2019-10-10 13:46:44 +00:00
Jay Foad
2f06e6d163 Revert "[AMDGPU] Run unreachable-mbb-elimination after isel to clean up PHIs."
Summary:
This has been superseded by "[AMDGPU]: PHI Elimination hooks added for custom COPY insertion."

This reverts the code changes from commit 53f967f2bdb6aa7b08596880c3689d1ecad6f0ff
but keeps the test case.

Reviewers: hliao, arsenm, tpr, dstuttard

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68769

llvm-svn: 374347
2019-10-10 13:34:31 +00:00
Simon Pilgrim
a4f61c7e17 [DAG][X86] Add isNegatibleForFree/GetNegatedExpression override placeholders. NFCI.
Continuing to undo the rL372756 reversion.

Differential Revision: https://reviews.llvm.org/D67557

llvm-svn: 374345
2019-10-10 13:29:35 +00:00
Amaury Sechet
f31eb759e2 [DAGCombine] Match more patterns for half word bswap
Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68250

llvm-svn: 374340
2019-10-10 13:20:10 +00:00
Kai Nacke
28535e7629 [FileCheck] Implement --ignore-case option.
The FileCheck utility is enhanced to support a `--ignore-case`
option. This is useful in cases where the output of Unix tools
differs in case (e.g. case not specified by Posix).

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D68146

llvm-svn: 374339
2019-10-10 13:15:41 +00:00
Florian Hahn
3d17a2eeec [LV][NFC] Factor out calculation of "best" estimated trip count.
This is just small refactoring to minimize changes in upcoming patch.
In the next path I'm going to introduce changes into heuristic for vectorization of "tiny trip count" loops.

Patch by Evgeniy Brevnov <evgueni.brevnov@gmail.com>

Reviewers: hsaito, Ayal, fhahn, reames

Reviewed By: hsaito

Differential Revision: https://reviews.llvm.org/D67690

llvm-svn: 374338
2019-10-10 13:07:01 +00:00
Pavel Labath
210872d9b9 MinidumpYAML: Add support for the memory info list stream
Summary:
The implementation is fairly straight-forward and uses the same patterns
as the existing streams. The yaml form does not attempt to preserve the
data in the "gaps" that can be created by setting a larger-than-required
header or entry size in the stream header, because the existing consumer
(lldb) does not make use of the information in the gap in any way, and
attempting to preserve that would make the implementation more
complicated.

Reviewers: amccarth, jhenderson, clayborg

Subscribers: llvm-commits, lldb-commits, markmentovai, zturner, JosephTremoulet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68645

llvm-svn: 374337
2019-10-10 13:05:46 +00:00
David Green
0b1f0050d2 [ARM] VQADD instructions
This selects MVE VQADD from the vector llvm.sadd.sat or llvm.uadd.sat
intrinsics.

Differential Revision: https://reviews.llvm.org/D68566

llvm-svn: 374336
2019-10-10 13:05:04 +00:00
Guillaume Chatelet
08dec91263 [Alignment][NFC] Make VectorUtils uas llvm::Align
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, rogfer01, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68784

llvm-svn: 374330
2019-10-10 12:35:04 +00:00
Simon Pilgrim
65cdfba041 Fix -Wparentheses warning. NFCI.
llvm-svn: 374326
2019-10-10 12:21:52 +00:00
Mirko Brkusanin
26b8ffd7ae [Mips] Fix 374055
EXPENSIVE_CHECKS build was failing on new test.
This is fixed by marking $ra register as undef.
Test now has -verify-machineinstrs to check for operand flags.

llvm-svn: 374320
2019-10-10 12:02:14 +00:00
Oliver Stannard
2a451dc9d1 [IfCvt][ARM] Optimise diamond if-conversion for code size
Currently, the heuristics the if-conversion pass uses for diamond if-conversion
are based on execution time, with no consideration for code size. This adds a
new set of heuristics to be used when optimising for code size.

This is mostly target-independent, because the if-conversion pass can
see the code size of the instructions which it is removing. For thumb,
there are a few passes (insertion of IT instructions, selection of
narrow branches, and selection of CBZ instructions) which are run after
if conversion and affect these heuristics, so I've added target hooks to
better predict the code-size effect of a proposed if-conversion.

Differential revision: https://reviews.llvm.org/D67350

llvm-svn: 374301
2019-10-10 09:58:28 +00:00
Matt Arsenault
a8a00a4e2a AMDGPU: Use SGPR_128 instead of SReg_128 for vregs
SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds
the additional non-allocatable TTMP registers. There's no point in
allocating SReg_128 vregs. This shrinks the size of the classes
regalloc needs to consider, which is usually good.

llvm-svn: 374284
2019-10-10 07:11:33 +00:00
Johannes Doerfert
2cea006c92 [Attributor][NFC] clang format
llvm-svn: 374281
2019-10-10 05:34:21 +00:00
Johannes Doerfert
ffed554346 [Attributor] Handle null differently in capture and alias logic
Summary:
`null` in the default address space (=AS 0) cannot be captured nor can
it alias anything. We make this clear now as it can be important for
callbacks and other cases later on. In addition, this patch improves the
debug output for noalias deduction.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68624

llvm-svn: 374280
2019-10-10 05:33:21 +00:00
Cyndy Ishida
cd2fbbf9fc Reland "[TextAPI] Introduce TBDv4"
Original Patch broke for compilations w/ gcc and exposed asan fail.
This reland repairs those bugs.

Differential Revision: https://reviews.llvm.org/D67529

llvm-svn: 374277
2019-10-10 04:24:44 +00:00
Reid Kleckner
ff93e765b6 [codeview] Try to avoid emitting .cv_loc with line zero
Summary:
Visual Studio doesn't like it while stepping. It kicks you out of the
source view of the file being stepped through and tries to fall back to
the disassembly view.

Fixes PR43530

The fix is incomplete, because it's possible to have a basic block with
no source locations at all. In this case, we don't emit a .cv_loc, but
that will result in wrong stepping behavior in the debugger if the
layout predecessor of the location-less BB has an unrelated source
location. We could try harder to find a valid location that dominates or
post-dominates the current BB, but in general it's a dataflow problem,
and one still might not exist. I left a FIXME about this.

As an alternative, we might want to consider having the middle-end check
if its emitting codeview and get it to stop using line zero.

Reviewers: akhuang

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68747

llvm-svn: 374267
2019-10-10 01:06:01 +00:00
Philip Reames
b0ac285ee2 Conservatively add volatility and atomic checks in a few places
As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86.

As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them.

This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions.

Differential Revision: https://reviews.llvm.org/D68419

llvm-svn: 374261
2019-10-09 23:43:33 +00:00
Matt Arsenault
983c9e76c0 AMDGPU: Don't fold copies to physregs
In a future patch, this will help cleanup m0 handling.

The register coalescer handles copies from a register that
materializes an immediate, but doesn't handle move immediates
itself. The virtual register uses will often be allocated to the same
register, so there end up being no real copy.

llvm-svn: 374257
2019-10-09 22:51:42 +00:00