204613 Commits

Author SHA1 Message Date
Fangrui Song
9382312eec [DomTree] findNearestCommonDominator: assert the nodes are in tree
i.e. they cannot be unreachable from the entry (which usually indicate usage errors).
This change allows the removal of some nullptr checks.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D88758
2020-10-04 15:35:14 -07:00
Craig Topper
2fdaef6a15 [X86] Correct the implicit defs/uses for the MWAITX pseudo instructions.
MWAITX doesn't touch EFLAGS so no pseudos should def EFLAGS.

The SAVE_EBX/RBX pseudos only needs to def the EBX register that
the expansion overwrites. The EAX and ECX registers are only read.

The pseudo emitted during isel that is used by the custom inserter
shouldn't have any implicit defs or uses since everything is in
vregs.
2020-10-04 15:28:38 -07:00
Craig Topper
36c60c4742 [X86] Remove usesCustomInserter from MWAITX_SAVE_EBX and MWAITX_SAVE_RBX. NFC
These are now emitted by a CustomInserter rather than using a custom
inserter themselves.
2020-10-04 15:28:38 -07:00
Stephen Neuendorffer
bcb409d3b1 Revert "[RFC] Factor out repetitive cmake patterns for llvm-style projects"
This reverts commit e9b87f43bde8b5f0d8a79c5884fdce639b12e0ca.

There are issues with macros generating macros without an obvious simple fix
so I'm going to revert this and try something different.
2020-10-04 15:17:34 -07:00
Arthur Eubanks
e7f476cce6 [Coroutines][NewPM] Fix coroutine tests under new pass manager
Some new function parameter attributes are derived under NPM.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D88760
2020-10-04 14:19:29 -07:00
Roman Lebedev
59058ed191 [NFC][SCEV] Add a test with some patterns where we could treat inttoptr/ptrtoint as semi-transparent 2020-10-05 00:05:39 +03:00
David Blaikie
8abb392de3 llvm-dwarfdump: Skip tombstoned address ranges
Make the dumper & API a bit more informative by using the new tombstone
addresses to filter out or otherwise render more explicitly dead code
ranges.
2020-10-04 13:43:29 -07:00
Nikita Popov
37e53cdb15 [MemCpyOpt] Add tests for call slot optimization with GEPs (NFC) 2020-10-04 22:26:05 +02:00
Nikita Popov
c98933e02d [MemCpyOpt] Don't use array allocas in tests (NFC)
Apparently querying dereferenceability of array allocations is
being intentionally penalized (https://reviews.llvm.org/D41398),
so avoid using them in tests.
2020-10-04 21:50:27 +02:00
Martin Storsjö
5874815ad0 [X86] Remove an accidentally added file. NFC.
This file seems to have been accidentally added as part of commit
413577a8790407d75ba834fa5668c2632fe1851e.
2020-10-04 22:33:26 +03:00
Fangrui Song
5757d9663a [SDA] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off builds 2020-10-04 12:17:16 -07:00
LLVM GN Syncbot
90e91d175f [gn build] Port 6c6cd5f8a97 2020-10-04 19:10:39 +00:00
Craig Topper
e24e1a2fb1 [X86] Sync AESENC/DEC Key Locker builtins with gcc.
For the wide builtins, pass a single input and output pointer to
the builtins. Emit the GEPs and input loads from CGBuiltin.
2020-10-04 12:09:41 -07:00
Craig Topper
91a84f8e39 [X86] Synchronize the encodekey builtins with gcc. Don't assume void* is 16 byte aligned.
We were taking multiple pointer arguments in the builtin.
gcc accepts a single void*.

The cast from void* to _m128i* caused the IR generation to assume
the pointer was aligned.

Instead make the builtin take a single void*, emit i8* GEPs to
adjust then cast to <2 x i64>* and perform a store with align of 1.
2020-10-04 12:09:35 -07:00
Craig Topper
8d06bb420a [X86] Synchronize the loadiwkey builtin operand order with gcc version. 2020-10-04 12:09:29 -07:00
Florian Hahn
8d81d843da [VPlan] Add VPRecipeBase::toVPUser helper (NFC).
This adds a helper to convert a VPRecipeBase pointer to a VPUser, for
recipes that inherit from VPUser. Once VPRecipeBase directly inherits
from VPUser this helper can be removed.
2020-10-04 19:43:27 +01:00
Florian Hahn
253efbceef [VPlan] Account for removed users in replaceAllUsesWith.
Make sure we do not iterate using an invalid iterator.

Another small fix/step towards traversing the def-use chains in VPlan.
2020-10-04 18:18:58 +01:00
Esme-Yi
b654b312b6 [PowerPC] Add builtins for xvtdiv(dp|sp) and xvtsqrt(dp|sp).
Summary: This patch implements the builtins for xvtdivdp, xvtdivsp, xvtsqrtdp, xvtsqrtsp.
The instructions correspond to the following builtins:
int vec_test_swdiv(vector double v1, vector double v2);
int vec_test_swdivs(vector float v1, vector float v2);
int vec_test_swsqrt(vector double v1);
int vec_test_swsqrts(vector float v1);
This patch depends on D88274, which fixes the bug in copying from CRRC to GPRC/G8RC.

Reviewed By: steven.zhang, amyk

Differential Revision: https://reviews.llvm.org/D88278
2020-10-04 16:24:20 +00:00
Sanjay Patel
2e5d7a734d [SDAG] fold x * 0.0 at node creation time
In the motivating case from https://llvm.org/PR47517
we create a node that does not get constant folded
before getNegatedExpression is attempted from some
other node, and we crash.

By moving the fold into SelectionDAG::simplifyFPBinop(),
we get the constant fold sooner and avoid the problem.
2020-10-04 11:31:57 -04:00
Nikita Popov
04cd7397ef [MemCpyOpt] Add additional call slot tests (NFC)
The case of a destination read between call and memcpy was not
covered anywhere (but is handled correctly).

However, a potentially throwing call between the call and the
memcpy appears to be miscompiled.
2020-10-04 17:26:46 +02:00
Simon Pilgrim
d6bcf4c690 [X86][SSE] isTargetShuffleEquivalent - ensure shuffle inputs are the correct size.
Preliminary patch for the next stage of PR45974 - we don't want to be creating 'padded' vectors on-the-fly at all in combineX86ShufflesRecursively, and only pad the source inputs if we have a definite match inside combineX86ShuffleChain.

This means that the inputs to combineX86ShuffleChain might soon be smaller than the final root value type, so we should ensure that isTargetShuffleEquivalent only matches with the inputs if they are the correct size.
2020-10-04 15:32:05 +01:00
Anatoly Parshintsev
b8a39db2c9 [RISCV][ASAN] instrumentation pass now uses proper shadow offset
[10/11] patch series to port ASAN for riscv64

Depends On D87580

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D87581
2020-10-04 16:30:38 +03:00
Roman Lebedev
09fdde59e8 [OldPM] Pass manager: run SROA after (simple) loop unrolling
I have stumbled into this pretty accidentally, when rewriting
some spaghetti-like code into something more structured,
which involved using some `std::array<>`s. And to my surprise,
the `alloca`s remained, causing about `+160%` perf regression.

https://llvm-compile-time-tracker.com/compare.php?from=bb6f4d32aac3eecb51909f4facc625219307ee68&to=d563e66f40f9d4d145cb2050e41cb961e2b37785&stat=instructions
suggests that this has geomean compile-time cost of `+0.08%`.

Note that D68593 / cecc0d27ad58c0aed8ef9ed99bbf691e137a0f26
already did this chage for NewPM, but left OldPM in a pessimized state.

This fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40011 | PR40011 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=42794 | PR42794 ]] and probably some other reports.

Reviewed By: nikic, xbolva00

Differential Revision: https://reviews.llvm.org/D87972
2020-10-04 11:53:50 +03:00
Craig Topper
a586463b2b [X86] LOADIWKEY, ENCODEKEY128 and ENCODEKEY256 clobber EFLAGS. 2020-10-03 21:55:03 -07:00
Craig Topper
08e9221b58 [X86] Add memory operand to AESENC/AESDEC Key Locker instructions.
This removes FIXMEs from selectAddr.
2020-10-03 21:42:16 -07:00
Craig Topper
3d4336f0f9 [X86] Move ENCODEKEY128/256 handling from lowering to selection.
We should avoid emitting MachineSDNodes from lowering.

We can use the the implicit def handling in InstrEmitter to avoid
manually copying from each xmm result register. We only need to
manually emit the copies for the implicit uses.
2020-10-03 18:44:53 -07:00
Craig Topper
6394824cc4 [X86] Remove X86ISD::MWAITX_DAG. Just match the intrinsic to the custom inserter pseudo instruction during isel. 2020-10-03 18:44:53 -07:00
Stephen Neuendorffer
192dd947b2 [RFC] Factor out repetitive cmake patterns for llvm-style projects
New projects (particularly out of tree) have a tendency to hijack the existing
llvm configuration options and build targets (add_llvm_library,
add_llvm_tool).  This can lead to some confusion.

1) When querying a configuration variable, do we care about how LLVM was
configured, or how these options were configured for the out of tree project?
2) LLVM has lots of defaults, which are easy to miss
(e.g. LLVM_BUILD_TOOLS=ON).  These options all need to be duplicated in the
CMakeLists.txt for the project.

In addition, with LLVM Incubators coming online, we need better ways for these
incubators to do things the "LLVM way" without alot of futzing.  Ideally, this
would happen in a way that eases importing into the LLVM monorepo when
projects mature.

This patch creates some generic infrastructure in llvm/cmake/modules and
refactors MLIR to use this infrastructure.  This should expand to include
add_xxx_library, which is by far the most complicated bit of building a
project correctly, since it has to deal with lots of shared library
configuration bits.  (MLIR currently hijacks the LLVM infrastructure for
building libMLIR.so, so this needs to get refactored anyway.)

Differential Revision: https://reviews.llvm.org/D85140
2020-10-03 17:12:35 -07:00
Craig Topper
ec7068c19c [X86] Add X86ISD opcodes for the Key Locker AESENC*KL and AESDEC*KL instructions
Instead of emitting MachineSDNodes during lowering, emit X86ISD
opcodes. These opcodes will either be selected by tablegen
patterns or custom selection code.

Emitting MachineSDNodes during lowering is uncommon so this makes
things more consistent. It also allows selectAddr to be called to
perform address matching during instruction selection.

I had trouble getting tablegen to accept XMM0-XMM7 as results in
an isel pattern for the WIDE instructions so I had to use custom
instruction selection.
2020-10-03 16:55:19 -07:00
Alexander Shaposhnikov
b5d3f2e87c [Object][MachO] Refactor MachOUniversalWriter
This diff refactors writeUniversalBinary and adds writeUniversalBinaryToBuffer.
This is a preparation for adding support for universal binaries to llvm-objcopy.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D88372
2020-10-03 14:18:38 -07:00
Mircea Trofin
956b8004b9 [MC] Assert that MCRegUnitIterator operates over MCRegisters
The signature of the ctor expects a MCRegister, but currently any
unsigned value can be converted to a MCRegister.

This patch checks that indeed the provided value is a physical register
only. We want to eventually stop implicitly converting unsigned or
Register to MCRegister (which is incorrect). The next step after this
patch is changing uses of MCRegUnitIterator to explicitly cast Register
or unsigned values to MCRegister. To that end, this patch also
introduces 2 APIs that make that conversion checked and explicit.

Differential Revision: https://reviews.llvm.org/D88705
2020-10-03 13:18:25 -07:00
Florian Hahn
245d1e80c9 [VPlan] Properly update users when updating operands.
When updating operands of a VPUser, we also have to adjust the list of
users for the new and old VPValues. This is required once we start
transitioning recipes to become VPValues.
2020-10-03 20:54:58 +01:00
Roman Lebedev
19460feaba [NFC][InstCombine] Autogenerate a few tests being affected by an upcoming patch 2020-10-03 22:49:58 +03:00
Roman Lebedev
9f25ff7107 [NFC][PhaseOrdering] Add a test showing new inttoptr casts after SROA due to InstCombine (PR47592)
We could either try to make SROA more picky to the new type
and/or prevent InstCombine from creating the original problem (converting load-stores to operate on ints),
and/or make InstCombine recover the situation by cleaning up all that cruft.
2020-10-03 22:49:58 +03:00
Florian Hahn
43822a788a [LV] Add another test case with unsinkable first-order recurrences. 2020-10-03 20:41:41 +01:00
Martin Storsjö
7b893d09e0 [AArch64] Prefer prologues with sp adjustments merged into stp/ldp for WinCFI, if optimizing for size
This makes the prologue match the windows canonical layout, for
cases without a frame pointer.

This can potentially be a slower (a longer dependency chain of the
sp register, and potentially one arithmetic operation more on some
cores), but gives notable size improvements.

The previous two commits shrinks a 166 KB xdata section by 49 KB,
and if the change from this commit is enabled, it shrinks the xdata
section by another 25 KB.

In total, since the start of the recent arm64 unwind info cleanups
and optimizations (since before commit 37ef743cbf3), the xdata+pdata
sections of the same test DLL has shrunk from 407 KB in total
originally, to 163 KB now.

Differential Revision: https://reviews.llvm.org/D88701
2020-10-03 21:37:22 +03:00
Martin Storsjö
fda1e859d8 [AArch64] Allow pairing lr with other GPRs for WinCFI
This saves one instruction per prologue/epilogue for any function with
an odd number of callee-saved GPRs, but more importantly, allows such
functions to match the packed unwind format.

Differential Revision: https://reviews.llvm.org/D88699
2020-10-03 21:37:22 +03:00
Martin Storsjö
ba79f9fef9 [AArch64] Match the windows canonical callee saved register order
On windows, the callee saved registers in a canonical prologue are
ordered starting from a lower register number at a lower stack
address (with the possible gap for aligning the stack at the top);
this is the opposite order that llvm normally produces.

To achieve this, reverse the order of the registers in the
assignCalleeSavedSpillSlots callback, to get the stack objects
laid out by PrologEpilogInserter in the right order, and adjust
computeCalleeSaveRegisterPairs to lay them out from the bottom up.

This allows generated prologs more often to match the format that
allows the unwind info to be written as packed info.

Differential Revision: https://reviews.llvm.org/D88677
2020-10-03 21:37:22 +03:00
Michał Górny
52e0eadb29 [asan] Stop instrumenting user-defined ELF sections
Do not instrument user-defined ELF sections (whose names resemble valid
C identifiers).  They may have special use semantics and modifying them
may break programs.  This is e.g. the case with NetBSD __link_set API
that expects these sections to store consecutive array elements.

Differential Revision: https://reviews.llvm.org/D76665
2020-10-03 19:54:38 +02:00
Simon Pilgrim
b0714107ad [InstCombine] Add tests for or(shl(x,c1),lshr(y,c2)) patterns that could fold to funnel shifts
Some initial test coverage toward fixing PR46896 - these are just copied from rotate.ll
2020-10-03 18:32:47 +01:00
Simon Pilgrim
c4daa0a11e [Analysis] resolveAllCalls - fix use after std::move warning. NFCI.
We can't use Use.Calls after its std::move()'d to TmpCalls as it will be in an undefined state. Instead, swap with the known empty map in TmpCalls so we can then safely emplace_back into the now empty Use.Calls.

Fixes clang static analyzer warning.
2020-10-03 17:52:20 +01:00
Simon Pilgrim
6d69920c01 [InstCombine] Add or(shl(v,and(x,bw-1)),lshr(v,bw-and(x,bw-1))) rotate tests
If we know the shift amount is less than the bitwidth we should be able to convert this to a rotate/funnel shift
2020-10-03 17:17:42 +01:00
David Green
8533523010 [ARM] Fix pointer offset when splitting stores from VMOVDRR
We were not accounting for the pointer offset when splitting a store from
a VMOVDRR node, which could lead to incorrect aliasing info. In this
case it is the fneg via integer arithmetic that gives us a store->load
pair that we started getting wrong.

Differential Revision: https://reviews.llvm.org/D88653
2020-10-03 16:47:50 +01:00
Simon Pilgrim
662d641c46 [InstCombine] recognizeBSwapOrBitReverseIdiom - add vector support
Add basic vector handling to recognizeBSwapOrBitReverseIdiom/collectBitParts - this works at the element level, all vector element operations must match (splat constants etc.) and there is no cross-element support (insert/extract/shuffle etc.).
2020-10-03 16:26:46 +01:00
Simon Pilgrim
5e4c1a8615 [InstCombine] recognizeBSwapOrBitReverseIdiom - use generic CreateIntegerCast
Try to appease buildbots breakages due to D88578
2020-10-03 15:29:22 +01:00
Simon Pilgrim
aa429e1d50 [InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) (Reapplied)
If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.

Reapplied with early-out if recognizeBSwapOrBitReverseIdiom collects a source wider than the result type.

Differential Revision: https://reviews.llvm.org/D88578
2020-10-03 14:52:42 +01:00
David Green
9586e118d1 [ARM] Test to show incorrect pointer info. NFC 2020-10-03 12:25:34 +01:00
Nikita Popov
d2133bd3a6 [MemCpyOpt] Make moveUp() a member method (NFC)
So we don't have to pass through more parameters in the future.
2020-10-03 11:28:49 +02:00
Nikita Popov
e295c195d1 [MemCpyOpt] Remove unnecessary -dse from test (NFC)
This one doesn't even have any dead stores to eliminate...
2020-10-03 11:28:49 +02:00
Craig Topper
a51256dabf [X86] Key Locker instructions should use VR128 regclass not VR128X. 2020-10-02 21:55:07 -07:00