and also "clang-format GenericDomTreeConstruction.h, since the current
formatting makes it look like their is a bug in the loop indentation, and there
is not"
This reverts commit r296535.
There are still some open design questions which I would like to discuss. I
revert this for Daniel (who gave the OK), as he is on vacation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296812 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes pr32063.
Current code in PPCTargetLowering::PerformDAGCombine can transform
bswap
store
into a single PPCISD::STBRX instruction. but it doesn't consider the case that the operand size of bswap may be larger than store size. When it occurs, we need 2 modifications,
1 For the last operand of PPCISD::STBRX, we should not use DAG.getValueType(N->getOperand(1).getValueType()), instead we should use cast<StoreSDNode>(N)->getMemoryVT().
2 Before PPCISD::STBRX, we need to shift the original operand of bswap to the right side.
Differential Revision: https://reviews.llvm.org/D30362
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296811 91177308-0d34-0410-b5e6-96231b3b80d8
This patch extends the current functionality of the AArch64 redundant copy
elimination pass to handle non-zero cases such as:
BB#0:
cmp x0, #1
b.eq .LBB0_1
.LBB0_1:
orr x0, xzr, #0x1 ; <-- redundant copy; x0 known to hold #1.
Differential Revision: https://reviews.llvm.org/D29344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296809 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for struct return values to the MSP430
target backend. It also reverses the order of argument and return
registers in the calling convention to bring it into closer
alignment with the published EABI from TI.
Patch by Andrew Wygle (awygle).
Differential Revision: https://reviews.llvm.org/D29069
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296807 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Extend -unroll-partial-threshold to 200 for runtime-loop3.ll test
as epilogue unroll initially add 1 more IV to the loop.
From: Evgeny Stupachenko <evstupac@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296803 91177308-0d34-0410-b5e6-96231b3b80d8
Make opcode selection code for the load instruction a bit easier
to read and maintain.
This patch also catches number of f16 load/store variants that were
not handled before.
Differential Revision: https://reviews.llvm.org/D30513
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296785 91177308-0d34-0410-b5e6-96231b3b80d8
MMX extraction often ends up as extract_i32(bitcast_v2i32(extract_i64(bitcast_v1i64(x86mmx v), 0)), 0) which fails to simplify on 32-bit targets as i64 isn't legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296782 91177308-0d34-0410-b5e6-96231b3b80d8
Original commit message:
"Allow externally dlopen-ed libraries to be registered as permanent libraries.
This is also useful in cases when llvm is in a shared library. First we dlopen
the llvm shared library and then we register it as a permanent library in order
to keep the JIT and other services working.
Patch reviewed by Vedant Kumar (D29955)!"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296774 91177308-0d34-0410-b5e6-96231b3b80d8
This patch reduces the stack frame size by not allocating the parameter area if
it is not required. In the current implementation LowerFormalArguments_64SVR4
already handles the parameter area, but LowerCall_64SVR4 does not
(when calculating the stack frame size). What this patch does is make
LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4.
Committing on behalf of Hiroshi Inoue.
Differential Revision: https://reviews.llvm.org/D29881
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296771 91177308-0d34-0410-b5e6-96231b3b80d8
This bug was introduced with:
https://reviews.llvm.org/rL296699
There may be a way to loosen the restriction, but for now just bail out
on any opaque constant.
The tests show that opacity is target-specific. This goes back to cost
calculations in ConstantHoisting based on TTI->getIntImmCost().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296768 91177308-0d34-0410-b5e6-96231b3b80d8
This tool allows generating the different between two optimization record
files. The result is a YAML file too that can be visualized with opt-viewer.
This is very useful to see what optimization were added and removed by a
change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296767 91177308-0d34-0410-b5e6-96231b3b80d8
We used to exclude arguments but for a diffed YAML file, it's interesting to
show these as changes.
Turns out this also affects gvn/LoadClobbered because we used to squash
multiple entries of this on the same line even if they reported clobbers
by *different* instructions. This increases the number of unique entries now
and the share of gvn/LoadClobbered.
Total number of remarks 902287
Top 10 remarks by pass:
inline 43%
gvn 37%
licm 11%
loop-vectorize 4%
asm-printer 3%
regalloc 1%
loop-unroll 1%
inline-cost 0%
slp-vectorizer 0%
loop-delete 0%
Top 10 remarks:
gvn/LoadClobbered 33%
inline/Inlined 16%
inline/CanBeInlined 14%
inline/NoDefinition 7%
licm/Hoisted 6%
licm/LoadWithLoopInvariantAddressInvalidated 5%
gvn/LoadElim 3%
asm-printer/InstructionCount 3%
inline/TooCostly 2%
loop-vectorize/MissedDetails 2%
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296766 91177308-0d34-0410-b5e6-96231b3b80d8
__getattr__ does not work well with debugging. If the attribute function has
a run-time error, a missing attribute is reported instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296765 91177308-0d34-0410-b5e6-96231b3b80d8
Bug-Point functionality needs extending due to the patch D29185 by bd1976llvm (Allow llvm's build and test systems to support paths with spaces ). It requires Bugpoint to accept the use of spaces within ‘--compile-command’ tokens.
Details
Bugpoint uses the argument ‘--compile-command’ to pass in a command line argument as a string, the string is tokenized by the ‘lexCommand’ function using spaces as a delimiter. Patch D29185 will cause the unit test compile-custom.ll to fail as spaces are now required within tokens and as a delimiter. This patch allows the use of escape characters as below:
Two consecutive '\' evaluate to a single '\'.
A space after a '\' evaluates to a space that is not interpreted as a delimiter.
Any other instances of the '\' character are removed.
Committed on behalf of Owen Reynolds
Differential revision: https://reviews.llvm.org/D29940
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296763 91177308-0d34-0410-b5e6-96231b3b80d8
This re-applies r289696, which caused TSan perf regression, which has
since been addressed in separate changes (see PR for details).
See PR31382.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296759 91177308-0d34-0410-b5e6-96231b3b80d8
The CallingConv.td rules allocate 8 bytes for these kinds of arguments
on AAPCS targets, but we were only recording the smaller amount. The
difference is theoretical on AArch64 because we don't actually store
more than the smaller amount, but it's still much better to have these
two components in agreement.
Based on Diana Picus's ARM equivalent patch (where it matters a lot
more).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296754 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When InstCombine is optimizing certain select-cmp-br patterns
it replaces the result of the select in uses outside of the
basic block containing the select. This is only legal if the
path from the select to the outside use is disjoint from all
other paths out from the originating basic block.
The problem found was that InstCombiner::replacedSelectWithOperand
did not consider the case when both edges out from the br pointed
to the same label. In that case the paths aren't disjoint and the
transformation is illegal. This patch avoids the faulty rewrites
by verifying that there is a single flow to the successor where
we want to replace uses.
Reviewers: llvm-commits, spatel, majnemer
Differential Revision: https://reviews.llvm.org/D30455
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296752 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches (ARM|AArch64)ISelLowering.cpp to match illegal vector types
to interleaved access intrinsics as long as the types are multiples of the
vector register width. A "wide" access will now be mapped to multiple
interleave intrinsics similar to the way in which non-interleaved accesses with
illegal types are legalized into multiple accesses. I'll update the associated
TTI costs (in getInterleavedMemoryOpCost) as a follow-on.
Differential Revision: https://reviews.llvm.org/D29466
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296750 91177308-0d34-0410-b5e6-96231b3b80d8
When computing the smallest and largest types for selecting the maximum
vectorization factor, we currently ignore loads and stores of pointer types if
the memory access is non-consecutive. We do this because such accesses must be
scalarized regardless of vectorization factor, and thus shouldn't be considered
when determining the factor. This patch makes this check less aggressive by
also considering non-consecutive accesses that may be vectorized, such as
interleaved accesses. Because we don't know at the time of the check if an
accesses will certainly be vectorized (this is a cost model decision given a
particular VF), we consider all accesses that can potentially be vectorized.
Differential Revision: https://reviews.llvm.org/D30305
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296747 91177308-0d34-0410-b5e6-96231b3b80d8
If dominator tree is not calculated or is invalidated, set corresponding
pointer in the pass state to nullptr. Such pointer value will indicate
that operations with dominator tree are not allowed. In particular, it
allows to skip verification for such pass state. The dominator tree is
not calculated if the machine dominator pass was skipped, it occures in
the case of entities with linkage available_externally.
The change fixes some test fails observed when expensive checks
are enabled.
Differential Revision: https://reviews.llvm.org/D29280
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296742 91177308-0d34-0410-b5e6-96231b3b80d8
Surprisingly, one of the three interference checks in LiveRegMatrix was
using the main live range instead of the apropriate subregister range
resulting in unnecessarily conservative results.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296722 91177308-0d34-0410-b5e6-96231b3b80d8
Original commit message:
[ARM] Fix insert point for store rescheduling.
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation as
LastOp.
This patch fixes some cases where we would sink stores for no reason.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296718 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This can be used to optimize large multiplications after legalization.
Depends on D29565
Reviewers: mkuper, spatel, RKSimon, zvi, bkramer, aaboud, craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296711 91177308-0d34-0410-b5e6-96231b3b80d8
Until now, we've had to use -global-isel to enable GISel. But using
that on other targets that don't support it will result in an abort, as we
can't build a full pipeline.
Additionally, we want to experiment with enabling GISel by default for
some targets: we can't just enable GISel by default, even among those
target that do have some support, because the level of support varies.
This first step adds an override for the target to explicitly define its
level of support. For AArch64, do that using
a new command-line option (I know..):
-aarch64-enable-global-isel-at-O=<N>
Where N is the opt-level below which GISel should be used.
Default that to -1, so that we still don't enable GISel anywhere.
We're not there yet!
While there, remove a couple LLVM_UNLIKELYs. Building the pipeline is
such a cold path that in practice that shouldn't matter at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296710 91177308-0d34-0410-b5e6-96231b3b80d8
In ARMPreAllocLoadStoreOpt::RescheduleOps, LastOp should be the last
operation which we want to merge. If we break out of the loop because
an operation has the wrong offset, we shouldn't use that operation as
LastOp.
This patch fixes some cases where we would sink stores for no reason.
Differential Revision: https://reviews.llvm.org/D30124
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296708 91177308-0d34-0410-b5e6-96231b3b80d8
This code starts from the high end of the sorted vector of offsets, and
works backwards: it tries to find contiguous offsets, process them, then
pops them from the end of the vector. Most of the code agrees with this
order of processing, but one loop doesn't: it instead processes elements
from the low end of the vector (which are nodes with unrelated offsets).
Fix that loop to process the correct elements.
This has a few implications. One, we don't incorrectly return early when
processing multiple groups of offsets in the same block (which allows
rescheduling prera-ldst-insertpt.mir). Two, we pick the correct insert
point for loads, so they're correctly sorted (which affects the
scheduling of vldm-liveness.ll). I think it might also impact some of
the heuristics slightly.
Differential Revision: https://reviews.llvm.org/D30368
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296701 91177308-0d34-0410-b5e6-96231b3b80d8
This is part of the ongoing attempt to improve select codegen for all targets and select
canonicalization in IR (see D24480 for more background). The transform is a subset of what
is done in InstCombine's FoldOpIntoSelect().
I first noticed a regression in the x86 avx512-insert-extract.ll tests with a patch that
hopes to convert more selects to basic math ops. This appears to be a general missing DAG
transform though, so I added tests for all standard binops in rL296621
(PowerPC was chosen semi-randomly; it has scripted FileCheck support, but so do ARM and x86).
The poor output for "sel_constants_shl_constant" is tracked with:
https://bugs.llvm.org/show_bug.cgi?id=32105
Differential Revision: https://reviews.llvm.org/D30502
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296699 91177308-0d34-0410-b5e6-96231b3b80d8
Now that terminators can be EH pads, this code needs to iterate over the
immediate dominators of the EH pad to find a valid insertion point.
Fix for PR32107
Patch by Robert Olliff!
Differential Revision: https://reviews.llvm.org/D30511
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296698 91177308-0d34-0410-b5e6-96231b3b80d8