For an invoke with operand bundles, the [op_begin(), op_end()-3] range
can contain things other than invoke arguments. This change teaches
PruneEH to use arg_begin() and arg_end() explicitly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255073 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This fixes failure when trying to select
insertelement <4 x half> undef, half %a, i64 0
which gets transformed to a scalar_to_vector node.
The accompanying v4 and v8 tests fail instruction selection without this
patch.
Reviewers: ab, jmolloy
Subscribers: srhines, llvm-commits
Differential Revision: http://reviews.llvm.org/D15322
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255072 91177308-0d34-0410-b5e6-96231b3b80d8
On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector.
This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element.
Fix for PR23022
Differential Revision: http://reviews.llvm.org/D15310
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255061 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the fully redundant load part of EarlyCSE how to forward from atomic and volatile loads and stores, and how to eliminate unordered atomics (only). This patch does not include dead store elimination support for unordered atomics, that will follow in the near future.
The basic idea is that we allow all loads and stores to be tracked by the AvailableLoad table. We store a bit in the table which tracks whether load/store was atomic, and then only replace atomic loads with ones which were also atomic.
No attempt is made to refine our handling of ordered loads or stores. Those are still treated as full fences. We could pretty easily extend the release fence handling to release stores, but that should be a separate patch.
Differential Revision: http://reviews.llvm.org/D15337
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255054 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse4a-builtins.c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255053 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/ssse3-builtins.c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255052 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse3-builtins.c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255051 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986;
so we need the special fixup in the epilogue.
Reviewers: jroelofs, qcolombet
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D15126
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255047 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts r255043, as per post-review concern were raised on the correctness.
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255045 91177308-0d34-0410-b5e6-96231b3b80d8
Per LangRef: "Globals with available_externally linkage are
allowed to be discarded at will, and are otherwise the same
as linkonce_odr", since linkonce_odr is in this list it makes
sense to have available_externally there as well.
Reviewers: rafael
Differential Revision: http://reviews.llvm.org/D15323
From: Mehdi Amini <mehdi.amini@apple.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255043 91177308-0d34-0410-b5e6-96231b3b80d8
AND/BIC instructions do accept SP/PC, so the register class should be
more generic (rGPR -> GPR) to cope with that case. Adding more tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255034 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We don't check the size operand on ext/dext*/ins/dins* yet because the
permitted range depends on the pos argument and we can't check that using
this mechanism.
The bug was that dextu/dinsu accepted 0..31 in the pos operand instead of 32..63.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D15190
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255015 91177308-0d34-0410-b5e6-96231b3b80d8
ARMv8.2-A adds 16-bit floating point versions of all existing SIMD
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.
Note that VFP without SIMD is not a valid combination for any version of
ARMv8-A, but I have ensured that these instructions all depend on both
FeatureNEON and FeatureFullFP16 for consistency.
The ".2h" vector type specifier is now legal (for the scalar pairwise
reduction instructions), so some unrelated tests have been modified as
different error messages are emitted. This is not a problem as the
invalid operands are still caught.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255010 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, vectors of halfs end up as ConstantVectors, but there isn't
a good reason they can't be ConstantDataVectors. This should save some
memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254991 91177308-0d34-0410-b5e6-96231b3b80d8
It's strange to duplicate the logic for emitting FP values into
emitGlobalConstantDataSequential, and it's even stranger that we end
up printing the verbose assembly comments differently between the two
paths. Just call into emitGlobalConstantFP rather than crudely
duplicating its logic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254988 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Also add a stricter post-condition for IndVarSimplify.
Fixes PR25578. Test case by Michael Zolotukhin.
Reviewers: hfinkel, atrick, mzolotukhin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15059
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254977 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
(Note: the problematic invocation of hoistIVInc that caused PR24804 came
from IndVarSimplify, not from SCEVExpander itself)
Fixes PR24804. Test case by David Majnemer.
Reviewers: hfinkel, majnemer, atrick, mzolotukhin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15058
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254976 91177308-0d34-0410-b5e6-96231b3b80d8
A manipulation (in this case, mkdir) can make slack between creating and touching %t.older/evenlen.
I would make this rewrote with python if this were still unstable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254965 91177308-0d34-0410-b5e6-96231b3b80d8
The 2-element vector case shows a surprising bug: we failed to
eliminate ops on undefs, so there are 4 fmax calls even though
there can only be 2 valid elements in the inputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254920 91177308-0d34-0410-b5e6-96231b3b80d8
FP logic instructions are supported in DQ extension on AVX-512 target.
I use integer operations instead.
Added tests.
I also enabled FABS in this patch in order to check ANDPS.
The operations are FOR, FXOR, FAND, FANDN.
The instructions, that supported for 512-bit vector under DQ are:
VORPS/PD, VXORPS/PD, VANDPS/PD, FANDNPS/PD.
Differential Revision: http://reviews.llvm.org/D15110
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254913 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
valid-xfail.s is for instructions that should be valid in the given ISA but
incorrectly fail. DSP/DSPr2 instructions are correct to fail since DSP/DSPr2 is
not enabled.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15072
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254911 91177308-0d34-0410-b5e6-96231b3b80d8
Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store.
This intrinsic comes with all legal types:
<8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru),
but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only.
All data operands should be widened to 512-bit vector.
The mask operand should be widened to v16i1 with zeroes.
Differential Revision: http://reviews.llvm.org/D15265
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254909 91177308-0d34-0410-b5e6-96231b3b80d8
Now instead of changing it to the new format and then linking, it just
handles the old format while copying it over.
The main differences are:
* There is no rauw in the source module.
* An old format input is always upgraded.
The first item helps with having a sane API that passes in a GV list to
the linker.
The second one is a small step in deprecating the old format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254907 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We are inserting both Scope and SP into the Seen map and check whether
it was already there in which case we skip the validation (the idea
being that we already checked this Subprogram before). However,
if (Scope == SP) as MDNodes, then inserting the Scope, will trigger
the Seen check causing us to incorrectly not validate this !dbg
attachment. Fix this by not performing the SP Seen check if Scope == SP
Reviewers: pcc, dexonsmith, dblaikie
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14697
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254887 91177308-0d34-0410-b5e6-96231b3b80d8