Summary:
As per title. DAGCombiner only mathes the special case where b = 0, this patches extends the pattern to match any value of b.
Depends on D57302
Reviewers: hfinkel, RKSimon, craig.topper
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59208
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366214 91177308-0d34-0410-b5e6-96231b3b80d8
We already split extract_subvector(binop(insert_subvector(v,x),insert_subvector(w,y))) -> binop(x,y).
This patch adds support for extract_subvector(binop(concat_vectors(),concat_vectors())) cases as well.
In particular this means we don't have to wait for X86 lowering to convert concat_vectors to insert_subvector chains, which helps avoid some cases where demandedelts/combine calls occur too late to split large vector ops.
The fast-isel-store.ll load folding regression is annoying but I don't think is that critical.
Differential Revision: https://reviews.llvm.org/D63653
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365785 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Unsafe does not map well alone for each of these three cases as it is missing NoNan context when accessed directly with clang. I have migrated the fold guards to reflect the expectations of handing nan and zero contexts directly (NoNan, NSZ) and some tests with it. Unsafe does include NSZ, however there is already precedent for using the target option directly to reflect that context.
Reviewers: spatel, wristow, hfinkel, craig.topper, arsenm
Reviewed By: arsenm
Subscribers: michele.scandale, wdng, javed.absar
Differential Revision: https://reviews.llvm.org/D64450
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365679 91177308-0d34-0410-b5e6-96231b3b80d8
Keep the uint64_t type from getZExtValue() to stop truncation/extension overflow warnings in MSVC in subvector index math.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365621 91177308-0d34-0410-b5e6-96231b3b80d8
Basically the problem is that X86 doesn't set the Fast flag from
allowsMemoryAccess on certain CPUs due to slow unaligned memory
subtarget features. This prevents bitcasts from being folded into
loads and stores. But all vector loads and stores of the same width
are the same cost on X86.
This patch merges the allowsMemoryAccess call into isLoadBitCastBeneficial to allow X86 to skip it.
Differential Revision: https://reviews.llvm.org/D64295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365549 91177308-0d34-0410-b5e6-96231b3b80d8
Keep the uint64_t type from getOffsetFromBase() to stop truncation/extension overflow warnings in MSVC in alignment math.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365504 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The uaddo won't be removed and the addcarry will still be
dependent on the uaddo. So we'll just increase the use count
of X and Y and potentially require a COPY.
Reviewers: spatel, RKSimon, deadalnix
Reviewed By: RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64190
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365149 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This diff improve the capability of DAGCOmbine to generate linear carries propagation in presence of a diamond pattern. It is now able to match a large variety of different patterns rather than some hardcoded one.
Arguably, the codegen in test cases is not better, but this is to be expected. The goal of this transformation is more about canonicalisation than actual optimisation.
Reviewers: hfinkel, RKSimon, craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D57302
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365051 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is the backend part of [[ https://bugs.llvm.org/show_bug.cgi?id=42457 | PR42457 ]].
In middle-end, we'd want to prefer the form with two adds - D63992,
but as this diff shows, not every target will prefer that pattern.
Out of 4 targets for which i added tests all seem to be ok with inc-of-add for scalars,
but only X86 prefer that same pattern for vectors.
Here i'm adding a new TLI hook, always defaulting to the inc-of-add,
but adding AArch64,ARM,PowerPC overrides to prefer inc-of-add only for scalars.
Reviewers: spatel, RKSimon, efriedma, t.p.northover, hfinkel
Reviewed By: efriedma
Subscribers: nemanjai, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64090
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365010 91177308-0d34-0410-b5e6-96231b3b80d8
For a given floating point load / store pair, if the load value isn't used by any other operations,
then consider transforming the pair to integer load / store operations if the target deems the transformation profitable.
And we can exploiting much more when there are other operation nodes with chain operand between the load/store pair
so long as we keep the chain ordering original. We only replace the register used to load/store from float to integer.
I only add testcase in ARM because the TLI.isDesirableToTransformToIntegerOp hook is only enabled in ARM target.
Differential Revision: https://reviews.llvm.org/D60601
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364883 91177308-0d34-0410-b5e6-96231b3b80d8
We support 'big to little' (e.g. extract_subvector(v16i8 bitcast(v2i64))) but not 'little to big' cases (e.g. extract_subvector(v2i64 bitcast(v16i8)))
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364405 91177308-0d34-0410-b5e6-96231b3b80d8
This can occur under certain circumstances when undefs are created later on in the constant multipliers (e.g. in this case due to SimplifyDemandedVectorElts). Its better to let the shift by zero to occur and perform any cleanup afterward.
Fixes OSS Fuzz #15429
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364179 91177308-0d34-0410-b5e6-96231b3b80d8
The code divides the alignment by 2 if the original alignment is
equal to the original VT size. But this wouldn't be correct
if the alignment was larger than the VT size.
The memory operand object already takes care of calling MinAlign
on the base alignment and the memory pointer offset. So we don't
need any special code at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364151 91177308-0d34-0410-b5e6-96231b3b80d8
We tend to only test for scalar/scalar consts when really we could support non-uniform vectors using ISD::matchUnaryPredicate/matchBinaryPredicate etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363924 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes cppcheck warning.
Use the more capable getAPIntVal() instead of getZExtValue() as well since I'm here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363921 91177308-0d34-0410-b5e6-96231b3b80d8
Use getAPIntValue() in a few more places. Most of the time getZExtValue() is fine, but occasionally there's fuzzed code or someone decides to create i65536 or something.....
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363887 91177308-0d34-0410-b5e6-96231b3b80d8
Use matchBinaryPredicate instead of isConstOrConstSplat to let us handle non-uniform shift cases.
This requires us to tweak matchBinaryPredicate to allow it to (optionally) handle constants with different type widths.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363792 91177308-0d34-0410-b5e6-96231b3b80d8
Some GEPs were not being split, presumably because that split would just be
undone by the DAGCombiner. Not performing those splits can prevent important
optimizations, such as preventing the element indices / member offsets from
being (partially) folded into load/store instruction immediates. This patch:
- Makes the splits also occur in the cases where the base address and the GEP
are in the same BB.
- Ensures that the DAGCombiner doesn't reassociate them back again.
Differential Revision: https://reviews.llvm.org/D60294
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363544 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts rL363474. -debug-only=isel was added to some tests that
don't specify `REQUIRES: asserts`. This causes failures on
-DLLVM_ENABLE_ASSERTIONS=off builds.
I chose to revert instead of fixing the tests because I'm not sure
whether we should add `REQUIRES: asserts` to more tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363482 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on D62910, we need to check whether particular types of memory access are allowed, not just their alignment/address-space.
This NFC patch adds a MachineMemOperand::Flags argument to allowsMemoryAccess and allowsMisalignedMemoryAccesses, and wires up calls to pass the relevant flags to them.
If people are happy with this approach I can then update X86TargetLowering::allowsMisalignedMemoryAccesses to handle misaligned NT load/stores.
Differential Revision: https://reviews.llvm.org/D63075
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363179 91177308-0d34-0410-b5e6-96231b3b80d8
As suggested by @arsenm on D63075 - this adds a TargetLowering::allowsMemoryAccess wrapper that takes a Load/Store node's MachineMemOperand to handle the AddressSpace/Alignment arguments and will also implicitly handle the MachineMemOperand::Flags change in D63075.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363048 91177308-0d34-0410-b5e6-96231b3b80d8
This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c
static void store64(u64 x, unsigned char* y)
{
for(int i = 0; i != 8; ++i)
y[i] = (x >> ((7-i) * 8)) & 255;
}
static u64 load64(const unsigned char* y)
{
u64 res = 0;
for(int i = 0; i != 8; ++i)
res |= (u64)(y[i]) << ((7-i) * 8);
return res;
}
The load64 has been implemented by https://reviews.llvm.org/D26149
This patch is trying to implement the store pattern.
Match a pattern where a wide type scalar value is stored by several narrow
stores. Fold it into a single store or a BSWAP and a store if the targets
supports it.
Assuming little endian target:
i8 *p = ...
i32 val = ...
p[0] = (val >> 0) & 0xFF;
p[1] = (val >> 8) & 0xFF;
p[2] = (val >> 16) & 0xFF;
p[3] = (val >> 24) & 0xFF;
>
*((i32)p) = val;
i8 *p = ...
i32 val = ...
p[0] = (val >> 24) & 0xFF;
p[1] = (val >> 16) & 0xFF;
p[2] = (val >> 8) & 0xFF;
p[3] = (val >> 0) & 0xFF;
>
*((i32)p) = BSWAP(val);
Differential Revision: https://reviews.llvm.org/D62897
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362921 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores:
1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another.
2 - The merged load\store node must retain the non-temporal flag.
Differential Revision: https://reviews.llvm.org/D62910
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362723 91177308-0d34-0410-b5e6-96231b3b80d8
This is a special case of a more general transform (not (sub Y, X)) -> (add X, ~Y). InstCombine knows the general form. I've restricted to the special case to fix the motivating case PR42118. I tried handling any case where Y was constant, but got some changes on some Mips tests that I couldn't quickly prove where beneficial.
Fixes PR42118
Differential Revision: https://reviews.llvm.org/D62828
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362533 91177308-0d34-0410-b5e6-96231b3b80d8
The proposal in D62498 showed that x86 would benefit from vector
store splitting, but that may conflict with the generic DAG
combiner's store merging transforms.
Add memory type to the existing TLI hook that enables the merging
transforms, so we can limit those changes to scalars only for x86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362507 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This *might* be the last fold for `sink-addsub-of-const.ll`, but i'm not sure yet.
As far as i can tell, there are no regressions here (ignoring x86-32),
all changes are either good or neutral.
This, almost surprisingly to me, fixes the motivational tests (in `shift-amount-mod.ll`)
`@reg32_lshr_by_sub_from_negated` from [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].
https://rise4fun.com/Alive/vMd3
Reviewers: RKSimon, t.p.northover, craig.topper, spatel, efriedma
Reviewed By: RKSimon
Subscribers: sdardis, javed.absar, arichardson, kristof.beyls, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62774
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362488 91177308-0d34-0410-b5e6-96231b3b80d8