This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328395 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r328252. This change broke building a number
of projects when targeting ARM and AArch64, see PR36873.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328297 91177308-0d34-0410-b5e6-96231b3b80d8
In our real world application, we found the following optimization is missed in DAGCombiner
(zext (and/or/xor (shl/shr (load x), cst), cst)) -> (and/or/xor (shl/shr (zextload x), (zext cst)), (zext cst))
If the user of original zext is an add, it may enable further lea optimization on x86.
This patch add a new function CombineZExtLogicopShiftLoad to do this optimization.
Differential Revision: https://reviews.llvm.org/D44402
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328252 91177308-0d34-0410-b5e6-96231b3b80d8
BUILD_VECTORs aren't themselves legalized until LegalizeDAG so we should still be able to create an "illegal" one before that. This helps combine with BUILD_VECTORS that are introduced during LegalizeVectorOps due to unrolling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327446 91177308-0d34-0410-b5e6-96231b3b80d8
All uses conservatively assume in early exit case that it will be a
predecessor. Changing default removes checking code in all uses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327169 91177308-0d34-0410-b5e6-96231b3b80d8
Loading a constant into a k-register in AVX512 requires a bitcast from a scalar constant. In the test case here we have a k-register store that gets split into multiple parts of KNL. MergeConsecutiveStores sees each of these pieces as a consecutive store and looks through the bitcast to find the underly scalar constant. But when we went to create the combined store we didn't look through the same bitcast.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326677 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There are transformation that change setcc into other constructs, and transform that try to reconstruct a setcc from the brcond condition. Depending on what order these transform are done, the end result differs.
Most of the time, it is preferable to get a setcc as a brcond argument (and this is why brcond try to recreate the setcc in the first place) so we ensure this is done every time by also doing it at the setcc level when the only user is a brcond.
Reviewers: spatel, hfinkel, niravd, craig.topper
Subscribers: nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D41235
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325892 91177308-0d34-0410-b5e6-96231b3b80d8
We looked through a BITCAST, but the bitcast might be a from a scalar type rather than a vector.
I don't have a test case. I stumbled onto it while prototyping another change that isn't ready yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325750 91177308-0d34-0410-b5e6-96231b3b80d8
DAGCombiner and SimplifySetCC both use getPointerTy for shift amounts pre-legalization. DAGCombiner uses a single helper function to hide this. SimplifySetCC does it in multiple places.
This patch adds a defaulted parameter to getShiftAmountTy that can make it return getPointerTy for scalar types. Use this parameter to simplify the SimplifySetCC and DAGCombiner.
Additionally, there were two places in SimplifySetCC that were creating shifts using the target's preferred shift amount pre-legalization. If the target uses a narrow type and the type is illegal, this can cause SimplfiySetCC to create a shift with an amount that can't represent all possible shift values for the type. To fix this we should use pointer type there too.
Alternatively we could make getScalarShiftAmountTy for each target return a safe value for large types as proposed in D43445. And maybe we should still do that, but fixing the SimplifySetCC code keeps other targets from tripping over this in the future.
Fixes PR36250.
Differential Revision: https://reviews.llvm.org/D43449
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325602 91177308-0d34-0410-b5e6-96231b3b80d8
Same for the sign extend case.
Currently we check for multiple uses on the binop. Then we call ExtendUsesToFormExtLoad to capture SetCCs that use the load. So we only end up finding any setccs when the and has additional uses and the load is used by a setcc. I don't think the and having multiple uses is relevant here. I think we should only be checking for the load having multiple uses.
This changes an NVPTX test because we now find that the load has a second use by a truncate, but ExtendUsesToFormExtLoad only looks at setccs it can extend. All other operations just check isTruncateFree. Maybe we should allow widening of an existing truncate even if its not free?
Differential Revision: https://reviews.llvm.org/D43063
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325289 91177308-0d34-0410-b5e6-96231b3b80d8
This is mainly a move of simplifyShuffleOperands from DAGCombiner::visitVECTOR_SHUFFLE to create a more general purpose TargetLowering::SimplifyDemandedVectorElts implementation.
Further features can be moved/added in future patches.
Differential Revision: https://reviews.llvm.org/D42896
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325232 91177308-0d34-0410-b5e6-96231b3b80d8
When creating high MachineMemOperand for MSTORE/MLOAD we supply
it with the original PointerInfo, while the pointer itself had been incremented.
The patch adds the proper offset to the PointerInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325135 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If the and has an additional use we shouldn't invert it. That creates an additional instruction.
While there add a one use check to the transform above that looked similar.
Reviewers: spatel, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43225
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325019 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG::getBoolConstant was recently introduced. At the time I didn't know getConstTrueVal existed, but I think getBoolConstant is better as it will use the source VT to make sure it can properly detect floating point if it is configured differently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324832 91177308-0d34-0410-b5e6-96231b3b80d8
We're passing the binary op that uses the load instead of the load.
Noticed by inspection. Not sure how to test this because this just prevents the introduction of an extend that will later be truncated and will probably be combined out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324568 91177308-0d34-0410-b5e6-96231b3b80d8
The truncate is being used to replace other users of of the load, but we checked that the load only has one use so there are no other uses to replace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324567 91177308-0d34-0410-b5e6-96231b3b80d8
The truncate is only needed if the load has additional users. It used to get passed to extendSetCCUses so was created early, but that's no longer the case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324562 91177308-0d34-0410-b5e6-96231b3b80d8
We were calling a load LN0 but it came from N0.getOperand(0) so its really more like LN00 if we follow the name used in other places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324561 91177308-0d34-0410-b5e6-96231b3b80d8
X86 currently has a late DAG combine after cttz/ctlz are turned into BSR+BSF+CMOV to detect this and remove the CMOV. But we should be able to do this much earlier and avoid creating the cmov all together.
For the changed AMDGPU test case it appears that previously the i8 cttz was type legalized to i16 which introduced an OR with 256 in order to limit the result to 8 on the widened type. At this point the result is known to never be zero, but nothing checked that. Then operation legalization is told to promote all i16 cttz to i32. This introduces an extend and a truncate and another OR with 65536 to limit the result to 16. With the DAG combiner change we are able to prevent the creation of the second OR since the opcode will have been changed to cttz_zero_undef after the first OR. I the lack of the OR caused the instruction to change to v_ffbl_b32_sdwa
Differential Revision: https://reviews.llvm.org/D42985
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324427 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This method is trying to use the truncate node to find which SETCC operand should be replaced directly with the extended load.
This used to work correctly because all uses of the original load were replaced by the truncate before this function was called. So this was used to effectively bypass the truncate and find the load under it.
All but one of the callers now call this before the truncate has replaced the laod so the setcc doesn't yet use the truncate. To account for this we should pass the original load instead.
I changed the order of that one caller to make this work there too.
I don't have a test case because this is probably hidden by later DAG combines causing the extend and truncate to cancel out. I assume this way is a little more efficient and matches what was originally intended.
Reviewers: RKSimon, spatel, niravd
Reviewed By: niravd
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42878
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324311 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If the load is already an extended load we should be using the memory VT for the legality check, not just the VT of the current extension.
I don't have a test case, just noticed it while investigating some load extension improvements.
Reviewers: RKSimon, spatel, niravd
Reviewed By: niravd
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42783
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324181 91177308-0d34-0410-b5e6-96231b3b80d8
We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits.
Fixes PR36199
Differential Revision: https://reviews.llvm.org/D42809
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324002 91177308-0d34-0410-b5e6-96231b3b80d8
As shown in the example in PR34994:
https://bugs.llvm.org/show_bug.cgi?id=34994
...we can return a very wrong answer (inf instead of 0.0) for square root when
using a reciprocal square root estimate instruction.
Here, I've conditionalized the filtering out of denorms based on the function
having "denormal-fp-math"="ieee" in its attributes. The other options for this
attribute are 'preserve-sign' and 'positive-zero'.
So we don't generate this extra code by default with just '-ffast-math' (because
then there's no denormal attribute string at all), but it works if you specify
'-ffast-math -fdenormal-fp-math=ieee' from clang.
As noted in the review, there may be other problems in clang that affect the
results depending on platform (Linux x86 at least), but this should allow
creating the desired codegen.
Differential Revision: https://reviews.llvm.org/D42323
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323981 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There's a check in the code to only check getSetCCResultType after LegalOperations or if the type is MVT::i1. But the i1 check is only allowing scalar types through. I think it should check that the scalar type is MVT::i1 so that it will work for vectors.
The changed test already does this combine with AVX512VL where getSetCCResultType returns vXi1. But with avx512f and no VLX getSetCCResultType returns a type matching the width of the input type.
Reviewers: spatel, RKSimon
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42619
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323631 91177308-0d34-0410-b5e6-96231b3b80d8
For the included test case, the DAG transformation
concat_vectors(scalar, undef) -> scalar_to_vector(sclr)
would attempt to create a v2i32 vector for a v9i8
concat_vector. Bail out to avoid creating a bitcast with
mismatching sizes later on.
Differential Revision: https://reviews.llvm.org/D42379
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323312 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r322085. Internal PPC testing is still showing the
same symptoms as when this patch landed the last time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322474 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Fold cases such as:
(v8i8 truncate (v8i32 extract_subvector (v16i32 sext (v16i8 V), Idx)))
->
(v8i8 extract_subvector (v16i8 V), Idx)
This can be generalized to cases where the truncate and extend do not
fully cancel each other out, but it may require querying the target
about profitability.
Reviewers: RKSimon, craig.topper, spatel, efriedma
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41927
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322300 91177308-0d34-0410-b5e6-96231b3b80d8
Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend.
Most of this patch is just making sure we copy the scale around everywhere.
Differential Revision: https://reviews.llvm.org/D40055
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322210 91177308-0d34-0410-b5e6-96231b3b80d8
Handle this in DAGCombiner::visitEXTRACT_VECTOR_ELT the same as we already do in SelectionDAG::getNode and use APInt instead of getZExtValue.
This should also fix oss-fuzz #4910
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321767 91177308-0d34-0410-b5e6-96231b3b80d8
Our internal testing has revealed has discovered bugs in PPC builds.
I have forward reproduction instructions to the original author (Nirav).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321649 91177308-0d34-0410-b5e6-96231b3b80d8
For example, float operations may fail to constant fold under certain circumstances (inf/nan/denormal creation etc.)
Reduced from oss-fuzz #4802 test case
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321488 91177308-0d34-0410-b5e6-96231b3b80d8