The patch comes in 2 parts:
1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types.
2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled.
Fix for PR30760
Differential Revision: https://reviews.llvm.org/D29568
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294749 91177308-0d34-0410-b5e6-96231b3b80d8
Currently we only combine shuffle nodes if they have a single user to prevent us from causing code bloat by splitting the shuffles into several different combines.
We don't take into account that in some cases we will already have combined all the users during recursively calling up the shuffle tree.
This patch keeps a list of all the shuffle nodes that have been combined so far and permits combining of further shuffle nodes if all its users are in that list.
Differential Revision: https://reviews.llvm.org/D29399
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294183 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes emitting conversions of constants on targets
without legal f16 that need to use these for legalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293499 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This teaches getNode to simplify extracting from Undef. This is similar to what is done for EXTRACT_VECTOR_ELT. It also adds support for extracting from CONCAT_VECTOR when we can reuse one of the inputs to the concat. These seem like simple non-target specific optimizations.
For X86 we currently handle undef in extractSubvector, but not all EXTRACT_SUBVECTOR creations go through there.
Ultimately, my motivation here is to simplify extractSubvector and remove custom lowering for EXTRACT_SUBVECTOR since we don't do anything but handle undef and BUILD_VECTOR optimizations, but those should be DAG combines.
Reviewers: RKSimon, delena
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29000
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292876 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves the knownbits logic for unsigned integer min/max opcodes.
For UMIN we know that the result will have the maximum of the inputs' known leading zero bits in the result, similarly for UMAX the maximum of the inputs' leading one bits.
This is particularly useful for simplifying clamping patterns,. e.g. as SSE doesn't have a uitofp instruction we want to use sitofp instead where possible and for that we need to confirm that the top bit is not set.
Differential Revision: https://reviews.llvm.org/D28853
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292528 91177308-0d34-0410-b5e6-96231b3b80d8
We were relying on constant folding of the legalized instructions to do what constant folding we had previously
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292114 91177308-0d34-0410-b5e6-96231b3b80d8
The usage of some MIPS MSA instrinsics that took immediates could crash LLVM
during lowering. This patch addresses that behaviour. Crucially this patch
also makes the use of intrinsics with out of range immediates as producing an
internal error.
The ld,st instrinsics would trigger an assertion failure for MIPS64 as their
lowering would attempt to add an i32 offset to a i64 pointer.
Reviewers: vkalintiris, slthakur
Differential Revision: https://reviews.llvm.org/D25438
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291571 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid extra (recursive) calls to computeKnownBits if we already know that there are no common known bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290490 91177308-0d34-0410-b5e6-96231b3b80d8
There are helpers for testing for constant or constant build_vector,
and for splat ConstantFP vectors, but not for a constantfp or
non-splat ConstantFP vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290317 91177308-0d34-0410-b5e6-96231b3b80d8
Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases.
Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value.
Differential Revision: https://reviews.llvm.org/D27714
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289654 91177308-0d34-0410-b5e6-96231b3b80d8
We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289534 91177308-0d34-0410-b5e6-96231b3b80d8
Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289329 91177308-0d34-0410-b5e6-96231b3b80d8
Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289232 91177308-0d34-0410-b5e6-96231b3b80d8
Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289214 91177308-0d34-0410-b5e6-96231b3b80d8
Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together.
We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them.
Differential Revision: https://reviews.llvm.org/D27129
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289200 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r288916 as it is currently causing a crasher in
Halide. Reproducer on llvm.org/PR31323. While it might be that halide is
generating invalid IR, llc shouldn't crash.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289194 91177308-0d34-0410-b5e6-96231b3b80d8
getNode already prevents formation of out of bounds constant
extract_vector_elts. Do the same for insert_vector_elt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288603 91177308-0d34-0410-b5e6-96231b3b80d8
We have the following DAGCombiner transformations:
(mul (shl X, c1), c2) -> (mul X, c2 << c1)
(mul (shl X, C), Y) -> (shl (mul X, Y), C)
(shl (mul x, c1), c2) -> (mul x, c1 << c2)
Usually the constant shift is optimised by SelectionDAG::getNode when it is
constructed, by SelectionDAG::FoldConstantArithmetic, but when we're dealing
with vectors and one of those vector constants contains an undef element
FoldConstantArithmetic does not fold and we enter an infinite loop.
Fix this by making FoldConstantArithmetic use getNode to decide how to fold each
vector element, the same as FoldConstantVectorArithmetic does, and rather than
adding the constant shift to the work list instead only apply the transformation
if it's already been folded into a constant, as if it's not we're going to loop
endlessly. Additionally add missing NoOpaques to one of those transformations,
which I noticed when writing the tests for this.
Differential Revision: https://reviews.llvm.org/D26605
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287766 91177308-0d34-0410-b5e6-96231b3b80d8
Add basic ComputeNumSignBits support for TRUNCATE ops for cases where the source's number of sign bits overlaps with the truncated size.
Improves X86 SIGN_EXTEND_IN_REG vector cases which were needlessly sign extending boolean vector results.
Differential Revision: https://reviews.llvm.org/D26851
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287635 91177308-0d34-0410-b5e6-96231b3b80d8
Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.
This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.
The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.
I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.
DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.
This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander!
Differential Revision: https://reviews.llvm.org/D25691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285494 91177308-0d34-0410-b5e6-96231b3b80d8