Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.
This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.
The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.
I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.
DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.
This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander!
Differential Revision: https://reviews.llvm.org/D25691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285494 91177308-0d34-0410-b5e6-96231b3b80d8
No need to clear KnownOne2/KnownZero2 bits as the next call to computeKnownBits will overwrite them anyway
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285398 91177308-0d34-0410-b5e6-96231b3b80d8
Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.
This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.
The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.
I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.
DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.
Differential Revision: https://reviews.llvm.org/D25691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285296 91177308-0d34-0410-b5e6-96231b3b80d8
Use isConstOrConstSplat helper.
Also use APInt instead of getZExtValue directly to avoid out of range issues.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285033 91177308-0d34-0410-b5e6-96231b3b80d8
It is already part of the type (which is part of the global, which is already
being added), so there's no need to do it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285002 91177308-0d34-0410-b5e6-96231b3b80d8
As noted in:
https://reviews.llvm.org/D25685
This is the next-to-smallest step needed to enable the ComputeNumSignBits fix in that patch.
In a minor attempt to keep some structure, we're pulling the FP helper over along with its
integer sibling, but clearly we can and should do more refactoring of the similar helper
functions in DAGCombiner and SelectionDAG to simplify and not duplicate functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284421 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG::getConstantPool will automatically determine an appropriate alignment if one is not specified. It does this by querying the type's preferred alignment. This can end up creating quite a lot of padding when the preferred alignment for vectors is 128.
In optimize-for-size mode, it makes sense to instead query the ABI type alignment which is often smaller and causes less padding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284381 91177308-0d34-0410-b5e6-96231b3b80d8
Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector.
Right now, the node is used in intrinsics for VEXPAND* instructions.
The work is done towards implementation of masked.expandload and masked.compressstore intrinsics.
Differential Revision: https://reviews.llvm.org/D25322
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283694 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Both computeKnownBits and ComputeNumSignBits can now do a simple
look-through of EXTRACT_VECTOR_ELT. It will compute the result based
on the known bits (or known sign bits) for the vector that the element
is extracted from.
Reviewers: bogner, tstellarAMD, mkuper
Subscribers: wdng, RKSimon, jyknight, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D25007
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283347 91177308-0d34-0410-b5e6-96231b3b80d8
With D24253 we can now use SelectionDAG::SignBitIsZero with vector operations.
This patch uses SelectionDAG::SignBitIsZero to recognise that a zero sign bit means that we can use a sitofp instead of a uitofp (which is not directly support on pre-AVX512 hardware).
While AVX512 does provide support for uitofp, the conversion to sitofp should not cause any regressions.
Differential Revision: https://reviews.llvm.org/D24343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281852 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
An IR load can be invariant, dereferenceable, neither, or both. But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.
This patch splits up the notions of invariance and dereferenceability at
the MI level. It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.
Reviewers: chandlerc, tstellarAMD
Subscribers: jholewinski, arsenm, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D23371
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281151 91177308-0d34-0410-b5e6-96231b3b80d8
Add the ability to computeKnownBits and SimplifyDemandedBits to extract the known zero/one bits from BUILD_VECTOR, returning the known bits that are shared by every vector element.
This is an initial step towards determining the sign bits of a vector (PR29079).
Differential Revision: https://reviews.llvm.org/D24253
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280927 91177308-0d34-0410-b5e6-96231b3b80d8
If we are extracting a subvector that has just been inserted then we should just use the original inserted subvector.
This has come up in certain several x86 shuffle lowering cases where we are crossing 128-bit lanes.
Differential Revision: https://reviews.llvm.org/D24254
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280715 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This greatly simplifies our handling of SDNode::SubclassData.
NFC, hopefully. :)
See discussion in D23035 for discussion about the design API of these
bitfields.
Reviewers: chandlerc
Subscribers: llvm-commits, rnk
Differential Revision: https://reviews.llvm.org/D23036
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279537 91177308-0d34-0410-b5e6-96231b3b80d8
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278902 91177308-0d34-0410-b5e6-96231b3b80d8
[DAG] Check debug values for invalidation before transferring and mark
old debug values invalid when transferring to another SDValue.
This fixes PR28613.
Reviewers: jyknight, hans, dblaikie, echristo
Subscribers: yaron.keren, ismail, llvm-commits
Differential Revision: https://reviews.llvm.org/D22858
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277135 91177308-0d34-0410-b5e6-96231b3b80d8
[DAG] Relocate TransferDbgValues in ReplaceAllUsesWith(SDValue, SDValue)
to before we modify the CSE maps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277027 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead, we take a single flags arg (a bitset).
Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.
This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted. It also greatly simplifies the process of adding another flag
to getLoad.
Reviewers: chandlerc, tstellarAMD
Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits
Differential Revision: http://reviews.llvm.org/D22249
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275592 91177308-0d34-0410-b5e6-96231b3b80d8