Add an intrinsic that takes 2 signed integers with the scale of them provided
as the third argument and performs fixed point multiplication on them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D54719
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348912 91177308-0d34-0410-b5e6-96231b3b80d8
If all the demanded elements of the SimplifyDemandedVectorElts are known to be UNDEF, we can simplify to an ISD::UNDEF node.
Zero constant folding will be handled in a future patch - its a little trickier as we often have bitcasted zero values.
Differential Revision: https://reviews.llvm.org/D55511
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348784 91177308-0d34-0410-b5e6-96231b3b80d8
This is an initial patch to add a minimum level of support for funnel shifts to the SelectionDAG and to begin wiring it up to the X86 SHLD/SHRD instructions.
Some partial legalization code has been added to handle the case for 'SlowSHLD' where we want to expand instead and I've added a few DAG combines so we don't get regressions from the existing DAG builder expansion code.
Differential Revision: https://reviews.llvm.org/D54698
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348353 91177308-0d34-0410-b5e6-96231b3b80d8
Fix potential issue with the ISD::INSERT_VECTOR_ELT case tweaking the DemandedElts mask instead of using a local copy - so later uses of the mask use the tweaked version.....
Noticed while investigating adding zero/undef folding to SimplifyDemandedVectorElts and the altered DemandedElts mask was causing mismatches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348348 91177308-0d34-0410-b5e6-96231b3b80d8
PR17686 demonstrates that for some targets FP exceptions can fire in cases where the FP_TO_UINT is expanded using a FP_TO_SINT instruction.
The existing code converts both the inrange and outofrange cases using FP_TO_SINT and then selects the result, this patch changes this for 'strict' cases to pre-select the FP_TO_SINT input and the offset adjustment.
The X87 cases don't need the strict flag but generates much nicer code with it....
Differential Revision: https://reviews.llvm.org/D53794
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348251 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for ISD::*_EXTEND and ISD::*_EXTEND_VECTOR_INREG opcodes.
The extra broadcast in trunc-subvector.ll will be fixed in an upcoming patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348246 91177308-0d34-0410-b5e6-96231b3b80d8
D52935 introduced the ability for SimplifyDemandedBits to call SimplifyDemandedVectorElts through BITCASTs if the demanded bit mask entirely covered the sub element.
This patch relaxes this to demanding an element if we need any bit from it.
Differential Revision: https://reviews.llvm.org/D54761
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348073 91177308-0d34-0410-b5e6-96231b3b80d8
This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347313 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode.
I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary.
Differential Revision: https://reviews.llvm.org/D54703
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@347251 91177308-0d34-0410-b5e6-96231b3b80d8
Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346139 91177308-0d34-0410-b5e6-96231b3b80d8
SimplifySetCC could shrink a load without checking for
profitability or legality of such shink with a target.
Added checks to prevent shrinking of aligned scalar loads
in AMDGPU below dword as scalar engine does not support it.
Differential Revision: https://reviews.llvm.org/D53846
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345778 91177308-0d34-0410-b5e6-96231b3b80d8
Add an intrinsic that takes 2 integers and perform saturation subtraction on
them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53783
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345512 91177308-0d34-0410-b5e6-96231b3b80d8
Add vector support to TargetLowering::expandFP_TO_UINT.
This exposes an issue in X86TargetLowering::LowerVSELECT which was assuming that the select mask was the same width as the LHS/RHS ops - as long as the result is a sign splat we can easily sext/trunk this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345473 91177308-0d34-0410-b5e6-96231b3b80d8
As suggested on D52965, this patch moves the i64 to f64 UINT_TO_FP expansion code from LegalizeDAG into TargetLowering and makes it available to LegalizeVectorOps as well.
Not only does this help perform X86 lowering as a true vectorization instead of (partially vectorized) scalar conversions, it avoids the HADDPD op from the scalar code which can be slow on most targets.
The AVX512F does have the vcvtusi2sdq scalar operation but we don't unroll to use it as it seems to only help for the v2f64 case - otherwise the unrolling cost will certainly be too high. My feeling is that we should leave it to the vectorizers - and if it generates the vector UINT_TO_FP we should use it.
Differential Revision: https://reviews.llvm.org/D53649
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345256 91177308-0d34-0410-b5e6-96231b3b80d8
As suggested on D53258, this patch move the CTPOP expansion code from SelectionDAGLegalize to TargetLowering to allow it to be reused by the VectorLegalizer.
Proper vector support will be added by D53258.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345066 91177308-0d34-0410-b5e6-96231b3b80d8
As suggested on D53258, this patch shares common CTLZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.
Extension to D53474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345060 91177308-0d34-0410-b5e6-96231b3b80d8
As suggested on D53258, this patch demonstrates sharing common CTTZ expansion code between VectorLegalizer and SelectionDAGLegalize by putting it in TargetLowering.
I intend to move CTLZ and (scalar) CTPOP over as well and then update D53258 accordingly.
Differential Revision: https://reviews.llvm.org/D53474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345039 91177308-0d34-0410-b5e6-96231b3b80d8
Add an intrinsic that takes 2 integers and perform unsigned saturation
addition on them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53340
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344971 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce new versions that follow the IEEE semantics
to help with legalization that may need quieted inputs.
There are some regressions from inserting unnecessary
canonicalizes when these are matched from fast math
fcmp + select which should be fixed in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344914 91177308-0d34-0410-b5e6-96231b3b80d8
Add an intrinsic that takes 2 integers and perform saturation addition on them.
This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.
Differential Revision: https://reviews.llvm.org/D53053
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344629 91177308-0d34-0410-b5e6-96231b3b80d8
This is intended to make the backend on par with functionality that was
added to the IR version of SimplifyDemandedVectorElts in:
rL343727
...and the original motivation is that we need to improve demanded-vector-elements
in several ways to avoid problems that would be exposed in D51553.
Differential Revision: https://reviews.llvm.org/D52912
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344541 91177308-0d34-0410-b5e6-96231b3b80d8
Help stop bugs like rL343935 by making the 'original' DemandedBits arg more obviously not the mask that is actually used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344138 91177308-0d34-0410-b5e6-96231b3b80d8
Part of a minor cleanup to make all the switch statements more consistent prior to improving vector support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344136 91177308-0d34-0410-b5e6-96231b3b80d8
Similar to what already happens in the DAGCombiner wrappers, this patch adds the root nodes back onto the worklist if the DCI wrappers' SimplifyDemandedBits/SimplifyDemandedVectorElts were successful.
Differential Revision: https://reviews.llvm.org/D53026
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344132 91177308-0d34-0410-b5e6-96231b3b80d8
rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....).
Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ.
Thanks to Jan Vesely (@jvesely) for bisecting the bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343935 91177308-0d34-0410-b5e6-96231b3b80d8
This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector.
There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify.
As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.).
Fixes PR39178
Differential Revision: https://reviews.llvm.org/D52935
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343913 91177308-0d34-0410-b5e6-96231b3b80d8
Adding NonNull as attributes to returned pointers has the unfortunate side
effect of disabling tail calls. This patch ignores the NonNull attribute when
we decide whether to tail merge, in the same way that we ignore the NoAlias
attribute, as it has no affect on the call sequence.
Differential Revision: https://reviews.llvm.org/D52238
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343091 91177308-0d34-0410-b5e6-96231b3b80d8
This was trying to scalarizing a scalar FP type,
resulting in an assert.
Fixes unaligned f64 stack stores for AMDGPU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342132 91177308-0d34-0410-b5e6-96231b3b80d8
This is the DAG equivalent of D51433.
If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition.
The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit
vectors because we don't need those anyway.
I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed
to be running there?
Differential Revision: https://reviews.llvm.org/D51696
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341762 91177308-0d34-0410-b5e6-96231b3b80d8
This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340260 91177308-0d34-0410-b5e6-96231b3b80d8
This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators.
This is the last patch necessary to close PR36545
Differential Revision: https://reviews.llvm.org/D50765
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339908 91177308-0d34-0410-b5e6-96231b3b80d8
Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339898 91177308-0d34-0410-b5e6-96231b3b80d8
Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339763 91177308-0d34-0410-b5e6-96231b3b80d8