This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents.
This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))
Differential Revision: http://reviews.llvm.org/D13003
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248210 91177308-0d34-0410-b5e6-96231b3b80d8
The vext pseudo-instruction takes the number of elements that need to be
extracted, not the number of bytes. Hence, use the number of elements
directly instead of scaling them with a factor.
Reviewers: Silviu Baranga, James Molloy
(not reflected in the differential revision)
Differential Revision: http://reviews.llvm.org/D12974
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248208 91177308-0d34-0410-b5e6-96231b3b80d8
We're currently losing any fast-math flags when synthesizing fcmps for
min/max reductions. In LV, make sure we copy over the scalar inst's
flags. In LoopUtils, we know we only ever match patterns with
hasUnsafeAlgebra, so apply that to any synthesized ops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248201 91177308-0d34-0410-b5e6-96231b3b80d8
Because mod is always exact, this function should have never taken a rounding mode argument. The actual implementation still has issues, which I'll look at resolving in a subsequent patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248195 91177308-0d34-0410-b5e6-96231b3b80d8
Based on conversations with Justin and a few others, these constructors
are really useful to have in the executable so that you can call them
from the debugger. After some measurements, these *particular* calls
aren't so problematic as to make them a good tradeoff for always inline.
Please let me know if there are other functions really needed for
debugging. The always inline attribute is a hack that we should only
really employ when it doesn't hurt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248188 91177308-0d34-0410-b5e6-96231b3b80d8
The definition of the DivergenceAnalysis pass was in a CPP
file and wasn't accessible to users of the analysis to get it
through "getAnalysis<>()".
This patch extracts the definition into a separate header that
can be used by users of the analysis to fetch the results.
Patch by Volkan Keles (vkeles@apple.com)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248186 91177308-0d34-0410-b5e6-96231b3b80d8
evaluate whether 'readonly' or 'readnone' apply to a given function.
This both reduces indentation and will make it easy to share the logic
with a new pass manager implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248181 91177308-0d34-0410-b5e6-96231b3b80d8
The ISD::FPOW and ISD::FSINCOS opcodes default to Legal, but there
is no legal instruction for those on SystemZ. This could cause
LLVM internal errors. Fixed by setting the operation action to
Expand for those opcodes.
Also added test cases for all other LLVM IR intrinsics that should
generate a library call. (Those already work correctly since the
default operation action is fine.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248180 91177308-0d34-0410-b5e6-96231b3b80d8
If storing multiple FP constants, some subset of the stores
would be replaced with integers due to visit order, so
MergeConsecutiveStores would only partially merge
these.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248169 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a
hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with
a FIXME: attached.
This patch changes the handling of +t2dsp to be in line with other
architecture extensions.
Following review comments, also updating the description of FeatureDSPThumb2
in ARM.td.
Differential Revision: http://reviews.llvm.org/D12937
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248152 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Also tightened up the test and made a trivial fix to prevent double-newline
after emitting .cpsetup directives.
Reviewers: vkalintiris
Subscribers: seanbruno, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D12956
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248143 91177308-0d34-0410-b5e6-96231b3b80d8
Because -indvars widens induction variables through arithmetic,
`NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV`
manages information for all transitive uses of an IV being widened,
including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse`
which manages information for a specific def-use edge in the transitive
use list of an induction variable.
This change also adds a test case that demonstrates the problem with
r248045.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248107 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1))
Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x))
Differential Revision: http://reviews.llvm.org/D12663
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248091 91177308-0d34-0410-b5e6-96231b3b80d8