This recommits r335562 and 335563 as a single commit.
The frontend will surround the intrinsic with the appropriate marshalling to/from a scalar type to match the sigature of the builtin that software expects.
By exposing the vXi1 type directly in the llvm intrinsic we make it available to optimizers much earlier. This can enable the scalar marshalling code to be optimized away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335568 91177308-0d34-0410-b5e6-96231b3b80d8
We have unmasked intrinsics now and wrap them with a select. This is a net reduction of 36 intrinsics from before the unmasked intrinsics were added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333388 91177308-0d34-0410-b5e6-96231b3b80d8
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.
Patch by tkrupa
Differential Revision: https://reviews.llvm.org/D44785
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330322 91177308-0d34-0410-b5e6-96231b3b80d8
This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics.
We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329990 91177308-0d34-0410-b5e6-96231b3b80d8
The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329774 91177308-0d34-0410-b5e6-96231b3b80d8
These are were previously grouped in small groups of similarish intrinsics. But all the intrinsics have the same number of arguments and the same order. So we can move them all into a larger group for handling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329549 91177308-0d34-0410-b5e6-96231b3b80d8
The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325559 91177308-0d34-0410-b5e6-96231b3b80d8
The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior.
Fixes PR36360.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324953 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR.
This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares.
Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it.
This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once.
Reviewers: spatel, delena, RKSimon, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43137
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324827 91177308-0d34-0410-b5e6-96231b3b80d8
Clang already stopped using these a couple months ago.
The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324177 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch changes the kunpck intrinsic autoupgrade to use vXi1 shufflevector operations to perform vector extracts and concats. This more closely matches the definition of the kunpck instructions. Currently we rely on a DAG combine to turn the scalar shift/and/or code into a concat vectors operation. By doing it in the IR we get this for free.
Reviewers: spatel, RKSimon, zvi, jina.nahias
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42018
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322462 91177308-0d34-0410-b5e6-96231b3b80d8
I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322050 91177308-0d34-0410-b5e6-96231b3b80d8
The bitcode reader looks specifically for `__DATA, __objc_catlist` as a
section name. However, SVN r304661 removed the spaces (the two names
are functionally equivalent but do not compare equally
lexicographically). This causes compatibility issues. Add an
auto-upgrade path for removing the spaces as well as use the new name in
the LTO plugin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315086 91177308-0d34-0410-b5e6-96231b3b80d8
This came out of a recent discussion on llvm-dev
(https://reviews.llvm.org/D38042). Currently the Verifier will strip
the debug info metadata from a module if it finds the dbeug info to be
malformed. This feature is very valuable since it allows us to improve
the Verifier by making it stricter without breaking bcompatibility,
but arguable the Verifier pass should not be modifying the IR. This
patch moves the stripping of broken debug info into AutoUpgrade
(UpgradeDebugInfo to be precise), which is a much better location for
this since the stripping of malformed (i.e., produced by older, buggy
versions of Clang) is a (harsh) form of AutoUpgrade.
This change is mostly NFC in nature, the one big difference is the
behavior when LLVM module passes are introducing malformed debug
info. Prior to this patch, a NoAsserts build would have printed a
warning and stripped the debug info, after this patch the Verifier
will report a fatal error. I believe this behavior is actually more
desirable anyway.
Differential Revision: https://reviews.llvm.org/D38184
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314699 91177308-0d34-0410-b5e6-96231b3b80d8
Removing X86 broadcast(f/i)32x2 intrinsics from llvm.
Adding autoUpgrade support.
Moving matching tests from avx512dq-intrinsics.ll to avx512dq-intrinsics-upgrade.ll and from avx512dqvl-intrinsics.ll to avx512dqvl-intrinsics-upgrade.ll.
Differential Revision: https://reviews.llvm.org/D38220
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314195 91177308-0d34-0410-b5e6-96231b3b80d8