------------------------------------------------------------------------
r275981 | rksimon | 2016-07-19 08:07:43 -0700 (Tue, 19 Jul 2016) | 13 lines
[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.
It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).
This patch changes both scalar and packed versions back to using x86-specific builtins.
It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.
A companion clang patch is at D22105
Differential Revision: https://reviews.llvm.org/D22106
------------------------------------------------------------------------
------------------------------------------------------------------------
r276740 | rksimon | 2016-07-26 03:41:28 -0700 (Tue, 26 Jul 2016) | 5 lines
[X86][SSE] Fixed issue with memory folding of (v)cvtsd2ss intrinsics
Fixed typo in the intrinsic definitions of (v)cvtsd2ss with memory folding.
This was only unearthed when rL276102 started using the intrinsic again.....
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_39@276990 91177308-0d34-0410-b5e6-96231b3b80d8
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274043 91177308-0d34-0410-b5e6-96231b3b80d8
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Reviewed By: reames
Differential Revision: http://reviews.llvm.org/D17270
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273892 91177308-0d34-0410-b5e6-96231b3b80d8
Auto-upgrade to generic shuffles like sse/avx2 implementations now that we can lower to VPSLLDQ/VPSRLDQ
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272308 91177308-0d34-0410-b5e6-96231b3b80d8
This patch begins adding support for lowering to the XOP VPERMIL2PD/VPERMIL2PS shuffle instructions - adding the X86ISD::VPERMIL2 opcode and cleaning up the usage.
The internal llvm intrinsics were assuming the shuffle mask operand was the same type as the float/double input operands (I guess to simplify the intrinsic definitions in X86InstrXOP.td to a single value type). These needed changing to integer types (matching the clang builtin and the AMD intrinsics definitions), an auto upgrade path is added to convert old calls.
Mask decoding/target shuffle support will be added in future patches.
Differential Revision: http://reviews.llvm.org/D20049
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271633 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead.
Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower.
Differential Revision: http://reviews.llvm.org/D20860
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271510 91177308-0d34-0410-b5e6-96231b3b80d8
Looks like something isn't quite right still. Also forgot to move the test cases to an autoupgrade test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271363 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.
Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.
Differential Revision: http://reviews.llvm.org/D20686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271131 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.
A companion patch (D20684) removes/auto-upgrade the clang intrinsics.
Differential Revision: http://reviews.llvm.org/D20686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270973 91177308-0d34-0410-b5e6-96231b3b80d8
When we have "Image Info Version" module flag but don't have "Class Properties"
module flag, set "Class Properties" module flag to 0, so we can correctly emit
errors when one module has the flag set and another module does not.
rdar://26469641
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270791 91177308-0d34-0410-b5e6-96231b3b80d8
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer. I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.
This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.
Differential Revision: http://reviews.llvm.org/D19098
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266818 91177308-0d34-0410-b5e6-96231b3b80d8