Summary:
This patch is crucial for proving equality laundered/stripped
pointers. eg:
bool foo(A *a) {
return a == std::launder(a);
}
Clang with -fstrict-vtable-pointers will emit something like:
define dso_local zeroext i1 @_Z3fooP1A(%struct.A* %a) {
entry:
%c = bitcast %struct.A* %a to i8*
%call = tail call i8* @llvm.launder.invariant.group.p0i8(i8* %c)
%0 = bitcast %struct.A* %a to i8*
%1 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %0)
%2 = tail call i8* @llvm.strip.invariant.group.p0i8(i8* %call)
%cmp = icmp eq i8* %1, %2
ret i1 %cmp
}
and because %2 can be replaced with @llvm.strip.invariant.group(%0)
and that %2 and %1 will produce the same value (because strip is readnone)
we can replace compare with true.
Reviewers: rsmith, hfinkel, majnemer, amharc, kuhar
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D47423
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336963 91177308-0d34-0410-b5e6-96231b3b80d8
This converts them to what clang is now using for codegen. Unfortunately, there seem to be a few kinks to work out still. I'll try to address with follow up patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336871 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Support for this option is needed for building Linux kernel.
This is a very frequently requested feature by kernel developers.
More details : https://lkml.org/lkml/2018/4/4/601
GCC option description for -fdelete-null-pointer-checks:
This Assume that programs cannot safely dereference null pointers,
and that no code or data element resides at address zero.
-fno-delete-null-pointer-checks is the inverse of this implying that
null pointer dereferencing is not undefined.
This feature is implemented in LLVM IR in this CL as the function attribute
"null-pointer-is-valid"="true" in IR (Under review at D47894).
The CL updates several passes that assumed null pointer dereferencing is
undefined to not optimize when the "null-pointer-is-valid"="true"
attribute is present.
Reviewers: t.p.northover, efriedma, jyknight, chandlerc, rnk, srhines, void, george.burgess.iv
Reviewed By: efriedma, george.burgess.iv
Subscribers: eraman, haicheng, george.burgess.iv, drinkcat, theraven, reames, sanjoy, xbolva00, llvm-commits
Differential Revision: https://reviews.llvm.org/D47895
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336613 91177308-0d34-0410-b5e6-96231b3b80d8
Better NaN handling for AMDGCN fmed3.
All operands are checked for NaN now. The checks
were moved before the canonicalization to provide
a better mapping from fclamp. Changed the behaviour
of fmed3(x,y,NaN) to return max(x,y) instead of
min(x,y) in light of this. Updated tests as a result
and added some new cases to cover the fix.
Patch by Alan Baker
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336375 91177308-0d34-0410-b5e6-96231b3b80d8
I think the intrinsics named 'avx512.mask.' should refer to the previous behavior of taking a mask argument in the intrinsic instead of using a 'select' or 'and' instruction in IR to accomplish the masking. This is more consistent with the goal that eventually we will have no intrinsics that have masking builtin. When we reach that goal, we should have no intrinsics named "avx512.mask".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335744 91177308-0d34-0410-b5e6-96231b3b80d8
This patch introduces two helpers to make it easier to ignore debug
intrinsics:
- Instruction::getNextNonDebugInstruction()
This is just like Instruction::getNextNode(), except that it skips debug
info.
- skipDebugInfo(BasicBlock::iterator)
A free function which advances a BasicBlock iterator past any debug
info. This is a no-op when the iterator already points to a non-debug
instruction.
Part of: llvm.org/PR37728
Related to: https://reviews.llvm.org/D47874
Differential Revision: https://reviews.llvm.org/D48305
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335083 91177308-0d34-0410-b5e6-96231b3b80d8
This patch replaces calls to X86-specific intrinsics with floor-ceil semantics
with calls to target-independent @llvm.floor.* and @llvm.ceil.* intrinsics. This
doesn't affect the resulting machine code, as those intrinsics are lowered to
the same instructions, but exposes these specific rounding cases to generic
optimizations.
Differential Revision: https://reviews.llvm.org/D48067
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335039 91177308-0d34-0410-b5e6-96231b3b80d8
We should never get different CodeGen based on whether the code is being
compiled in debug mode so we must skip over @llvm.dbg.value (and similar)
calls.
Should fix at least the worst part of PR37690.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334090 91177308-0d34-0410-b5e6-96231b3b80d8
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333954 91177308-0d34-0410-b5e6-96231b3b80d8
Convert a vector load intrinsic into an llvm load instruction.
This is beneficial when the underlying object being addressed
comes from a constant, since we get constant-folding for free.
Differential Revision: https://reviews.llvm.org/D46273
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333643 91177308-0d34-0410-b5e6-96231b3b80d8
Turning a table lookup intrinsic into a shuffle vector instruction
can be beneficial. If the mask used for the lookup is the constant
vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse
instructions instead.
Differential Revision: https://reviews.llvm.org/D46133
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333550 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM/ARM64 AESE and AESD instructions have a builtin XOR as the first step in
the instruction. Therefore, if the AES key is zero and the AES data was
previously XORed, it can be combined into a single instruction.
Differential Revision: https://reviews.llvm.org/D47239
Patch by Michael Brase!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333193 91177308-0d34-0410-b5e6-96231b3b80d8
We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf.
This is alternative implementation working with the intrinsic in InstCombine.
Original review for past-ISel optimization: D46570.
Differential Revision: https://reviews.llvm.org/D46596
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332956 91177308-0d34-0410-b5e6-96231b3b80d8
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332240 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change adds handling of the atomic memset intrinsic to the
code path that simplifies the regular memset. In practice this means
that we will now also expand a small constant-length atomic memset
into a single unordered atomic store.
Reviewers: apilipenko, skatkov, mkazantsev, anna, reames
Reviewed By: reames
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D46660
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332132 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change reworks the handling of atomic memcpy within the instcombine pass.
Previously, a constant length atomic memcpy would be lowered into loads & stores
as long as no more than 16 load/store pairs are created. This is quite different
from the lowering done for a non-atomic memcpy; which only ever lowers into a single
load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are
expanded to load/stores in later passes, such as SelectionDAG lowering.
In this change the behaviour for atomic memcpy is unified with non-atomic memcpy;
atomic memcpy is now treated in the same was as non-atomic memcpy has always been.
We leave it to later passes to lower longer-length atomic memcpy calls.
Due to the structure of the pass's handling of memtransfer intrinsics, this change
also gives us handling of atomic memmove that we did not previously have.
Reviewers: apilipenko, skatkov, mkazantsev, anna, reames
Reviewed By: reames
Subscribers: reames, llvm-commits
Differential Revision: https://reviews.llvm.org/D46658
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332093 91177308-0d34-0410-b5e6-96231b3b80d8
The previous handling for guard widening in InstCombine was extremely restrictive. In particular, it didn't handle the common case where we had two guards separated by a single icmp. Handle this by scanning through a small fixed window of instructions to find the next guard if needed.
Differential Revision: https://reviews.llvm.org/D46203
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331935 91177308-0d34-0410-b5e6-96231b3b80d8
This completes the work started in r329604 and r329605 when we changed clang to no longer use the intrinsics.
We lost some InstCombine SimplifyDemandedBit optimizations through this change as we aren't able to fold 'and', bitcast, shuffle very well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329990 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The cast simplifications that instcombine does here do not make any
attempt to obey the verifier rules for musttail calls. Therefore we have
to disable them.
Reviewers: efriedma, majnemer, pcc
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45186
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329027 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Just updating a call to MemSetInst::getAlignment() to MemSetInst::getDestAlignment(). The
former has been deprecated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328227 91177308-0d34-0410-b5e6-96231b3b80d8
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.
Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.
Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328165 91177308-0d34-0410-b5e6-96231b3b80d8
With this patch in place, when a new-format TBAA tag is available
for a memory-transfer intrinsic call, we prefer propagating that
new-format tag. Otherwise, we fallback to the old approach where
we try to construct a proper TBAA access tag from 'tbaa.struct'
metadata.
Differential Revision: https://reviews.llvm.org/D41543
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325488 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
InstCombine pass to cease using the deprecated MemoryIntrinsic::getAlignment() method, and
instead we use the separate getSourceAlignment and getDestAlignment APIs to simplify
the source and destination alignment attributes separately.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
Reviewers: majnemer, bollu, efriedma
Reviewed By: efriedma
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D42871
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324960 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Small NFC change to change the name of the function used getting and setting
the alignment of a memset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324148 91177308-0d34-0410-b5e6-96231b3b80d8