Summary:
Previously IntrinsicInfo::size was an unsigned what can't represent the
64 bit value used by MemoryLocation::UnknownSize.
Reviewers: jmolloy
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68219
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373214 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Remove use of FileCheckPatternContext and FileCheckString concrete types
from FileCheck API to allow moving it and the other implementation only
only declarations into a private header file.
Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68186
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373211 91177308-0d34-0410-b5e6-96231b3b80d8
Initially SLP vectorizer replaced all going-to-be-vectorized
instructions with Undef values. It may break ScalarEvaluation and may
cause a crash.
Reworked SLP vectorizer so that it does not replace vectorized
instructions by UndefValue anymore. Instead vectorized instructions are
marked for deletion inside if BoUpSLP class and deleted upon class
destruction.
Reviewers: mzolotukhin, mkuper, hfinkel, RKSimon, davide, spatel
Subscribers: RKSimon, Gerolf, anemet, hans, majnemer, llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D29641
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373166 91177308-0d34-0410-b5e6-96231b3b80d8
m_SExtOrSelf() is for consistency.
m_ZExtOrSExtOrSelf() is motivated by the D68103/r373106 :
sometimes it is useful to look past any extensions of the shift amount,
and m_ZExtOrSExtOrSelf() may be exactly the tool to do that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373128 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This is a cleanup patch for MachineDominatorTree. It would be an NFC, except for replacing custom DomTree verification with the generic one.
Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao
Reviewed By: arsenm
Subscribers: wdng, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67976
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373101 91177308-0d34-0410-b5e6-96231b3b80d8
This caused severe compile-time regressions, see PR43455.
> Modern processors predict the targets of an indirect branch regardless of
> the size of any jump table used to glean its target address. Moreover,
> branch predictors typically use resources limited by the number of actual
> targets that occur at run time.
>
> This patch changes the semantics of the option `-max-jump-table-size` to limit
> the number of different targets instead of the number of entries in a jump
> table. Thus, it is now renamed to `-max-jump-table-targets`.
>
> Before, when `-max-jump-table-size` was specified, it could happen that
> cluster jump tables could have targets used repeatedly, but each one was
> counted and typically resulted in tables with the same number of entries.
> With this patch, when specifying `-max-jump-table-targets`, tables may have
> different lengths, since the number of unique targets is counted towards the
> limit, but the number of unique targets in tables is the same, but for the
> last one containing the balance of targets.
>
> Differential revision: https://reviews.llvm.org/D60295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373060 91177308-0d34-0410-b5e6-96231b3b80d8
hasDedicatedExits.
For the compile time problem described in https://reviews.llvm.org/D67359,
turns out the root cause is there are many duplicates in ExitBlocks so
the algorithm complexity of hasDedicatedExits gets very high. If we remove
the duplicates, the compile time issue is gone.
Thanks to Philip Reames for raising a good question and it leads me to
find the root cause.
Differential Revision: https://reviews.llvm.org/D68107
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373045 91177308-0d34-0410-b5e6-96231b3b80d8
We can't use short granules with stack instrumentation when targeting older
API levels because the rest of the system won't understand the short granule
tags stored in shadow memory.
Moreover, we need to be able to let old binaries (which won't understand
short granule tags) run on a new system that supports short granule
tags. Such binaries will call the __hwasan_tag_mismatch function when their
outlined checks fail. We can compensate for the binary's lack of support
for short granules by implementing the short granule part of the check in
the __hwasan_tag_mismatch function. Unfortunately we can't do anything about
inline checks, but I don't believe that we can generate these by default on
aarch64, nor did we do so when the ABI was fixed.
A new function, __hwasan_tag_mismatch_v2, is introduced that lets code
targeting the new runtime avoid redoing the short granule check. Because tag
mismatches are rare this isn't important from a performance perspective; the
main benefit is that it introduces a symbol dependency that prevents binaries
targeting the new runtime from running on older (i.e. incompatible) runtimes.
Differential Revision: https://reviews.llvm.org/D68059
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373035 91177308-0d34-0410-b5e6-96231b3b80d8
For a runtime loop if we can compute its trip count upperbound:
Don't unroll if:
1. loop is not guaranteed to run either zero or upperbound iterations; and
2. trip count upperbound is less than UnrollMaxUpperBound
Unless user or TTI asked to do so.
If unrolling, limit unroll factor to loop's trip count upperbound.
Differential Revision: https://reviews.llvm.org/D62989
Change-Id: I6083c46a9d98b2e22cd855e60523fdc5a4929c73
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373017 91177308-0d34-0410-b5e6-96231b3b80d8
for extreme large case.
We had a case that a single loop which has 4000 exits and the average number
of predecessors of each exit is > 1000, and we found compiling the case spent
a significant amount of time on checking whether a loop has dedicated exits.
This patch adds a limit for the iterations to the check. With the patch, the
time to compile our testcase reduced from 1000s to 200s (clang release build).
Differential Revision: https://reviews.llvm.org/D67359
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372990 91177308-0d34-0410-b5e6-96231b3b80d8
atomicrmw and cmpxchg have a volatile flag, so allow them to be get and set with LLVM{Get,Set}Volatile. atomicrmw and fence have orderings, so allow them to be get and set with LLVM{Get,Set}Ordering. Add missing LLVMAtomicRMWBinOpFAdd and LLVMAtomicRMWBinOpFSub enum constants. AtomicCmpXchg also has a weak flag, add a getter/setter for that too. Add a getter/setter for the binary-op of an atomicrmw.
atomicrmw and cmpxchg have a volatile flag, so allow it to be set/get with LLVMGetVolatile and LLVMSetVolatile. Add missing LLVMAtomicRMWBinOpFAdd and LLVMAtomicRMWBinOpFSub enum constants. AtomicCmpXchg also has a weak flag, add a getter/setter for that too. Add a getter/setter for the binary-op of an atomicrmw.
Add LLVMIsA## for CatchSwitchInst, CallBrInst and FenceInst, as well as AtomicCmpXchgInst and AtomicRMWInst.
Update llvm-c-test to include atomicrmw and fence, and to copy volatile for the four applicable instructions.
Differential Revision: https://reviews.llvm.org/D67132
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372938 91177308-0d34-0410-b5e6-96231b3b80d8
Rename old function to explicitly show that it cares only about alignment.
The new allowsMemoryAccess call the function related to alignment by default
and can be overridden by target to inform whether the memory access is legal or
not.
Differential Revision: https://reviews.llvm.org/D67121
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372935 91177308-0d34-0410-b5e6-96231b3b80d8
As @reames pointed out post-commit, rL371518 adds additional rounding
in some cases, when doing constant folding of the multiplication.
This breaks a guarantee llvm.fma makes and must be avoided.
This patch reapplies rL371518, but splits off the simplifications not
requiring rounding from SimplifFMulInst as SimplifyFMAFMul.
Reviewers: spatel, lebedev.ri, reames, scanon
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D67434
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372899 91177308-0d34-0410-b5e6-96231b3b80d8
When checking for tail call eligibility, we should use the correct CCAssignFn
for each argument, rather than just checking if the caller/callee is varargs or
not.
This is important for tail call lowering with varargs. If we don't check it,
then basically any varargs callee with parameters cannot be tail called on
Darwin, for one thing. If the parameters are all guaranteed to be in registers,
this should be entirely safe.
On top of that, not checking for this could potentially make it so that we have
the wrong stack offsets when checking for tail call eligibility.
Also refactor some of the stuff for CCAssignFnForCall and pull it out into a
helper function.
Update call-translator-tail-call.ll to show that we can now correctly tail call
on Darwin. Also add two extra tail call checks. The first verifies that we still
respect the caller's stack size, and the second verifies that we still don't
tail call when a varargs function has a memory argument.
Differential Revision: https://reviews.llvm.org/D67939
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372897 91177308-0d34-0410-b5e6-96231b3b80d8
Modern processors predict the targets of an indirect branch regardless of
the size of any jump table used to glean its target address. Moreover,
branch predictors typically use resources limited by the number of actual
targets that occur at run time.
This patch changes the semantics of the option `-max-jump-table-size` to limit
the number of different targets instead of the number of entries in a jump
table. Thus, it is now renamed to `-max-jump-table-targets`.
Before, when `-max-jump-table-size` was specified, it could happen that
cluster jump tables could have targets used repeatedly, but each one was
counted and typically resulted in tables with the same number of entries.
With this patch, when specifying `-max-jump-table-targets`, tables may have
different lengths, since the number of unique targets is counted towards the
limit, but the number of unique targets in tables is the same, but for the
last one containing the balance of targets.
Differential revision: https://reviews.llvm.org/D60295
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372893 91177308-0d34-0410-b5e6-96231b3b80d8
Currently m_Br only takes references to BasicBlock*, which limits its
flexibility. For example, you have to declare a variable, even if you
ignore the result or you have to have additional checks to make sure the
matched BB matches an expected one.
This patch adds m_BasicBlock and m_SpecificBB matchers, which can be
used like the existing matchers for constants or values.
I also had a look at the existing uses and updated a few. IMO it makes
the code a bit more explicit.
Reviewers: spatel, craig.topper, RKSimon, majnemer, lebedev.ri
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D68013
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372885 91177308-0d34-0410-b5e6-96231b3b80d8
Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future.
Committed on behalf of @hvdijk (Harald van Dijk)
Differential Revision: https://reviews.llvm.org/D66138
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372882 91177308-0d34-0410-b5e6-96231b3b80d8
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917 <https://reviews.llvm.org/D61917>
As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086
The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535
But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.
The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.
Differential Revision: https://reviews.llvm.org/D67564
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372878 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch fixes a bug that originated from passing a virtual exit block (nullptr) to `MachinePostDominatorTee::findNearestCommonDominator` and resulted in assertion failures inside its callee. It also applies a small cleanup to the class.
The patch introduces a new function in PDT that given a list of `MachineBasicBlock`s finds their NCD. The new overload of `findNearestCommonDominator` handles virtual root correctly.
Note that similar handling of virtual root nodes is not necessary in (forward) `DominatorTree`s, as right now they don't use virtual roots.
Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao
Reviewed By: hliao
Subscribers: hliao, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits
Tags: #amdgpu, #llvm
Differential Revision: https://reviews.llvm.org/D67974
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372874 91177308-0d34-0410-b5e6-96231b3b80d8
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917
As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086
The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535
But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.
The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.
Differential Revision: https://reviews.llvm.org/D67564
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372866 91177308-0d34-0410-b5e6-96231b3b80d8
Currently we can't use unique suffixes in section names to describe
stack sizes sections. E.g. '.stack_sizes [1]' will be treated as a regular section.
This happens because we recognize stack sizes section by name and
do not yet drop the suffix before the check.
The patch fixes it.
Differential revision: https://reviews.llvm.org/D68018
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372853 91177308-0d34-0410-b5e6-96231b3b80d8
This overload was left over from an operator new that was removed in
r123027. Fix it to match another operator new that was added in r248453.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372828 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The Regex "match" and "sub" member functions were previously not "const"
because they wrote to the "error" member variable. This commit removes
those assignments, and instead assumes that the validity of the regex
is already known after the initial compilation of the regular
expression. As a result, these member functions were possible to make
"const". This makes it easier to do things like pre-compile Regexes
up-front, and makes "match" and "sub" thread-safe. The error status is
now returned as an optional output, which also makes the API of "match"
and "sub" more consistent with each other.
Also, some uses of Regex that could be refactored to be const were made const.
Patch by Nicolas Guillemot
Reviewers: jankratochvil, thopre
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67241
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372764 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This branch is currently dead since we don't use C++17.
#if __cplusplus > 201402L && LLVM_HAS_CPP_ATTRIBUTE(nodiscard)
#define LLVM_NODISCARD [[nodiscard]]
This branch is Clang-only.
#elif LLVM_HAS_CPP_ATTRIBUTE(clang::warn_unused_result)
#define LLVM_NODISCARD [[clang::warn_unused_result]]
While we could use gnu variant [[gnu::warn_unused_result]], it is not ideal because it works only on functions.
/home/xbolva00/LLVM/llvm/include/llvm/ADT/ArrayRef.h:41:24: warning: ‘warn_unused_result’ attribute only applies to function types [-Wattributes]
GCC (checked 5,6,7,8) seems to enable [[nodiscard]] even in C++14 mode and does not produce warnings that nodiscard is C++17 feature. but Clang does - but we do not reach it due the code above. So it affects only GCC and does what we want.
Reviewers: jfb, rsmith, echristo, aaron.ballman
Reviewed By: aaron.ballman
Subscribers: MaskRay, dexonsmith
Differential Revision: https://reviews.llvm.org/D67663
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372761 91177308-0d34-0410-b5e6-96231b3b80d8