This recommits r292526 which is reverted in r292529 after fixing the test case.
The original summary:
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292570 91177308-0d34-0410-b5e6-96231b3b80d8
loops in a function.
These are relatively confusing to talk about and compute correctly so it
seems really good to write down their implementation in one place. I've
replaced one place we needed this in the loop PM infrastructure and
I have another place in a pending patch that wants it.
We can't quite use this for the core loop PM walk because there we're
sometimes working on a sub-forest.
I'll add the expected unittests before committing this but wanted to
make sure folks were happy with these names / comments.
Credit goes to Richard Smith for the idea for naming the order where siblings
are in reverse program order but the tree traversal remains preorder.
Differential Revision: https://reviews.llvm.org/D28932
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292569 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Fence instructions are currently marked as `ModRef` for all memory locations.
We can improve this for constant memory locations (such as constant globals),
since fence instructions cannot modify these locations.
This helps us to forward constant loads across fences (added test case in GVN).
There were no changes in behaviour for similar test cases in early-cse and licm.
Reviewers: dberlin, sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28914
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292546 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, a GEP is considered free only if its indices are all constant.
TTI::getGEPCost() can give target-specific more accurate analysis. TTI is
already used for the cost of many other instructions.
Differential Revision: https://reviews.llvm.org/D28693
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292526 91177308-0d34-0410-b5e6-96231b3b80d8
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.
Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.
No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.
Differential Revision: https://reviews.llvm.org/D28587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292449 91177308-0d34-0410-b5e6-96231b3b80d8
Before, it would print a sequence of:
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
...
for every single function in the module.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292442 91177308-0d34-0410-b5e6-96231b3b80d8
r292188 confused MSVC because of the combined lack of a default
case and return statement.
Move the unreachable outside of the NumLibFuncs case, to make it
obvious that all cases should be handled.
llvm_unreachable is __declspec(noreturn), so I'm assuming this
does appease MSVC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292246 91177308-0d34-0410-b5e6-96231b3b80d8
This is another step towards unifying all LibFunc prototype checks.
This work started in r267758 (D19469); add the remaining checks.
Also add a unittest that checks each libfunc declared with a known-valid
and known-invalid prototype. New libfuncs added in the future are
required to have prototype checking in place; the known-valid test will
fail otherwise.
Differential Revision: https://reviews.llvm.org/D28030
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292188 91177308-0d34-0410-b5e6-96231b3b80d8
When transferring affected values in the cache from an old value, identified by
the value of the current callback, to the specified new value we might need to
insert a new entry into the DenseMap which constitutes the cache. Doing so
might delete the current callback object. Move the copying logic into a new
function, a member of the assumption cache itself, so that we don't run into UB
should the callback handle itself be removed mid-copy.
Differential Revision: https://reviews.llvm.org/D28749
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292133 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Use getLoopLatch in place of isLoopSimplifyForm. we do not need
to know whether the loop has a preheader nor dedicated exits.
Reviewers: hfinkel, sanjoy, atrick, mkuper
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28724
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292078 91177308-0d34-0410-b5e6-96231b3b80d8
events.
This pass sometimes has a pointer to BlockFrequencyInfo so it needs
custom invalidation logic. It is also otherwise immutable so we can
reduce the number of invalidations that happen substantially.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292058 91177308-0d34-0410-b5e6-96231b3b80d8
a function's CFG when that CFG is unchanged.
This allows transformation passes to simply claim they preserve the CFG
and analysis passes to check for the CFG being preserved to remove the
fanout of all analyses being listed in all passes.
I've gone through and removed or cleaned up as many of the comments
reminding us to do this as I could.
Differential Revision: https://reviews.llvm.org/D28627
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292054 91177308-0d34-0410-b5e6-96231b3b80d8
mark it as never invalidated in the new PM.
The old PM already required this to work, and after a discussion with
Hal this seems to really be the only sensible answer. The cache
gracefully degrades as the IR is mutated, and most things which do this
should already be incrementally updating the cache.
This gets rid of a bunch of logic preserving and testing the
invalidation of this analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292039 91177308-0d34-0410-b5e6-96231b3b80d8
This fallthrough if other cases are added between fabs and default
could cause fabs to fall to the next case resulting in a bug.
Better getting rid of it immediately just to be sure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292003 91177308-0d34-0410-b5e6-96231b3b80d8
extractProfTotalWeight checks if the profile type is sample profile, but
before that we have to ensure that summary is available. Also expanded
the unittest to test the case where there is no summar
Differential Revision: https://reviews.llvm.org/D28708
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291982 91177308-0d34-0410-b5e6-96231b3b80d8
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291904 91177308-0d34-0410-b5e6-96231b3b80d8
Running tests with expensive checks enabled exhibits some problems with
verification of pass results.
First, the pass verification may require results of analysis that are not
available. For instance, verification of loop info requires results of dominator
tree analysis. A pass may be marked as conserving loop info but does not need to
be dependent on DominatorTreePass. When a pass manager tries to verify that loop
info is valid, it needs dominator tree, but corresponding analysis may be
already destroyed as no user of it remained.
Another case is a pass that is skipped. For instance, entities with linkage
available_externally do not need code generation and such passes are skipped for
them. In this case result verification must also be skipped.
To solve these problems this change introduces a special flag to the Pass
structure to mark passes that have valid results. If this flag is reset,
verifications dependent on the pass result are skipped.
Differential Revision: https://reviews.llvm.org/D27190
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291882 91177308-0d34-0410-b5e6-96231b3b80d8
* Add is{Hot|Cold}CallSite methods
* Fix a bug in isHotBB where it was looking for MD_prof on a return instruction
* Use MD_prof data only if sample profiling was used to collect profiles.
* Add an unit test to ProfileSummaryInfo
Differential Revision: https://reviews.llvm.org/D28584
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291878 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Memory Dependence Analysis was limited to return only local dependencies
for invariant.group handling. Now it returns NonLocal when it finds it
and then by asking getNonLocalPointerDependency we get found dep.
Thanks to this we are able to devirtualize loops!
void indirect(A &a, int n) {
for (int i = 0 ; i < n; i++)
a.foo();
}
void test(int n) {
A a;
indirect(a);
}
After inlining a.foo() will be changed to direct call, even if foo and A::A()
is external (but only if vtable definition is be available).
Reviewers: nlewycky, dberlin, chandlerc, rsmith
Subscribers: mehdi_amini, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D28137
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291762 91177308-0d34-0410-b5e6-96231b3b80d8
Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.
As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).
Differential Revision: https://reviews.llvm.org/D28459
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291671 91177308-0d34-0410-b5e6-96231b3b80d8
the latter to the Transforms library.
While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.
Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.
We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.
This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.
I haven't split the unittest though because testing one component
without the other seems nearly intractable.
Differential Revision: https://reviews.llvm.org/D28452
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291662 91177308-0d34-0410-b5e6-96231b3b80d8
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.
special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq.
In case if the real operands bitwidth <= 16.
Differential Revision: https://reviews.llvm.org/D28104
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291657 91177308-0d34-0410-b5e6-96231b3b80d8
arguments much like the CGSCC pass manager.
This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.
An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.
This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.
While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.
I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.
Differential Revision: https://reviews.llvm.org/D28292
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291651 91177308-0d34-0410-b5e6-96231b3b80d8
Functional change: Previously, if a callee is cold, we used ColdThreshold if it minimizes the existing threshold. This was irrespective of whether we were optimizing for minsize (-Oz) or not. But -Oz uses very low threshold to begin with and the inlining with -Oz is expected to be tuned for lowering code size, so there is no good reason to set an even lower threshold for cold callees. We now lower the threshold for cold callees only when -Oz is not used. For default values of -inlinethreshold and -inlinecold-threshold, this change has no effect and this simplifies the code.
NFC changes: Group all threshold updates that are guarded by !Caller->optForMinSize() and within that group threshold updates that require profile summary info.
Differential revision: https://reviews.llvm.org/D28369
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291487 91177308-0d34-0410-b5e6-96231b3b80d8
invalid.
This fixes use-after-free bugs that will arise with any interesting use
of SCEV.
I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291426 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Using the linker-supplied list of "preserved" symbols, we can compute
the list of "dead" symbols, i.e. the one that are not reachable from
a "preserved" symbol transitively on the reference graph.
Right now we are using this information to mark these functions as
non-eligible for import.
The impact is two folds:
- Reduction of compile time: we don't import these functions anywhere
or import the function these symbols are calling.
- The limited number of import/export leads to better internalization.
Patch originally by Mehdi Amini.
Reviewers: mehdi_amini, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23488
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291177 91177308-0d34-0410-b5e6-96231b3b80d8
Should fix some more bot failures from r291108.
This should have been a DenseSet, since GUID is not a pointer type.
It caused some bots to fail, but for some reason I wasnt't getting a
build failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291115 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This adds a new summary flag NotEligibleToImport that subsumes
several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal
and IsNotViableToInline). It also subsumes the checking of references
on the summary that was being done during the thin link by
eligibleForImport() for each candidate. It is much more efficient to
do that checking once during the per-module summary build and record
it in the summary.
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28169
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291108 91177308-0d34-0410-b5e6-96231b3b80d8
This code seems to be target dependent which may not be the same for all targets.
Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'.
Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general.
Differential Revision: https://reviews.llvm.org/D27518
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291106 91177308-0d34-0410-b5e6-96231b3b80d8