187 Commits

Author SHA1 Message Date
Chandler Carruth
377af8df2b [PM] Introduce basic update capabilities to the new PM's CGSCC pass
manager, including both plumbing and logic to handle function pass
updates.

There are three fundamentally tied changes here:
1) Plumbing *some* mechanism for updating the CGSCC pass manager as the
   CG changes while passes are running.
2) Changing the CGSCC pass manager infrastructure to have support for
   the underlying graph to mutate mid-pass run.
3) Actually updating the CG after function passes run.

I can separate them if necessary, but I think its really useful to have
them together as the needs of #3 drove #2, and that in turn drove #1.

The plumbing technique is to extend the "run" method signature with
extra arguments. We provide the call graph that intrinsically is
available as it is the basis of the pass manager's IR units, and an
output parameter that records the results of updating the call graph
during an SCC passes's run. Note that "...UpdateResult" isn't a *great*
name here... suggestions very welcome.

I tried a pretty frustrating number of different data structures and such
for the innards of the update result. Every other one failed for one
reason or another. Sometimes I just couldn't keep the layers of
complexity right in my head. The thing that really worked was to just
directly provide access to the underlying structures used to walk the
call graph so that their updates could be informed by the *particular*
nature of the change to the graph.

The technique for how to make the pass management infrastructure cope
with mutating graphs was also something that took a really, really large
number of iterations to get to a place where I was happy. Here are some
of the considerations that drove the design:

- We operate at three levels within the infrastructure: RefSCC, SCC, and
  Node. In each case, we are working bottom up and so we want to
  continue to iterate on the "lowest" node as the graph changes. Look at
  how we iterate over nodes in an SCC running function passes as those
  function passes mutate the CG. We continue to iterate on the "lowest"
  SCC, which is the one that continues to contain the function just
  processed.

- The call graph structure re-uses SCCs (and RefSCCs) during mutation
  events for the *highest* entry in the resulting new subgraph, not the
  lowest. This means that it is necessary to continually update the
  current SCC or RefSCC as it shifts. This is really surprising and
  subtle, and took a long time for me to work out. I actually tried
  changing the call graph to provide the opposite behavior, and it
  breaks *EVERYTHING*. The graph update algorithms are really deeply
  tied to this particualr pattern.

- When SCCs or RefSCCs are split apart and refined and we continually
  re-pin our processing to the bottom one in the subgraph, we need to
  enqueue the newly formed SCCs and RefSCCs for subsequent processing.
  Queuing them presents a few challenges:
  1) SCCs and RefSCCs use wildly different iteration strategies at
     a high level. We end up needing to converge them on worklist
     approaches that can be extended in order to be able to handle the
     mutations.
  2) The order of the enqueuing need to remain bottom-up post-order so
     that we don't get surprising order of visitation for things like
     the inliner.
  3) We need the worklists to have set semantics so we don't duplicate
     things endlessly. We don't need a *persistent* set though because
     we always keep processing the bottom node!!!! This is super, super
     surprising to me and took a long time to convince myself this is
     correct, but I'm pretty sure it is... Once we sink down to the
     bottom node, we can't re-split out the same node in any way, and
     the postorder of the current queue is fixed and unchanging.
  4) We need to make sure that the "current" SCC or RefSCC actually gets
     enqueued here such that we re-visit it because we continue
     processing a *new*, *bottom* SCC/RefSCC.

- We also need the ability to *skip* SCCs and RefSCCs that get merged
  into a larger component. We even need the ability to skip *nodes* from
  an SCC that are no longer part of that SCC.

This led to the design you see in the patch which uses SetVector-based
worklists. The RefSCC worklist is always empty until an update occurs
and is just used to handle those RefSCCs created by updates as the
others don't even exist yet and are formed on-demand during the
bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and
we push new SCCs onto it and blacklist existing SCCs on it to get the
desired processing.

We then *directly* update these when updating the call graph as I was
never able to find a satisfactory abstraction around the update
strategy.

Finally, we need to compute the updates for function passes. This is
mostly used as an initial customer of all the update mechanisms to drive
their design to at least cover some real set of use cases. There are
a bunch of interesting things that came out of doing this:

- It is really nice to do this a function at a time because that
  function is likely hot in the cache. This means we want even the
  function pass adaptor to support online updates to the call graph!

- To update the call graph after arbitrary function pass mutations is
  quite hard. We have to build a fairly comprehensive set of
  data structures and then process them. Fortunately, some of this code
  is related to the code for building the cal graph in the first place.
  Unfortunately, very little of it makes any sense to share because the
  nature of what we're doing is so very different. I've factored out the
  one part that made sense at least.

- We need to transfer these updates into the various structures for the
  CGSCC pass manager. Once those were more sanely worked out, this
  became relatively easier. But some of those needs necessitated changes
  to the LazyCallGraph interface to make it significantly easier to
  extract the changed SCCs from an update operation.

- We also need to update the CGSCC analysis manager as the shape of the
  graph changes. When an SCC is merged away we need to clear analyses
  associated with it from the analysis manager which we didn't have
  support for in the analysis manager infrsatructure. New SCCs are easy!
  But then we have the case that the original SCC has its shape changed
  but remains in the call graph. There we need to *invalidate* the
  analyses associated with it.

- We also need to invalidate analyses after we *finish* processing an
  SCC. But the analyses we need to invalidate here are *only those for
  the newly updated SCC*!!! Because we only continue processing the
  bottom SCC, if we split SCCs apart the original one gets invalidated
  once when its shape changes and is not processed farther so its
  analyses will be correct. It is the bottom SCC which continues being
  processed and needs to have the "normal" invalidation done based on
  the preserved analyses set.

All of this is mostly background and context for the changes here.

Many thanks to all the reviewers who helped here. Especially Sanjoy who
caught several interesting bugs in the graph algorithms, David, Sean,
and others who all helped with feedback.

Differential Revision: http://reviews.llvm.org/D21464

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279618 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-24 09:37:14 +00:00
Tim Shen
22fca38c9c [GraphTraits] Replace all NodeType usage with NodeRef
This should finish the GraphTraits migration.

Differential Revision: http://reviews.llvm.org/D23730


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279475 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-22 21:09:30 +00:00
Justin Bogner
7d7a23e700 Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278970 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 20:30:52 +00:00
Duncan P. N. Exon Smith
370879ff09 IPO: Swap || operands to avoid dereferencing end()
IsOperandBundleUse conveniently indicates  whether
std::next(F->arg_begin(),UseIndex) will get to (or past) end().  Check
it first to avoid dereferencing end().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278884 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 01:23:58 +00:00
Sean Silva
2fb9a98752 Consistently use ModuleAnalysisManager
Besides a general consistently benefit, the extra layer of indirection
allows the mechanical part of https://reviews.llvm.org/D23256 that
requires touching every transformation and analysis to be factored out
cleanly.

Thanks to David for the suggestion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278078 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-09 00:28:38 +00:00
Sean Silva
6d4afae8c0 Add some comments linking back to PR28400.
Thanks to Mehdi for the suggestion!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277984 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-08 07:03:49 +00:00
Sean Silva
d4b4a02255 [PM] Invalidate CallGraphAnalysis because it holds AssertingVH
This is essentially PR28400. The fix here is similar to that implemented
in r274656.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277980 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-08 05:38:01 +00:00
Tim Shen
a9ed4cc01c [ADT] NFC: Generalize GraphTraits requirement of "NodeType *" in interfaces to "NodeRef", and migrate SCCIterator.h to use NodeRef
Summary: By generalize the interface, users are able to inject more flexible Node token into the algorithm, for example, a pair of vector<Node>* and index integer. Currently I only migrated SCCIterator to use NodeRef, but more is coming. It's a NFC.

Reviewers: dblaikie, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22937

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277399 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-01 22:32:20 +00:00
David Majnemer
baf88b3b1a [FunctionAttrs] Correct the safety analysis for inference of 'returned'
We skipped over ReturnInsts which didn't return an argument which would
lead us to incorrectly conclude that an argument returned by another
ReturnInst was 'returned'.

This reverts commit r275756.

This fixes PR28610.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276008 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 18:50:26 +00:00
NAKAMURA Takumi
73585958d4 Revert r275678, "Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute""
This reverts also r275029, "Update Clang tests after adding inference for the returned argument attribute"

It broke LTO build. Seems miscompilation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275756 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 03:23:25 +00:00
Hal Finkel
4fe9cc7cb0 Revert "Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute"
This reverts commit r275042; the initial commit triggered self-hosting failures
on ARM/AArch64. James Molloy identified the problematic backend code, which has
been disabled in r275677. Trying again...

Original commit message:

Let FuncAttrs infer the 'returned' argument attribute

A function can have one argument with the 'returned' attribute, indicating that
the associated argument is always the return value of the function. Add
FuncAttrs inference logic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275678 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-16 07:21:28 +00:00
Hal Finkel
7f9e1e0b77 Revert r275027 - Let FuncAttrs infer the 'returned' argument attribute
Reverting r275027 and r275033. These seem to cause miscompiles on the AArch64 buildbot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275042 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 04:51:23 +00:00
Hal Finkel
8b01a2f64c Don't use a SmallSet for returned attribute inference
Suggested post-commit by David Majnemer on IRC (following-up on a pre-commit
review comment).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275033 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-11 01:14:21 +00:00
Hal Finkel
772d8502ad Let FuncAttrs infer the 'returned' argument attribute
A function can have one argument with the 'returned' attribute, indicating that
the associated argument is always the return value of the function. Add
FuncAttrs inference logic.

Differential Revision: http://reviews.llvm.org/D22202

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275027 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-10 22:02:55 +00:00
Sean Silva
ad5ea26278 [PM] Some preparatory refactoring to minimize the diff of D21921
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274456 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-03 03:35:03 +00:00
Sean Silva
e82e4ddb87 Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274455 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-02 23:47:27 +00:00
Benjamin Kramer
5288df58b7 Apply clang-tidy's modernize-loop-convert to most of lib/Transforms.
Only minor manual fixes. No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273808 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-26 12:28:59 +00:00
Sean Silva
cc691dd8da [PM] Port ReversePostOrderFunctionAttrs to the new PM
Below are my super rough notes when porting. They can probably serve as
a basic guide for porting other passes to the new PM. As I port more
passes I'll expand and generalize this and make a proper
docs/HowToPortToNewPassManager.rst document. There is also missing
documentation for general concepts and API's in the new PM which will
require some documentation.
Once there is proper documentation in place we can put up a list of
passes that have to be ported and game-ify/crowdsource the rest of the
porting (at least of the middle end; the backend is still unclear).

I will however be taking personal responsibility for ensuring that the
LLD/ELF LTO pipeline is ported in a timely fashion. The remaining passes
to be ported are (do something like
`git grep "<the string in the bullet point below>"` to find the pass):

General Scalar:
[ ] Simplify the CFG
[ ] Jump Threading
[ ] MemCpy Optimization
[ ] Promote Memory to Register
[ ] MergedLoadStoreMotion
[ ] Lazy Value Information Analysis

General IPO:
[ ] Dead Argument Elimination
[ ] Deduce function attributes in RPO

Loop stuff / vectorization stuff:
[ ] Alignment from assumptions
[ ] Canonicalize natural loops
[ ] Delete dead loops
[ ] Loop Access Analysis
[ ] Loop Invariant Code Motion
[ ] Loop Vectorization
[ ] SLP Vectorizer
[ ] Unroll loops

Devirtualization / CFI:
[ ] Cross-DSO CFI
[ ] Whole program devirtualization
[ ] Lower bitset metadata

CGSCC passes:
[ ] Function Integration/Inlining
[ ] Remove unused exception handling info
[ ] Promote 'by reference' arguments to scalars

Please let me know if you are interested in working on any of the passes
in the above list (e.g. reply to the post-commit thread for this patch).
I'll probably be tackling "General Scalar" and "General IPO" first FWIW.

Steps as I port "Deduce function attributes in RPO"
---------------------------------------------------

(note: if you are doing any work based on these notes, please leave a
note in the post-commit review thread for this commit with any
improvements / suggestions / incompleteness you ran into!)

Note: "Deduce function attributes in RPO" is a module pass.

1. Do preparatory refactoring.

Do preparatory factoring. In this case all I had to do was to pull out a static helper (r272503).
(TODO: give more advice here e.g. if pass holds state or something)

2. Rename the old pass class.

llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename class ReversePostOrderFunctionAttrs -> ReversePostOrderFunctionAttrsLegacyPass
in preparation for adding a class ReversePostOrderFunctionAttrs as the pass in the new PM.
(edit: actually wait what? The new class name will be
ReversePostOrderFunctionAttrsPass, so it doesn't conflict. So this step is
sort of useless churn).

llvm/include/llvm/InitializePasses.h
llvm/lib/LTO/LTOCodeGenerator.cpp
llvm/lib/Transforms/IPO/IPO.cpp
llvm/lib/Transforms/IPO/FunctionAttrs.cpp
Rename initializeReversePostOrderFunctionAttrsPass -> initializeReversePostOrderFunctionAttrsLegacyPassPass
(note that the "PassPass" thing falls out of `s/ReversePostOrderFunctionAttrs/ReversePostOrderFunctionAttrsLegacyPass/`)
Note that the INITIALIZE_PASS macro is what creates this identifier name, so renaming the class requires this renaming too.

Note that createReversePostOrderFunctionAttrsPass does not need to be
renamed since its name is not generated from the class name.

3. Add the new PM pass class.

In the new PM all passes need to have their
declaration in a header somewhere, so you will often need to add a header.
In this case
llvm/include/llvm/Transforms/IPO/FunctionAttrs.h is already there because
PostOrderFunctionAttrsPass was already ported.
The file-level comment from the .cpp file can be used as the file-level
comment for the new header. You may want to tweak the wording slightly
from "this file implements" to "this file provides" or similar.

Add declaration for the new PM pass in this header:

    class ReversePostOrderFunctionAttrsPass
        : public PassInfoMixin<ReversePostOrderFunctionAttrsPass> {
    public:
      PreservedAnalyses run(Module &M, AnalysisManager<Module> &AM);
    };

Its name should end with `Pass` for consistency (note that this doesn't
collide with the names of most old PM passes). E.g. call it
`<name of the old PM pass>Pass`.

Also, move the doxygen comment from the old PM pass to the declaration of
this class in the header.
Also, include the declaration for the new PM class
`llvm/Transforms/IPO/FunctionAttrs.h` at the top of the file (in this case,
it was already done when the other pass in this file was ported).

Now define the `run` method for the new class.
The main things here are:
a) Use AM.getResult<...>(M) to get results instead of `getAnalysis<...>()`

b) If the old PM pass would have returned "false" (i.e. `Changed ==
false`), then you should return PreservedAnalyses::all();

c) In the old PM getAnalysisUsage method, observe the calls
   `AU.addPreserved<...>();`.

   In the case `Changed == true`, for each preserved analysis you should do
   call `PA.preserve<...>()` on a PreservedAnalyses object and return it.
   E.g.:

       PreservedAnalyses PA;
       PA.preserve<CallGraphAnalysis>();
       return PA;

Note that calls to skipModule/skipFunction are not supported in the new PM
currently, so optnone and optimization bisect support do not work. You can
just drop those calls for now.

4. Add the pass to the new PM pass registry to make it available in opt.

In llvm/lib/Passes/PassBuilder.cpp add a #include for your header.
`#include "llvm/Transforms/IPO/FunctionAttrs.h"`
In this case there is already an include (from when
PostOrderFunctionAttrsPass was ported).

Add your pass to llvm/lib/Passes/PassRegistry.def
In this case, I added
`MODULE_PASS("rpo-functionattrs", ReversePostOrderFunctionAttrsPass())`
The string is from the `INITIALIZE_PASS*` macros used in the old pass
manager.

Then choose a test that uses the pass and use the new PM `-passes=...` to
run it.
E.g. in this case there is a test that does:
; RUN: opt < %s -basicaa -functionattrs -rpo-functionattrs -S | FileCheck %s
I have added the line:
; RUN: opt < %s -aa-pipeline=basic-aa -passes='require<targetlibinfo>,cgscc(function-attrs),rpo-functionattrs' -S | FileCheck %s
The `-aa-pipeline=basic-aa` and
`require<targetlibinfo>,cgscc(function-attrs)` are what is needed to run
functionattrs in the new PM (note that in the new PM "functionattrs"
becomes "function-attrs" for some reason). This is just pulled from
`readattrs.ll` which contains the change from when functionattrs was ported
to the new PM.
Adding rpo-functionattrs causes the pass that was just ported to run.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272505 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 07:48:51 +00:00
Sean Silva
5f6317b293 Factor out a helper. NFC
Prep for porting to new PM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272503 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 05:44:51 +00:00
David Majnemer
1f13752cbe [FunctionAttrs] Volatile loads should disable readonly
A volatile load has side effects beyond what callers expect readonly to
signify.  For example, it is not safe to reorder two function calls
which each perform a volatile load to the same memory location.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270671 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 05:53:04 +00:00
Mehdi Amini
a4a5ff8bb3 ReversePostOrderFunctionAttrs is not modifying the call graph, let's preserve it.
When running cc1 with -flto=thin, it is followed by GlobalOpt, which
requires the callgraph. This saves rebuilding one.

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268266 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-02 18:03:33 +00:00
Andrew Kaylor
1e455c5cfb Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267231 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 22:06:11 +00:00
Vedant Kumar
8866d94a61 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 06:51:37 +00:00
Andrew Kaylor
c852398cbc Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-21 17:58:54 +00:00
Sanjoy Das
c9e3e3cbfd Don't IPO over functions that can be de-refined
Summary:
Fixes PR26774.

If you're aware of the issue, feel free to skip the "Motivation"
section and jump directly to "This patch".

Motivation:

I define "refinement" as discarding behaviors from a program that the
optimizer has license to discard.  So transforming:

```
void f(unsigned x) {
  unsigned t = 5 / x;
  (void)t;
}
```

to

```
void f(unsigned x) { }
```

is refinement, since the behavior went from "if x == 0 then undefined
else nothing" to "nothing" (the optimizer has license to discard
undefined behavior).

Refinement is a fundamental aspect of many mid-level optimizations done
by LLVM.  For instance, transforming `x == (x + 1)` to `false` also
involves refinement since the expression's value went from "if x is
`undef` then { `true` or `false` } else { `false` }" to "`false`" (by
definition, the optimizer has license to fold `undef` to any non-`undef`
value).

Unfortunately, refinement implies that the optimizer cannot assume
that the implementation of a function it can see has all of the
behavior an unoptimized or a differently optimized version of the same
function can have.  This is a problem for functions with comdat
linkage, where a function can be replaced by an unoptimized or a
differently optimized version of the same source level function.

For instance, FunctionAttrs cannot assume a comdat function is
actually `readnone` even if it does not have any loads or stores in
it; since there may have been loads and stores in the "original
function" that were refined out in the currently visible variant, and
at the link step the linker may in fact choose an implementation with
a load or a store.  As an example, consider a function that does two
atomic loads from the same memory location, and writes to memory only
if the two values are not equal.  The optimizer is allowed to refine
this function by first CSE'ing the two loads, and the folding the
comparision to always report that the two values are equal.  Such a
refined variant will look like it is `readonly`.  However, the
unoptimized version of the function can still write to memory (since
the two loads //can// result in different values), and selecting the
unoptimized version at link time will retroactively invalidate
transforms we may have done under the assumption that the function
does not write to memory.

Note: this is not just a problem with atomics or with linking
differently optimized object files.  See PR26774 for more realistic
examples that involved neither.

This patch:

This change introduces a new set of linkage types, predicated as
`GlobalValue::mayBeDerefined` that returns true if the linkage type
allows a function to be replaced by a differently optimized variant at
link time.  It then changes a set of IPO passes to bail out if they see
such a function.

Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18634

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265762 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-08 00:48:30 +00:00
Justin Lebar
dd68c9c6c5 [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage.  Re-landed here with no changes.

Reviewers: chandlerc, jingyue

Subscribers: llvm-commits, tra, jhen, hfinkel

Differential Revision: http://reviews.llvm.org/D17739

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263481 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-14 20:18:54 +00:00
Chandler Carruth
8e27cb2f34 [PM] Make the AnalysisManager parameter to run methods a reference.
This was originally a pointer to support pass managers which didn't use
AnalysisManagers. However, that doesn't realistically come up much and
the complexity of supporting it doesn't really make sense.

In fact, *many* parts of the pass manager were just assuming the pointer
was never null already. This at least makes it much more explicit and
clear.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263219 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-11 11:05:24 +00:00
Chandler Carruth
cf88e9244e [AA] Hoist the logic to reformulate various AA queries in terms of other
parts of the AA interface out of the base class of every single AA
result object.

Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.

When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.

To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).

However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.

The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.

It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.

However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.

Differential Revision: http://reviews.llvm.org/D17329

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262490 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-02 15:56:53 +00:00
Justin Lebar
0bbc549035 Revert "[attrs] Handle convergent CallSites."
This reverts r261544, which was causing a test failure in
Transforms/FunctionAttrs/readattrs.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261549 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-22 18:24:43 +00:00
Justin Lebar
4644890a51 [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Reviewers: chandlerc, jingyue

Subscribers: hfinkel, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261544 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-22 17:51:35 +00:00
Duncan P. N. Exon Smith
8de6150816 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261498 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-21 20:39:50 +00:00
Chandler Carruth
e9afeb0bd1 [PM] Port the PostOrderFunctionAttrs pass to the new pass manager and
convert one test to use this.

This is a particularly significant milestone because it required
a working per-function AA framework which can be queried over each
function from within a CGSCC transform pass (and additionally a module
analysis to be accessible). This is essentially *the* point of the
entire pass manager rewrite. A CGSCC transform is able to query for
multiple different function's analysis results. It works. The whole
thing appears to actually work and accomplish the original goal. While
we were able to hack function attrs and basic-aa to "work" in the old
pass manager, this port doesn't use any of that, it directly leverages
the new fundamental functionality.

For this to work, the CGSCC framework also has to support SCC-based
behavior analysis, etc. The only part of the CGSCC pass infrastructure
not sorted out at this point are the updates in the face of inlining and
running function passes that mutate the call graph.

The changes are pretty boring and boiler-plate. Most of the work was
factored into more focused preperatory patches. But this is what wires
it all together.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261203 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-18 11:03:11 +00:00
Chandler Carruth
2c37b35a91 [attrs] Move the norecurse deduction to operate on the node set rather
than the SCC object, and have it scan the instruction stream directly
rather than relying on call records.

This makes the behavior of this routine consistent between libc routines
and LLVM intrinsics for libc routines. We can go and start teaching it
about those being norecurse, but we should behave the same for the
intrinsic and the libc routine rather than differently. I chatted with
James Molloy and the inconsistency doesn't seem intentional and likely
is due to intrinsic calls not being modelled in the call graph analyses.

This also fixes a bug where we would deduce norecurse on optnone
functions, when generally we try to handle optnone functions as-if they
were replaceable and thus unanalyzable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260813 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-13 08:47:51 +00:00
Chandler Carruth
5403bca6ae [attrs] Simplify the convergent removal to directly use the pre-built
node set rather than walking the SCC directly.

This directly exposes the functions and has already had null entries
filtered out. We also don't need need to handle optnone as it has
already been handled in the caller -- we never try to remove convergent
when there are optnone functions in the SCC.

With this change, the code for removing convergent should work with the
new pass manager and a different SCC analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260668 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 09:47:49 +00:00
Chandler Carruth
3f89873441 [attrs] Consolidate the test for a non-SCC, non-convergent function call
with the test for a non-convergent intrinsic call.

While it is possible to use the call records to search for function
calls, we're going to do an instruction scan anyways to find the
intrinsics, we can handle both cases while scanning instructions. This
will also make the logic more amenable to the new pass manager which
doesn't use the same call graph structure.

My next patch will remove use of CallGraphNode entirely and allow this
code to work with both the old and new pass manager. Fortunately, it
should also get strictly simpler without changing functionality.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260666 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 09:23:53 +00:00
Chandler Carruth
5dce6a869b [attrs] Run clang-format over a newly added routine in function-attrs
before I update it to be friendly with the new pass manager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260653 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 03:07:50 +00:00
Justin Lebar
c4f6eb8e3b Add convergent-removing bits to FunctionAttrs pass.
Summary:
Remove the convergent attribute on any functions which provably do not
contain or invoke any convergent functions.

After this change, we'll be able to modify clang to conservatively add
'convergent' to all functions when compiling CUDA.

Reviewers:  jingyue, joker.eph

Subscribers: llvm-commits, tra, jhen, hfinkel, resistor, chandlerc, arsenm

Differential Revision: http://reviews.llvm.org/D17013

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260319 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09 23:03:22 +00:00
Sanjoy Das
23b546ba0d [FunctionAttrs] Fix SCC logic around operand bundles
FunctionAttrs does an "optimistic" analysis of SCCs as a unit, which
means normally it is able to disregard calls from an SCC into itself.
However, calls and invokes with operand bundles are allowed to have
memory effects not fully described by the memory effects on the call
target, so we can't be optimistic around operand-bundled calls from an
SCC into itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260244 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09 18:40:40 +00:00
Sanjoy Das
81c2fc4c81 Add an "addUsedAAAnalyses" helper function
Summary:
Passes that call `getAnalysisIfAvailable<T>` also need to call
`addUsedIfAvailable<T>` in `getAnalysisUsage` to indicate to the
legacy pass manager that it uses `T`.  This contract was being
violated by passes that used `createLegacyPMAAResults`.  This change
fixes this by exposing a helper in AliasAnalysis.h,
`addUsedAAAnalyses`, that is complementary to createLegacyPMAAResults
and does the right thing when called from `getAnalysisUsage`.

Reviewers: chandlerc

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260183 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-09 01:21:57 +00:00
Chandler Carruth
e96fb9ab15 [attrs] Split the late-revisit pattern for deducing norecurse in
a top-down manner into a true top-down or RPO pass over the call graph.

There are specific patterns of function attributes, notably the
norecurse attribute, which are most effectively propagated top-down
because all they us caller information.

Walk in RPO over the call graph SCCs takes the form of a module pass run
immediately after the CGSCC pass managers postorder walk of the SCCs,
trying again to deduce norerucrse for each singular SCC in the call
graph.

This removes a very legacy pass manager specific trick of using a lazy
revisit list traversed during finalization of the CGSCC pass. There is
no analogous finalization step in the new pass manager, and a lazy
revisit list is just trying to produce an RPO iteration of the call
graph. We can do that more directly if more expensively. It seems
unlikely that this will be the expensive part of any compilation though
as we never examine the function bodies here. Even in an LTO run over
a very large module, this should be a reasonable fast set of operations
over a reasonably small working set -- the function call graph itself.

In the future, if this really is a compile time performance issue, we
can look at building support for both post order and RPO traversals
directly into a pass manager that builds and maintains the PO list of
SCCs.

Differential Revision: http://reviews.llvm.org/D15785

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257163 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 10:55:52 +00:00
Chandler Carruth
d97964253e [attrs] Extract the pure inference of function attributes into
a standalone pass.

There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.

In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.

The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.

Differential Revision: http://reviews.llvm.org/D15676

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256466 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 08:41:34 +00:00
Chandler Carruth
6a1ce8ecd6 [attrs] Split off the forced attributes utility into its own pass that
is (by default) run much earlier than FuncitonAttrs proper.

This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.

I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.

Differential Revision: http://reviews.llvm.org/D15668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256465 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 08:13:45 +00:00
Tilmann Scheller
73a482f4cc Revert "[FunctionAttrs] Remove redundant assignment."
This reverts r253661.

Turns out that the assignment is not redundant (despite the Clang static analyzer claiming the opposite).

The variable is being used by the lambda function AddUsersToWorklistIfCapturing().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253696 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-20 19:17:10 +00:00
Tilmann Scheller
e4b48cdafb [FunctionAttrs] Remove redundant assignment.
Identified by the Clang static analyzer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253661 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-20 12:51:58 +00:00
James Molloy
e30d17a4d7 [FunctionAttrs] Provide a mechanism for adding function attributes from the command line
This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc).

The syntax is -force-attribute=function_name:attribute_name

All function attributes are parsed except alignstack as it requires an argument.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253550 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 08:49:57 +00:00
Elena Demikhovsky
741a772213 Vector of pointers in function attributes calculation
While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers.
I added vector-of-pointers to the call arguments types that should be checked.

Differential Revision: http://reviews.llvm.org/D14693



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253363 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-17 19:30:51 +00:00
James Molloy
29020c0a61 Revert "Revert "[FunctionAttrs] Identify norecurse functions""
This reapplies this patch, with test fixes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252871 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-12 10:55:20 +00:00
James Molloy
6cd07b45e1 Revert "[FunctionAttrs] Identify norecurse functions"
This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252863 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-12 09:05:43 +00:00
James Molloy
31d4df2fc6 [FunctionAttrs] Identify norecurse functions
A function can be marked as norecurse if:
  * The SCC to which it belongs has cardinality 1; and either
    a) It does not call any non-norecurse function. This includes self-recursion; or
    b) It only has one callsite and the function that callsite is within is marked norecurse.

a) is best propagated bottom-up and b) is best propagated top-down.

We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252862 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-12 08:53:04 +00:00
Sanjoy Das
226fd8ff51 Unbreak the build
My code clashed with some ilist iterator changes upstream.  Fix by
adding an explicit "&*" coercion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252392 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-07 02:26:53 +00:00