Commit Graph

2675 Commits

Author SHA1 Message Date
Teresa Johnson
89aab30a42 [ThinLTO] Rename HasSection to NoRename (NFC)
Summary:
This is in preparation for a change to utilize this flag for symbols
referenced/defined in either inline or module level assembly.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26048

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285376 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 02:24:59 +00:00
Peter Collingbourne
2c3f1a7cd5 GlobalDCE: Restore a statement accidentally removed in r285048.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285052 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 02:57:27 +00:00
Peter Collingbourne
598979983f GlobalDCE: Deduplicate code. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285048 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 01:58:26 +00:00
Rong Xu
fe89a6bcf0 Conditionally eliminate library calls where the result value is not used
Summary:
This pass shrink-wraps a condition to some library calls where the call
result is not used. For example:
   sqrt(val);
 is transformed to
   if (val < 0)
     sqrt(val);
Even if the result of library call is not being used, the compiler cannot
safely delete the call because the function can set errno on error
conditions.
Note in many functions, the error condition solely depends on the incoming
parameter. In this optimization, we can generate the condition can lead to
the errno to shrink-wrap the call. Since the chances of hitting the error
condition is low, the runtime call is effectively eliminated.

These partially dead calls are usually results of C++ abstraction penalty
exposed by inlining. This optimization hits 108 times in 19 C/C++ programs
in SPEC2006.

Reviewers: hfinkel, mehdi_amini, davidxl

Subscribers: modocache, mgorny, mehdi_amini, xur, llvm-commits, beanz

Differential Revision: https://reviews.llvm.org/D24414

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284542 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 21:36:27 +00:00
Mehdi Amini
3ffe113e11 Turn cl::values() (for enum) from a vararg function to using C++ variadic template
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:

 va_start(ValueArgs, Desc);

with Desc being a StringRef.

Differential Revision: https://reviews.llvm.org/D25342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283671 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-08 19:41:06 +00:00
David Callahan
8be61a8c7e Modify df_iterator to support post-order actions
Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes.  It will however allow a client to distinguish back from cross edges in a DFS tree.

Reviewers: nadav, mehdi_amini, dberlin

Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits

Differential Revision: https://reviews.llvm.org/D25191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283391 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-05 21:36:16 +00:00
Sanjoy Das
dd99e950f7 [PruneEH] Be correct in the face IPO
This fixes one spot I had missed in r265762.  Credit goes to Philip
Reames for spotting this one!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283137 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 19:35:30 +00:00
Mehdi Amini
67f335d992 Use StringRef in Pass/PassManager APIs (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:56:57 +00:00
Piotr Padlewski
a6db8552b8 [thinlto] Don't decay threshold for hot callsites
Summary:
We don't want to decay hot callsites to import chains of hot
callsites. The same mechanism is used in LIPO.

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24976

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282833 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-30 03:01:17 +00:00
Piotr Padlewski
584dba4143 [thinlto] Add cold-callsite import heuristic
Summary:
Not tunned up heuristic, but with this small heuristic there is about
+0.10% improvement on SPEC 2006

Reviewers: tejohnson, mehdi_amini, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282733 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-29 17:32:07 +00:00
Dehao Chen
b393d828c5 Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update.
Summary: This refactors the change in r282616

Reviewers: davidxl, eraman, mehdi_amini

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D25041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282630 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 21:00:58 +00:00
Adam Nemet
695f82f13a [Inliner] Port all opt remarks to new streaming API
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282559 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 23:47:03 +00:00
Adam Nemet
0552c8c45a Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFC
With the new streaming interface, these class names need to be typed a
lot and it's way too looong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282544 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 22:19:23 +00:00
Adam Nemet
8d5df95c2e [Inliner] Fold the analysis remark into the missed remark
There is really no reason for these to be separate.

The vectorizer started this pretty bad tradition that the text of the
missed remarks is pretty meaningless, i.e. vectorization failed.  There,
you have to query analysis to get the full picture.

I think we should just explain the reason for missing the optimization
in the missed remark when possible.  Analysis remarks should provide
information that the pass gathers regardless whether the optimization is
passing or not.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282542 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 21:58:17 +00:00
Adam Nemet
47c0d49055 Output optimization remarks in YAML
(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282539 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 20:55:07 +00:00
Adam Nemet
2713e77a55 Revert "Output optimization remarks in YAML"
This reverts commit r282499.

The GCC bots are failing

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282503 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 16:39:24 +00:00
Adam Nemet
3ecd7534da Output optimization remarks in YAML
This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282499 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 16:15:16 +00:00
Ivan Krasin
cf66db4f99 Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generation
Summary:
We don't currently need this facility for CFI. Disabling individual hot methods proved
to be a better strategy in Chrome.

Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne.

Reviewers: pcc

Subscribers: kcc

Differential Revision: https://reviews.llvm.org/D24948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282461 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 00:29:53 +00:00
Peter Collingbourne
e07092b299 LowerTypeTests: Remove unused variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282456 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 23:56:17 +00:00
Peter Collingbourne
5b66e90c52 LowerTypeTests: Create LowerTypeTestsModule class and move implementation there. Related simplifications.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282455 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 23:54:39 +00:00
Piotr Padlewski
fdf7354745 [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282437 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 20:37:32 +00:00
Dehao Chen
aa30d2cf71 Change the basic block weight calculation algorithm to use max instead of voting.
Summary: Now that we have more precise debug info, we should change back to use maximum to get basic block weight.

Reviewers: dnovillo

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D24788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282084 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-21 16:26:51 +00:00
Arnold Schwaighofer
e9993d5a21 DeadArgElim: Don't mark swifterror arguments as unused
Replacing swifterror arguments with undef creates invalid IR.

rdar://28300490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282075 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-21 15:29:08 +00:00
Dehao Chen
ddeacc711a Handle early inline for hot callsites that reside in the same basic block.
Summary: Callsites in the same basic block should share the same hotness. This patch checks for the hottest callsite in the same basic block, and use the hotness for all callsites in that basic block for early inline decisions. It also fixes the test to add "-S" so theat the "CHECK-NOT" is actually checking the content.

Reviewers: dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281927 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-19 18:38:14 +00:00
Dehao Chen
ba3d957955 Only set branch weight during sample pgo annotation when max_weight of the branch is non-zero. Otherwise use default static profile to set branch probability.
Summary: It does not make sense to set equal weights for all unkown branches as we have static branch prediction available.

Reviewers: dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24732

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281912 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-19 16:33:41 +00:00
Dehao Chen
625183db5a Use call target count to derive the call instruction weight
Summary: The call target count profile is directly derived from LBR branch->target data. This is more reliable than instruction frequency profiles that could be moved across basic block boundaries. This patches uses call target count profile to annotate call instructions.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24410

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281911 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-19 16:06:37 +00:00
Dehao Chen
60489f476b Handle Invoke during sample profiler annotation: make it inlinable.
Summary: Previously we reline on inst-combine to remove inlinable invoke instructions. This causes trouble because a few extra optimizations are schedule early that could introduce too much CFG change (e.g. simplifycfg removes too much control flow). This patch handles invoke instruction in-place during sample profile annotation, so that we do not rely on instcombine to remove those invoke instructions.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281870 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-18 23:11:37 +00:00
Teresa Johnson
b6d6773710 [ThinLTO] Ensure anonymous globals renamed even at -O0
Summary:
This fixes an issue when files are compiled with -flto=thin
at default -O0. We need to rename anonymous globals before attempting
to write the module summary because all values need names for
the summary. This was happening at -O1 and above, but not before
the early exit when constructing the pipeline for -O0.

Also add an internal -prepare-for-thinlto option to enable this
to be tested via opt.

Fixes PR30419.

Reviewers: mehdi_amini

Subscribers: probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24701

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281840 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-17 20:40:16 +00:00
Mehdi Amini
d1e3c5aaec Don't create a SymbolTable in Function when the LLVMContext discards value names (NFC)
The ValueSymbolTable is used to detect name conflict and rename
instructions automatically. This is not needed when the value
names are automatically discarded by the LLVMContext.
No functional change intended, just saving a little bit of memory.

This is a recommit of r281806 after fixing the accessor to return
a pointer instead of a reference and updating all the call-sites.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281813 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-17 06:00:02 +00:00
Mehdi Amini
8dff0d8b85 Rename NameAnonFunctions to NameAnonGlobals to match what it is doing (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281745 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16 16:56:30 +00:00
Mehdi Amini
12be25f90c [GlobalOpt] Dead Eliminate declarations
GlobalOpt is already dead-code-eliminating global definitions. With
this change it also takes care of declarations.
Hopefully this should make it now a strict superset of GlobalDCE.
This is important for LTO/ThinLTO as we don't want the linker to see
"undefined reference" when it processes the input files: it could
prevent proper internalization (or even load an extra file from a
static archive, changing the behavior of the program!).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281653 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-15 20:26:27 +00:00
Peter Collingbourne
5420de3f15 DebugInfo: New metadata representation for global variables.
This patch reverses the edge from DIGlobalVariable to GlobalVariable.
This will allow us to more easily preserve debug info metadata when
manipulating global variables.

Fixes PR30362. A program for upgrading test cases is attached to that
bug.

Differential Revision: http://reviews.llvm.org/D20147

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281284 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-13 01:12:59 +00:00
David Majnemer
5db0b906e8 [FunctionAttrs] Don't try to infer returned if it is already on an argument
Trying to infer the 'returned' attribute if an argument is already
'returned' can lead to verification failure: inference might determine
that a different argument is passed through which would result in two
different arguments marked as 'returned'.

This fixes PR30350.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281221 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 16:04:59 +00:00
Tim Shen
b62ba77b89 s/static inline/static/ for headers I have changed in r279475. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280257 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 16:48:13 +00:00
Teresa Johnson
cf538716fb [ThinLTO] Indirect call promotion fixes for promoted local functions
Summary:
Fix a couple issues limiting the application of indirect call promotion
in ThinLTO mode:
- Invoke indirect call promotion before globalopt, since it may
  eliminate imported functions which appear unreferenced.
- Invoke indirect call promotion with InLTO=true so that the PGOFuncName
  metadata is used to get the name for locals which would have been
  renamed during promotion.

Reviewers: davidxl, mehdi_amini

Subscribers: Prazek, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24004

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280024 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-29 22:46:56 +00:00
Adam Nemet
c96b33c2f4 [Inliner] Report when inlining fails because callee's def is unavailable
Summary:
This is obviously an interesting case because it may motivate code
restructuring or LTO.

Reporting this requires instantiation of ORE in the loop where the call
sites are first gathered.  I've checked compile-time
overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in
mcf and quickly tailing off.  As before without -Rpass-with-hotness
there is no overhead.

Because this could be a pretty noisy diagnostics, it is currently
qualified as 'verbose'.  As of this patch, 'verbose' diagnostics are
only emitted with -Rpass-with-hotness, i.e. when the output is expected
to be filtered.

Reviewers: eraman, chandlerc, davidxl, hfinkel

Subscribers: tejohnson, Prazek, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D23415

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279860 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-26 20:21:05 +00:00
Chandler Carruth
377af8df2b [PM] Introduce basic update capabilities to the new PM's CGSCC pass
manager, including both plumbing and logic to handle function pass
updates.

There are three fundamentally tied changes here:
1) Plumbing *some* mechanism for updating the CGSCC pass manager as the
   CG changes while passes are running.
2) Changing the CGSCC pass manager infrastructure to have support for
   the underlying graph to mutate mid-pass run.
3) Actually updating the CG after function passes run.

I can separate them if necessary, but I think its really useful to have
them together as the needs of #3 drove #2, and that in turn drove #1.

The plumbing technique is to extend the "run" method signature with
extra arguments. We provide the call graph that intrinsically is
available as it is the basis of the pass manager's IR units, and an
output parameter that records the results of updating the call graph
during an SCC passes's run. Note that "...UpdateResult" isn't a *great*
name here... suggestions very welcome.

I tried a pretty frustrating number of different data structures and such
for the innards of the update result. Every other one failed for one
reason or another. Sometimes I just couldn't keep the layers of
complexity right in my head. The thing that really worked was to just
directly provide access to the underlying structures used to walk the
call graph so that their updates could be informed by the *particular*
nature of the change to the graph.

The technique for how to make the pass management infrastructure cope
with mutating graphs was also something that took a really, really large
number of iterations to get to a place where I was happy. Here are some
of the considerations that drove the design:

- We operate at three levels within the infrastructure: RefSCC, SCC, and
  Node. In each case, we are working bottom up and so we want to
  continue to iterate on the "lowest" node as the graph changes. Look at
  how we iterate over nodes in an SCC running function passes as those
  function passes mutate the CG. We continue to iterate on the "lowest"
  SCC, which is the one that continues to contain the function just
  processed.

- The call graph structure re-uses SCCs (and RefSCCs) during mutation
  events for the *highest* entry in the resulting new subgraph, not the
  lowest. This means that it is necessary to continually update the
  current SCC or RefSCC as it shifts. This is really surprising and
  subtle, and took a long time for me to work out. I actually tried
  changing the call graph to provide the opposite behavior, and it
  breaks *EVERYTHING*. The graph update algorithms are really deeply
  tied to this particualr pattern.

- When SCCs or RefSCCs are split apart and refined and we continually
  re-pin our processing to the bottom one in the subgraph, we need to
  enqueue the newly formed SCCs and RefSCCs for subsequent processing.
  Queuing them presents a few challenges:
  1) SCCs and RefSCCs use wildly different iteration strategies at
     a high level. We end up needing to converge them on worklist
     approaches that can be extended in order to be able to handle the
     mutations.
  2) The order of the enqueuing need to remain bottom-up post-order so
     that we don't get surprising order of visitation for things like
     the inliner.
  3) We need the worklists to have set semantics so we don't duplicate
     things endlessly. We don't need a *persistent* set though because
     we always keep processing the bottom node!!!! This is super, super
     surprising to me and took a long time to convince myself this is
     correct, but I'm pretty sure it is... Once we sink down to the
     bottom node, we can't re-split out the same node in any way, and
     the postorder of the current queue is fixed and unchanging.
  4) We need to make sure that the "current" SCC or RefSCC actually gets
     enqueued here such that we re-visit it because we continue
     processing a *new*, *bottom* SCC/RefSCC.

- We also need the ability to *skip* SCCs and RefSCCs that get merged
  into a larger component. We even need the ability to skip *nodes* from
  an SCC that are no longer part of that SCC.

This led to the design you see in the patch which uses SetVector-based
worklists. The RefSCC worklist is always empty until an update occurs
and is just used to handle those RefSCCs created by updates as the
others don't even exist yet and are formed on-demand during the
bottom-up walk. The SCC worklist is pre-populated from the RefSCC, and
we push new SCCs onto it and blacklist existing SCCs on it to get the
desired processing.

We then *directly* update these when updating the call graph as I was
never able to find a satisfactory abstraction around the update
strategy.

Finally, we need to compute the updates for function passes. This is
mostly used as an initial customer of all the update mechanisms to drive
their design to at least cover some real set of use cases. There are
a bunch of interesting things that came out of doing this:

- It is really nice to do this a function at a time because that
  function is likely hot in the cache. This means we want even the
  function pass adaptor to support online updates to the call graph!

- To update the call graph after arbitrary function pass mutations is
  quite hard. We have to build a fairly comprehensive set of
  data structures and then process them. Fortunately, some of this code
  is related to the code for building the cal graph in the first place.
  Unfortunately, very little of it makes any sense to share because the
  nature of what we're doing is so very different. I've factored out the
  one part that made sense at least.

- We need to transfer these updates into the various structures for the
  CGSCC pass manager. Once those were more sanely worked out, this
  became relatively easier. But some of those needs necessitated changes
  to the LazyCallGraph interface to make it significantly easier to
  extract the changed SCCs from an update operation.

- We also need to update the CGSCC analysis manager as the shape of the
  graph changes. When an SCC is merged away we need to clear analyses
  associated with it from the analysis manager which we didn't have
  support for in the analysis manager infrsatructure. New SCCs are easy!
  But then we have the case that the original SCC has its shape changed
  but remains in the call graph. There we need to *invalidate* the
  analyses associated with it.

- We also need to invalidate analyses after we *finish* processing an
  SCC. But the analyses we need to invalidate here are *only those for
  the newly updated SCC*!!! Because we only continue processing the
  bottom SCC, if we split SCCs apart the original one gets invalidated
  once when its shape changes and is not processed farther so its
  analyses will be correct. It is the bottom SCC which continues being
  processed and needs to have the "normal" invalidation done based on
  the preserved analyses set.

All of this is mostly background and context for the changes here.

Many thanks to all the reviewers who helped here. Especially Sanjoy who
caught several interesting bugs in the graph algorithms, David, Sean,
and others who all helped with feedback.

Differential Revision: http://reviews.llvm.org/D21464

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279618 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-24 09:37:14 +00:00
Tim Shen
22fca38c9c [GraphTraits] Replace all NodeType usage with NodeRef
This should finish the GraphTraits migration.

Differential Revision: http://reviews.llvm.org/D23730


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279475 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-22 21:09:30 +00:00
Justin Bogner
7d7a23e700 Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278970 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 20:30:52 +00:00
Chandler Carruth
b699f7b88f [PM] Port the always inliner to the new pass manager in a much more
minimal and boring form than the old pass manager's version.

This pass does the very minimal amount of work necessary to inline
functions declared as always-inline. It doesn't support a wide array of
things that the legacy pass manager did support, but is alse ... about
20 lines of code. So it has that going for it. Notably things this
doesn't support:

- Array alloca merging
  - To support the above, bottom-up inlining with careful history
    tracking and call graph updates
- DCE of the functions that become dead after this inlining.
- Inlining through call instructions with the always_inline attribute.
  Instead, it focuses on inlining functions with that attribute.

The first I've omitted because I'm hoping to just turn it off for the
primary pass manager. If that doesn't pan out, I can add it here but it
will be reasonably expensive to do so.

The second should really be handled by running global-dce after the
inliner. I don't want to re-implement the non-trivial logic necessary to
do comdat-correct DCE of functions. This means the -O0 pipeline will
have to be at least 'always-inline,global-dce', but that seems
reasonable to me. If others are seriously worried about this I'd like to
hear about it and understand why. Again, this is all solveable by
factoring that logic into a utility and calling it here, but I'd like to
wait to do that until there is a clear reason why the existing
pass-based factoring won't work.

The final point is a serious one. I can fairly easily add support for
this, but it seems both costly and a confusing construct for the use
case of the always inliner running at -O0. This attribute can of course
still impact the normal inliner easily (although I find that
a questionable re-use of the same attribute). I've started a discussion
to sort out what semantics we want here and based on that can figure out
if it makes sense ta have this complexity at O0 or not.

One other advantage of this design is that it should be quite a bit
faster due to checking for whether the function is a viable candidate
for inlining exactly once per function instead of doing it for each call
site.

Anyways, hopefully a reasonable starting point for this pass.

Differential Revision: https://reviews.llvm.org/D23299

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278896 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 02:56:20 +00:00
Chandler Carruth
bc17354b9a [Inliner] Add a flag to disable manual alloca merging in the Inliner.
This is off for now while testing can take place to make sure that in
fact we do sufficient stack coloring to fully obviate the manual alloca
array merging.

Some context on why we should be using stack coloring rather than
merging allocas in this way:

LLVM relies very heavily on analyzing pointers as coming from different
allocas in order to make aliasing decisions. These are some of the most
powerful aliasing signals available in LLVM. So merging allocas is an
extremely destructive operation on the LLVM IR -- it takes away highly
valuable and hard to reconstruct information.

As a consequence, inlined functions which happen to have array allocas
that this pattern matches will fail to be properly interleaved unless
SROA manages to hoist everything to an SSA register. Instead, the
inliner will have added an unnecessary dependence that one inlined
function execute after the other because they will have been rewritten
to refer to the same memory.

All that said, folks will reasonably want some time to experiment here
and make sure there are no significant regressions. A flag should give
us an easy knob to test.

For more context, see the thread here:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/103277.html
http://lists.llvm.org/pipermail/llvm-dev/2016-August/103285.html

Differential Revision: https://reviews.llvm.org/D23052

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278892 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 02:40:23 +00:00
Duncan P. N. Exon Smith
370879ff09 IPO: Swap || operands to avoid dereferencing end()
IsOperandBundleUse conveniently indicates  whether
std::next(F->arg_begin(),UseIndex) will get to (or past) end().  Check
it first to avoid dereferencing end().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278884 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 01:23:58 +00:00
Mehdi Amini
2a393a5758 FunctionImport: missed one occurence of ImportListForModule to rename (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278778 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-16 05:49:12 +00:00
Mehdi Amini
22277af930 FunctionImport: rename ImportsForModule to ImportList for consistency (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278777 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-16 05:47:12 +00:00
Mehdi Amini
055f556eae [LTO] Simplify APIs and constify (NFC)
Summary:
Multiple APIs were taking a StringMap for the ImportLists containing
the entries for for all the modules while operating on a single entry
for the current module. Instead we can pass the desired ModuleImport
directly. Also some of the APIs were not const, I believe just to be
able to use operator[] on the StringMap.

Reviewers: tejohnson

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23537

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278776 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-16 05:46:05 +00:00
Teresa Johnson
d73e1345aa [ThinLTO] Remove functions resolved to available_externally from comdats
Summary:
thinLTOResolveWeakForLinkerModule needs to drop any preempted weak symbols
that were converted to available_externally from comdats, otherwise we
will get a verification failure (since available_externally is a
declaration for the linker, and no declarations can be in a comdat).

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278739 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-15 21:00:04 +00:00
Dehao Chen
126e4c2d97 Fine tuning of sample profile propagation algorithm.
Summary: The refined propagation algorithm is more accurate and robust.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23224

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278522 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-12 16:22:12 +00:00
Ivan Krasin
d79f1f1720 WholeProgramDevirt: initialize WasDevirt in all constructors.
Summary: This is a follow up to r278389 and r278442.

Differential Revision: https://reviews.llvm.org/D23438

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278455 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-12 01:40:10 +00:00
Piotr Padlewski
fb2a7f990d Don't import variadic functions
Summary:
This patch adds IsVariadicFunction bit to summary in order
to not import variadic functions. Inliner doesn't inline
variadic functions because it is hard to reason about it.

This one small fix improves Importer by about 16%
(going from 86% to 100% of imported functions that are
inlined anywhere)
on some spec benchmarks like 'int' and others.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D23339

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278432 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 22:13:57 +00:00
David Majnemer
dc9c737666 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278417 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 21:15:00 +00:00