Commit Graph

3332 Commits

Author SHA1 Message Date
Igor Laevsky
22eba5a2e2 [LCSSA] Perform LCSSA verification only for the current loop nest.
Now LPPassManager will run LCSSA verification only for the top-level loop
which was processed on the current iteration.

Differential Revision: https://reviews.llvm.org/D25873



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285394 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 12:57:20 +00:00
David L. Jones
f64117ce6e Fix map insertion that is elided in release build.
The assert() macro doesn't actually execute its body in Release builds, so using
it to check cache invariants requires that the insertion be outside of the
assert() statement. This change does that, and also makes sure to return the
actual map contents.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284898 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 23:30:39 +00:00
Li Huang
f0a5715382 [SCEV] Memoize visitMulExpr results in SCEVRewriteVisitor.
Summary:
When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV
multiple times if this SCEV is referenced by multiple other SCEVs. This has
exponential time complexity in the worst case. Memoizing the results will
avoid re-visiting the same SCEV. Add a map to save the results, and override
the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each
SCEV once and thus returns the same result for the same input SCEV.

This patch fixes PR18606, PR18607.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25810


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284868 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 20:05:21 +00:00
Peter Collingbourne
7e2bd34085 Analysis: Move llvm::getConstantRangeFromMetadata to IR library.
We're about to start using it there.

Differential Revision: https://reviews.llvm.org/D25877

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284865 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 19:59:26 +00:00
Simon Pilgrim
aaafc1d16c Wdocumentation fix
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284822 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 12:51:16 +00:00
John Brawn
9e0c61cbeb [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 11:08:48 +00:00
Benjamin Kramer
06d5a1641d Do a sweep over move ctors and remove those that are identical to the default.
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284721 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 12:20:28 +00:00
Chad Rosier
db638de2de [AliasSetTracker] Add support for memcpy and memmove.
Differential Revision: https://reviews.llvm.org/D25776

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284630 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 19:09:03 +00:00
Chad Rosier
6d823ecec2 [AliasSetTracker] Return void for add() functions. NFC.
Differential Revision: https://reviews.llvm.org/D25748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284628 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 18:50:32 +00:00
Mandeep Singh Grang
a30faf0090 Remove unused typedef
Summary: Unused: typedef SmallSetVector<RegionT *, 4> RegionSet

Reviewers: MatzeB, grosser

Subscribers: zinob

Differential Revision: https://reviews.llvm.org/D25744

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284524 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 20:01:12 +00:00
Mandeep Singh Grang
0480c1bb9f Fix differences in codegen between Linux and Windows toolchains
Summary:
    There are differences in codegen between Linux and Windows due to:
    1. Using std::sort which uses quicksort which is a non-stable sort.

    2. Iterating over Set data structure where the iteration order is
       non deterministic.

Reviewers: arsenm, grosbach, junbuml, zinob, MatzeB

Subscribers: MatzeB, wdng

Differential Revision: https://reviews.llvm.org/D25695

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284441 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 00:11:19 +00:00
Justin Bogner
e1d1c98dc3 SCEV: Prefer the LLVM_NODISCARD spelling
Update functions annotated with LLVM_ATTRIBUTE_UNUSED_RESULT to use
LLVM_NODISCARD instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284345 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-16 21:17:29 +00:00
Haicheng Wu
b893afb0a5 Reapply "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
Reappy r284044 after revert in r284051. Krzysztof fixed the error in r284049.

The original summary:

This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284053 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 21:29:38 +00:00
Haicheng Wu
3d05abfa85 Revert "[LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop"
This reverts commit r284044.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284051 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 21:02:22 +00:00
Haicheng Wu
6ceda533e5 [LoopUnroll] Use the upper bound of the loop trip count to fullly unroll a loop
This patch tries to fully unroll loops having break statement like this

for (int i = 0; i < 8; i++) {
    if (a[i] == value) {
        found = true;
        break;
    }
}

GCC can fully unroll such loops, but currently LLVM cannot because LLVM only
supports loops having exact constant trip counts.

The upper bound of the trip count can be obtained from calling
ScalarEvolution::getMaxBackedgeTakenCount(). Part of the patch is the
refactoring work in SCEV to prevent duplicating code.

The feature of using the upper bound is enabled under the same circumstance
when runtime unrolling is enabled since both are used to unroll loops without
knowing the exact constant trip count.

Differential Revision: https://reviews.llvm.org/D24790

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 20:24:32 +00:00
Chandler Carruth
32e2e9af9b [LCG] Cleanup various places where comments said SCC but meant
`RefSCC`.

Also improve the comments surrounding the lazy post-order iterator as
they had grown stale since the RefSCC/SCC split.

I'm sure there are more comments that need updating here, but I saw and
fixed these and didn't want to lose them. I've not gotten to doing
a really complete audit of every comment yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283987 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 08:40:51 +00:00
Chandler Carruth
94162a40bb [LCG] Add the necessary functionality to the LazyCallGraph to support inlining.
The basic inlining operation makes the following changes to the call graph:
1) Add edges that were previously transitive edges. This is always trivial and
   this patch gives the LCG helper methods to make this more convenient.
2) Remove the inlined edge. We had existing support for this, but it contained
   bugs that needed to be fixed. Testing in the same pattern as the inliner
   exposes these bugs very nicely.
3) Delete a function when it becomes dead because it is internal and all calls
   have been inlined. The LCG had no support at all for this operation, so this
   adds that support.

Two unittests have been added that exercise this specific mutation pattern to
the call graph. They were extremely effective in uncovering bugs. Sadly,
a large fraction of the code here is just to implement those unit tests, but
I think they're paying for themselves. =]

This was split out of a patch that actually uses the routines to
implement inlining in the new pass manager in order to isolate (with
unit tests) the logic that was entirely within the LCG.

Many thanks for the careful review from folks! There will be a few minor
follow-up patches based on the comments in the review as well.

Differential Revision: https://reviews.llvm.org/D24225

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283982 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 07:59:56 +00:00
Kyle Butt
2a18018c10 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Issue with early tail-duplication of blocks that branch to a fallthrough
predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll

Differential revision: https://reviews.llvm.org/D18226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283934 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 20:36:43 +00:00
Igor Laevsky
00eb3c9237 [LCSSA] Implement linear algorithm for the isRecursivelyLCSSAForm
For each block check that it doesn't have any uses outside of it's innermost loop.

Differential Revision: https://reviews.llvm.org/D25364



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283877 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 13:37:22 +00:00
Daniel Jasper
ebc8a28377 Revert "Codegen: Tail-duplicate during placement."
This reverts commit r283842.

test/CodeGen/X86/tail-dup-repeat.ll causes and llc crash with our
internal testing. I'll share a link with you.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283857 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 07:36:11 +00:00
Kyle Butt
be53d7c9c4 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Issue with early tail-duplication of blocks that branch to a fallthrough
predecessor fixed with test case: tail-dup-branch-to-fallthrough.ll

Differential revision: https://reviews.llvm.org/D18226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283842 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 01:20:33 +00:00
Dehao Chen
d9a6f657d8 Rename isHotFunction/isColdFunction to isFunctionEntryHot/isFunctionEntryCold. (NFC)
This is in preparation for https://reviews.llvm.org/D25048


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283805 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-10 21:47:28 +00:00
Kyle Butt
473ebca2dd Revert "Codegen: Tail-duplicate during placement."
This reverts commit 71c312652c.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283647 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-08 01:47:05 +00:00
Kyle Butt
71c312652c Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Differential revision: https://reviews.llvm.org/D18226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283619 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-07 22:33:20 +00:00
Oliver Stannard
9dc8cca063 [ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI
With the ROPI and RWPI relocation models we can't always have pointers
to global data or functions in constant data, so don't try to convert switches
into lookup tables if any value in the lookup table would require a relocation.
We can still safely emit lookup tables of other values, such as simple
constants.

Differential Revision: https://reviews.llvm.org/D24462



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283530 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-07 08:48:24 +00:00
David Callahan
8be61a8c7e Modify df_iterator to support post-order actions
Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes.  It will however allow a client to distinguish back from cross edges in a DFS tree.

Reviewers: nadav, mehdi_amini, dberlin

Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits

Differential Revision: https://reviews.llvm.org/D25191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283391 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-05 21:36:16 +00:00
Volkan Keles
9920e54e99 Add new target hooks for LoadStoreVectorizer
Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D24727

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283099 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 10:31:34 +00:00
Mehdi Amini
5e07fb3545 Use StringRef in TLI instead of raw pointer (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283005 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 03:10:48 +00:00
Mehdi Amini
67f335d992 Use StringRef in Pass/PassManager APIs (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:56:57 +00:00
Piotr Padlewski
8948c796f5 NFC fix doxygen comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282950 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-30 21:05:49 +00:00
Adam Nemet
858cc931c4 [LAA, LV] Port to new streaming interface for opt remarks. Update LV
(Recommit after making sure IsVerbose gets properly initialized in
DiagnosticInfoOptimizationBase.  See previous commit that takes care of
this.)

OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282813 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-30 00:01:30 +00:00
Adam Nemet
a9b1465978 Revert "[LAA, LV] Port to new streaming interface for opt remarks. Update LV"
This reverts commit r282758.

There are some clang failures I haven't seen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282759 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-29 20:17:37 +00:00
Adam Nemet
07beb3c041 [LAA, LV] Port to new streaming interface for opt remarks. Update LV
OptimizationRemarkAnalysis directly takes the role of the report that is
generated by LAA.

Then we need the magic to be able to turn an LAA remark into an LV
remark.  This is done via a new OptimizationRemark ctor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282758 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-29 20:12:18 +00:00
Dehao Chen
b393d828c5 Refactor the ProfileSummaryInfo to use doInitialization and doFinalization to handle Module update.
Summary: This refactors the change in r282616

Reviewers: davidxl, eraman, mehdi_amini

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D25041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282630 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 21:00:58 +00:00
Dehao Chen
bc9cf99b1a Fix the bug when -compile-twice is specified, the PSI will be invalidated.
Summary:
When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary.
The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll

Reviewers: eraman, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282616 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 18:41:14 +00:00
Sanjoy Das
2fd72fd595 [SCEV] Use a SmallPtrSet as a temporary union predicate; NFC
Summary:
Instead of creating and destroying SCEVUnionPredicate instances (which
internally creates and destroys a DenseMap), use temporary SmallPtrSet
instances of remember the set of predicates that will get reified into a
SCEVUnionPredicate.

Reviewers: silviu.baranga, sbaranga

Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D25000

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282606 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 17:14:58 +00:00
Jonas Paulsson
2db5109945 [SystemZ] Implementation of getUnrollingPreferences().
This commit enables more unrolling for SystemZ by implementing the
SystemZTargetTransformInfo::getUnrollingPreferences() method.

It has been found that it is better to only unroll moderately, so the
DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order
to set this to a lower value for SystemZ (4).

Reviewers: Evgeny Stupachenko, Ulrich Weigand.
https://reviews.llvm.org/D24451

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282570 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 09:41:38 +00:00
Adam Nemet
e555d1bef7 [LAA] Rename emitAnalysis to recordAnalys. NFC
Ever since LAA was split out into an analysis on its own, this function
stopped emitting the report directly.  Instead it stores it to be
retrieved by the client which can then emit it as its own report
(e.g. -Rpass-analysis=loop-vectorize).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282561 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 00:58:36 +00:00
Adam Nemet
47c0d49055 Output optimization remarks in YAML
(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282539 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 20:55:07 +00:00
Sanjoy Das
bedf4a9691 [SCEV] Make PendingLoopPredicates more frugal; NFCI
I don't expect `PendingLoopPredicates` to have very many
elements (e.g. when -O3'ing the sqlite3 amalgamation,
`PendingLoopPredicates` has at most 3 elements).  So now we use a
`SmallPtrSet` for it instead of the more heavyweight `DenseSet`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282511 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 18:01:38 +00:00
Adam Nemet
2713e77a55 Revert "Output optimization remarks in YAML"
This reverts commit r282499.

The GCC bots are failing

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282503 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 16:39:24 +00:00
Adam Nemet
3ecd7534da Output optimization remarks in YAML
This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282499 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-27 16:15:16 +00:00
Piotr Padlewski
fdf7354745 [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282437 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 20:37:32 +00:00
Sanjoy Das
a80a7d1c30 [SCEV] Combine two predicates into one; NFC
Both `loopHasNoSideEffects` and `loopHasNoAbnormalExits` involve walking
the loop and maintaining similar sorts of caches.  This commit changes
SCEV to compute both the predicates via a single walk, and maintain a
single cache instead of two.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282375 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 02:44:07 +00:00
Sanjoy Das
98e898ce46 [SCEV] Make it obvious BackedgeTakenInfo's constructor steals storage
Specifically, it moves SCEVUnionPredicates from its input into its own
storage.  Make this obvious at the type level.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282374 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 01:10:27 +00:00
Sanjoy Das
643807ace9 [SCEV] Further isolate incidental data structure; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282373 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 01:10:25 +00:00
Sanjoy Das
fbc0b21d22 Appease MSVC
... by not default move constructors and operator= s. Defaulting these
works in clang, but not in MSVC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282370 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 00:22:18 +00:00
Sanjoy Das
e6b6f3a715 Attempt to appease MSVC
... by explicitly deleting the copy constructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282369 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 00:00:51 +00:00
Sanjoy Das
bfa3983f1b [SCEV] Document a gotcha; NFC
We should re-consider the design decision that led to this gotcah, but
for now just document it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282367 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-25 23:12:06 +00:00
Sanjoy Das
347d5b6432 [SCEV] Have ExitNotTakenInfo keep a pointer to its predicate; NFC
SCEVUnionPredicate is a "heavyweight" structure, so it is beneficial to
store the (optional) data out of line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282366 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-25 23:12:04 +00:00