llvm/lib/Analysis
Piotr Padlewski fdf7354745 [thinlto] Basic thinlto fdo heuristic
Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282437 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 20:37:32 +00:00
..
AliasAnalysis.cpp [AliasAnalysis] Give back AA results for fence instructions 2016-07-15 17:19:24 +00:00
AliasAnalysisEvaluator.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
AliasAnalysisSummary.cpp Update a comment. 2016-08-25 01:29:55 +00:00
AliasAnalysisSummary.h Make some LLVM_CONSTEXPR variables const. NFC. 2016-08-25 01:05:08 +00:00
AliasSetTracker.cpp [AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large. 2016-08-19 17:05:22 +00:00
Analysis.cpp [PM] Port CFGViewer and CFGPrinter to the new Pass Manager 2016-09-15 18:35:27 +00:00
AssumptionCache.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
BasicAliasAnalysis.cpp Replace a few more "fall through" comments with LLVM_FALLTHROUGH 2016-08-17 20:30:52 +00:00
BlockFrequencyInfo.cpp s/static inline/static/ for headers I have changed in r279475. NFC. 2016-08-31 16:48:13 +00:00
BlockFrequencyInfoImpl.cpp [GraphTraits] Replace all NodeType usage with NodeRef 2016-08-22 21:09:30 +00:00
BranchProbabilityInfo.cpp Enhance calcColdCallHeuristics for InvokeInst 2016-09-23 17:26:14 +00:00
CallGraph.cpp Consistently use ModuleAnalysisManager 2016-08-09 00:28:38 +00:00
CallGraphSCCPass.cpp RefreshCallGraph does not modify the SCC, adding "const" to make it clear (NFC) 2016-08-08 18:51:05 +00:00
CallPrinter.cpp [CG] Rename the DOT printing pass to actually reference "DOT". 2016-03-10 11:04:40 +00:00
CaptureTracking.cpp [CaptureTracking] Volatile operations capture their memory location 2016-05-26 17:36:22 +00:00
CFG.cpp Avoid overly large SmallPtrSet/SmallSet 2016-01-30 01:24:31 +00:00
CFGPrinter.cpp [PM] Port CFGViewer and CFGPrinter to the new Pass Manager 2016-09-15 18:35:27 +00:00
CFLAndersAliasAnalysis.cpp Make some LLVM_CONSTEXPR variables const. NFC. 2016-08-25 01:05:08 +00:00
CFLGraph.h [CFLAA] Check for pointer types in more places. 2016-07-29 01:23:45 +00:00
CFLSteensAliasAnalysis.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
CGSCCPassManager.cpp Fixup r279618, instantiate *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC>, instead of *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID. 2016-08-30 15:47:13 +00:00
CMakeLists.txt [BPI] Add new LazyBPI analysis 2016-07-28 23:31:12 +00:00
CodeMetrics.cpp [Assumptions] Make collecting ephemeral values not quadratic in the 2016-08-18 17:51:24 +00:00
ConstantFolding.cpp [ConstantFold] Improve the bitcast folding logic for constant vectors. 2016-09-13 14:50:47 +00:00
CostModel.cpp [LV, X86] Be more optimistic about vectorizing shifts. 2016-08-04 22:48:03 +00:00
Delinearization.cpp [NFC] Header cleanup 2016-04-18 09:17:29 +00:00
DemandedBits.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
DependenceAnalysis.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
DivergenceAnalysis.cpp DivergenceAnalysis: Fix crash with no return blocks 2016-05-09 16:57:08 +00:00
DominanceFrontier.cpp [PM] Make the AnalysisManager parameter to run methods a reference. 2016-03-11 11:05:24 +00:00
DomPrinter.cpp Introduce analysis pass to compute PostDominators in the new pass manager. NFC 2016-02-25 17:54:07 +00:00
EHPersonalities.cpp Use the range variant of find instead of unpacking begin/end 2016-08-11 22:21:41 +00:00
GlobalsModRef.cpp Use the range variant of find instead of unpacking begin/end 2016-08-11 22:21:41 +00:00
IndirectCallPromotionAnalysis.cpp Remove another unused variable from r275216 2016-07-12 23:49:17 +00:00
InlineCost.cpp Fix a thinko in r278189. 2016-08-29 20:45:51 +00:00
InstCount.cpp
InstructionSimplify.cpp move variables closer to their uses; add FIXMEs; NFC 2016-09-20 14:36:14 +00:00
Interval.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
IntervalPartition.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
IteratedDominanceFrontier.cpp Normalize file docs. NFC. 2016-07-21 20:52:35 +00:00
IVUsers.cpp Consistently use LoopAnalysisManager 2016-08-09 00:28:52 +00:00
LazyBlockFrequencyInfo.cpp [BPI] Add new LazyBPI analysis 2016-07-28 23:31:12 +00:00
LazyBranchProbabilityInfo.cpp [BPI] Add new LazyBPI analysis 2016-07-28 23:31:12 +00:00
LazyCallGraph.cpp [LCG] Redesign the lazy post-order iteration mechanism for the 2016-09-16 10:20:17 +00:00
LazyValueInfo.cpp Add some shortcuts in LazyValueInfo to reduce compile time of Correlated Value Propagation. 2016-09-15 06:28:34 +00:00
Lint.cpp Fix some Clang-tidy modernize-use-using and Include What You Use warnings. 2016-08-13 00:50:41 +00:00
LLVMBuild.txt Refactor indirect call promotion profitability analysis (NFC) 2016-07-12 21:13:44 +00:00
Loads.cpp [Loads] Properly populate the visited set in isDereferenceableAndAlignedPointer 2016-08-31 03:22:32 +00:00
LoopAccessAnalysis.cpp [LV] When reporting about a specific instruction without debug location use loop's 2016-09-21 03:14:20 +00:00
LoopInfo.cpp [LoopInfo] Add verification by recomputation. 2016-08-31 19:26:19 +00:00
LoopPass.cpp Consistently use LoopAnalysisManager 2016-08-09 00:28:52 +00:00
LoopPassManager.cpp PM: Check that loop passes preserve a basic set of analyses 2016-05-03 21:35:08 +00:00
LoopUnrollAnalyzer.cpp [LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad 2016-07-23 02:56:49 +00:00
MemDepPrinter.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
MemDerefPrinter.cpp NFC. Move isDereferenceable to Loads.h/cpp 2016-02-24 12:49:04 +00:00
MemoryBuiltins.cpp Make some LLVM_CONSTEXPR variables const. NFC. 2016-08-25 01:05:08 +00:00
MemoryDependenceAnalysis.cpp Do not widen load for different variable in GVN. 2016-09-09 18:42:35 +00:00
MemoryLocation.cpp [TLI] Unify LibFunc signature checking. NFCI. 2016-04-27 19:04:35 +00:00
ModuleDebugInfoPrinter.cpp
ModuleSummaryAnalysis.cpp [thinlto] Basic thinlto fdo heuristic 2016-09-26 20:37:32 +00:00
ObjCARCAliasAnalysis.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
ObjCARCAnalysisUtils.cpp
ObjCARCInstKind.cpp ObjCARC: Don't increment or dereference end() when scanning args 2016-08-17 01:02:18 +00:00
OptimizationDiagnosticInfo.cpp [Inliner] Report when inlining fails because callee's def is unavailable 2016-08-26 20:21:05 +00:00
OrderedBasicBlock.cpp
PHITransAddr.cpp Use the range variant of find instead of unpacking begin/end 2016-08-11 22:21:41 +00:00
PostDominators.cpp [PM] Remove support for omitting the AnalysisManager argument to new 2016-06-17 00:11:01 +00:00
ProfileSummaryInfo.cpp Consistently use ModuleAnalysisManager 2016-08-09 00:28:38 +00:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
RegionPass.cpp [RegionPass] Some minor cleanups 2016-07-19 17:50:27 +00:00
RegionPrinter.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
ScalarEvolution.cpp [SCEV] Fix the order of members in the initializer list. 2016-09-26 04:49:58 +00:00
ScalarEvolutionAliasAnalysis.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
ScalarEvolutionExpander.cpp Create a getelementptr instead of sub expr for ValueOffsetPair if the 2016-09-14 04:39:50 +00:00
ScalarEvolutionNormalization.cpp Remove emacs mode markers from .cpp files. NFC 2016-04-24 17:55:41 +00:00
ScopedNoAliasAA.cpp [ScopedNoAliasAA] collectMDInDomain should be a free function 2016-08-15 03:56:06 +00:00
SparsePropagation.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
StratifiedSets.h [CFLAA] Simplify CFLGraphBuilder. NFC. 2016-07-11 22:59:09 +00:00
TargetLibraryInfo.cpp [TLI] isdigit / isascii / toascii param type should match return type (PR30484) 2016-09-23 18:44:09 +00:00
TargetTransformInfo.cpp [LoopStrenghtReduce] Refactoring and addition of a new target cost function. 2016-08-17 13:24:19 +00:00
Trace.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
TypeBasedAliasAnalysis.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
TypeMetadataUtils.cpp [IR] Make getIndexedOffsetInType return a signed result 2016-07-13 03:42:38 +00:00
ValueTracking.cpp Analysis: Return early for UndefValue in computeKnownBits 2016-09-24 20:42:02 +00:00
VectorUtils.cpp Add handling of !invariant.load to PropagateMetadata. 2016-09-11 01:39:08 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//