llvm/Analysis at 47c0d49055f1f7107715f1a76532005149d95c81 - llvm

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-01-22 04:05:05 +00:00

History

Adam Nemet 47c0d49055 Output optimization remarks in YAML

(Re-committed after moving the template specialization under the yaml
namespace.  GCC was complaining about this.)

This allows various presentation of this data using an external tool.
This was first recommended here[1].

As an example, consider this module:

  1 int foo();
  2 int bar();
  3
  4 int baz() {
  5   return foo() + bar();
  6 }

The inliner generates these missed-optimization remarks today (the
hotness information is pulled from PGO):

  remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30)
  remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30)

Now with -pass-remarks-output=<yaml-file>, we generate this YAML file:

  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 10 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: foo
    - String:  will not be inlined into
    - Caller: baz
  ...
  --- !Missed
  Pass:            inline
  Name:            NotInlined
  DebugLoc:        { File: /tmp/s.c, Line: 5, Column: 18 }
  Function:        baz
  Hotness:         30
  Args:
    - Callee: bar
    - String:  will not be inlined into
    - Caller: baz
  ...

This is a summary of the high-level decisions:

* There is a new streaming interface to emit optimization remarks.
E.g. for the inliner remark above:

   ORE.emit(DiagnosticInfoOptimizationRemarkMissed(
                DEBUG_TYPE, "NotInlined", &I)
            << NV("Callee", Callee) << " will not be inlined into "
            << NV("Caller", CS.getCaller()) << setIsVerbose());

NV stands for named value and allows the YAML client to process a remark
using its name (NotInlined) and the named arguments (Callee and Caller)
without parsing the text of the message.

Subsequent patches will update ORE users to use the new streaming API.

* I am using YAML I/O for writing the YAML file.  YAML I/O requires you
to specify reading and writing at once but reading is highly non-trivial
for some of the more complex LLVM types.  Since it's not clear that we
(ever) want to use LLVM to parse this YAML file, the code supports and
asserts that we're writing only.

On the other hand, I did experiment that the class hierarchy starting at
DiagnosticInfoOptimizationBase can be mapped back from YAML generated
here (see D24479).

* The YAML stream is stored in the LLVM context.

* In the example, we can probably further specify the IR value used,
i.e. print "Function" rather than "Value".

* As before hotness is computed in the analysis pass instead of
DiganosticInfo.  This avoids the layering problem since BFI is in
Analysis while DiagnosticInfo is in IR.

[1] https://reviews.llvm.org/D19678#419445

Differential Revision: https://reviews.llvm.org/D24587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282539 91177308-0d34-0410-b5e6-96231b3b80d8

2016-09-27 20:55:07 +00:00

AliasAnalysis.cpp

…

AliasAnalysisEvaluator.cpp

…

AliasAnalysisSummary.cpp

Update a comment.

2016-08-25 01:29:55 +00:00

AliasAnalysisSummary.h

Make some LLVM_CONSTEXPR variables const. NFC.

2016-08-25 01:05:08 +00:00

AliasSetTracker.cpp

…

Analysis.cpp

[PM] Port CFGViewer and CFGPrinter to the new Pass Manager

2016-09-15 18:35:27 +00:00

AssumptionCache.cpp

…

BasicAliasAnalysis.cpp

…

BlockFrequencyInfo.cpp

s/static inline/static/ for headers I have changed in r279475. NFC.

2016-08-31 16:48:13 +00:00

BlockFrequencyInfoImpl.cpp

…

BranchProbabilityInfo.cpp

Enhance calcColdCallHeuristics for InvokeInst

2016-09-23 17:26:14 +00:00

CallGraph.cpp

…

CallGraphSCCPass.cpp

…

CallPrinter.cpp

…

CaptureTracking.cpp

…

CFG.cpp

…

CFGPrinter.cpp

[PM] Port CFGViewer and CFGPrinter to the new Pass Manager

2016-09-15 18:35:27 +00:00

CFLAndersAliasAnalysis.cpp

Make some LLVM_CONSTEXPR variables const. NFC.

2016-08-25 01:05:08 +00:00

CFLGraph.h

…

CFLSteensAliasAnalysis.cpp

…

CGSCCPassManager.cpp

Fixup r279618, instantiate *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC>, instead of *AnalysisManagerProxy<*AnalysisManager,LazyCallGraph::SCC,LazyCallGraph&>, for PassID.

2016-08-30 15:47:13 +00:00

CMakeLists.txt

…

CodeMetrics.cpp

…

ConstantFolding.cpp

[ConstantFold] Improve the bitcast folding logic for constant vectors.

2016-09-13 14:50:47 +00:00

CostModel.cpp

…

Delinearization.cpp

…

DemandedBits.cpp

…

DependenceAnalysis.cpp

…

DivergenceAnalysis.cpp

…

DominanceFrontier.cpp

…

DomPrinter.cpp

…

EHPersonalities.cpp

…

GlobalsModRef.cpp

…

IndirectCallPromotionAnalysis.cpp

…

InlineCost.cpp

Fix a thinko in r278189.

2016-08-29 20:45:51 +00:00

InstCount.cpp

…

InstructionSimplify.cpp

move variables closer to their uses; add FIXMEs; NFC

2016-09-20 14:36:14 +00:00

Interval.cpp

…

IntervalPartition.cpp

…

IteratedDominanceFrontier.cpp

…

IVUsers.cpp

…

LazyBlockFrequencyInfo.cpp

…

LazyBranchProbabilityInfo.cpp

…

LazyCallGraph.cpp

[LCG] Redesign the lazy post-order iteration mechanism for the

2016-09-16 10:20:17 +00:00

LazyValueInfo.cpp

Add some shortcuts in LazyValueInfo to reduce compile time of Correlated Value Propagation.

2016-09-15 06:28:34 +00:00

Lint.cpp

…

LLVMBuild.txt

…

Loads.cpp

[Loads] Properly populate the visited set in isDereferenceableAndAlignedPointer

2016-08-31 03:22:32 +00:00

LoopAccessAnalysis.cpp

[LV] When reporting about a specific instruction without debug location use loop's

2016-09-21 03:14:20 +00:00

LoopInfo.cpp

[LoopInfo] Add verification by recomputation.

2016-08-31 19:26:19 +00:00

LoopPass.cpp

…

LoopPassManager.cpp

…

LoopUnrollAnalyzer.cpp

…

MemDepPrinter.cpp

…

MemDerefPrinter.cpp

…

MemoryBuiltins.cpp

Make some LLVM_CONSTEXPR variables const. NFC.

2016-08-25 01:05:08 +00:00

MemoryDependenceAnalysis.cpp

Do not widen load for different variable in GVN.

2016-09-09 18:42:35 +00:00

MemoryLocation.cpp

…

ModuleDebugInfoPrinter.cpp

…

ModuleSummaryAnalysis.cpp

[thinlto] Basic thinlto fdo heuristic

2016-09-26 20:37:32 +00:00

ObjCARCAliasAnalysis.cpp

…

ObjCARCAnalysisUtils.cpp

…

ObjCARCInstKind.cpp

…

OptimizationDiagnosticInfo.cpp

Output optimization remarks in YAML

2016-09-27 20:55:07 +00:00

OrderedBasicBlock.cpp

…

PHITransAddr.cpp

…

PostDominators.cpp

…

ProfileSummaryInfo.cpp

…

PtrUseVisitor.cpp

…

README.txt

…

RegionInfo.cpp

…

RegionPass.cpp

…

RegionPrinter.cpp

…

ScalarEvolution.cpp

[SCEV] Replace a struct with a function; NFC

2016-09-27 18:01:48 +00:00

ScalarEvolutionAliasAnalysis.cpp

…

ScalarEvolutionExpander.cpp

Create a getelementptr instead of sub expr for ValueOffsetPair if the

2016-09-14 04:39:50 +00:00

ScalarEvolutionNormalization.cpp

…

ScopedNoAliasAA.cpp

…

SparsePropagation.cpp

…

StratifiedSets.h

…

TargetLibraryInfo.cpp

[TLI] isdigit / isascii / toascii param type should match return type (PR30484)

2016-09-23 18:44:09 +00:00

TargetTransformInfo.cpp

…

Trace.cpp

…

TypeBasedAliasAnalysis.cpp

…

TypeMetadataUtils.cpp

…

ValueTracking.cpp

Analysis: Return early for UndefValue in computeKnownBits

2016-09-24 20:42:02 +00:00

VectorUtils.cpp

Add handling of !invariant.load to PropagateMetadata.

2016-09-11 01:39:08 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//