llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-15 09:51:00 +00:00

History

Zi Xuan Wu 5d5b98eb29 recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148

llvm-svn: 374634

2019-10-12 02:53:04 +00:00

AliasAnalysis.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

AliasAnalysisEvaluator.cpp

…

AliasAnalysisSummary.cpp

Move CFLGraph and the AA summary code over to the new CallBase

2019-02-11 09:25:41 +00:00

AliasAnalysisSummary.h

Move CFLGraph and the AA summary code over to the new CallBase

2019-02-11 09:25:41 +00:00

AliasSetTracker.cpp

[LICM/AST] Check if the AliasAny set is removed from the tracker.

2019-09-12 18:09:47 +00:00

Analysis.cpp

[MustExec] Add a generic "must-be-executed-context" explorer

2019-08-23 15:17:27 +00:00

AssumptionCache.cpp

Fix: Actually erase remove the elements from AssumeHandles

2019-10-02 17:35:06 +00:00

BasicAliasAnalysis.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

BlockFrequencyInfo.cpp

Add optional arg to profile count getters to filter

2019-04-24 19:51:16 +00:00

BlockFrequencyInfoImpl.cpp

Add optional arg to profile count getters to filter

2019-04-24 19:51:16 +00:00

BranchProbabilityInfo.cpp

[BPI] Adjust the probability for floating point unordered comparison

2019-09-10 17:25:11 +00:00

CallGraph.cpp

Revert "[CallGraph] Refine call graph for indirect calls with !callees metadata"

2019-08-16 10:59:18 +00:00

CallGraphSCCPass.cpp

[CallSite removal] Move the legacy PM, call graph, and some inliner

2019-04-19 05:59:42 +00:00

CallPrinter.cpp

…

CaptureTracking.cpp

[CaptureTracker] Let subclasses provide dereferenceability information

2019-08-19 21:56:38 +00:00

CFG.cpp

Recommit "[GVN] Preserve loop related analysis/canonical forms."

2019-07-31 09:27:54 +00:00

CFGPrinter.cpp

Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC

2019-08-05 05:43:48 +00:00

CFLAndersAliasAnalysis.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

CFLGraph.h

[CFLGraph] Add support for unary fneg instruction.

2019-06-06 19:21:23 +00:00

CFLSteensAliasAnalysis.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

CGSCCPassManager.cpp

Revert "[CallGraph] Refine call graph for indirect calls with !callees metadata"

2019-08-16 10:59:18 +00:00

CMakeLists.txt

[SVFS] Vector Function ABI demangling.

2019-09-19 17:47:32 +00:00

CmpInstAnalysis.cpp

…

CodeMetrics.cpp

Remove CallSite from the CodeMetrics analysis, moving it to the new

2019-02-11 09:03:32 +00:00

ConstantFolding.cpp

[ConstantFolding] Fold constant calls to log2()

2019-09-30 20:53:23 +00:00

CostModel.cpp

…

DDG.cpp

[DDG] Data Dependence Graph - Root Node

2019-10-01 19:32:42 +00:00

Delinearization.cpp

…

DemandedBits.cpp

[DemandedBits] Remove some redundancy in the work list

2019-03-03 14:50:01 +00:00

DependenceAnalysis.cpp

[llvm] Migrate llvm::make_unique to std::make_unique

2019-08-15 15:54:37 +00:00

DependenceGraphBuilder.cpp

[DDG] Data Dependence Graph - Root Node

2019-10-01 19:32:42 +00:00

DivergenceAnalysis.cpp

[DivergenceAnalysis] Add methods for querying divergence at use

2019-07-29 10:22:09 +00:00

DominanceFrontier.cpp

…

DomPrinter.cpp

…

DomTreeUpdater.cpp

[DTU] Refine the interface and logic of applyUpdates

2019-02-22 13:48:38 +00:00

EHPersonalities.cpp

…

GlobalsModRef.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

GuardUtils.cpp

[WideableCond] Fix a nasty bug in detection of "explicit guards"

2019-04-02 16:51:43 +00:00

IndirectCallPromotionAnalysis.cpp

[llvm] Migrate llvm::make_unique to std::make_unique

2019-08-15 15:54:37 +00:00

InlineCost.cpp

[SVE][IR] Scalable Vector size queries and IR instruction support

2019-10-08 12:53:54 +00:00

InstCount.cpp

…

InstructionPrecedenceTracking.cpp

Make widenable condition transparent for MemoryWriteTracking

2019-02-14 11:10:29 +00:00

InstructionSimplify.cpp

[InstCombine] Simplify fma multiplication to nan for undef or nan operands.

2019-10-02 12:32:52 +00:00

Interval.cpp

…

IntervalPartition.cpp

…

IVDescriptors.cpp

[IR] allow fast-math-flags on phi of FP values (2nd try)

2019-09-25 14:35:02 +00:00

IVUsers.cpp

…

LazyBlockFrequencyInfo.cpp

…

LazyBranchProbabilityInfo.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

LazyCallGraph.cpp

Second attempt to add iterator_range::empty()

2019-10-07 18:14:24 +00:00

LazyValueInfo.cpp

[LVI] Look through extractvalue of insertvalue

2019-09-07 12:03:59 +00:00

LegacyDivergenceAnalysis.cpp

Remove an unnecessary cast. NFC.

2019-10-02 08:56:33 +00:00

Lint.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

LLVMBuild.txt

…

Loads.cpp

[LV] Support invariant addresses in speculation logic

2019-09-12 16:49:10 +00:00

LoopAccessAnalysis.cpp

LoopAccessAnalysis isConsecutiveAccess() - silence static analyzer dyn_cast<SCEVConstant> null dereference warning. NFCI.

2019-10-02 13:08:56 +00:00

LoopAnalysisManager.cpp

[LoopPassManager + MemorySSA] Only enable use of MemorySSA for LPMs known to preserve it.

2019-08-21 17:00:57 +00:00

LoopCacheAnalysis.cpp

[llvm] Migrate llvm::make_unique to std::make_unique

2019-08-15 15:54:37 +00:00

LoopInfo.cpp

[LOOPGUARD] Remove asserts in getLoopGuardBranch

2019-10-06 16:39:43 +00:00

LoopPass.cpp

ftime-trace: Trace loop passes

2019-05-31 10:14:04 +00:00

LoopUnrollAnalyzer.cpp

[InstSimplify] Rename SimplifyFPUnOp and SimplifyFPBinOp

2019-07-24 12:50:10 +00:00

MemDepPrinter.cpp

…

MemDerefPrinter.cpp

OpaquePtr: add Type parameter to Loads analysis API.

2019-07-09 11:35:35 +00:00

MemoryBuiltins.cpp

[Alignment][NFC] Remove unneeded llvm:: scoping on Align types

2019-09-27 12:54:21 +00:00

MemoryDependenceAnalysis.cpp

Change TargetLibraryInfo analysis passes to always require Function

2019-09-07 03:09:36 +00:00

MemoryLocation.cpp

…

MemorySSA.cpp

MemorySSA tryOptimizePhi - assert that we've found a DefChainEnd. NFCI.

2019-10-02 13:09:04 +00:00

MemorySSAUpdater.cpp

[MemorySSA] Update Phi simplification.

2019-10-10 23:27:21 +00:00

ModuleDebugInfoPrinter.cpp

…

ModuleSummaryAnalysis.cpp

IR. Change strip* family of functions to not look through aliases.

2019-08-22 19:56:14 +00:00

MustExecute.cpp

[MustExec] Add a generic "must-be-executed-context" explorer

2019-08-23 15:17:27 +00:00

ObjCARCAliasAnalysis.cpp

[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.

2019-03-22 17:22:19 +00:00

ObjCARCAnalysisUtils.cpp

…

ObjCARCInstKind.cpp

[ObjC][ARC] Delete ObjC runtime calls on global variables annotated

2019-06-14 22:06:32 +00:00

OptimizationRemarkEmitter.cpp

[llvm] Migrate llvm::make_unique to std::make_unique

2019-08-15 15:54:37 +00:00

OrderedBasicBlock.cpp

Recommit "[DSE] Preserve basic block ordering using OrderedBasicBlock."

2019-03-29 14:10:24 +00:00

OrderedInstructions.cpp

[llvm] Migrate llvm::make_unique to std::make_unique

2019-08-15 15:54:37 +00:00

PHITransAddr.cpp

…

PhiValues.cpp

…

PostDominators.cpp

…

ProfileSummaryInfo.cpp

[PGO][PGSO] ProfileSummary changes.

2019-09-24 22:17:51 +00:00

PtrUseVisitor.cpp

SROA: Allow eliminating addrspacecasted allocas

2019-06-14 21:38:31 +00:00

README.txt

…

RegionInfo.cpp

…

RegionPass.cpp

[IR] Refactor attribute methods in Function class (NFC)

2019-04-04 22:40:06 +00:00

RegionPrinter.cpp

…

ScalarEvolution.cpp

[SCEV] Add stricter verification option.

2019-10-11 11:46:40 +00:00

ScalarEvolutionAliasAnalysis.cpp

[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.

2019-03-22 17:22:19 +00:00

ScalarEvolutionExpander.cpp

[PatternMatch] Make m_Br more flexible, add matchers for BB values.

2019-09-25 15:05:08 +00:00

ScalarEvolutionNormalization.cpp

…

ScopedNoAliasAA.cpp

[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.

2019-03-22 17:22:19 +00:00

StackSafetyAnalysis.cpp

IR. Change strip* family of functions to not look through aliases.

2019-08-22 19:56:14 +00:00

StratifiedSets.h

…

SyncDependenceAnalysis.cpp

[SDA] Don't stop divergence propagation at the IPD.

2019-09-18 13:40:22 +00:00

SyntheticCountsUtils.cpp

…

TargetLibraryInfo.cpp

[TLI][AMDGPU] AMDPAL does not have library functions

2019-09-11 07:26:39 +00:00

TargetTransformInfo.cpp

recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

2019-10-12 02:53:04 +00:00

Trace.cpp

…

TypeBasedAliasAnalysis.cpp

[AliasAnalysis] Second prototype to cache BasicAA / anyAA state.

2019-03-22 17:22:19 +00:00

TypeMetadataUtils.cpp

Dead Virtual Function Elimination

2019-10-11 11:59:55 +00:00

ValueLattice.cpp

…

ValueLatticeUtils.cpp

…

ValueTracking.cpp

[ValueTracking] Improve pointer offset computation for cases of same base

2019-10-10 21:30:43 +00:00

VectorUtils.cpp

[Alignment][NFC] Make VectorUtils uas llvm::Align

2019-10-10 12:35:04 +00:00

VFABIDemangling.cpp

[SVFS] Vector Function ABI demangling.

2019-09-19 17:47:32 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//