mirror of
https://github.com/RPCS3/llvm.git
synced 2024-12-14 15:39:06 +00:00
[PM] Flesh out almost all of the late loop passes.
With this the per-module pass pipeline is *extremely* close to the legacy PM. The missing pieces are: - PruneEH (or some equivalent) - ArgumentPromotion - LoopLoadElimination - LoopUnswitch I'm going to work through those in essentially that order but this seems like a worthwhile incremental step toward the end state. One difference in what I have here from the legacy PM is that I've consolidated some of the per-function passes at the very end of the pipeline into the main optimization function pipeline. The intervening passes are *really* uninteresting and so this seems very likely to have any effect other than minor improvement to locality. Note that there are still some failures in the test suite, but the compiler doesn't crash or assert. Differential Revision: https://reviews.llvm.org/D29114 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293241 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
parent
49c910dae1
commit
f8e4cd6c85
@ -491,32 +491,60 @@ PassBuilder::buildPerModuleDefaultPipeline(OptimizationLevel Level,
|
||||
// Optimize the loop execution. These passes operate on entire loop nests
|
||||
// rather than on each loop in an inside-out manner, and so they are actually
|
||||
// function passes.
|
||||
|
||||
// First rotate loops that may have been un-rotated by prior passes.
|
||||
OptimizePM.addPass(createFunctionToLoopPassAdaptor(LoopRotatePass()));
|
||||
|
||||
// Distribute loops to allow partial vectorization. I.e. isolate dependences
|
||||
// into separate loop that would otherwise inhibit vectorization. This is
|
||||
// currently only performed for loops marked with the metadata
|
||||
// llvm.loop.distribute=true or when -enable-loop-distribute is specified.
|
||||
OptimizePM.addPass(LoopDistributePass());
|
||||
|
||||
// Now run the core loop vectorizer.
|
||||
OptimizePM.addPass(LoopVectorizePass());
|
||||
|
||||
// FIXME: Need to port Loop Load Elimination and add it here.
|
||||
|
||||
// Cleanup after the loop optimization passes.
|
||||
OptimizePM.addPass(InstCombinePass());
|
||||
|
||||
|
||||
// Now that we've formed fast to execute loop structures, we do further
|
||||
// optimizations. These are run afterward as they might block doing complex
|
||||
// analyses and transforms such as what are needed for loop vectorization.
|
||||
|
||||
// Optimize parallel scalar instruction chains into SIMD instructions.
|
||||
OptimizePM.addPass(SLPVectorizerPass());
|
||||
|
||||
// Cleanup after vectorizers.
|
||||
// Cleanup after all of the vectorizers.
|
||||
OptimizePM.addPass(SimplifyCFGPass());
|
||||
OptimizePM.addPass(InstCombinePass());
|
||||
|
||||
// Unroll small loops to hide loop backedge latency and saturate any parallel
|
||||
// execution resources of an out-of-order processor.
|
||||
// FIXME: Need to add once loop pass pipeline is available.
|
||||
|
||||
// FIXME: Add the loop sink pass when ported.
|
||||
|
||||
// FIXME: Add cleanup from the loop pass manager when we're forming LCSSA
|
||||
// here.
|
||||
// execution resources of an out-of-order processor. We also then need to
|
||||
// clean up redundancies and loop invariant code.
|
||||
// FIXME: It would be really good to use a loop-integrated instruction
|
||||
// combiner for cleanup here so that the unrolling and LICM can be pipelined
|
||||
// across the loop nests.
|
||||
OptimizePM.addPass(createFunctionToLoopPassAdaptor(LoopUnrollPass::create()));
|
||||
OptimizePM.addPass(InstCombinePass());
|
||||
OptimizePM.addPass(createFunctionToLoopPassAdaptor(LICMPass()));
|
||||
|
||||
// Now that we've vectorized and unrolled loops, we may have more refined
|
||||
// alignment information, try to re-derive it here.
|
||||
OptimizePM.addPass(AlignmentFromAssumptionsPass());
|
||||
|
||||
// ADd the core optimizing pipeline.
|
||||
// LoopSink pass sinks instructions hoisted by LICM, which serves as a
|
||||
// canonicalization pass that enables other optimizations. As a result,
|
||||
// LoopSink pass needs to be a very late IR pass to avoid undoing LICM
|
||||
// result too early.
|
||||
OptimizePM.addPass(LoopSinkPass());
|
||||
|
||||
// And finally clean up LCSSA form before generating code.
|
||||
OptimizePM.addPass(InstSimplifierPass());
|
||||
|
||||
// Add the core optimizing pipeline.
|
||||
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(OptimizePM)));
|
||||
|
||||
// Now we need to do some global optimization transforms.
|
||||
|
@ -129,6 +129,7 @@
|
||||
; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PassManager{{.*}}>
|
||||
; CHECK-O-NEXT: Starting llvm::Function pass manager run.
|
||||
; CHECK-O-NEXT: Running pass: Float2IntPass
|
||||
; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LoopRotatePass
|
||||
; CHECK-O-NEXT: Running pass: LoopDistributePass
|
||||
; CHECK-O-NEXT: Running pass: LoopVectorizePass
|
||||
; CHECK-O-NEXT: Running analysis: BlockFrequencyAnalysis
|
||||
@ -137,7 +138,12 @@
|
||||
; CHECK-O-NEXT: Running pass: SLPVectorizerPass
|
||||
; CHECK-O-NEXT: Running pass: SimplifyCFGPass
|
||||
; CHECK-O-NEXT: Running pass: InstCombinePass
|
||||
; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LoopUnrollPass
|
||||
; CHECK-O-NEXT: Running pass: InstCombinePass
|
||||
; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass
|
||||
; CHECK-O-NEXT: Running pass: AlignmentFromAssumptionsPass
|
||||
; CHECK-O-NEXT: Running pass: LoopSinkPass
|
||||
; CHECK-O-NEXT: Running pass: InstSimplifierPass
|
||||
; CHECK-O-NEXT: Finished llvm::Function pass manager run.
|
||||
; CHECK-O-NEXT: Running pass: GlobalDCEPass
|
||||
; CHECK-O-NEXT: Running pass: ConstantMergePass
|
||||
|
Loading…
Reference in New Issue
Block a user