llvm/lib/Analysis
Karthik Bhat e637d65af3 Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 04:56:54 +00:00
..
IPA
AliasAnalysis.cpp
AliasAnalysisCounter.cpp
AliasAnalysisEvaluator.cpp
AliasDebugger.cpp
AliasSetTracker.cpp
Analysis.cpp
BasicAliasAnalysis.cpp Use range based for loops to avoid needing to re-mention SmallPtrSet size. 2014-08-24 23:23:06 +00:00
BlockFrequencyInfo.cpp
BlockFrequencyInfoImpl.cpp
BranchProbabilityInfo.cpp
CaptureTracking.cpp
CFG.cpp
CFGPrinter.cpp
CGSCCPassManager.cpp
CMakeLists.txt
CodeMetrics.cpp
ConstantFolding.cpp Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. 2014-08-21 05:55:13 +00:00
CostModel.cpp
Delinearization.cpp
DependenceAnalysis.cpp Analysis: unique_ptr-ify DependenceAnalysis::collectCoeffInfo 2014-08-25 00:28:43 +00:00
DominanceFrontier.cpp
DomPrinter.cpp
InstCount.cpp
InstructionSimplify.cpp
Interval.cpp
IntervalPartition.cpp
IVUsers.cpp Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. 2014-08-21 05:55:13 +00:00
JumpInstrTableInfo.cpp
LazyCallGraph.cpp
LazyValueInfo.cpp In LVI(Lazy Value Info), originally value on a BB can only be caculated once, 2014-08-11 05:02:04 +00:00
LibCallAliasAnalysis.cpp
LibCallSemantics.cpp
Lint.cpp Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size. 2014-08-21 05:55:13 +00:00
LLVMBuild.txt
Loads.cpp
LoopInfo.cpp
LoopPass.cpp
Makefile
MemDepPrinter.cpp
MemoryBuiltins.cpp
MemoryDependenceAnalysis.cpp Use range based for loops to avoid needing to re-mention SmallPtrSet size. 2014-08-24 23:23:06 +00:00
ModuleDebugInfoPrinter.cpp
NoAliasAnalysis.cpp
PHITransAddr.cpp
PostDominators.cpp
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp
RegionPass.cpp
RegionPrinter.cpp
ScalarEvolution.cpp
ScalarEvolutionAliasAnalysis.cpp
ScalarEvolutionExpander.cpp
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp
SparsePropagation.cpp
TargetTransformInfo.cpp Allow vectorization of division by uniform power of 2. 2014-08-25 04:56:54 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp
ValueTracking.cpp ValueTracking: Figure out more bits when looking at add/sub 2014-08-22 00:40:43 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//