mirror of
https://github.com/RPCSX/llvm.git
synced 2025-01-10 22:46:20 +00:00
8fea35acad
This patch is a follow up from r240560 and is a step further into mitigating the compile time performance issues in CaptureTracker. By providing the CaptureTracker with a "cached ordered basic block" instead of computing it every time, MemDepAnalysis can use this cache throughout its calls to AA->callCapturesBefore, avoiding to recompute it for every scanned instruction. In the same testcase used in r240560, compile time is reduced from 2min to 30s. This also fixes PR22348. rdar://problem/19230319 Differential Revision: http://reviews.llvm.org/D11364 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243750 91177308-0d34-0410-b5e6-96231b3b80d8
Analysis Opportunities: //===---------------------------------------------------------------------===// In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the ScalarEvolution expression for %r is this: {1,+,3,+,2}<loop> Outside the loop, this could be evaluated simply as (%n * %n), however ScalarEvolution currently evaluates it as (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n)) In addition to being much more complicated, it involves i65 arithmetic, which is very inefficient when expanded into code. //===---------------------------------------------------------------------===// In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll, ScalarEvolution is forming this expression: ((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32))) This could be folded to (-1 * (trunc i64 undef to i32)) //===---------------------------------------------------------------------===//