llvm/lib/Analysis
Alexandre Isoard 3b88873b05 [SCEV] Add URem support to SCEV
In LLVM IR the following code:

    %r = urem <ty> %t, %b

is equivalent to

    %q = udiv <ty> %t, %b
    %s = mul <ty> nuw %q, %b
    %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t

As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented
with minimal effort using that relation:

    %r --> (-%b * (%t /u %b)) + %t

We implement two special cases:

  - if %b is 1, the result is always 0
  - if %b is a power-of-two, we produce a zext/trunc based expression instead

That is, the following code:

    %r = urem i32 %t, 65536

Produces:

    %r --> (zext i16 (trunc i32 %a to i16) to i32)

Note that while this helps get a tighter bound on the range analysis and the
known-bits analysis, this exposes some normalization shortcoming of SCEVs:

    %div = udim i32 %a, 65536
    %mul = mul i32 %div, 65536
    %rem = urem i32 %a, 65536
    %add = add i32 %mul, %rem

Will usually not be reduced.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312329 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 14:59:59 +00:00
..
2017-07-09 05:54:44 +00:00
2017-08-25 12:38:53 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//