sure that the loop we're promoting into a memcpy doesn't mutate the input
of the memcpy. Before we were just checking that the dest of the memcpy
wasn't mod/ref'd by the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122712 91177308-0d34-0410-b5e6-96231b3b80d8
blocks in a loop, instead of just the header block. This makes it more
aggressive, able to handle Duncan's Ada examples.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122704 91177308-0d34-0410-b5e6-96231b3b80d8
in the PR, the pass could break LCSSA form when inserting preheaders. It probably
would be easy enough to fix this, but since currently we always go into LCSSA form
after running this pass, doing so is not urgent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122695 91177308-0d34-0410-b5e6-96231b3b80d8
header for now for memset/memcpy opportunities. It turns out that loop-rotate
is successfully rotating loops, but *DOESN'T MERGE THE BLOCKS*, turning "for
loops" into 2 basic block loops that loop-idiom was ignoring.
With this fix, we form many *many* more memcpy and memsets than before, including
on the "history" loops in the viterbi benchmark, which look like this:
for (j=0; j<MAX_history; ++j) {
history_new[i][j+1] = history[2*i][j];
}
Transforming these loops into memcpy's speeds up the viterbi benchmark from
11.98s to 3.55s on my machine. Woo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122685 91177308-0d34-0410-b5e6-96231b3b80d8
numbering, in which it considers (for example) "%a = add i32 %x, %y" and
"%b = add i32 %x, %y" to be equal because the operands are equal and the
result of the instructions only depends on the values of the operands.
This has almost no effect (it removes 4 instructions from gcc-as-one-file),
and perhaps slows down compilation: I measured a 0.4% slowdown on the large
gcc-as-one-file testcase, but it wasn't statistically significant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122654 91177308-0d34-0410-b5e6-96231b3b80d8
the original instruction, half the cases were missed (making it not
wrong but suboptimal). Also correct a typo (A <-> B) in the second
chunk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122414 91177308-0d34-0410-b5e6-96231b3b80d8
if both A op B and A op C simplify. This fires fairly often but doesn't
make that much difference. On gcc-as-one-file it removes two "and"s and
turns one branch into a select.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122399 91177308-0d34-0410-b5e6-96231b3b80d8
I still think that LVI should be handling this, but that capability is some ways off in the future,
and this matters for some significant benchmarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122378 91177308-0d34-0410-b5e6-96231b3b80d8
not assume this (for example in case more transforms get added below
it). Suggested by Frits van Bommel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122332 91177308-0d34-0410-b5e6-96231b3b80d8
quite often, but don't make much difference in practice presumably because
instcombine also knows them and more.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122328 91177308-0d34-0410-b5e6-96231b3b80d8
a couple of existing transforms. This fires surprisingly often, for
example when compiling gcc "(X+(-1))+1->X" fires quite a lot as well
as various "and" simplifications (usually with a phi node operand).
Most of the time this doesn't make a real difference since the same
thing would have been done elsewhere anyway, eg: by instcombine, but
there are a few places where this results in simplifications that we
were not doing before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122326 91177308-0d34-0410-b5e6-96231b3b80d8
(they had just been forgotten before). Adding Xor causes "main" in the
existing testcase 2010-11-01-lshr-mask.ll to be hugely more simplified.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122245 91177308-0d34-0410-b5e6-96231b3b80d8
argument. The generated alloca has to have at least the alignment of the
byval, if not, the client may be making assumptions that the new alloca won't
satisfy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122234 91177308-0d34-0410-b5e6-96231b3b80d8
This resolves a README entry and technically resolves PR4916,
but we still get poor code for the testcase in that PR because
GVN isn't CSE'ing uadd with add, filed as PR8817.
Previously we got:
_test7: ## @test7
addq %rsi, %rdi
cmpq %rdi, %rsi
movl $42, %eax
cmovaq %rsi, %rax
ret
Now we get:
_test7: ## @test7
addq %rsi, %rdi
movl $42, %eax
cmovbq %rsi, %rax
ret
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122182 91177308-0d34-0410-b5e6-96231b3b80d8
sadd formed is half the size of the original type. We can
now compile this into a sadd.i8:
unsigned char X(char a, char b) {
int res = a+b;
if ((unsigned )(res+128) > 255U)
abort();
return res;
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122178 91177308-0d34-0410-b5e6-96231b3b80d8
checking to see if the high bits of the original add result were dead.
Inserting a smaller add and zexting back to that size is not good enough.
This is likely to be the fix for 8816.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122177 91177308-0d34-0410-b5e6-96231b3b80d8
which have trapping constant exprs in them due to PHI nodes.
Eliminating them can cause the constant expr to be evalutated
on new paths if the input edges are critical.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122164 91177308-0d34-0410-b5e6-96231b3b80d8