ReplaceNodeResults: rather than returning a node which
must have the same number of results as the original
node (which means mucking around with MERGE_VALUES,
and which is also easy to get wrong since SelectionDAG
folding may mean you don't get the node you expect),
return the results in a vector.
llvm-svn: 60348
instead of std::sort. This shrinks the release-asserts LSR.o file
by 1100 bytes of code on my system.
We should start using array_pod_sort where possible.
llvm-svn: 60335
buggy rewrite, this notifies ScalarEvolution of a pending instruction
about to be removed and then erases it, instead of erasing it then
notifying.
llvm-svn: 60329
new instructions it simplifies. Because we're threading jumps on edges
with constants coming in from PHI's, we inherently are exposing a lot more
constants to the new block. Folding them and deleting dead conditions
allows the cost model in jump threading to be more accurate as it iterates.
llvm-svn: 60327
elimination: when finding dependent load/stores, realize that
they are the same if aliasing claims must alias instead of relying
on the pointers to be exactly equal. This makes load elimination
more aggressive. For example, on 403.gcc, we had:
< 68 gvn - Number of instructions PRE'd
< 152718 gvn - Number of instructions deleted
< 49699 gvn - Number of loads deleted
< 6153 memdep - Number of dirty cached non-local responses
< 169336 memdep - Number of fully cached non-local responses
< 162428 memdep - Number of uncached non-local responses
now we have:
> 64 gvn - Number of instructions PRE'd
> 153623 gvn - Number of instructions deleted
> 49856 gvn - Number of loads deleted
> 5022 memdep - Number of dirty cached non-local responses
> 159030 memdep - Number of fully cached non-local responses
> 162443 memdep - Number of uncached non-local responses
That's an extra 157 loads deleted and extra 905 other instructions nuked.
This slows down GVN very slightly, from 3.91 to 3.96s.
llvm-svn: 60314
vector instead of a densemap. This shrinks the memory usage of this thing
substantially (the high water mark) as well as making operations like
scanning it faster. This speeds up memdep slightly, gvn goes from
3.9376 to 3.9118s on 403.gcc
This also splits out the statistics for the cached non-local case to
differentiate between the dirty and clean cached case. Here's the stats
for 403.gcc:
6153 memdep - Number of dirty cached non-local responses
169336 memdep - Number of fully cached non-local responses
162428 memdep - Number of uncached non-local responses
yay for caching :)
llvm-svn: 60313
Note that the FoldOpIntoPhi call is dead because it's impossible for the
first operand of a subtraction to be both a ConstantInt and a PHINode.
llvm-svn: 60306
multiplies.
Some more cleverness would be nice, though. It would be nice if we
could do this transformation on illegal types. Also, we would
prefer a narrower constant when possible so that we can use a narrower
multiply, which can be cheaper.
llvm-svn: 60283
"For signed integers, the determination of overflow of x*y is not so simple. If
x and y have the same sign, then overflow occurs iff xy > 2**31 - 1. If they
have opposite signs, then overflow occurs iff xy < -2**31."
In this case, x == -1.
llvm-svn: 60278
overflowed on negation. This commit checks to make sure that neithe C nor X
overflows. This requires that the RHS of X (a subtract instruction) be a
constant integer.
llvm-svn: 60275
ReverseLocalDeps when we update it. This fixes a regression test
failure from my last commit.
Second, for each non-local cached information structure, keep a bit that
indicates whether it is dirty or not. This saves us a scan over the whole
thing in the common case when it isn't dirty.
llvm-svn: 60274
instead of containing them by value. This increases the density
(!) of NonLocalDeps as well as making the reallocation case
faster. This speeds up gvn on 403.gcc by 2% and makes room for
future improvements.
I'm not super thrilled with having to explicitly manage the new/delete
of the map, but it is necesary for the next change.
llvm-svn: 60271
If we see that a load depends on the allocation of its memory with no
intervening stores, we now return a 'None' depedency instead of "Normal".
This tweaks GVN to do its optimization with the new result.
llvm-svn: 60267
dependencies. The basic situation was this: consider if we had:
store1
...
store2
...
store3
Where memdep thinks that store3 depends on store2 and store2 depends
on store1. The problem happens when we delete store2: The code in
question was updating dep info for store3 to be store1. This is a
spiffy optimization, but is not safe at all, because aliasing isn't
transitive. This bug isn't exposed today with DSE because DSE will only
zap store2 if it is identifical to store 3, and in this case, it is
safe to update it to depend on store1. However, memcpyopt is not so
fortunate, which is presumably why the "dropInstruction" code used to
exist.
Since this doesn't actually provide a speedup in practice, just rip the
code out.
llvm-svn: 60263