Commit Graph

5951 Commits

Author SHA1 Message Date
Chandler Carruth
3ac929c473 Fix a miscompile introduced in r220178.
The original code had an implicit assumption that if the test for
allocas or globals was reached, the two pointers were not equal. With my
changes to make the pointer analysis more powerful here, I also had to
guard against circumstances where the results weren't useful. That in
turn violated the assumption and gave rise to a circumstance in which we
could have a store with both the queried pointer and stored pointer
rooted at *the same* alloca. Clearly, we cannot ignore such a store.
There are other things we might do in this code to better handle the
case of both pointers ending up at the same alloca or global, but it
seems best to at least make the test explicit in what it intends to
check.

I've added tests for both the alloca and global case here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220190 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 10:03:01 +00:00
Chandler Carruth
080dfb5bda Fix a somewhat subtle pair of issues with JumpThreading I introduced in
r220178. First, the creation routine doesn't insert prior to the
terminator of the basic block provided, but really at the end of the
basic block. Instead, get the terminator and insert before that. The
next issue was that we need to ensure multiple PHI node entries for
a single predecessor re-use the same cast instruction rather than
creating new ones.

All of the logic here was without tests previously. I've reduced and
added a test case from the test suite that crashed without both of these
fixes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 05:34:36 +00:00
Chandler Carruth
35c4e071be Teach the load analysis driving core instcombine logic and other bits of
logic to look through pointer casts, making them trivially stronger in
the face of loads and stores with intervening pointer casts.

I've included a few test cases that demonstrate the kind of folding
instcombine can do without pointer casts and then variations which
obfuscate the logic through bitcasts. Without this patch, the variations
all fail to optimize fully.

This is more important now than it has been in the past as I've started
moving the load canonicialization to more closely follow the value type
requirements rather than the pointer type requirements and thus this
needs to be prepared for more pointer casts. When I made the same change
to stores several test cases regressed without logic along these lines
so I wanted to systematically improve matters first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220178 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 00:24:14 +00:00
Chandler Carruth
fc1c1ec435 Add a datalayout string to this test so that it exercises the full gamut
of InstCombine rather than just the bits enabled when datalayout is
optional.

The primary fixes here are because now things are little endian.

In good news, silliness like this seems like it will be going away as
we've got pretty stong consensus on dropping optional datalayout
entirely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220176 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-20 00:11:31 +00:00
Chandler Carruth
63276ccdbd Do a better and more complete job of preserving metadata when combining
loads.

This handles many more cases than just the AA metadata, some of them
suggested by Hal in his review of the AA metadata handling patch. I've
tried to test this behavior where tractable to do so.

I'll point out that I have specifically *not* included a test for
debuginfo because it was going to require 2 or 3 times as much work to
craft some input which would survive the "helpful" stripping of debug
info metadata that doesn't match the desired schema. This is another
good example of why the current state of write-ability for our debug
info metadata is unacceptable. I spent over 30 minutes trying to conjure
some test case that would survive, even copying from other debug info
tests, but it always failed to survive with no explanation of why or how
I might fix it. =[

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220165 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 10:46:46 +00:00
Chandler Carruth
4d2a706176 Move previously dead code to handle computing the known bits of an alias
up to where it actually works as intended. The problem is that
a GlobalAlias isa GlobalValue and so the prior block handled all of the
cases.

This allows us to constant fold based on the actual constant expression
in the global alias. As an example, see the last function in the newly
added test case which explicitly aligns an unaligned pointer using
constant expression math. Without this change, we fail to see that and
fold an alignment test to zero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220164 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 09:06:56 +00:00
David Majnemer
0fd4e2e5a1 InstCombine: (sub (or A B) (xor A B)) --> (and A B)
The following implements the transformation:
(sub (or A B) (xor A B)) --> (and A B).

Patch by Ankur Garg!

Differential Revision: http://reviews.llvm.org/D5719

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220163 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 08:32:32 +00:00
David Majnemer
242aeb9d84 InstCombine: Optimize icmp eq/ne (shl Const2, A), Const1
The following implements the optimization for sequences of the form:
icmp eq/ne (shl Const2, A), Const1

Such sequences can be transformed to:
icmp eq/ne A, (TrailingZeros(Const1) - TrailingZeros(Const2))

This handles only the equality operators for now. Other operators need
to be handled.

Patch by Ankur Garg!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220162 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 08:23:08 +00:00
Chandler Carruth
908d4514f6 Fix a long-standing miscompile in the load analysis that was uncovered
by my refactoring of this code.

The method isSafeToLoadUnconditionally assumes that the load will
proceed with the preferred type alignment. Given that, it has to ensure
that the alloca or global is at least that aligned. It has always done
this historically when a datalayout is present, but has never checked it
when the datalayout is absent. When I refactored the code in r220156,
I exposed this path when datalayout was present and that turned the
latent bug into a patent bug.

This fixes the issue by just removing the special case which allows
folding things without datalayout. This isn't worth the complexity of
trying to tease apart when it is or isn't safe without actually knowing
the preferred alignment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-19 08:17:50 +00:00
Chandler Carruth
797e9b812e Preserve AA metadata when combining (cast (load (...))) -> (load (cast
(...))).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220141 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 11:00:12 +00:00
Chandler Carruth
9b2d091a9c [InstCombine] Do an about-face on how LLVM canonicalizes (cast (load
...)) and (load (cast ...)): canonicalize toward the former.

Historically, we've tried to load using the type of the *pointer*, and
tried to match that type as closely as possible removing as many pointer
casts as we could and trading them for bitcasts of the loaded value.
This is deeply and fundamentally wrong.

Repeat after me: memory does not have a type! This was a hard lesson for
me to learn working on SROA.

There is only one thing that should actually drive the type used for
a pointer, and that is the type which we need to use to load from that
pointer. Matching up pointer types to the loaded value types is very
useful because it minimizes the physical size of the IR required for
no-op casts. Similarly, the only thing that should drive the type used
for a loaded value is *how that value is used*! Again, this minimizes
casts. And in fact, the *only* thing motivating types in any part of
LLVM's IR are the types used by the operations in the IR. We should
match them as closely as possible.

I've ended up removing some tests here as they were testing bugs or
behavior that is no longer present. Mostly though, this is just cleanup
to let the tests continue to function as intended.

The only fallout I've found so far from this change was SROA and I have
fixed it to not be impeded by the different type of load. If you find
more places where this change causes optimizations not to fire, those
too are likely bugs where we are assuming that the type of pointers is
"significant" for optimization purposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 06:36:22 +00:00
Chandler Carruth
a45d0c8bd1 Remove a test that was ported from the old llvm-gcc frontend test suite.
This test is pretty awesome. It is claiming to test devirtualization.
However, the code in question is not in fact devirtualized by LLVM. If
you take the original C++ test case and run it through Clang at -O3 we
fail to devirtualize it completely. It also isn't a sufficiently focused
test case.

The *reason* we fail to devirtualize it isn't because of any missing
instcombine though. Instead, it is because we fail to emit an available
externally vtable and thus the vtable is just an external and completely
opaque. If I cause the vtable to be emitted, we successfully
devirtualize things.

Anyways, I'm just removing it because it is providing negative value at
this point: it isn't representative of the output of Clang really, LLVM
isn't doing the transform it claims to be testing, LLVM's failure to do
the transform isn't actually an LLVM bug at all and we shouldn't be
testing for it here, and finally the test is written in such a way that
it will trivially pass even when the point of the test is failing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220137 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 06:36:18 +00:00
Chandler Carruth
2402e6315d [SROA] Change how SROA does vector-based promotion of allocas to handle
cases where the alloca type, the load types, and the store types used
all disagree.

Previously, the only way that vector-based promotion occured was if the
alloca type was a vector type. This was one of the *very* few remaining
uses of the alloca's type to guide SROA/mem2reg left in LLVM. It turns
out it was a bad idea.

The alloca type can change very easily based on the mixture of types
loaded and stored to that alloca. We shouldn't be relying on it as
a signal for very much. Instead, the source of truth should be loads and
stores. We should canonicalize the loads and stores as much as possible
and then rely on them exclusively in SROA.

When looking and loads and stores, we may find many different candidate
vector types. This change will let SROA try all of them to find a vector
type which is a viable way to promote the entire alloca to a vector
register.

With this change, it becomes possible to do better canonicalization and
optimization of loads and stores without breaking SROA in random ways,
and that should allow fixing a core source of performance loss in hot
numerical loops such as those in Eigen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220116 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-18 00:44:02 +00:00
Rafael Espindola
ec51f45338 Revert "TRE: make TRE a bit more aggressive"
This reverts commit r219899.

This also updates byval-tail-call.ll to make it clear what was breaking.
Adding r219899 again will cause the load/store to disappear.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220093 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 21:25:48 +00:00
Hal Finkel
9d85eff56a [DSE] Remove no-data-layout-only type-based overlap checking
DSE's overlap checking contained special logic, used only when no DataLayout
was available, which inferred a complete overwrite when the pointee types were
equal. This logic seems fine for regular loads/stores, but does not work for
memcpy and friends. Instead of fixing this, I'm just removing it.
Philosophically, transformations should not contain enhanced behavior used only
when data layout is lacking (data layout should be strictly additive), and
maintaining these rarely-tested code paths seems not worthwhile at this stage.

Credit to Aliaksei Zasenka for the bug report and the diagnosis. The test case
(slightly reduced from that provided by Aliaksei) replaces the original
contents of test/Transforms/DeadStoreElimination/no-targetdata.ll -- a few
other tests have been updated to have a data layout.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220035 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-17 11:56:00 +00:00
Rafael Espindola
2f8f1d34e3 Delete -std-compile-opts.
These days -std-compile-opts was just a silly alias for -O3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 20:00:02 +00:00
Bjorn Steinbrink
6eaa62af77 Allow call-slop optzn for destinations with a suitable dereferenceable attribute
Summary:
Currently, call slot optimization requires that if the destination is an
argument, the argument has the sret attribute. This is to ensure that
the memory access won't trap. In addition to sret, we can also allow the
optimization to happen for arguments that have the new dereferenceable
attribute, which gives the same guarantee.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219950 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 19:43:08 +00:00
Sanjay Patel
d8214db086 fold: sqrt(x * x * y) -> fabs(x) * sqrt(y)
If a square root call has an FP multiplication argument that can be reassociated,
then we can hoist a repeated factor out of the square root call and into a fabs().

In the simplest case, this:

   y = sqrt(x * x);

becomes this:

   y = fabs(x);

This patch relies on an earlier optimization in instcombine or reassociate to put the
multiplication tree into a canonical form, so we don't have to search over
every permutation of the multiplication tree.

Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to
use function-level attributes to do this optimization. This needs to be fixed
for both the intrinsics and in the backend.

Differential Revision: http://reviews.llvm.org/D5787



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219944 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 18:48:17 +00:00
Akira Hatanaka
4eb03123df Reapply r219832 - InstCombine: Narrow switch instructions using known bits.
The code committed in r219832 asserted when it attempted to shrink a switch
statement whose type was larger than 64-bit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219902 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 06:00:46 +00:00
Saleem Abdulrasool
ebe6584c32 TRE: make TRE a bit more aggressive
Make tail recursion elimination a bit more aggressive.  This allows us to get
tail recursion on functions that are just branches to a different function.  The
fact that the function takes a byval argument does not restrict it from being
optimised into just a tail call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219899 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 03:27:30 +00:00
Akira Hatanaka
608d59f535 Revert r219832.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219884 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-16 01:17:02 +00:00
Sanjoy Das
a0b0184b33 Revert "r219834 - Teach ScalarEvolution to sharpen range information"
This change breaks the asan buildbots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/13468



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219878 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:46:04 +00:00
Hal Finkel
43141a0764 Preserve non-byval pointer alignment attributes using @llvm.assume when inlining
For pointer-typed function arguments, enhanced alignment can be asserted using
the 'align' attribute. When inlining, if this enhanced alignment information is
not otherwise available, preserve it using @llvm.assume-based alignment
assumptions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219876 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 23:44:41 +00:00
Sanjoy Das
40edbf130e Teach ScalarEvolution to sharpen range information.
If x is known to have the range [a, b) in a loop predicated by (icmp
ne x, a), its range can be sharpened to [a + 1, b).  Get
ScalarEvolution and hence IndVars to exploit this fact.
    
This change triggers an optimization to widen-loop-comp.ll, so it had
to be edited to get it to pass.

phabricator: http://reviews.llvm.org/D5639



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219834 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:25:28 +00:00
Akira Hatanaka
38537634e2 InstCombine: Narrow switch instructions using known bits.
Truncate the operands of a switch instruction to a narrower type if the upper
bits are known to be all ones or zeros.

rdar://problem/17720004


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 19:05:50 +00:00
Hal Finkel
6c15862fd3 [SLPVectorize] Basic ephemeral-value awareness
The SLP vectorizer should not vectorize ephemeral values. These are used to
express information to the optimizer, and vectorizing them does not lead to
faster code (because the ephemeral values are dropped prior to code generation,
vectorized or not), and obscures the information the instructions are
attempting to communicate (the logic that interprets the arguments to
@llvm.assume generically does not understand vectorized conditions).

Also, uses by ephemeral values are free (because they, and the necessary
extractelement instructions, will be dropped prior to code generation).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219816 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-15 17:35:01 +00:00
Hal Finkel
75277b9f70 [LoopVectorize] Ignore @llvm.assume for cost estimates and legality
A few minor changes to prevent @llvm.assume from interfering with loop
vectorization. First, treat @llvm.assume like the lifetime intrinsics, which
are scalarized (but don't otherwise interfere with the legality checking).
Second, ignore the cost of ephemeral instructions in the loop (these will go
away anyway during CodeGen).

Alignment assumptions and other uses of @llvm.assume can often end up inside of
loops that should be vectorized (this is not uncommon for assumptions generated
by __attribute__((align_value(n))), for example).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219741 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 22:59:49 +00:00
Sanjay Patel
3f349b2ba8 Optimize away fabs() calls when input is squared (known positive).
Eliminate library calls and intrinsic calls to fabs when the input 
is a squared value.

Note that no unsafe-math / fast-math assumptions are needed for
this optimization.

Differential Revision: http://reviews.llvm.org/D5777



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219717 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:43:11 +00:00
David Majnemer
505187a9bd InstCombine: Don't miscompile X % ((Pow2 << A) >>u B)
We assumed that A must be greater than B because the right hand side of
a remainder operator must be nonzero.

However, it is possible for A to be less than B if Pow2 is a power of
two greater than 1.

Take for example:
i32 %A = 0
i32 %B = 31
i32 Pow2 = 2147483648

((Pow2 << 0) >>u 31) is non-zero but A is less than B.

This fixes PR21274.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219713 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 20:28:40 +00:00
Hal Finkel
f0f98417ca Revert "r216914 - Revert: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'"
Reapply r216913, a fix for PR20832 by Andrea Di Biagio. The commit was reverted
because of buildbot failures, and credit goes to Ulrich Weigand for isolating
the underlying issue (which can be confirmed by Valgrind, which does helpfully
light up like the fourth of July). Uli explained the problem with the original
patch as:

  It seems the problem is calling multiplySignificand with an addend of category
  fcZero; that is not expected by this routine.  Note that for fcZero, the
  significand parts are simply uninitialized, but the code in (or rather, called
  from) multiplySignificand will unconditionally access them -- in effect using
  uninitialized contents.

This version avoids using a category == fcZero addend within
multiplySignificand, which avoids this problem (the Valgrind output is also now
clean).

Original commit message:

[APFloat] Fixed a bug in method 'fusedMultiplyAdd'.

When folding a fused multiply-add builtin call, make sure that we propagate the
correct result in the case where the addend is zero, and the two other operands
are finite non-zero.

Example:
  define double @test() {
    %1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0)
    ret double %1
  }

Before this patch, the instruction simplifier wrongly folded the builtin call
in function @test to constant 'double 7.0'.
With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and
propagates the expected result (i.e. 56.0).

Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra
test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence
of NaN/Inf operands.

This fixes PR20832.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219708 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 19:23:07 +00:00
Hal Finkel
2993617e41 [LVI] Check for @llvm.assume dominating the edge branch
When LazyValueInfo uses @llvm.assume intrinsics to provide edge-value
constraints, we should check for intrinsics that dominate the edge's branch,
not just any potential context instructions. An assumption that dominates the
edge's branch represents a truth on that edge. This is specifically useful, for
example, if multiple predecessors assume a pointer to be nonnull, allowing us
to simplify a later null comparison.

The test case, and an initial patch, were provided by Philip Reames. Thanks!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219688 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 16:04:49 +00:00
Marcello Maggioni
db9fed93fa Switch to select optimization for two-case switches
This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block
and a test to check that this condition is handled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219656 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-14 01:58:26 +00:00
David Majnemer
af6be11a60 InstCombine: Fix miscompile in X % -Y -> X % Y transform
We assumed that negation operations of the form (0 - %Z) resulted in a
negative number.  This isn't true if %Z was originally negative.
Substituting the negative number into the remainder operation may result
in undefined behavior because the dividend might be INT_MIN.

This fixes PR21256.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219639 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 22:37:51 +00:00
David Majnemer
dfe81adbce InstCombine: Don't miscompile (x lshr C1) udiv C2
We have a transform that changes:
  (x lshr C1) udiv C2
into:
  x udiv (C2 << C1)

However, it is unsafe to do so if C2 << C1 discards any of C2's bits.

This fixes PR21255.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219634 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-13 21:48:30 +00:00
Joerg Sonnenberger
c6133c17e0 Revert r219223, it creates invalid PHI nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219587 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-12 17:16:04 +00:00
Benjamin Kramer
2b7b804fcc InstCombine: Turn (x != 0 & x <u C) into the canonical range check form (x-1 <u C-1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219585 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-12 14:02:34 +00:00
David Majnemer
171825a8ce InstCombine: Don't fold (X <<s log(INT_MIN)) /s INT_MIN to X
Consider the case where X is 2.  (2 <<s 31)/s-2147483648 is zero but we
would fold to X.  Note that this is valid when we are in the unsigned
domain because we require NUW: 2 <<u 31 results in poison.

This fixes PR21245.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219568 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-11 10:20:04 +00:00
David Majnemer
9043f74acb InstCombine, InstSimplify: (%X /s C1) /s C2 isn't always 0 when C1 * C2 overflow
consider:
C1 = INT_MIN
C2 = -1

C1 * C2 overflows without a doubt but consider the following:
%x = i32 INT_MIN

This means that (%X /s C1) is 1 and (%X /s C1) /s C2 is -1.

N. B.  Move the unsigned version of this transform to InstSimplify, it
doesn't create any new instructions.

This fixes PR21243.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-11 10:20:01 +00:00
David Majnemer
2af441e26e InstCombine: mul to shl shouldn't preserve nsw
consider:
mul i32 nsw %x, -2147483648

this instruction will not result in poison if %x is 1

however, if we transform this into:
shl i32 nsw %x, 31

then we will be generating poison because we just shifted into the sign
bit.

This fixes PR21242.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219566 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-11 10:19:52 +00:00
Sanjay Patel
add74ff5ff Return undef on FP <-> Int conversions that overflow (PR21330).
The LLVM Lang Ref states for signed/unsigned int to float conversions:
"If the value cannot fit in the floating point value, the results are undefined."

And for FP to signed/unsigned int:
"If the value cannot fit in ty2, the results are undefined."

This matches the C definitions.

The existing behavior pins to infinity or a max int value, but that may just
lead to more confusion as seen in:
http://llvm.org/bugs/show_bug.cgi?id=21130

Returning undef will hopefully lead to a less silent failure.

Differential Revision: http://reviews.llvm.org/D5603



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219542 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 23:00:21 +00:00
Sanjoy Das
65f2077c62 This patch teaches ScalarEvolution to pick and use !range metadata.
It also makes it more aggressive in querying range information by
adding a call to isKnownPredicateWithRanges to
isLoopBackedgeGuardedByCond and isLoopEntryGuardedByCond.

phabricator: http://reviews.llvm.org/D5638

Reviewed by: atrick, hfinkel



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 21:22:34 +00:00
Mark Heffernan
ed05e3703e This patch de-pessimizes the calculation of loop trip counts in
ScalarEvolution in the presence of multiple exits. Previously all
loops exits had to have identical counts for a loop trip count to be
considered computable. This pessimization was implemented by calling
getBackedgeTakenCount(L) rather than getExitCount(L, ExitingBlock)
inside of ScalarEvolution::getSmallConstantTripCount() (see the FIXME
in the comments of that function). The pessimization was added to fix
a corner case involving undefined behavior (pr/16130). This patch more
precisely handles the undefined behavior case allowing the pessimization
to be removed.

ControlsExit replaces IsSubExpr to more precisely track the case where
undefined behavior is expected to occur. Because undefined behavior is
tracked more precisely we can remove MustExit from ExitLimit. MustExit
was used to track the case where the limit was computed potentially
assuming undefined behavior even if undefined behavior didn't necessarily
occur.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219517 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 17:39:11 +00:00
Arnold Schwaighofer
dd8c386432 SimplifyCFG: Don't convert phis into selects if we could remove undef behavior
instead

We used to transform this:

  define void @test6(i1 %cond, i8* %ptr) {
  entry:
    br i1 %cond, label %bb1, label %bb2

  bb1:
    br label %bb2

  bb2:
    %ptr.2 = phi i8* [ %ptr, %entry ], [ null, %bb1 ]
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

into this:

  define void @test6(i1 %cond, i8* %ptr) {
    %ptr.2 = select i1 %cond, i8* null, i8* %ptr
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

because the simplifycfg transformation into selects would happen to happen
before the simplifycfg transformation that removes unreachable control flow
(We have 'unreachable control flow' due to the store to null which is undefined
behavior).

The existing transformation that removes unreachable control flow in simplifycfg
is:

  /// If BB has an incoming value that will always trigger undefined behavior
  /// (eg. null pointer dereference), remove the branch leading here.
  static bool removeUndefIntroducingPredecessor(BasicBlock *BB)

Now we generate:

  define void @test6(i1 %cond, i8* %ptr) {
    store i8 2, i8* %ptr.2, align 8
    ret void
  }

I did not see any impact on the test-suite + externals.

rdar://18596215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219462 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-10 01:27:02 +00:00
Chad Rosier
c2bf8fbdf4 [Reassociate] Don't canonicalize X - undef to X + (-undef).
Phabricator Revision: http://reviews.llvm.org/D5674
PR21205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219434 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 20:06:29 +00:00
Andrea Di Biagio
c53401ee91 [InstCombine] Fix wrong folding of constant comparisons involving ashr and negative values.
This patch fixes a bug in method InstCombiner::FoldCmpCstShrCst where we
wrongly computed the distance between the highest bits set of two negative
values.

This fixes PR21222.

Differential Revision: http://reviews.llvm.org/D5700


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219406 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-09 12:41:49 +00:00
David Majnemer
b5b306e6a9 Inliner: Non-local functions in COMDATs shouldn't be dropped
A function with discardable linkage cannot be discarded if its a member
of a COMDAT group without considering all the other COMDAT members as
well.  This sort of thing is already handled by GlobalOpt/GlobalDCE.

This fixes PR21206.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219335 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 19:32:32 +00:00
Justin Bogner
41c4eb79d8 Revert "[InstCombine] re-commit r218721 with fix for pr21199"
This seems to cause a miscompile when building clang, which causes a
bootstrapped clang to fail or crash in several of its tests.

See:
  http://lab.llvm.org:8013/builders/clang-x86_64-darwin11-RA/builds/1184
  http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/7813

This reverts commit r219282.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219317 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 16:30:22 +00:00
David Majnemer
b07b0970b1 GlobalOpt: Don't drop unused memberes of a Comdat
A linkonce_odr member of a COMDAT shouldn't be dropped if we need to
keep the entire COMDAT group.

This fixes PR21191.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219283 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 07:23:31 +00:00
Gerolf Hoflehner
f8b5847bc4 [InstCombine] re-commit r218721 with fix for pr21199
The icmp-select-icmp optimization targets select-icmp.eq
only. This is now ensured by testing the branch predicate
explictly. This commit also includes the test case for pr21199.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219282 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 06:42:19 +00:00
Hans Wennborg
8315bd8ab0 Revert r219175 - [InstCombine] re-commit r218721 icmp-select-icmp optimization
This seems to have caused PR21199.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219264 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-08 01:05:57 +00:00