2287 Commits

Author SHA1 Message Date
Silviu Baranga
5f064bbdab Revert r257164 - it has caused spec2k6 failures in LTO mode
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257340 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 16:19:38 +00:00
Silviu Baranga
2bb04c9abe Re-commit r257064, this time with a fixed assert
In setInsertionPoint if the value is not a PHI, Instruction or
Argument it should be a Constant, not a ConstantExpr.

Original commit message:

[InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs

Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.

InstCombine was already doing this when comparing two GEPs if the base
pointers were the same. However, in the case where we have complex
pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to
or from integers, etc) the value of the original base pointer will be
hidden to the optimizer and this transformation will be disabled.

This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the relevant
uses of GEPs to GEPs with a common base pointer. The GEP comparison
will be converted to a comparison done on indices.

Reviewers: majnemer, jmolloy

Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits

Differential Revision: http://reviews.llvm.org/D15146


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257164 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 11:11:04 +00:00
Sanjay Patel
810605370d [InstCombine] insert a new shuffle in a safe place (PR25999)
Limit this transform to a basic block and guard against PHIs.
Hopefully, this fixes the remaining failures in PR25999:
https://llvm.org/bugs/show_bug.cgi?id=25999



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257133 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 01:39:16 +00:00
David Majnemer
5ad98810fd Add test for r256912
I forgot to add this with the rest of r256912.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257088 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 19:27:16 +00:00
Silviu Baranga
866ddc01c3 Revert r257064. It caused failures in some sanitizer tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257069 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 15:46:43 +00:00
Silviu Baranga
f3ba9f9b6a [InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose more constants when comparing GEPs
Summary:
When comparing two GEP instructions which have the same base pointer
and one of them has a constant index, it is possible to only compare
indices, transforming it to a compare with a constant. This removes
one use for the GEP instruction with the constant index, can reduce
register pressure and can sometimes lead to removing the comparisson
entirely.

InstCombine was already doing this when comparing two GEPs if the
base pointers were the same. However, in the case where we have
complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs,
conversions to or from integers, etc) the value of the original
base pointer will be hidden to the optimizer and this transformation
will be disabled.

This change detects when the two sides of the comparison can be
expressed as GEPs with the same base pointer, even if they don't
appear as such in the IR. The transformation will convert all the
pointer arithmetic to arithmetic done on indices and all the
relevant uses of GEPs to GEPs with a common base pointer. The
GEP comparison will be converted to a comparison done on indices.

Reviewers: majnemer, jmolloy

Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits

Differential Revision: http://reviews.llvm.org/D15146

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257064 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 14:56:08 +00:00
Sanjay Patel
78a42b0707 [LibCallSimplifier] use instruction-level fast-math-flags for tan/atan transform
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256964 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-06 19:23:35 +00:00
Sanjay Patel
d79ef021fe [LibCallSimplfier] use instruction-level fast-math-flags for fmin/fmax transforms
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256871 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 20:46:19 +00:00
Sanjay Patel
7a3b692c47 [InstCombine] insert a new shuffle before its uses (PR26015)
Although this solves the test case in PR26015:
https://llvm.org/bugs/show_bug.cgi?id=26015

And may solve PR25999:
https://llvm.org/bugs/show_bug.cgi?id=25999

...I suspect this is not the best solution. I think we want to insert the new shuffle
just ahead of the earliest ExtractElementInst that we're replacing, but I don't know 
how that should be implemented.

Differential Revision: http://reviews.llvm.org/D15878



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256857 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-05 19:09:47 +00:00
Chen Li
bee229d85d [InstructionCombining] prepareICWorklistFromFunction halts in infinite loop with instructions of token type
Summary: This patch fixes a bug in prepareICWorklistFromFunction, where the loop becomes infinite with instructions of token type. The patch checks if the instruction is token type, and if so it updates EndInst with the current instruction.

Reviewers: reames, majnemer

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15859

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256792 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-04 23:28:57 +00:00
Sanjay Patel
e193ff319c [LibCallSimplifier] propagate FMF when shrinking binary calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256682 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-31 23:40:59 +00:00
Sanjay Patel
2e3f468f86 [LibCallSimplifier] propagate FMF when shrinking unary calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256679 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-31 21:52:31 +00:00
Sanjay Patel
36bfd300cd change function names to avoid accidentally matching the substring
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256678 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-31 21:25:25 +00:00
Sanjay Patel
f6e2baf532 add 'fast' attribute to calls to show that the flag isn't being propagated
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256677 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-31 21:12:19 +00:00
Chandler Carruth
d97964253e [attrs] Extract the pure inference of function attributes into
a standalone pass.

There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.

In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.

The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.

Differential Revision: http://reviews.llvm.org/D15676

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256466 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 08:41:34 +00:00
Chen Li
955318d58d [gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. 

Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob

Subscribers: reames, mjacob, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256443 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-26 07:54:32 +00:00
Sanjay Patel
75759ab3e9 [InstCombine] transform more extract/insert pairs into shuffles (PR2109)
This is an extension of the shuffle combining from r203229:
http://reviews.llvm.org/rL203229

The idea is to widen a short input vector with undef elements so the
existing shuffle transform for extract/insert can kick in.

The motivation is to finally solve PR2109:
https://llvm.org/bugs/show_bug.cgi?id=2109

For that example, the IR becomes:

%1 = bitcast <2 x i32>* %P to <2 x float>*
%ld1 = load <2 x float>, <2 x float>* %1, align 8
%2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
%i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
ret <4 x float> %i2

And x86 SSE output improves from:

movq	(%rdi), %xmm1           ## xmm1 = mem[0],zero
movdqa	%xmm1, %xmm2
shufps	$229, %xmm2, %xmm2      ## xmm2 = xmm2[1,1,2,3]
shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
shufps	$132, %xmm1, %xmm0      ## xmm0 = xmm0[0,1],xmm1[0,2]
shufps	$32, %xmm0, %xmm2       ## xmm2 = xmm2[0,0],xmm0[2,0]
shufps	$36, %xmm2, %xmm0       ## xmm0 = xmm0[0,1],xmm2[2,0]
retq

To the almost optimal:

movhpd	(%rdi), %xmm0

Note: There's a tension in the existing transform related to generating
arbitrary shufflevector masks. We avoid that in other places in InstCombine
because we're scared that codegen can't handle strange masks, but it looks
like we're ok with producing those here. I purposely chose weird insert/extract
indexes for the regression tests to see the effect in these cases. 
For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or
better for these examples.

Differential Revision: http://reviews.llvm.org/D15096



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 21:17:56 +00:00
David Majnemer
ec185d074d [OperandBundles] Have InstCombine play nice with operand bundles
Don't assume a call's use corresponds to an argument operand, it might
correspond to a bundle operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256327 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 09:58:41 +00:00
Philip Reames
b96976e30d [InstCombine] Extend peephole DSE to handle unordered atomics
This extends the same line of reasoning used in EarlyCSE w/http://reviews.llvm.org/D15352 to the DSE implementation in InstCombine.

Key points:
 * We only remove unordered or simple stores.
 * The loads producing values consumed by dead stores don't influence whether the store is dead.

Differential Revision: http://reviews.llvm.org/D15354



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255932 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-17 22:19:27 +00:00
Nicolai Hahnle
44c3e26b95 AMDGPU: mark ldexp LibCalls as unavailable
Summary:
The LibCallSimplifier will turn llvm.exp2.* intrinsics into ldexp* libcalls
which do not make sense with the AMDGPU backend.

In the long run, we'll want an llvm.ldexp.* intrinsic to properly make use of
this optimization, but this works around the problem for now.

See also: http://reviews.llvm.org/D14327 (suggested llvm.ldexp.* implementation)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709

Reviewers: arsenm, tstellarAMD

Differential Revision: http://reviews.llvm.org/D14990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255658 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15 17:24:15 +00:00
Mehdi Amini
257fee7c73 Instcombine: destructor loads of structs that do not contains padding
For non padded structs, we can just proceed and deaggregate them.
We don't want ot do this when there is padding in the struct as to not
lose information about this padding (the subsequents passes would then
try hard to preserve the padding, which is undesirable).

Also update extractvalue.ll and cast.ll so that they use structs with padding.

Remove the FIXME in the extractvalue of laod case as the non padded case is
handled when processing the load, and we don't want to do it on the padded
case.

Patch by: Amaury SECHET <deadalnix@gmail.com>

Differential Revision: http://reviews.llvm.org/D14483

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255600 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15 01:44:07 +00:00
Sanjay Patel
a3a48d96c9 add fast-math-flags to 'call' instructions (PR21290)
This patch adds optional fast-math-flags (the same that apply to fmul/fadd/fsub/fdiv/frem/fcmp)
to call instructions in IR. Follow-up patches would use these flags in LibCallSimplifier, add 
support to clang, and extend FMF to the DAG for calls.

Motivating example:

%y = fmul fast float %x, %x
%z = tail call float @sqrtf(float %y)

We'd like to be able to optimize sqrt(x*x) into fabs(x). We do this today using a function-wide
attribute for unsafe-math, but we really want to trigger on the instructions themselves:

%z = tail call fast float @sqrtf(float %y)

because in an LTO build it's possible that calls with fast semantics have been inlined into a
function with non-fast semantics.

The code changes and tests are based on the recent commits that added "notail":
http://reviews.llvm.org/rL252368

and added FMF to fcmp:
http://reviews.llvm.org/rL241901

Differential Revision: http://reviews.llvm.org/D14707



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255555 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 21:59:03 +00:00
Sanjay Patel
2dab6252a4 [InstCombine] fold trunc ([lshr] (bitcast vector) ) --> extractelement (PR25543)
This is a fix for PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543

The idea is to take the existing fold of:
bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X)
( http://reviews.llvm.org/rL112232 )

And break it into less specific transforms so we'll catch more cases such as
the example in the bug report:
bitcast ( trunc ( lshr ( bitcast X))) -->
bitcast ( extractelement (bitcast X)) -->
extractelement (bitcast X)

Enabling patches for this change:
http://reviews.llvm.org/rL255399 (combine bitcasts)
http://reviews.llvm.org/rL255433 (canonicalize extractelement(bitcast X))

Differential Revision: http://reviews.llvm.org/D15392



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255504 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-14 16:16:54 +00:00
Sanjay Patel
3720185c28 [InstCombine] canonicalize (bitcast (extractelement X)) --> (extractelement(bitcast X))
This change was discussed in D15392. It allows us to remove the fold that was added
in:
http://reviews.llvm.org/r255261

...and it will allow us to generalize this fold:
http://reviews.llvm.org/rL112232

while preserving the order of bitcast + extract that it produces and testing shows
is better handled by the backend.

Note that the existing check for "isVectorTy()" wasn't strong enough in general
and specifically because: x86_mmx. It's not a vector, but it's not vectorizable
either. So here we check VectorType::isValidElementType() directly before 
proceeding with the transform.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255433 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 16:44:48 +00:00
David Majnemer
8cec2f2816 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255422 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 05:38:55 +00:00
Sanjay Patel
63b29821f9 [InstCombine] allow any pair of bitcasts to be combined
This change is discussed in D15392 and should allow us to effectively
revert:
http://llvm.org/viewvc/llvm-project?view=revision&revision=255261
if we canonicalize bitcasts ahead of extracts.

It should be safe to convert any pair of bitcasts into a single bitcast, 
however, it was mentioned here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110829/127089.html
that we're not allowed to bitcast from an x86_mmx to some other types, but I'm 
not seeing any failures from that, and we have regression tests in CodeGen/X86
that appear to cover all of those cases. 

Some day we'll get to remove that MMX wart from LLVM IR completely?

Differential Revision: http://reviews.llvm.org/D15468



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255399 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 00:33:36 +00:00
Sanjay Patel
9827670e39 use FileCheck for better checking
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-12 00:01:10 +00:00
Sanjay Patel
fce620f39a Add tests for bitcast-bitcast sequences for all scalar/vector permutations
As noted in http://reviews.llvm.org/D15392 , we should be able to improve this.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255370 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 20:26:30 +00:00
James Molloy
e153879648 [InstCombine] Make MatchBSwap also match bit reversals
MatchBSwap has most of the functionality to match bit reversals already. If we switch it from looking at bytes to individual bits and remove a few early exits, we can extend the main recursive function to match any sequence of ORs, ANDs and shifts that assemble a value from different parts of another, base value. Once we have this bit->bit mapping, we can very simply detect if it is appropriate for a bswap or bitreverse.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255334 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 10:04:51 +00:00
Sanjay Patel
2bce431402 [InstCombine] fold bitcasts around an extractelement (3rd try)
This is a redo of r255137 (reverted at r255227) which was a redo of 
r255124 (reverted at r255126) with a fixed check for a scalar source 
type and an added test for the failure that caused the revert.

Original commit message:

Example:
  bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
    --->
  extractelement <2 x float> %X, i32 1

This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543

The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)

Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232

with 2 less specific transforms to catch the case in the bug report.

Differential Revision: http://reviews.llvm.org/D14879



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255261 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 17:09:28 +00:00
Akira Hatanaka
342c5a6432 Revert r255137.
This commit broke apple's internal bot.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255227 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 08:00:52 +00:00
Sanjay Patel
1e3aaa8bef [InstCombine] fold bitcasts around an extractelement (2nd try)
This is a redo of r255124 (reverted at r255126) with an added check for a
scalar destination type and an added test for the failure seen in Clang's
test/CodeGen/vector.c. The extra test shows a different missing optimization.

Original commit message:

Example:
  bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
    --->
  extractelement <2 x float> %X, i32 1

This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543

The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)

Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232

with 2 less specific transforms to catch the case in the bug report.

Differential Revision: http://reviews.llvm.org/D14879



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255137 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 18:57:16 +00:00
Mehdi Amini
cf1e58c002 Revert "[InstCombine] fold bitcasts around an extractelement"
This reverts commit r255124.

Broke http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/4193/steps/test/logs/stdio

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 16:31:39 +00:00
Sanjay Patel
eb103602da [InstCombine] fold bitcasts around an extractelement
Example:
  bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float
    --->
  extractelement <2 x float> %X, i32 1

This is part of fixing PR25543:
https://llvm.org/bugs/show_bug.cgi?id=25543

The next step will be to generalize this fold:
trunc ( lshr ( bitcast X) ) -> extractelement (X)

Ie, I'm hoping to replace the existing transform of:
bitcast ( trunc ( lshr ( bitcast X)))
added by:
http://reviews.llvm.org/rL112232

with 2 less specific transforms to catch the case in the bug report.

Differential Revision: http://reviews.llvm.org/D14879



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255124 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 16:17:20 +00:00
Sanjoy Das
a8231e7f59 [InstCombine] Call getCmpPredicateForMinMax only with a valid SPF
Summary:
There are `SelectPatternFlavor`s that don't represent min or max idioms,
and we should not be passing those to `getCmpPredicateForMinMax`.

Fixes PR25745.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254869 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-05 23:44:22 +00:00
Weiming Zhao
cc87069c31 [SimplifyLibCalls] Optimization for pow(x, n) where n is some constant
Summary:
    In order to avoid calling pow function we generate repeated fmul when n is a
    positive or negative whole number.
    
    For each exponent we pre-compute Addition Chains in order to minimize the no.
    of fmuls.
    Refer: http://wwwhomes.uni-bielefeld.de/achim/addition_chain.html
    
    We pre-compute addition chains for exponents upto 32 (which results in a max of
    7 fmuls).

    For eg:
    4 = 2+2
    5 = 2+3
    6 = 3+3 and so on
    
    Hence,
    pow(x, 4.0) ==> y = fmul x, x
                    x = fmul y, y
                    ret x

    For negative exponents, we simply compute the reciprocal of the final result.
    
    Note: This transformation is only enabled under fast-math.
    
    Patch by Mandeep Singh Grang <mgrang@codeaurora.org>

Reviewers: weimingz, majnemer, escha, davide, scanon, joerg

Subscribers: probinson, escha, llvm-commits

Differential Revision: http://reviews.llvm.org/D13994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254776 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-04 22:00:47 +00:00
David Majnemer
68c9f5ec88 [Analysis] Become aware of MSVC's new/delete functions
The compiler can take advantage of the allocation/deallocation
function's properties.  We knew how to do this for Itanium but had no
support for MSVC-style functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254656 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-03 22:45:19 +00:00
David Majnemer
26a5db075f Do (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1 rather than (A == C1 || A == C2) -> (A | (C1 ^ C2)) == C2 when C1 ^ C2 is a power of 2.
Differential Revision: http://reviews.llvm.org/D14223

Patch by Amaury SECHET!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254518 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-02 16:15:07 +00:00
Sanjay Patel
7cb4a767f8 [InstCombine] add tests to show potential vector IR shuffle transforms
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254342 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-30 22:39:36 +00:00
Davide Italiano
76755680f6 [SimplifyLibCalls] Remove useless bits of this tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254318 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-30 19:38:35 +00:00
Davide Italiano
24c91af18b [SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254317 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-30 19:36:35 +00:00
Davide Italiano
401b67d4eb [SimplifyLibCalls] Don't crash if the function doesn't have a name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254265 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-29 21:58:56 +00:00
Davide Italiano
0f019d6283 [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).
This one is enabled only under -ffast-math. There are cases where the
difference between the value computed and the correct value is huge
even for ffast-math, e.g. as Steven pointed out:

x = -1, y = -4
log(pow(-1), 4) = 0
4*log(-1) = NaN

I checked what GCC does and apparently they do the same optimization
(which result in the dramatic difference). Future work might try to
make this (slightly) less worse.

Differential Revision:	http://reviews.llvm.org/D14400


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254263 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-29 20:58:04 +00:00
Benjamin Kramer
7c7d1ee4c2 [SimplifyLibCalls] Don't depend on a called function having a name, it might be an indirect call.
Fixes the crasher in PR25651 and related crashers using the same pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254145 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-26 09:51:17 +00:00
Sanjoy Das
8a44ac7412 [InstCombine] Don't drop operand bundles
Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14857

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254046 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-25 00:42:19 +00:00
Sanjay Patel
4da18f10ae [InstCombine] fix propagation of fast-math-flags
Noticed while working on D4583:
http://reviews.llvm.org/D4583



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253997 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-24 17:51:20 +00:00
Rafael Espindola
a2197f8f51 Have a single way for creating unique value names.
We had two code paths. One would create names like "foo.1" and the other
names like "foo1".

For globals it is important to use "foo.1" to help C++ name demangling.
For locals there is no strong reason to go one way or the other so I
kept the most common mangling (foo1).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253804 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-22 00:16:24 +00:00
Sanjay Patel
549121305a move a single test case to where most other instcombine shuffle bug test cases exist
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253784 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-21 16:12:58 +00:00
Sanjay Patel
272898a425 [InstCombine] add tests to show missing trunc optimizations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253609 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 22:11:52 +00:00
Sanjay Patel
634e1cb482 [InstCombine] add tests to show missing bitcast optimizations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253602 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 21:32:25 +00:00