Commit Graph

10183 Commits

Author SHA1 Message Date
Anat Shemer
86dc3f3739 In the function InstCombiner::visitExtractElementInst() removed the limitation that extract is promoted over a cast only if the cast has only one use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179786 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 19:56:44 +00:00
Anat Shemer
77e95d04c4 Added a function scalarizePHI() that sclarizes a vector phi instruction if it has only 2 uses: one to promote the vector phi in a loop and the other use is an extract operation of one element at a constant location.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179783 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 19:35:39 +00:00
Chris Lattner
77327fd652 Fix a comment, PR15777.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179775 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 17:42:14 +00:00
Arnold Schwaighofer
a3fb330d05 LoopVectorizer: Recognize min/max reductions
A min/max operation is represented by a select(cmp(lt/le/gt/ge, X, Y), X, Y)
sequence in LLVM. If we see such a sequence we can treat it just as any other
commutative binary instruction and reduce it.

This appears to help bzip2 by about 1.5% on an imac12,2.

radar://12960601

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179773 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 17:22:34 +00:00
Benjamin Kramer
403fc14370 LoopVectorize: Use a set to avoid longer cycles in the reduction chain too.
Fixes PR15748.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179757 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 14:29:13 +00:00
David Majnemer
7754276c4c Revert "Combine bit test + conditional or into simple math"
It is causing stage2 builds to fail, let's get them running again.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179750 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 08:42:33 +00:00
David Majnemer
a40a3a5981 Combine bit test + conditional or into simple math
Simplify:
(select (icmp eq (and X, C1), 0), Y, (or Y, C2))

Into:
(or (shl (and X, C1), C3), y)

Where:
C3 = Log(C2) - Log(C1)

If:
C1 and C2 are both powers of two


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179748 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 07:30:07 +00:00
Michael Gottesman
0556900b26 [objc-arc] Do not mismatch up retains inside a for loop with releases outside said for loop in the presense of differing provenance caused by escaping blocks.
This occurs due to an alloca representing a separate ownership from the
original pointer. Thus consider the following pseudo-IR:

  objc_retain(%a)
  for (...) {
    objc_retain(%a)
    %block <- %a
    F(%block)
    objc_release(%block)
  }
  objc_release(%a)

From the perspective of the optimizer, the %block is a separate
provenance from the original %a. Thus the optimizer pairs up the inner
retain for %a and the outer release from %a, resulting in segfaults.

This is fixed by noting that the signature of a mismatch of
retain/releases inside the for loop is a Use/CanRelease top down with an
None bottom up (since bottom up the Retain-CanRelease-Use-Release
sequence is completed by the inner objc_retain, but top down due to the
differing provenance from the objc_release said sequence is not
completed). In said case in CheckForCFGHazards, we now clear the state
of %a implying that no pairing will occur.

Additionally a test case is included.

rdar://12969722

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179747 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 05:39:45 +00:00
Michael Gottesman
8a709208ed Removed trailing whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179746 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-18 04:34:11 +00:00
Michael Gottesman
f92bf40ced [objc-arc] Added annotation option to only emit annotations for a specific ssa identifier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179729 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 21:59:41 +00:00
Michael Gottesman
9739b65264 Fixed typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179721 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 21:03:53 +00:00
Michael Gottesman
b271b120d0 [objc-arc] Added descriptions for EnableARCAnnotations, EnableCheckForCFGHazards, EnableARCOptimizations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179718 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 20:48:03 +00:00
Michael Gottesman
ba5d950518 [objc-arc] Added an option to arc-annotations for turning off CheckForCFGHazard.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179717 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 20:48:01 +00:00
Peter Collingbourne
c7ab4f99be Do not optimise fprintf() calls if its return value is used.
Differential Revision: http://llvm-reviews.chandlerc.com/D620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179661 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-17 02:01:10 +00:00
Hans Wennborg
a121e24c54 simplifycfg: Fix integer overflow converting switch into icmp.
If a switch instruction has a case for every possible value of its type,
with the same successor, SimplifyCFG would replace it with an icmp ult,
but the computation of the bound overflows in that case, which inverts
the test.

Patch by Jed Davis!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179587 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-16 08:35:36 +00:00
Bill Wendling
23e00ae631 We are not able to bitcast a pointer to an integral value.
Two return types are not equivalent if one is a pointer and the other is an
integral. This is because we cannot bitcast a pointer to an integral value.
PR15185


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179569 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 22:33:50 +00:00
Nadav Rotem
e9a4411db4 SLPVectorizer: Make it a function pass and add code for hoisting the vector-gather sequence out of loops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179562 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 22:00:26 +00:00
Jim Grosbach
467116a1c8 Fix a typo in comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179542 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 17:40:48 +00:00
Nadav Rotem
1129a832e6 Add an option -vectorize-slp-aggressive for running the BB vectorizer. Make -fslp-vectorize run the slp-vectorizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179508 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 05:39:58 +00:00
Nadav Rotem
8849838965 Rename the slp-vectorizer clang/llvm flags. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179505 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 04:54:42 +00:00
Nadav Rotem
09616565dd SLPVectorizer: Add support for vectorizing trees that start at compare instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179504 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-15 04:25:27 +00:00
David Majnemer
024d943bca Reorders two transforms that collide with each other
One performs: (X == 13 | X == 14) -> X-13 <u 2
The other: (A == C1 || A == C2) -> (A & ~(C1 ^ C2)) == C1

The problem is that there are certain values of C1 and C2 that
trigger both transforms but the first one blocks out the second,
this generates suboptimal code.

Reordering the transforms should be better in every case and
allows us to do interesting stuff like turn:
  %shr = lshr i32 %X, 4
  %and = and i32 %shr, 15
  %add = add i32 %and, -14
  %tobool = icmp ne i32 %add, 0

into:
  %and = and i32 %X, 240
  %tobool = icmp ne i32 %and, 224


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179493 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14 21:15:43 +00:00
Benjamin Kramer
e197486908 Miscellaneous cleanups for VecUtils.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179483 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14 09:33:08 +00:00
Nadav Rotem
0774629936 SLP: Document the scalarization cost method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179479 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14 07:22:22 +00:00
Nadav Rotem
ab105ae95f SLPVectorizer: Add support for trees that don't start at binary operators, and add the cost of extracting values from the roots of the tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179475 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14 05:15:53 +00:00
Nadav Rotem
f7eaf29cf7 SLPVectorizer: add initial support for reduction variable vectorization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179470 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-14 03:22:20 +00:00
Benjamin Kramer
9cbee63b1a GlobalDCE: Fix an oversight in my last commit that could lead to crashes.
There is a Constant with non-constant operands: blockaddress.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179460 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-13 16:11:14 +00:00
Benjamin Kramer
8848680ce0 Fix a scalability issue with complex ConstantExprs.
This is basically the same fix in three different places. We use a set to avoid
walking the whole tree of a big ConstantExprs multiple times.

For example: (select cmp, (add big_expr 1), (add big_expr 2))
We don't want to visit big_expr twice here, it may consist of thousands of
nodes.

The testcase exercises this by creating an insanely large ConstantExprs out of
a loop. It's questionable if the optimizer should ever create those, but this
can be triggered with real C code. Fixes PR15714.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179458 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-13 12:53:18 +00:00
Benjamin Kramer
6ac9278606 InstCombine: Check the operand types before merging fcmp ord & fcmp ord.
Fixes PR15737.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179417 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12 21:56:23 +00:00
Nadav Rotem
a74f91e44c SLPVectorizer: add support for vectorization of diamond shaped trees. We now perform a preliminary traversal of the graph to collect values with multiple users and check where the users came from.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179414 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12 21:16:54 +00:00
Nadav Rotem
196ee11f85 Add debug prints.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179412 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12 21:11:14 +00:00
David Majnemer
fb1cd69b90 Simplify (A & ~B) in icmp if A is a power of 2
The transform will execute like so:
(A & ~B) == 0 --> (A & B) != 0
(A & ~B) != 0 --> (A & B) == 0


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179386 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12 17:25:07 +00:00
Arnold Schwaighofer
08a0e8f8db LoopVectorizer: integer division is not a reduction operation
Don't classify idiv/udiv as a reduction operation. Integer division is lossy.
For example : (1 / 2) * 4 != 4/2.

Example:

int a[] = { 2, 5, 2, 2}
int x = 80;

for()
  x /= a[i];

Scalar:
  x /= 2 // = 40
  x /= 5 // = 8
  x /= 2 // = 4
  x /= 2 // = 2

Vectorized:

 <80, 1> / <2,5> //= <40,0>
 <40, 0> / <2,2> //= <20,0>

 20*0 = 0

radar://13640654

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179381 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-12 15:15:19 +00:00
David Majnemer
59b11c415e Optimize icmp involving addition better
Allows LLVM to optimize sequences like the following:

%add = add nsw i32 %x, 1
%cmp = icmp sgt i32 %add, %y

into:

%cmp = icmp sge i32 %x, %y

as well as:

%add1 = add nsw i32 %x, 20
%add2 = add nsw i32 %y, 57
%cmp = icmp sge i32 %add1, %add2

into:

%add = add nsw i32 %y, 37
%cmp = icmp sle i32 %cmp, %x


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179316 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11 20:05:46 +00:00
Benjamin Kramer
c37cb66e6e Fix for wrong instcombine on vector insert/extract
When trying to collapse sequences of insertelement/extractelement
instructions into single shuffle instructions, there is one specific
case where the Instruction Combiner wrongly updates the resulting
Mask of shuffle indexes.

The problem is in function CollectShuffleElments.

If we have a sequence of insert/extract element instructions
like the one below:

  %tmp1 = extractelement <4 x float> %LHS, i32 0
  %tmp2 = insertelement <4 x float> %RHS, float %tmp1, i32 1
  %tmp3 = extractelement <4 x float> %RHS, i32 2
  %tmp4 = insertelement <4 x float> %tmp2, float %tmp3, i32 3

Where:
  . %RHS will have a mask of [4,5,6,7]
  . %LHS will have a mask of [0,1,2,3]

The Mask of shuffle indexes is wrongly computed to [4,1,6,7]
instead of [4,0,6,7].
When analyzing %tmp2 in order to compute the Mask for the
resulting shuffle instruction, the algorithm forgets to update
the mask index at position 1 with the index associated to the
element extracted from %LHS by instruction %tmp1.

Patch by Andrea DiBiagio!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179291 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11 15:10:09 +00:00
Alexey Samsonov
305e3b277c [ASan] Allow disabling init-order checks for globals by source file name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179280 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11 13:20:00 +00:00
Benjamin Kramer
acc897a5e1 Rename the C function to create a SLPVectorizerPass to something sane and expose it in the header file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179272 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-11 11:36:36 +00:00
Nadav Rotem
4b924d3a61 Make the SLP store-merger less paranoid about function calls. We check for function calls when we check if it is safe to sink instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179207 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10 19:41:36 +00:00
Nadav Rotem
20cd5e6862 We require DataLayout for analyzing the size of stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179206 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10 18:57:27 +00:00
Joey Gouly
0f57a98a65 Change CloneFunctionInto to always clone Argument attributes induvidually,
rather than checking if the source and destination have the same number of
arguments and copying the attributes over directly.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179169 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-10 10:37:38 +00:00
Bob Wilson
58ddf52892 Fix some comment typos.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179132 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09 22:15:51 +00:00
Nadav Rotem
8383b539ff Add support for bottom-up SLP vectorization infrastructure.
This commit adds the infrastructure for performing bottom-up SLP vectorization (and other optimizations) on parallel computations.
The infrastructure has three potential users:

  1. The loop vectorizer needs to be able to vectorize AOS data structures such as (sum += A[i] + A[i+1]).

  2. The BB-vectorizer needs this infrastructure for bottom-up SLP vectorization, because bottom-up vectorization is faster to compute.

  3. A loop-roller needs to be able to analyze consecutive chains and roll them into a loop, in order to reduce code size. A loop roller does not need to create vector instructions, and this infrastructure separates the chain analysis from the vectorization.

This patch also includes a simple (100 LOC) bottom up SLP vectorizer that uses the infrastructure, and can vectorize this code:

void SAXPY(int *x, int *y, int a, int i) {
  x[i]   = a * x[i]   + y[i];
  x[i+1] = a * x[i+1] + y[i+1];
  x[i+2] = a * x[i+2] + y[i+2];
  x[i+3] = a * x[i+3] + y[i+3];
}



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179117 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-09 19:44:35 +00:00
Shuxin Yang
4fd00c55d0 Redo the fix Benjamin Kramer committed in r178793 about iterator invalidation in Reassociate.
I brazenly think this change is slightly simpler than r178793 because: 
  - no "state" in functor
  - "OpndPtrs[i]" looks simpler than "&Opnds[OpndIndices[i]]" 

  While I can reproduce the probelm in Valgrind, it is rather difficult to come up
a standalone testing case. The reason is that when an iterator is invalidated,
the stale invalidated elements are not yet clobbered by nonsense data, so the
optimizer can still proceed successfully. 

  Thank Benjamin for fixing this bug and generously providing the test case.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179062 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-08 22:00:43 +00:00
Chandler Carruth
05c7e7f99d Fix PR15674 (and PR15603): a SROA think-o.
The fix for PR14972 in r177055 introduced a real think-o in the *store*
side, likely because I was much more focused on the load side. While we
can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
widen a value to be stored, as that changes the width of memory access!
Lock down the code path in the store rewriting which would do this to
only handle the intended circumstance.

All of the existing tests continue to pass, and I've added a test from
the PR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178974 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-07 11:47:54 +00:00
Michael Gottesman
310f2665e8 Removed trailing whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178932 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 23:46:45 +00:00
Michael Gottesman
e8b3c2e48a An objc_retain can serve as a use for a different pointer.
This is the counterpart to commit r160637, except it performs the action
in the bottomup portion of the data flow analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178922 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 22:54:32 +00:00
Michael Gottesman
e7ce2b3f75 Properly model precise lifetime when given an incomplete dataflow sequence.
The normal dataflow sequence in the ARC optimizer consists of the following
states:

    Retain -> CanRelease -> Use -> Release

The optimizer before this patch stored the uses that determine the lifetime of
the retainable object pointer when it bottom up hits a retain or when top down
it hits a release. This is correct for an imprecise lifetime scenario since what
we are trying to do is remove retains/releases while making sure that no
``CanRelease'' (which is usually a call) deallocates the given pointer before we
get to the ``Use'' (since that would cause a segfault).

If we are considering the precise lifetime scenario though, this is not
correct. In such a situation, we *DO* care about the previous sequence, but
additionally, we wish to track the uses resulting from the following incomplete
sequences:

  Retain -> CanRelease -> Release   (TopDown)
  Retain <- Use <- Release          (BottomUp)

*NOTE* This patch looks large but the most of it consists of updating
test cases. Additionally this fix exposed an additional bug. I removed
the test case that expressed said bug and will recommit it with the fix
in a little bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178921 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 22:54:28 +00:00
Jim Grosbach
03fceff6f6 Tidy up a bit. No functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178915 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 21:20:12 +00:00
Shuxin Yang
2da70d1792 Disable the optimization about promoting vector-element-access with symbolic index.
This optimization is unstable at this moment; it 
  1) block us on a very important application
  2) PR15200
  3) test6 and test7 in test/Transforms/ScalarRepl/dynamic-vector-gep.ll
     (the CHECK command compare the output against wrong result)

   I personally believe this optimization should not have any impact on the
autovectorized code, as auto-vectorizer is supposed to put gather/scatter
in a "right" way.  Although in theory downstream optimizaters might reveal 
some gather/scatter optimization opportunities, the chance is quite slim.

   For the hand-crafted vectorizing code, in term of redundancy elimination,
load-CSE, copy-propagation and DSE can collectively achieve the same result,
but in much simpler way. On the other hand, these optimizers are able to 
improve the code in a incremental way; in contrast, SROA is sort of all-or-none
approach. However, SROA might slighly win in stack size, as it tries to figure 
out a stretch of memory tightenly cover the area accessed by the dynamic index.

 rdar://13174884
 PR15200



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178912 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 21:07:08 +00:00
Michael Gottesman
5c762e0c25 Added two debug logging messages to VisitInstructionsTopDown to match VisitInstructionsBottomUp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178895 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 18:26:08 +00:00