16126 Commits

Author SHA1 Message Date
Joseph Tremoulet
37c54c60a5 Fix inliner funclet unwind memoization
Summary:
The inliner may need to determine where a given funclet unwinds to,
and this determination may depend on other funclets throughout the
funclet tree.  The code that performs this walk in getUnwindDestToken
memoizes results to avoid redundant computations.  In the case that
a funclet's unwind destination is derived from its ancestor, there's
code to walk back down the tree from the ancestor updating the memo
map of its descendants to record the unwind destination.  This change
fixes that code to account for the case that some descendant has a
different unwind destination, which can happen if that unwind dest
is a descendant of the EHPad being queried and thus didn't determine
its unwind destination.

Also update test inline-funclets.ll, which is supposed to cover such
scenarios, to include a case that fails an assertion without this fix
but passes with it.

Fixes PR29151.


Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280610 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-04 01:23:20 +00:00
Xinliang David Li
1f44212e7e Cleanup : Use metadata preserving API for branch creation
Use the wrapper API in IRBuilder that does meta data copy
to create new branch in LoopUnswitch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280602 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-03 22:26:11 +00:00
Matt Arsenault
6acb49abca AMDGPU: Do basic folding of class intrinsic
This allows more of the OCML builtin library to be
constant folded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280586 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-03 07:06:58 +00:00
Duncan P. N. Exon Smith
0d65f1da0e ADT: Do not inherit from std::iterator in ilist_iterator
Inheriting from std::iterator uses more boiler-plate than manual
typedefs.  Avoid that in both ilist_iterator and
MachineInstrBundleIterator.

This has the side effect of removing ilist_iterator from certain ADL
lookups in namespace std; calls to std::next need to be qualified by
"std::" that didn't have to before.  The one case of this in-tree was
operating on a temporary, so I used the more compact operator++.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280570 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-03 02:27:35 +00:00
Xinliang David Li
aa13c4773f [Profile] handle select instruction in 'expect' lowering
Builtin expect lowering currently ignores select. This patch
fixes the issue

Differential Revision: http://reviews.llvm.org/D24166


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280547 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 22:03:40 +00:00
Chad Rosier
454a60a86c [SLP] Don't pass a global CL option as an argument. NFC.
Differential Revision: https://reviews.llvm.org/D24199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280527 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 19:09:50 +00:00
Sanjay Patel
ef2c802039 [InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126)
The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards
shrinking that to a single shufflevector.

Note that the transform is intentionally limited to shuffles that are equivalent to vector
selects to avoid creating arbitrary shuffle masks that may not lower well.

This should solve PR29126:
https://llvm.org/bugs/show_bug.cgi?id=29126

Differential Revision: https://reviews.llvm.org/D23886


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280504 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 17:05:43 +00:00
Matthew Simpson
d768ea4620 [LV] Ensure reverse interleaved group GEPs remain uniform
For uniform instructions, we're only required to generate a scalar value for
the first vector lane of each unroll iteration. Thus, if we have a reverse
interleaved group, computing the member index off the scalar GEP corresponding
to the last vector lane of its pointer operand technically makes the GEP
non-uniform. We should compute the member index off the first scalar GEP
instead.

I've added the updated member index computation to the existing reverse
interleaved group test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280497 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 16:19:22 +00:00
James Molloy
87afa50931 [SimplifyCFG] Add a workaround to fix PR30188
We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280470 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 07:29:00 +00:00
Dehao Chen
fa2c5e7ef9 revert r280429 and r280425:
r280425 | dehao | 2016-09-01 16:15:50 -0700 (Thu, 01 Sep 2016) | 9 lines

Refactor LICM pass in preparation for LoopSink pass.

Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).

r280429 | dehao | 2016-09-01 16:31:25 -0700 (Thu, 01 Sep 2016) | 9 lines

Refactor LICM to expose canSinkOrHoistInst to LoopSink pass.

Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280453 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 01:59:27 +00:00
Dehao Chen
6cae2f1a3e revert r280432:
r280432 | dehao | 2016-09-01 16:51:37 -0700 (Thu, 01 Sep 2016) | 9 lines

Explicitly require DominatorTreeAnalysis pass for instsimplify pass.

Summary: DominatorTreeAnalysis is always required by instsimplify.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280452 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-02 01:47:13 +00:00
Dehao Chen
7946689952 Explicitly require DominatorTreeAnalysis pass for instsimplify pass.
Summary: DominatorTreeAnalysis is always required by instsimplify.

Reviewers: davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280432 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 23:51:37 +00:00
Dehao Chen
7bb9af1901 Refactor LICM to expose canSinkOrHoistInst to LoopSink pass.
Summary: LoopSink pass shares the same canSinkOrHoistInst functionality with LICM pass. This patch exposes this function in preparation of https://reviews.llvm.org/D22778

Reviewers: chandlerc, davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280429 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 23:31:25 +00:00
Dehao Chen
d21744e2b8 Refactor replaceDominatedUsesWith to have a flag to control whether to replace uses in BB itself.
Summary: This is in preparation for LoopSink pass which calls replaceDominatedUsesWith to update after sinking.

Reviewers: chandlerc, davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24170

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280427 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 23:26:48 +00:00
Dehao Chen
910602a540 Refactor LICM pass in preparation for LoopSink pass.
Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).

Reviewers: chandlerc, davidxl, danielcdh

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24168

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280425 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 23:15:50 +00:00
Matthew Simpson
af1a999e07 [LV] Use ScalarParts for ad-hoc pointer IV scalarization (NFCI)
We can now maintain scalar values in VectorLoopValueMap. Thus, we no longer
have to create temporary vectors with insertelement instructions when handling
pointer induction variables. This case was mistakenly missed from r279649 when
refactoring the other scalarization code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280405 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 19:40:19 +00:00
Matthew Simpson
8dabfb7c14 [LV] Move VectorParts allocation and mapping into PHI widening (NFC)
This patch moves the allocation of VectorParts for PHI nodes into the actual
PHI widening code. Previously, we allocated these VectorParts in
vectorizeBlockInLoop, and passed them by reference to widenPHIInstruction. Upon
returning, we would then map the VectorParts in VectorLoopValueMap. This
behavior is problematic for the cases where we only want to generate a scalar
version of a PHI node. For example, if in the future we only generate a scalar
version of an induction variable, we would end up inserting an empty vector
entry into the map once we return to vectorizeBlockInLoop. We now no longer
need to pass VectorParts to the various PHI widening functions, and we can keep
VectorParts allocation as close as possible to the point at which they are
actually mapped in VectorLoopValueMap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280390 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 18:14:27 +00:00
Geoff Berry
6f45ebf800 [EarlyCSE] Change C API pass interface for EarlyCSE w/ MemorySSA
Previous change broke the C API for creating an EarlyCSE pass w/
MemorySSA by adding a bool parameter to control whether MemorySSA was
used or not.  This broke the OCaml bindings.  Instead, change the old C
API entry point back and add a new one to request an EarlyCSE pass with
MemorySSA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280379 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 15:07:46 +00:00
Sanjay Patel
36ff41aa37 [InstCombine] remove fold of an icmp pattern that should never happen
While removing a scalar shackle from an icmp fold, I noticed that I couldn't find any tests to trigger
this code path.

The 'and' shrinking transform should be handled by InstCombiner::foldCastedBitwiseLogic()
or eliminated with InstSimplify. The icmp narrowing is part of InstCombiner::foldICmpWithCastAndCast().

Differential Revision: https://reviews.llvm.org/D24031 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280370 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 14:20:43 +00:00
James Molloy
f37c8a6b19 [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280364 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 12:58:13 +00:00
James Molloy
f991e38d15 [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

    %a = load i32* %b          %d = load i32* %b
    %c = gep i32* %a, i32 0    %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280351 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 10:44:35 +00:00
James Molloy
16a76ce5f3 [SimplifyCFG] Fix nondeterministic iteration order
We iterate over the result from SafeToMergeTerminators, so make it a SmallSetVector instead of a SmallPtrSet.

Should fix stage3 convergence builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280342 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 09:01:34 +00:00
James Molloy
f4e1029357 [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases
A very important case is not handled here: multiple arcs to a single block with a PHI. Consider:

    a:
      %1 = icmp %b, 1
      br %1, label %c, label %e
    c:
      %2 = icmp %b, 2
      br %2, label %d, label %e
    d:
      br %e
    e:
      phi [0, %a], [1, %c], [2, %d]

FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280338 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-01 07:45:25 +00:00
Nick Lewycky
46b443d899 Add cast to appease windows builder. Fixes build break introduced in r280306.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280311 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 23:24:43 +00:00
Nick Lewycky
0b94abfb45 Add -fprofile-dir= to clang.
-fprofile-dir=path allows the user to specify where .gcda files should be
emitted when the program is run. In particular, this is the first flag that
causes the .gcno and .o files to have different paths, LLVM is extended to
support this. -fprofile-dir= does not change the file name in the .gcno (and
thus where lcov looks for the source) but it does change the name in the .gcda
(and thus where the runtime library writes the .gcda file). It's different from
a GCOV_PREFIX because a user can observe that the GCOV_PREFIX_STRIP will strip
paths off of -fprofile-dir= but not off of a supplied GCOV_PREFIX.

To implement this we split -coverage-file into -coverage-data-file and
-coverage-notes-file to specify the two different names. The !llvm.gcov
metadata node grows from a 2-element form {string coverage-file, node dbg.cu}
to 3-elements, {string coverage-notes-file, string coverage-data-file, node
dbg.cu}. In the 3-element form, the file name is already "mangled" with
.gcno/.gcda suffixes, while the 2-element form left that to the middle end
pass.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280306 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 23:04:32 +00:00
Sanjay Patel
8aee946406 [InstCombine] allow icmp (shr exact X, C2), C fold for splat constant vectors
The enhancement to foldICmpDivConstant ( http://llvm.org/viewvc/llvm-project?view=revision&revision=280299 )
allows us to remove the ConstantInt check; no other changes needed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280300 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 22:18:43 +00:00
Sanjay Patel
bfcd22b81a [InstCombine] allow icmp (div X, Y), C folds for splat constant vectors
Converting all of the overflow ops to APInt looked risky, so I've left that as a TODO.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280299 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 21:57:21 +00:00
Sanjay Patel
ebc9efbb36 [InstCombine] change insertRangeTest() to use APInt instead of Constant; NFCI
This is prep work before changing the callers to also use APInt which will
allow folds for splat vectors. Currently, the callers have ConstantInt
guards in place, so no functional change intended with this commit.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280282 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 19:49:56 +00:00
Michael Zolotukhin
dbd67f6f6c [LoopInfo] Add verification by recomputation.
Summary:
Current implementation of LI verifier isn't ideal and fails to detect
some cases when LI is incorrect. For instance, it checks that all
recorded loops are in a correct form, but it has no way to check if
there are no more other (unrecorded in LI) loops in the function. This
patch adds a way to detect such bugs.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits, silvas, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23437

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280280 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 19:26:19 +00:00
Geoff Berry
aa61209f48 [EarlyCSE] Optionally use MemorySSA. NFC.
Summary:
Use MemorySSA, if requested, to do less conservative memory dependency
checking.

This change doesn't enable the MemorySSA enhanced EarlyCSE in the
default pipelines, so should be NFC.

Reviewers: dberlin, sanjoy, reames, majnemer

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19821

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280279 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 19:24:10 +00:00
Geoff Berry
5fc0cc8bf5 [EarlyCSE] Allow forwarding a non-invariant load into an invariant load.
Reviewers: sanjoy

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D23935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280265 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 17:45:31 +00:00
Chad Rosier
8f1c5752a9 [SLP] Update the debug based on Michael's suggestion.
Passing the types/opcode check still doesn't guarantee we'll actually vectorize.
Therefore, just make it clear we're attempting to vectorize.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280263 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 17:41:12 +00:00
Chad Rosier
71255dfcef [SLP] Sink debug after checking for matching types/opcode.
Differential Revision: https://reviews.llvm.org/D24090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280260 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 17:31:09 +00:00
Tim Shen
b62ba77b89 s/static inline/static/ for headers I have changed in r279475. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280257 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 16:48:13 +00:00
Philip Reames
327ae5898b [statepoints][experimental] Add support for live-in semantics of values in deopt bundles
This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes.

The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate.

Today, we *only* fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.)

My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any *known* miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default.

Differential Revision: https://reviews.llvm.org/D24000



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280250 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 15:12:17 +00:00
Chad Rosier
5f960b8b75 [SLP] Arguments should be camel case, and start with an upper case letter. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280248 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 15:06:58 +00:00
James Molloy
7ae397d16e Revert "[SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases"
This reverts commit r280218. This *also* causes buildbot errors. Sigh. Not a successful day all around!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280239 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 13:32:28 +00:00
James Molloy
8f65479ccc Revert "[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd"
This reverts commit r280216 - it caused buildbot failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280234 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 13:16:52 +00:00
James Molloy
85dac9a06d Revert "[SimplifyCFG] Handle tail-sinking of more than 2 incoming branches"
This reverts commit r280217. r280216 caused buildbot failures - backing out the entire chain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280233 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 13:16:45 +00:00
James Molloy
3d40b2a633 Revert "[SimplifyCFG] Add a workaround to fix PR30188"
This reverts commit r280219. r280216 caused buildbot failures - backing out the entire chain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280232 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 13:16:36 +00:00
James Molloy
0ba92ab006 Revert "[SimplifyCFG] Fix bootstrap failure after r280220"
This reverts commit r280228. r280216 caused buildbot failures - backing out the entire sequence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280231 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 13:16:30 +00:00
James Molloy
d5c06d12b3 [SimplifyCFG] Fix bootstrap failure after r280220
We check that a sinking candidate is used by only one PHI node during our legality checks. However for instructions that are used by other sinking candidates our heuristic is less conservative. This can result in a candidate actually being illegal when we come to sink it because of how we sunk a predecessor. Do the used-by-only-one-PHI checks again during sinking to ensure we don't crash.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280228 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 12:33:48 +00:00
James Molloy
1c6ea1a949 [SimplifyCFG] Add a workaround to fix PR30188
We're sinking stores, which is a good thing, but in the process creating selects for the store address operand, which SROA/Mem2Reg can't look through, which caused serious regressions.

The real fix is in SROA, which I'll be looking into.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280219 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 10:46:45 +00:00
James Molloy
87aeaacc31 [SimplifyCFG] Improve FoldValueComparisonIntoPredecessors to handle more cases
A very important case is not handled here: multiple arcs to a single block with a PHI. Consider:

    a:
      %1 = icmp %b, 1
      br %1, label %c, label %e
    c:
      %2 = icmp %b, 2
      br %2, label %d, label %e
    d:
      br %e
    e:
      phi [0, %a], [1, %c], [2, %d]

FoldValueComparisonIntoPredecessors will refuse to fold this, as it doesn't know how to deal with two arcs to a common destination with different PHI values. The answer is obvious - just split all conflicting arcs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280218 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 10:46:39 +00:00
James Molloy
e786823705 [SimplifyCFG] Handle tail-sinking of more than 2 incoming branches
This was a real restriction in the original version of SinkIfThenCodeToEnd. Now it's been rewritten, the restriction can be lifted.

As part of this, we handle a very common and useful case where one of the incoming branches is actually conditional. Consider:

   if (a)
     x(1);
   else if (b)
     x(2);

This produces the following CFG:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \    |  /
          [ end ]

[end] has two unconditional predecessor arcs and one conditional. The conditional refers to the implicit empty 'else' arc. This same pattern can also be caused by an empty default block in a switch.

We can't sink the call to x() down to end because no call to x() happens on the third incoming arc (assume that x() has sideeffects for the sake of argument; if something is safe to speculate we could indeed sink nevertheless but this cannot happen in the general case and causes many extra selects).

We are now able to detect this case and split off the unconditional arcs to a common successor:

         [if]
        /    \
      [x(1)] [if]
        |     | \
        |     |  \
        |  [x(2)] |
         \   /    |
     [sink.split] |
           \     /
           [ end ]

Now we can sink the call to x() into %sink.split. This can cause significant code simplification in many testcases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280217 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 10:46:33 +00:00
James Molloy
d308206dce [SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming edges and took pains to ensure this didn't regress anything.

This time we change the logic for determining if an instruction should be sunk. Previously we used a single pass greedy algorithm - sink instructions until one requires more than one PHI node or we run out of instructions to sink.

This had the problem that sinking instructions that had non-identical but trivially the same operands needed extra logic so we sunk them aggressively. For example:

    %a = load i32* %b          %d = load i32* %b
    %c = gep i32* %a, i32 0    %e = gep i32* %d, i32 1

Sinking %c and %e would naively require two PHI merges as %a != %d. But the loads are obviously equivalent (and maybe can't be hoisted because there is no common predecessor).

This is why we implemented the fairly complex function areValuesTriviallySame(), to look through trivial differences like this. However it's just not clever enough.

Instead, throw areValuesTriviallySame away, use pointer equality to check equivalence of operands and switch to a two-stage algorithm.

In the "scan" stage, we look at every sinkable instruction in isolation from end of block to front. If it's sinkable, we keep track of all operands that required PHI merging.

In the "sink" stage, we iteratively sink the last non-terminator in the source blocks. But when calculating how many PHIs are actually required to be inserted (to work out if we should stop or not) we remove any values that have already been sunk from the set of PHI-merges required, which allows us to be more aggressive.

This turns an algorithm with potentially recursive lookahead (looking through GEPs, casts, loads and any other instruction potentially not CSE'd) to two linear scans.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280216 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 10:46:23 +00:00
James Molloy
5ae34477c2 [SimplifyCFG] Tail-merge calls with sideeffects
This was deliberately disabled during my rewrite of SinkIfThenToEnd to keep behaviour
at least vaguely consistent with the previous version and keep it as close to NFC as
I could.

There's no real reason not to merge sideeffect calls though, so let's do it! Small fixup
along the way to ensure we don't create indirect calls.

Should fix PR28964.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280215 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 10:46:16 +00:00
Gor Nishanov
b6a139826b [Coroutines] Part 10: Add coroutine promise support.
Summary:
1) CoroEarly now lowers llvm.coro.promise intrinsic that allows to obtain
a coroutine promise pointer from a coroutine frame and vice versa.

2) CoroFrame now interprets Promise argument of llvm.coro.begin to
place CoroutinPromise alloca at a deterministic offset from the coroutine frame.

Now, the coroutine promise example from docs\Coroutines.rst compiles and produces expected result (see test/Transform/Coroutines/ex4.ll).

Reviewers: majnemer

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280184 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 00:35:41 +00:00
Sanjay Patel
1a31aefb30 [InstCombine] clean up InsertRangeTest; NFCI
It's much less code and easier to read if we don't duplicate
everything between the 'Inside' and not 'Inside' cases.

As noted with the FIXME, the goal is to make this vector-friendly
in a follow-up patch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280183 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-31 00:19:35 +00:00
Alina Sbirlea
74a597b31a [LoadStoreVectorizer] Change VectorSet to Vector to match head and tail positions. Resolves PR29148.
Summary:
LSV was using two vector sets (heads and tails) to track pairs of adjiacent position to vectorize.
A recent optimization is trying to obtain the longest chain to vectorize and assumes the positions
in heads(H) and tails(T) match, which is not the case is there are multiple tails for the same head.

e.g.:
i1: store a[0]
i2: store a[1]
i3: store a[1]
Leads to:
H: i1
T: i2 i3
Instead of:
H: i1 i1
T: i2 i3
So the positions for instructions that follow i3 will have different indexes in H/T.
This patch resolves PR29148.

This issue also surfaced the fact that if the chain is too long, and TLI
returns a "not-fast" answer, the whole chain will be abandoned for
vectorization, even though a smaller one would be beneficial.
Added a testcase and FIXME for this.

Reviewers: tstellarAMD, arsenm, jlebar

Subscribers: mzolotukhin, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280179 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-30 23:53:59 +00:00