17606 Commits

Author SHA1 Message Date
Aditya Kumar
945e057a95 Refactor SimplifyCFG:canSinkInstructions [NFC]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297839 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 14:26:45 +00:00
Fiona Glaser
54dc1ed088 MemCpyOptimizer: don't create new addrspace casts
This isn't safe on all targets, and since we don't have a way
to know it's safe, avoid doing it for now.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297788 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 22:37:38 +00:00
Dehao Chen
cd2a5b62d1 SamplePGO ThinLTO ICP fix for local functions.
Summary:
In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen:

1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName.
2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName

This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining:

1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import.
2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName.
3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote.

Reviewers: mehdi_amini, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297757 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 17:33:01 +00:00
Sanjay Patel
77960c2b12 [InstCombine] improve readability; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297755 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 17:27:27 +00:00
Gil Rapaport
a27a1f7795 [LV] Refactor cross-iteration phi's back-patching; NFC
This patch refactors the PHisToFix loop as follows:

- The loop itself now resides in its own method.
- The new method iterates on scalar-loop's header; the PHIsToFix map formerly
  propagated as an output parameter and filled during phi widening is removed.
- The code handling reductions is moved into its own method, similar to the
  existing fixFirstOrderRecurrence().

Differential Revision: https://reviews.llvm.org/D30755


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297740 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 13:50:47 +00:00
Ayal Zaks
1a739b3383 [LV] Refactor Cost Model's selectVectorizationFactor(); NFC
Refactoring Cost Model's selectVectorizationFactor() so that it handles only the
selection of the best VF from a pre-computed range of candidate VF's, extracting
early-exit criteria and the computation of a MaxVF upper-bound to other methods,
all driven by a newly introduced LoopVectorizationPlanner.

Differential Revision: https://reviews.llvm.org/D30653


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297737 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 13:07:04 +00:00
Tobias Grosser
78b8364919 Fix typos in ADCE comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297726 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 10:18:11 +00:00
Jonas Paulsson
85dd82a95b [TargetTransformInfo] getIntrinsicInstrCost() scalarization estimation improved
getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.

Tests updates:

Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll

The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.

Review: Hal Finkel
https://reviews.llvm.org/D29540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 06:35:36 +00:00
Matt Arsenault
8187a9b9ca AMDGPU: Fold icmp/fcmp into icmp intrinsic
The typical use is a library vote function which
compares to 0. Fold the user condition into the intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297650 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-13 18:14:02 +00:00
Adrian Prantl
5a7c5f7c89 API gardening: Rename FindAllocaDbgValue to findDbgValue (NFC)
and use have it use SmallVectorImpl.

There is nothing specific about allocas in this function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297643 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-13 17:20:47 +00:00
Gil Rapaport
710c3a271b [LV] Set memcheck metadata also for VF==1
This commit is a follow-up on r297580. It fixes the FIXME added temporarily
by that commit to keep the removal of Unroller's specialized version of
scalarizeInstruction() an NFC. See https://reviews.llvm.org/D30715 for details.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297610 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-13 10:23:46 +00:00
Gil Rapaport
dd86c5c182 [LV] A unified scalarizeInstruction() for Vectorizer and Unroller; NFC
Unroller's specialized scalarizeInstruction() is mostly duplicating Vectorizer's
variant. OTOH Vectorizer's scalarizeInstruction() already supports the special
case of VF==1 except for avoiding mask-bit extraction in that case. This patch
removes Unroller's specialized version in favor of a unified method.

The only functional difference between the two variants seems to be setting
memcheck metadata for loads and stores only in Vectorizer's variant, which is a
bug in Unroller. To keep this patch an NFC the unified method doesn't set
memcheck metadata for VF==1.

Differential Revision: https://reviews.llvm.org/D30715


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297580 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-12 12:31:38 +00:00
Ayal Zaks
33175e7038 Test commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297579 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-12 09:48:06 +00:00
Daniel Berlin
6b544010f4 Split NewGVN class into a legacy pass and an impl, instead of a merged class.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297576 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-12 04:46:45 +00:00
Daniel Berlin
5b9a142f57 VNCoercion: Make the function signatures all consistent
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297537 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-11 00:51:01 +00:00
Peter Collingbourne
bfecb4640b WholeProgramDevirt: Implement export/import support for VCP.
Differential Revision: https://reviews.llvm.org/D30017

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297503 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 20:13:58 +00:00
Peter Collingbourne
eb3c7034f3 WholeProgramDevirt: Implement export/import support for unique ret val opt.
Differential Revision: https://reviews.llvm.org/D29917

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297502 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 20:09:11 +00:00
Daniel Berlin
e879f96940 NewGVN: Rename InitialClass to TOP, which is what most people would expect it to be called
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297494 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 19:05:04 +00:00
Michael Kuperstein
70d8785561 [SLP] Revert everything that has to do with memory access sorting.
This reverts r293386, r294027, r294029 and r296411.

Turns out the SLP tree isn't actually a "tree" and we don't handle
accessing the same packet of loads in several different orders well,
causing miscompiles.

Revert until we can fix this properly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297493 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 18:59:07 +00:00
George Rimar
a312eded5f WholeProgramDevirt: Fixed compilation error under MSVS2015.
It was introduced in:

r296945
WholeProgramDevirt: Implement exporting for single-impl devirtualization.
---------------------
r296939
WholeProgramDevirt: Add any unsuccessful llvm.type.checked.load devirtualizations to the list of llvm.type.test users.
---------------------

Microsoft Visual Studio Community 2015
Version 14.0.23107.0 D14REL
Does not compile that code without additional brackets, showing multiple error like below:

WholeProgramDevirt.cpp(1216): error C2958: the left bracket '[' found at 'c:\access_softek\llvm\lib\transforms\ipo\wholeprogramdevirt.cpp(1216)' was not matched correctly
WholeProgramDevirt.cpp(1216): error C2143: syntax error: missing ']' before '}'
WholeProgramDevirt.cpp(1216): error C2143: syntax error: missing ';' before '}'
WholeProgramDevirt.cpp(1216): error C2059: syntax error: ']'


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297451 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 10:31:56 +00:00
Matt Arsenault
931794f288 AMDGPU: Fix insertion point when reducing load intrinsics
The insertion point may be later than the next instruction,
so it is necessary to set it when replacing the call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297439 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 05:25:49 +00:00
Daniel Berlin
7ff9661670 Move memory coercion functions from GVN.cpp to VNCoercion.cpp so they can be shared between GVN and NewGVN.
Summary:
These are the functions used to determine when values of loads can be
extracted from stores, etc, and to perform the necessary insertions to
do this.  There are no changes to the functions themselves except
reformatting, and one case where memdep was informed of a removed load
(which was pushed into the caller).

Reviewers: davide

Subscribers: mgorny, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30478

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297438 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 04:54:10 +00:00
Daniel Berlin
d7d7be8b3c NewGVN: Rewrite DCE during elimination so we do it as well as old GVN did.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297428 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 00:32:33 +00:00
Daniel Berlin
43009168c6 NewGVN: Rename a few things for clarity
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297427 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 00:32:26 +00:00
Daniel Berlin
5e2cfa2e2d Add support for DenseMap/DenseSet count and find using const pointers
Summary:
Similar to SmallPtrSet, this makes find and count work with both const
referneces and const pointers.

Reviewers: dblaikie

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D30713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297424 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 00:25:26 +00:00
Matt Arsenault
f90265a1b0 AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297408 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 20:34:27 +00:00
Rong Xu
1bf6ca1df1 Minor format change. nfc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297400 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 19:08:55 +00:00
Chandler Carruth
4fea871248 [PM/Inliner] Make the new PM's inliner process call edges across an
entire SCC before iterating on newly-introduced call edges resulting
from any inlined function bodies.

This more closely matches the behavior of the old PM's inliner. While it
wasn't really clear to me initially, this behavior is actually essential
to the inliner behaving reasonably in its current design.

Because the inliner is fundamentally a bottom-up inliner and all of its
cost modeling is designed around that it often runs into trouble within
an SCC where we don't have any meaningful bottom-up ordering to use. In
addition to potentially cyclic, infinite inlining that we block with the
inline history mechanism, it can also take seemingly simple call graph
patterns within an SCC and turn them into *insanely* large functions by
accidentally working top-down across the SCC without any of the
threshold limitations that traditional top-down inliners use.

Consider this diabolical monster.cpp file that Richard Smith came up
with to help demonstrate this issue:
```
template <int N> extern const char *str;

void g(const char *);

template <bool K, int N> void f(bool *B, bool *E) {
  if (K)
    g(str<N>);
  if (B == E)
    return;
  if (*B)
    f<true, N + 1>(B + 1, E);
  else
    f<false, N + 1>(B + 1, E);
}
template <> void f<false, MAX>(bool *B, bool *E) { return f<false, 0>(B, E); }
template <> void f<true, MAX>(bool *B, bool *E) { return f<true, 0>(B, E); }

extern bool *arr, *end;
void test() { f<false, 0>(arr, end); }
```

When compiled with '-DMAX=N' for various values of N, this will create an SCC
with a reasonably large number of functions. Previously, the inliner would try
to exhaust the inlining candidates in a single function before moving on. This,
unfortunately, turns it into a top-down inliner within the SCC. Because our
thresholds were never built for that, we will incrementally decide that it is
always worth inlining and proceed to flatten the entire SCC into that one
function.

What's worse, we'll then proceed to the next function, and do the exact same
thing except we'll skip the first function, and so on. And at each step, we'll
also make some of the constant factors larger, which is awesome.

The fix in this patch is the obvious one which makes the new PM's inliner use
the same technique used by the old PM: consider all the call edges across the
entire SCC before beginning to process call edges introduced by inlining. The
result of this is essentially to distribute the inlining across the SCC so that
every function incrementally grows toward the inline thresholds rather than
allowing the inliner to grow one of the functions vastly beyond the threshold.
The code for this is a bit awkward, but it works out OK.

We could consider in the future doing something more powerful here such as
prioritized order (via lowest cost and/or profile info) and/or a code-growth
budget per SCC. However, both of those would require really substantial work
both to design the system in a way that wouldn't break really useful
abstraction decomposition properties of the current inliner and to be tuned
across a reasonably diverse set of code and workloads. It also seems really
risky in many ways. I have only found a single real-world file that triggers
the bad behavior here and it is generated code that has a pretty pathological
pattern. I'm not worried about the inliner not doing an *awesome* job here as
long as it does *ok*. On the other hand, the cases that will be tricky to get
right in a prioritized scheme with a budget will be more common and idiomatic
for at least some frontends (C++ and Rust at least). So while these approaches
are still really interesting, I'm not in a huge rush to go after them. Staying
even closer to the existing PM's behavior, especially when this easy to do,
seems like the right short to medium term approach.

I don't really have a test case that makes sense yet... I'll try to find a
variant of the IR produced by the monster template metaprogram that is both
small enough to be sane and large enough to clearly show when we get this wrong
in the future. But I'm not confident this exists. And the behavior change here
*should* be unobservable without snooping on debug logging. So there isn't
really much to test.

The test case updates come from two incidental changes:
1) We now visit functions in an SCC in the opposite order. I don't think there
   really is a "right" order here, so I just update the test cases.
2) We no longer compute some analyses when an SCC has no call instructions that
   we consider for inlining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297374 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 11:35:40 +00:00
Adam Nemet
bd8dfcd1a2 [SLP] Mark values in Dot that need to be extracted
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297361 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 05:48:03 +00:00
Peter Collingbourne
1166572ca3 WholeProgramDevirt: Implement importing for uniform ret val opt.
Differential Revision: https://reviews.llvm.org/D29854

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297350 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 01:11:15 +00:00
Peter Collingbourne
0e534009f0 WholeProgramDevirt: Implement importing for single-impl devirtualization.
Differential Revision: https://reviews.llvm.org/D29844

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297333 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 00:21:25 +00:00
Teresa Johnson
611bafa4c4 Perform symbol binding for .symver versioned symbols
Summary:
In a .symver assembler directive like:
.symver name, name2@@nodename
"name2@@nodename" should get the same symbol binding as "name".

While the ELF object writer is updating the symbol binding for .symver
aliases before emitting the object file, not doing so when the module
inline assembly is handled by the RecordStreamer is causing the wrong
behavior in *LTO mode.

E.g. when "name" is global, "name2@@nodename" must also be marked as
global. Otherwise, the symbol is skipped when iterating over the LTO
InputFile symbols (InputFile::Symbol::shouldSkip). So, for example,
when performing any *LTO via the gold-plugin, the versioned symbol
definition is not recorded by the plugin and passed back to the
linker. If the object was in an archive, and there were no other symbols
needed from that object, the object would not be included in the final
link and references to the versioned symbol are undefined.

The llvm-lto2 tests added will give an error about an unused symbol
resolution without the fix.

Reviewers: rafael, pcc

Reviewed By: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D30485

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297332 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 00:19:49 +00:00
Evgeniy Stepanov
43935601f1 Don't merge global constants with non-dbg metadata.
!type metadata can not be dropped. An alternative to this is adding
!type metadata from the replaced globals to the replacement, but that
may weaken type tests and make them slower at the same time.

The merged global gets !dbg metadata from replaced globals, and can
end up with multiple debug locations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297327 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 00:03:37 +00:00
George Burgess IV
d52e1fa55c [MemCpyOpt] clang-format + trim the legacy pass. NFC.
None of the declarations below `// Helper functions` seem to have
definitions anymore.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297309 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 21:28:19 +00:00
Adam Nemet
773c83e777 [SLP] Visualize SLP trees with -view-slp-tree
Analyzing larger trees is extremely difficult with the current debug output so
this adds GraphTraits and DOTGraphTraits on top of the VectorizableTree data
structure.  We can now display the SLP trees with Graphviz as in
https://reviews.llvm.org/F3132765.

I decorated the graph where a value needs to be gathered for one reason or
another.  These are the red nodes.

There are other improvement I am planning to make as I work through my case
here.  For example, I would also like to mark nodes that need to be extracted.

Differential Revision: https://reviews.llvm.org/D30731

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297303 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 18:47:50 +00:00
Matthew Simpson
8ca15c2f22 [LV] Select legal insert point when fixing first-order recurrences
Because IRBuilder performs constant-folding, it's not guaranteed that an
instruction in the original loop map to an instruction in the vector loop. It
could map to a constant vector instead. The handling of first-order recurrences
was incorrectly making this assumption when setting the IRBuilder's insert
point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 18:18:20 +00:00
Jun Bum Lim
f45aefe37d [JumpThread] Use AA in SimplifyPartiallyRedundantLoad()
Summary: Use AA when scanning to find an available load value.

Reviewers: rengolin, mcrosier, hfinkel, trentxintong, dberlin

Reviewed By: rengolin, dberlin

Subscribers: aemerson, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D30352

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297284 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 15:22:30 +00:00
Sanjay Patel
48a60dc29b [InstCombine] avoid crashing on shuffle shrinkage when input type is not same as result type
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297280 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 15:02:23 +00:00
Sam Parker
0bf7b18461 [LoopRotate] Propagate dbg.value intrinsics
Recommitting patch which was previously reverted in r297159. These
changes should address the casting issues.

The original patch enables dbg.value intrinsics to be attached to
newly inserted PHI nodes.

Differential Review: https://reviews.llvm.org/D30701


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297269 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 09:56:22 +00:00
Davide Italiano
22615bf638 [SCCP] Merge markOverdefined and markAnythingOverdefined.
There's no need to have two separate APIs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297253 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 01:26:37 +00:00
Sanjay Patel
70bc85f677 [InstCombine] shrink truncated insertelement into undef vector
This is the 2nd part of solving:
http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html

D30123 moves the trunc ahead of the shuffle, and this moves the trunc ahead of the insertelement. 
We're limiting this transform to undef rather than any constant to avoid backend problems.

Differential Revision: https://reviews.llvm.org/D30137


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297242 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 23:27:14 +00:00
Evgeniy Stepanov
9616c18c37 Fix one-after-the-end type metadata handling in globalsplit.
Itanium ABI may have an address point one byte after the end of a
vtable. When such vtable global is split, the !type metadata needs to
follow the right vtable.

Differential Revision: https://reviews.llvm.org/D30716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297236 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 22:18:48 +00:00
Sanjay Patel
b1dd2b4312 [InstCombine] shrink truncated splat shuffle (2nd try)
This was committed at r297155 and reverted at r297166 because of an
over-reaching clang test. That should be fixed with r297189.

This is one part of solving a recent bug report:
http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html

This keeps with our general approach: changing arbitrary shuffles is off-limts,
but changing splat is ok. The transform is very similar to the existing
shrinkBitwiseLogic() canonicalization.

Differential Revision: https://reviews.llvm.org/D30123


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297232 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 21:45:16 +00:00
Gor Nishanov
2d33745172 [coroutines] Add handling for unwind coro.ends
Summary:
The purpose of coro.end intrinsic is to allow frontends to mark the cleanup and
other code that is only relevant during the initial invocation of the coroutine
and should not be present in resume and destroy parts.

In landing pads coro.end is replaced with an appropriate instruction to unwind to
caller. The handling of coro.end differs depending on whether the target is
using landingpad or WinEH exception model.

For landingpad based exception model, it is expected that frontend uses the
`coro.end`_ intrinsic as follows:

```
    ehcleanup:
      %InResumePart = call i1 @llvm.coro.end(i8* null, i1 true)
      br i1 %InResumePart, label %eh.resume, label %cleanup.cont

    cleanup.cont:
      ; rest of the cleanup

    eh.resume:
      %exn = load i8*, i8** %exn.slot, align 8
      %sel = load i32, i32* %ehselector.slot, align 4
      %lpad.val = insertvalue { i8*, i32 } undef, i8* %exn, 0
      %lpad.val29 = insertvalue { i8*, i32 } %lpad.val, i32 %sel, 1
      resume { i8*, i32 } %lpad.val29

```
The `CoroSpit` pass replaces `coro.end` with ``True`` in the resume functions,
thus leading to immediate unwind to the caller, whereas in start function it
is replaced with ``False``, thus allowing to proceed to the rest of the cleanup
code that is only needed during initial invocation of the coroutine.

For Windows Exception handling model, a frontend should attach a funclet bundle
referring to an enclosing cleanuppad as follows:

```
    ehcleanup:
      %tok = cleanuppad within none []
      %unused = call i1 @llvm.coro.end(i8* null, i1 true) [ "funclet"(token %tok) ]
      cleanupret from %tok unwind label %RestOfTheCleanup
```

The `CoroSplit` pass, if the funclet bundle is present, will insert
``cleanupret from %tok unwind to caller`` before
the `coro.end`_ intrinsic and will remove the rest of the block.

Reviewers: majnemer

Reviewed By: majnemer

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297223 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 21:00:54 +00:00
Xin Tong
928a5d4eba [JumpThread] Simplify CmpInst-as-Condition branch-folding a bit.
Summary: Simplify CmpInst-as-Condition branch-folding a bit.

Reviewers: sanjoy, efriedma

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297186 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 18:59:09 +00:00
Matthew Simpson
30bc56bd3c [LV] Consider users that are memory accesses in uniforms expansion step
When expanding the set of uniform instructions beyond the seed instructions
(e.g., consecutive pointers), we mark a new instruction uniform if all its
loop-varying users are uniform. We should also allow users that are consecutive
or interleaved memory accesses. This fixes cases where we have an instruction
that is used as the pointer operand of a consecutive access but also used by a
non-memory instruction that later becomes uniform as part of the expansion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297179 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 18:47:30 +00:00
Sanjay Patel
5eee35ff8f revert r297155 because there's a clang test that depends on InstCombine:
tools/clang/test/CodeGen/zvector.c



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297166 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 17:41:45 +00:00
Adrian Prantl
378f96b2ce Revert "Strip debug info when inlining into a nodebug function."
This reverts commit r296488.

As noted by David Blaikie on llvm-commits, I overlooked the case of a
debug function being inlined into a nodebug function being inlined
into a debug function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297163 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 17:28:57 +00:00
Nico Weber
cf4eefa8f1 Revert r297132, it caused PR32171
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297159 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 17:23:52 +00:00
Sanjay Patel
8b2302f3d1 [InstCombine] shrink truncated splat shuffle
This is one part of solving a recent bug report:
http://lists.llvm.org/pipermail/llvm-dev/2017-February/110293.html

This keeps with our general approach: changing arbitrary shuffles is off-limts, 
but changing splat is ok. The transform is very similar to the existing 
shrinkBitwiseLogic() canonicalization.

Differential Revision: https://reviews.llvm.org/D30123


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297155 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 16:10:36 +00:00