2879 Commits

Author SHA1 Message Date
Keno Fischer
1a173576c9 [StripDeadDebug/DIFinder] Track inlined SPs
Summary:
In rL299692 I improved strip-dead-debug-info's ability to drop CUs that are not
referenced from the current module. However, in doing so I neglected to realize
that some SPs could be referenced entirely from inlined functions. It appears
I was not the only one to make this mistake, because DebugInfoFinder, doesn't
find those SPs either. Fix this in DebugInfoFinder and then use that to make
sure not to drop those CUs in strip-dead-debug-info.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299936 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 13:32:11 +00:00
Diana Picus
1d02724c71 Revert "Turn some C-style vararg into variadic templates"
This reverts commit r299925 because it broke the buildbots. See e.g.
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299928 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 10:07:12 +00:00
Serge Guelton
ec124b3a6f Turn some C-style vararg into variadic templates
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299925 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 08:36:52 +00:00
Reid Kleckner
7dde8e89fc Reland "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies"
This re-lands r299875.

I introduced a bug in Clang code responsible for replacing K&R, no
prototype declarations with a real function definition with a prototype.
The bug was here:

       // Collect any return attributes from the call.
  -    if (oldAttrs.hasAttributes(llvm::AttributeList::ReturnIndex))
  -      newAttrs.push_back(llvm::AttributeList::get(newFn->getContext(),
  -                                                  oldAttrs.getRetAttributes()));
  +    newAttrs.push_back(oldAttrs.getRetAttributes());

Previously getRetAttributes() carried AttributeList::ReturnIndex in its
AttributeList. Now that we return the AttributeSetNode* directly, it no
longer carries that index, and we call this overload with a single node:
  AttributeList::get(LLVMContext&, ArrayRef<AttributeSetNode*>)

That aborted with an assertion on x86_32 targets. I added an explicit
triple to the test and added CHECKs to help find issues like this in the
future sooner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299899 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-10 23:31:05 +00:00
Matt Arsenault
e0b3c335a2 Allow DataLayout to specify addrspace for allocas.
LLVM makes several assumptions about address space 0. However,
alloca is presently constrained to always return this address space.
There's no real way to avoid using alloca, so without this
there is no way to opt out of these assumptions.

The problematic assumptions include:
- That the pointer size used for the stack is the same size as
  the code size pointer, which is also the maximum sized pointer.

- That 0 is an invalid, non-dereferencable pointer value.

These are problems for AMDGPU because alloca is used to
implement the private address space, which uses a 32-bit
index as the pointer value. Other pointers are 64-bit
and behave more like LLVM's notion of generic address
space. By changing the address space used for allocas,
we can change our generic pointer type to be LLVM's generic
pointer type which does have similar properties.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299888 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-10 22:27:50 +00:00
Dehao Chen
5c741f7437 Emit less compiler optimization remarks in samplepgo to reduce a call to findCalleeFunctionSamples which is going to be refactored.
Summary: Now the SamplePGO support is more stable, we do not need so many verbose optimization remarks emitted.

Reviewers: dnovillo, davidxl

Reviewed By: davidxl

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D31826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299883 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-10 20:49:16 +00:00
Evgeniy Stepanov
9cf476f311 Revert "[asan] Fix dead stripping of globals on Linux."
This reverts commit r299697, which caused a big increase in object file size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299879 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-10 20:36:30 +00:00
Reid Kleckner
d5d5d0b80e Revert "[IR] Make AttributeSetNode public, avoid temporary AttributeList copies"
This reverts r299875. A Linux bot came back with a test failure:
http://bb.pgr.jp/builders/test-clang-i686-linux-RA/builds/741/steps/test_clang/logs/Clang%20%3A%3A%20CodeGen__2006-05-19-SingleEltReturn.c

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299878 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-10 20:34:19 +00:00
Reid Kleckner
a8b5a980c8 [IR] Make AttributeSetNode public, avoid temporary AttributeList copies
Summary:
AttributeList::get(Fn|Ret|Param)Attributes no longer creates a temporary
AttributeList just to hide the AttributeSetNode type.

I've also added a factory method to create AttributeLists from a
parallel array of AttributeSetNodes. I think this simplifies
construction of AttributeLists when rewriting function prototypes.
Previously we would test if a particular index had attributes, and
conditionally add a temporary attribute list to a vector. Now the
attribute set vector is parallel to the argument vector already that
these passes already construct.

My long term vision is to wrap AttributeSetNode* inside an AttributeSet
type that holds the enum attributes, but that will come in a follow up
change.

I haven't done any performance measurements for this change because
profiling hasn't shown that any of the affected code is hot.

Reviewers: pete, chandlerc, sanjoy, hfinkel

Reviewed By: pete

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299875 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-10 20:18:10 +00:00
Evgeniy Stepanov
212b2d57cb [cfi] Take over existing __cfi_check in CrossDSOCFI.
https://reviews.llvm.org/D31796 will emit a dummy __cfi_check in the
frontend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299805 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-07 23:00:20 +00:00
Mehdi Amini
8701bbc75d Revert "Turn some C-style vararg into variadic templates"
This reverts commit r299699, the examples needs to be updated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299702 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 20:23:57 +00:00
Mehdi Amini
753bd2a772 Turn some C-style vararg into variadic templates
Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D31070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299699 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 20:09:31 +00:00
Evgeniy Stepanov
7bfedca6fc [asan] Fix dead stripping of globals on Linux.
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.

Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.

This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.

At this moment it does not work on Gold (as in the globals are never
stripped).

This is a re-land of r298158 rebased on D31358. This time,
asan.module_ctor is put in a comdat as well to avoid quadratic
behavior in Gold.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299697 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 19:55:17 +00:00
Keno Fischer
93e3e80c72 [StripDeadDebugInfo] Drop dead CUs entirely
Summary:
Prior to this while it would delete the dead DIGlobalVariables, it would
leave dead DICompileUnits and everything referenced therefrom. For a bit
bitcode file with thousands of compile units those dead nodes easily
outnumbered the real ones. Clean that up.

Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D31720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299692 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-06 19:26:22 +00:00
Bob Haarman
ed735da23f ThinLTOBitcodeWriter: handle aliases first in filterModule
Summary: This change fixes a "local linkage requires default visibility" assert when attempting to build LLVM with ThinLTO on Windows.

Reviewers: pcc, tejohnson, mehdi_amini

Reviewed By: pcc

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D31632


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299491 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-05 00:42:07 +00:00
Rong Xu
9e52f8ee8b [PGO] Memory intrinsic calls optimization based on profiled size
This patch optimizes two memory intrinsic operations: memset and memcpy based
on the profiled size of the operation. The high level transformation is like:
  mem_op(..., size)
  ==>
  switch (size) {
    case s1:
       mem_op(..., s1);
       goto merge_bb;
    case s2:
       mem_op(..., s2);
       goto merge_bb;
    ...
    default:
       mem_op(..., size);
       goto merge_bb;
    }
  merge_bb:

Differential Revision: http://reviews.llvm.org/D28966


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299446 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-04 16:42:20 +00:00
Peter Collingbourne
0b40dade41 ThinLTOBitcodeWriter: Use Module::global_values(). NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299132 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-30 23:43:08 +00:00
Joerg Sonnenberger
3d72b70842 Split the SimplifyCFG pass into two variants.
The first variant contains all current transformations except
transforming switches into lookup tables. The second variant
contains all current transformations.

The switch-to-lookup-table conversion results in code that is more
difficult to analyze and optimize by other passes. Most importantly,
it can inhibit Dead Code Elimination. As such it is often beneficial to
only apply this transformation very late. A common example is inlining,
which can often result in range restrictions for the switch expression.

Changes in execution time according to LNT:
SingleSource/Benchmarks/Misc/fp-convert +3.03%
MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20%
MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43%
and a couple of smaller changes. For perimeter it also results 2.6%
a smaller binary.

Differential Revision: https://reviews.llvm.org/D30333


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298799 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-26 06:44:08 +00:00
Dehao Chen
f62637a837 Set the prof weight correctly for call instructions in DeadArgumentElimination.
Summary: In DeadArgumentElimination, the call instructions will be replaced. We also need to set the prof weights so that function inlining can find the correct profile.

Reviewers: eraman

Reviewed By: eraman

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298660 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 23:26:00 +00:00
Dehao Chen
42740ced92 Disable loop unrolling and icp in SamplePGO ThinLTO compile phase
Summary:
loop unrolling and icp will make the sample profile annotation much harder in the backend. So disable these 2 optimization in the ThinLTO compile phase.
Will add a test in cfe in a separate patch.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: mehdi_amini, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D31217

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298646 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 21:20:05 +00:00
Teresa Johnson
08d0e94685 [ThinLTO] Add support for emitting minimized bitcode for thin link
Summary:
The cumulative size of the bitcode files for a very large application
can be huge, particularly with -g. In a distributed build environment,
all of these files must be sent to the remote build node that performs
the thin link step, and this can exceed size limits.

The thin link actually only needs the summary along with a bitcode
symbol table. Until we have a proper bitcode symbol table, simply
stripping the debug metadata results in significant size reduction.

Add support for an option to additionally emit minimized bitcode
modules, just for use in the thin link step, which for now just strips
all debug metadata. I plan to add a cc1 option so this can be invoked
easily during the compile step.

However, care must be taken to ensure that these minimized thin link
bitcode files produce the same index as with the original bitcode files,
as these original bitcode files will be used in the backends.

Specifically:
1) The module hash used for caching is typically produced by hashing the
written bitcode, and we want to include the hash that would correspond
to the original bitcode file. This is because we want to ensure that
changes in the stripped portions affect caching. Added plumbing to emit
the same module hash in the minimized thin link bitcode file.
2) The module paths in the index are constructed from the module ID of
each thin linked bitcode, and typically is automatically generated from
the input file path. This is the path used for finding the modules to
import from, and obviously we need this to point to the original bitcode
files. Added gold-plugin support to take a suffix replacement during the
thin link that is used to override the identifier on the MemoryBufferRef
constructed from the loaded thin link bitcode file. The assumption is
that the build system can specify that the minimized bitcode file has a
name that is similar but uses a different suffix (e.g. out.thinlink.bc
instead of out.o).

Added various tests to ensure that we get identical index files out of
the thin link step.

Reviewers: mehdi_amini, pcc

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31027

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298638 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 19:47:39 +00:00
Dehao Chen
b9056a6799 Do not set branch weight if the branch weight annotation is present.
Summary: ThinLTO will annotate the CFG twice. If the branch weight is set by the first annotation, we should not set the branch weight again in the second annotation because the first annotation is more accurate as there is less optimization that could affect debug info accuracy.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: mehdi_amini, aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D31228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298602 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-23 14:43:10 +00:00
Peter Collingbourne
e53e585ce9 IPO: Const correctness for summaries passed into passes.
Pass const qualified summaries into importers and unqualified summaries into
exporters. This lets us const-qualify the summary argument to thinBackend.

Differential Revision: https://reviews.llvm.org/D31230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298534 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-22 18:22:59 +00:00
Peter Collingbourne
49df148534 IR: Fix a race condition in type id clients of ModuleSummaryIndex.
Add a const version of the getTypeIdSummary accessor that avoids
mutating the TypeIdMap.

Differential Revision: https://reviews.llvm.org/D31226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298531 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-22 18:04:39 +00:00
Evgeny Astigeevich
ef253e2f6e r286814 resulted that CallPenalty can be subtracted twice:
- First time, during calculation of the cost in InlineCost.cpp
- Second time, during calculation of the cost in Inliner.cpp

This patches fixes this.

Differential Revision: https://reviews.llvm.org/D31137


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298496 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-22 12:01:57 +00:00
Dehao Chen
287fe25641 Do not inline hot callsites for samplepgo in thinlto compile phase.
Summary: Because SamplePGO passes will be invoked twice in ThinLTO build: once at compile phase, the other at backend. We want to make sure the IR at the 2nd phase matches the hot part in profile, thus we do not want to inline hot callsites in the first phase.

Reviewers: tejohnson, eraman

Reviewed By: tejohnson

Subscribers: mehdi_amini, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D31201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298428 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 19:55:36 +00:00
Reid Kleckner
6707770d48 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298393 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 16:57:19 +00:00
Evgeniy Stepanov
d3ffac83c0 Revert r298158.
Revert "[asan] Fix dead stripping of globals on Linux."

OOM in gold linker.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298288 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-20 18:45:34 +00:00
Evgeniy Stepanov
a82d7906c2 [asan] Fix dead stripping of globals on Linux.
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.

Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.

This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.

At this moment it does not work on Gold (as in the globals are never
stripped).

Differential Revision: https://reviews.llvm.org/D30121

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298158 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 22:17:29 +00:00
Stanislav Mekhanoshin
17dcd3dc69 Only unswitch loops with uniform conditions
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.

Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.

Differential Revision: https://reviews.llvm.org/D30796

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298104 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 17:13:41 +00:00
Chandler Carruth
c177883098 [PM/Inliner] Fix a bug in r297374 where we would leave stale calls in
the work queue and crash when trying to visit them after deleting the
function containing those calls.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-16 10:45:42 +00:00
Dehao Chen
cd2a5b62d1 SamplePGO ThinLTO ICP fix for local functions.
Summary:
In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen:

1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName.
2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName

This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining:

1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import.
2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName.
3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote.

Reviewers: mehdi_amini, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D30754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297757 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-14 17:33:01 +00:00
Peter Collingbourne
bfecb4640b WholeProgramDevirt: Implement export/import support for VCP.
Differential Revision: https://reviews.llvm.org/D30017

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297503 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 20:13:58 +00:00
Peter Collingbourne
eb3c7034f3 WholeProgramDevirt: Implement export/import support for unique ret val opt.
Differential Revision: https://reviews.llvm.org/D29917

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297502 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 20:09:11 +00:00
George Rimar
a312eded5f WholeProgramDevirt: Fixed compilation error under MSVS2015.
It was introduced in:

r296945
WholeProgramDevirt: Implement exporting for single-impl devirtualization.
---------------------
r296939
WholeProgramDevirt: Add any unsuccessful llvm.type.checked.load devirtualizations to the list of llvm.type.test users.
---------------------

Microsoft Visual Studio Community 2015
Version 14.0.23107.0 D14REL
Does not compile that code without additional brackets, showing multiple error like below:

WholeProgramDevirt.cpp(1216): error C2958: the left bracket '[' found at 'c:\access_softek\llvm\lib\transforms\ipo\wholeprogramdevirt.cpp(1216)' was not matched correctly
WholeProgramDevirt.cpp(1216): error C2143: syntax error: missing ']' before '}'
WholeProgramDevirt.cpp(1216): error C2143: syntax error: missing ';' before '}'
WholeProgramDevirt.cpp(1216): error C2059: syntax error: ']'


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297451 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 10:31:56 +00:00
Chandler Carruth
4fea871248 [PM/Inliner] Make the new PM's inliner process call edges across an
entire SCC before iterating on newly-introduced call edges resulting
from any inlined function bodies.

This more closely matches the behavior of the old PM's inliner. While it
wasn't really clear to me initially, this behavior is actually essential
to the inliner behaving reasonably in its current design.

Because the inliner is fundamentally a bottom-up inliner and all of its
cost modeling is designed around that it often runs into trouble within
an SCC where we don't have any meaningful bottom-up ordering to use. In
addition to potentially cyclic, infinite inlining that we block with the
inline history mechanism, it can also take seemingly simple call graph
patterns within an SCC and turn them into *insanely* large functions by
accidentally working top-down across the SCC without any of the
threshold limitations that traditional top-down inliners use.

Consider this diabolical monster.cpp file that Richard Smith came up
with to help demonstrate this issue:
```
template <int N> extern const char *str;

void g(const char *);

template <bool K, int N> void f(bool *B, bool *E) {
  if (K)
    g(str<N>);
  if (B == E)
    return;
  if (*B)
    f<true, N + 1>(B + 1, E);
  else
    f<false, N + 1>(B + 1, E);
}
template <> void f<false, MAX>(bool *B, bool *E) { return f<false, 0>(B, E); }
template <> void f<true, MAX>(bool *B, bool *E) { return f<true, 0>(B, E); }

extern bool *arr, *end;
void test() { f<false, 0>(arr, end); }
```

When compiled with '-DMAX=N' for various values of N, this will create an SCC
with a reasonably large number of functions. Previously, the inliner would try
to exhaust the inlining candidates in a single function before moving on. This,
unfortunately, turns it into a top-down inliner within the SCC. Because our
thresholds were never built for that, we will incrementally decide that it is
always worth inlining and proceed to flatten the entire SCC into that one
function.

What's worse, we'll then proceed to the next function, and do the exact same
thing except we'll skip the first function, and so on. And at each step, we'll
also make some of the constant factors larger, which is awesome.

The fix in this patch is the obvious one which makes the new PM's inliner use
the same technique used by the old PM: consider all the call edges across the
entire SCC before beginning to process call edges introduced by inlining. The
result of this is essentially to distribute the inlining across the SCC so that
every function incrementally grows toward the inline thresholds rather than
allowing the inliner to grow one of the functions vastly beyond the threshold.
The code for this is a bit awkward, but it works out OK.

We could consider in the future doing something more powerful here such as
prioritized order (via lowest cost and/or profile info) and/or a code-growth
budget per SCC. However, both of those would require really substantial work
both to design the system in a way that wouldn't break really useful
abstraction decomposition properties of the current inliner and to be tuned
across a reasonably diverse set of code and workloads. It also seems really
risky in many ways. I have only found a single real-world file that triggers
the bad behavior here and it is generated code that has a pretty pathological
pattern. I'm not worried about the inliner not doing an *awesome* job here as
long as it does *ok*. On the other hand, the cases that will be tricky to get
right in a prioritized scheme with a budget will be more common and idiomatic
for at least some frontends (C++ and Rust at least). So while these approaches
are still really interesting, I'm not in a huge rush to go after them. Staying
even closer to the existing PM's behavior, especially when this easy to do,
seems like the right short to medium term approach.

I don't really have a test case that makes sense yet... I'll try to find a
variant of the IR produced by the monster template metaprogram that is both
small enough to be sane and large enough to clearly show when we get this wrong
in the future. But I'm not confident this exists. And the behavior change here
*should* be unobservable without snooping on debug logging. So there isn't
really much to test.

The test case updates come from two incidental changes:
1) We now visit functions in an SCC in the opposite order. I don't think there
   really is a "right" order here, so I just update the test cases.
2) We no longer compute some analyses when an SCC has no call instructions that
   we consider for inlining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297374 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 11:35:40 +00:00
Peter Collingbourne
1166572ca3 WholeProgramDevirt: Implement importing for uniform ret val opt.
Differential Revision: https://reviews.llvm.org/D29854

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297350 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 01:11:15 +00:00
Peter Collingbourne
0e534009f0 WholeProgramDevirt: Implement importing for single-impl devirtualization.
Differential Revision: https://reviews.llvm.org/D29844

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297333 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 00:21:25 +00:00
Teresa Johnson
611bafa4c4 Perform symbol binding for .symver versioned symbols
Summary:
In a .symver assembler directive like:
.symver name, name2@@nodename
"name2@@nodename" should get the same symbol binding as "name".

While the ELF object writer is updating the symbol binding for .symver
aliases before emitting the object file, not doing so when the module
inline assembly is handled by the RecordStreamer is causing the wrong
behavior in *LTO mode.

E.g. when "name" is global, "name2@@nodename" must also be marked as
global. Otherwise, the symbol is skipped when iterating over the LTO
InputFile symbols (InputFile::Symbol::shouldSkip). So, for example,
when performing any *LTO via the gold-plugin, the versioned symbol
definition is not recorded by the plugin and passed back to the
linker. If the object was in an archive, and there were no other symbols
needed from that object, the object would not be included in the final
link and references to the versioned symbol are undefined.

The llvm-lto2 tests added will give an error about an unused symbol
resolution without the fix.

Reviewers: rafael, pcc

Reviewed By: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D30485

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297332 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 00:19:49 +00:00
Evgeniy Stepanov
43935601f1 Don't merge global constants with non-dbg metadata.
!type metadata can not be dropped. An alternative to this is adding
!type metadata from the replaced globals to the replacement, but that
may weaken type tests and make them slower at the same time.

The merged global gets !dbg metadata from replaced globals, and can
end up with multiple debug locations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297327 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 00:03:37 +00:00
Evgeniy Stepanov
9616c18c37 Fix one-after-the-end type metadata handling in globalsplit.
Itanium ABI may have an address point one byte after the end of a
vtable. When such vtable global is split, the !type metadata needs to
follow the right vtable.

Differential Revision: https://reviews.llvm.org/D30716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297236 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 22:18:48 +00:00
Hans Wennborg
0de969bf83 Disable gvn-hoist (PR32153)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297075 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-06 21:10:40 +00:00
Dehao Chen
08b8679347 Remove the sample pgo annotation heuristic that uses call count to annotate basic block count.
Summary: We do not need that special handling because the debug info is more accurate now. Performance testing shows no regression on google internal benchmarks.

Reviewers: davidxl, aprantl

Reviewed By: aprantl

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D30658

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297038 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-06 17:49:59 +00:00
Peter Collingbourne
708709f2dc Fix build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296949 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-04 01:38:05 +00:00
Peter Collingbourne
3f44bcbdcb WholeProgramDevirt: Implement exporting for uniform ret val opt.
Differential Revision: https://reviews.llvm.org/D29846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296948 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-04 01:34:53 +00:00
Peter Collingbourne
d1e011ddda WholeProgramDevirt: Implement exporting for single-impl devirtualization.
Differential Revision: https://reviews.llvm.org/D29811

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296945 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-04 01:31:01 +00:00
Peter Collingbourne
d31e3edb6f WholeProgramDevirt: Add any unsuccessful llvm.type.checked.load devirtualizations to the list of llvm.type.test users.
Any unsuccessful llvm.type.checked.load devirtualizations will be translated
into uses of llvm.type.test, so we need to add the resulting llvm.type.test
intrinsics to the function summaries so that the LowerTypeTests pass will
export them.

Differential Revision: https://reviews.llvm.org/D29808

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296939 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-04 01:23:30 +00:00
Benjamin Kramer
9ee375bd99 Revert "Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline.""
This reverts commit r296759. Miscompiles bash.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296872 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-03 14:27:53 +00:00
Peter Collingbourne
d8035f7f14 ThinLTOBitcodeWriter: Do not follow operand edges of type GlobalValue when looking for virtual functions.
Such edges may otherwise result in infinite recursion if a pointer to a vtable
is reachable from the vtable itself. This can happen in practice if a TU
defines the ABI types used to implement RTTI, and is itself compiled with RTTI.

Fixes PR32121.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296839 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-02 23:10:17 +00:00
Geoff Berry
7bc404756c Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline."
This re-applies r289696, which caused TSan perf regression, which has
since been addressed in separate changes (see PR for details).

See PR31382.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296759 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-02 16:16:47 +00:00