3427 Commits

Author SHA1 Message Date
Robert Lougher
d1ec87fb2f Resubmit r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"
Failing LLD tests have been fixed in r356593.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356594 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 19:08:18 +00:00
Robert Lougher
6364ead02b Revert r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"
Due to buildbot failures (LLD tests).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356516 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 20:54:20 +00:00
Robert Lougher
e9a0d4ae07 [TailCallElim] Add tailcall elimination pass to LTO pipelines
LTO provides additional opportunities for tailcall elimination due to
link-time inlining and visibility of nocapture attribute. Testing showed
negligible impact on compilation times.

Differential Revision: https://reviews.llvm.org/D58391



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356511 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 20:24:28 +00:00
Teresa Johnson
fdb5c57576 [ThinLTO] Restructure AliasSummary to contain ValueInfo of Aliasee
Summary:
The AliasSummary previously contained the AliaseeGUID, which was only
populated when reading the summary from bitcode. This patch changes it
to instead hold the ValueInfo of the aliasee, and always populates it.
This enables more efficient access to the ValueInfo (specifically in the
recent patch r352438 which needed to perform an index hash table lookup
using the aliasee GUID).

As noted in the comments in AliasSummary, we no longer technically need
to keep a pointer to the corresponding aliasee summary, since it could
be obtained by walking the list of summaries on the ValueInfo looking
for the summary in the same module. However, I am concerned that this
would be inefficient when walking through the index during the thin
link for various analyses. That can be reevaluated in the future.

By always populating this new field, we can remove the guard and special
handling for a 0 aliasee GUID when dumping the dot graph of the summary.

An additional improvement in this patch is when reading the summaries
from LLVM assembly we now set the AliaseeSummary field to the aliasee
summary in that same module, which makes it consistent with the behavior
when reading the summary from bitcode.

Reviewers: pcc, mehdi_amini

Subscribers: inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D57470

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356268 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-15 15:11:38 +00:00
Rong Xu
0d5bf69c9c [PGO] Context sensitive PGO (part 3)
Part 3 of CSPGO changes (mostly related to PassMananger).

Differential Revision: https://reviews.llvm.org/D54175


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355330 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-04 20:21:27 +00:00
Manman Ren
0e8eb4baf2 Add a module pass for order file instrumentation
The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name.

In this pass, we add three global variables:
(1) an order file buffer: a circular buffer at its own llvm section.
(2) a bitmap for each module: one byte for each function to say if the function is already executed.
(3) a global index to the order file buffer.

At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index.

Differential Revision:  https://reviews.llvm.org/D57463



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355133 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-28 20:13:38 +00:00
Rong Xu
5af7fe31f4 [PGO] Context sensitive PGO (part 2)
Part 2 of CSPGO changes (mostly related to ProfileSummary).
Note that I use a default parameter in setProfileSummary() and getSummary().
This is to break the dependency in clang. I will make the parameter explicit
after changing clang in a separated patch.

Differential Revision: https://reviews.llvm.org/D54175


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355131 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-28 19:55:07 +00:00
Eric Christopher
2d97abd893 Temporarily revert "ArgumentPromotion should copy all metadata to new Function" and the dependent patch "Refine ArgPromotion metadata handling" as they're causing segfaults in argument promotion.
This reverts commits r354032 and r353537.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355060 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-28 01:11:12 +00:00
Vedant Kumar
3b8055a11c [HotColdSplit] Disable splitting for sanitized functions
Splitting can make sanitizer errors harder to understand, as the
trapping instruction may not be in the function where the bug was
detected.

rdar://48142697

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354931 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-26 22:55:46 +00:00
Wei Mi
818f20e3da [Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled
is false.

Right now for inliner and partial inliner, we always pass the address of a
valid ORE object to getInlineCost even if RemarkEnabled is false because of
no -Rpass is specified. Since ComputeFullInlineCost will be set to true if
ORE is non-null in getInlineCost, this introduces the problem that in
getInlineCost we cannot return early even if we already know the cost is
definitely higher than the threshold. It is a general problem for compile
time.

This patch fixes that by pass nullptr as the ORE argument if RemarkEnabled is
false.

Differential Revision: https://reviews.llvm.org/D58399



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354542 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-21 02:57:52 +00:00
Vedant Kumar
45c9fb9375 [HotColdSplit] Schedule splitting late to fix perf regression
With or without PGO data applied, splitting early in the pipeline
(either before the inliner or shortly after it) regresses performance
across SPEC variants. The cause appears to be that splitting hides
context for subsequent optimizations.

Schedule splitting late again, in effect reversing r352080, which
scheduled the splitting pass early for code size benefits (documented in
https://reviews.llvm.org/D57082).

Differential Revision: https://reviews.llvm.org/D58258

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354158 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-15 18:46:44 +00:00
Teresa Johnson
c415237177 [ThinLTO] Detect partially split modules during the thin link
Summary:
The changes to disable LTO unit splitting by default (r350949) and
detect inconsistently split LTO units (r350948) are causing some crashes
when the inconsistency is detected in multiple threads simultaneously.
Fix that by having the code always look for the inconsistently split
LTO units during the thin link, by checking for the presence of type
tests recorded in the summaries.

Modify test added in r350948 to remove single threading required to fix
a bot failure due to this issue (and some debugging options added in the
process of diagnosing it).

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57561

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354062 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-14 21:22:50 +00:00
Teresa Johnson
978f4b006b Refine ArgPromotion metadata handling
Summary:
In r353537 we now copy all metadata to the new function, with the old
being removed when the old function is eliminated. In some cases the old
function is dropped to a declaration (seems to only occur with the old
PM). Go ahead and clear all metadata from the old function to handle that
case, since verification will complain otherwise. This is consistent
with what was being done for debug metadata before r353537.

Reviewers: davidxl, uabelho

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354032 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-14 14:14:24 +00:00
Fangrui Song
a7866b1030 [GlobalOpt] Simplify __cxa_atexit elimination
cxxDtorIsEmpty checks callers recursively to determine if the
__cxa_atexit-registered function is empty, and eliminates the
__cxa_atexit call accordingly.

This recursive check is unnecessary as redundant instructions and
function calls can be removed by early-cse and inliner. In addition,
cxxDtorIsEmpty does not mark visited function and it may visit a
function exponential times (multiplication principle).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353603 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-09 09:18:37 +00:00
Sergey Dmitriev
2c7198dbe4 [NFC] Avoid passing blocks vector to the OutlineRegionInfo constructor by value.
Reviewers: vsk, fhahn, davidxl

Reviewed By: vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57957


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353582 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-08 23:52:15 +00:00
Teresa Johnson
a54c8af4c3 ArgumentPromotion should copy all metadata to new Function
Summary:
ArgumentPromotion had code to specifically move the dbg metadata over to
the new function, but other metadata such as the function_entry_count
!prof metadata was not. Replace code that moved dbg metadata with a call
to copyMetadata. The old metadata is automatically removed when the old
Function is removed.

Reviewers: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353537 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-08 17:08:27 +00:00
Sergey Dmitriev
9a22bfd50b [CodeExtractor] Update function's assumption cache after extracting blocks from it
Summary: Assumption cache's self-updating mechanism does not correctly handle the case when blocks are extracted from the function by the CodeExtractor. As a result function's assumption cache may have stale references to the llvm.assume calls that were moved to the outlined function. This patch fixes this problem by removing extracted llvm.assume calls from the function’s assumption cache.

Reviewers: hfinkel, vsk, fhahn, davidxl, sanjoy

Reviewed By: hfinkel, vsk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353500 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-08 06:55:18 +00:00
Teresa Johnson
d850ba3107 [HotColdSplit] With PGO add profile entry metadata to split cold function
Summary:
When compiling with profile data, ensure the split cold function gets
cold function_entry_count metadata (just use 0 since it should be cold).
Otherwise with function sections it will not be placed in the unlikely
text section with other cold code.

Reviewers: vsk

Subscribers: sebpop, hiraditya, davidxl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353434 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-07 17:50:35 +00:00
Teresa Johnson
760d33e77d [HotColdSplit] Move splitting after instrumented PGO use
Summary:
Follow up to D57082 which moved splitting earlier in the pipeline, in
order to perform it before inlining. However, it was moved too early,
before the IR is annotated with instrumented PGO data. This caused the
splitting to incorrectly determine cold functions.

Move it to just after PGO annotation (still before inlining), in both
pass managers.

Reviewers: vsk, hiraditya, sebpop

Subscribers: mehdi_amini, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353270 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-06 04:29:39 +00:00
Vedant Kumar
a74459cebd [HotColdSplit] Do not split out resume instructions
Resumes that are not reachable from a cleanup landing pad are considered
to be unreachable. It’s not safe to split them out.

rdar://47808235

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353242 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-05 23:39:02 +00:00
Richard Trieu
f9a06f6926 Fix narrowing issue from r353129
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353134 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-05 02:26:03 +00:00
Wei Mi
19b99c3cc4 [SamplePGO][NFC] Minor improvement to replace a temporary vector with a
brace-enclosed init list.

Differential Revision: https://reviews.llvm.org/D57726


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353129 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-05 00:57:50 +00:00
James Y Knight
3bab951f0f [opaque pointer types] Pass value type to GetElementPtr creation.
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352913 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-01 20:44:47 +00:00
James Y Knight
6c00b3f35f [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352911 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-01 20:44:24 +00:00
James Y Knight
e84538e816 [opaque pointer types] Pass function types to InvokeInst creation.
This cleans up all InvokeInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352910 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-01 20:43:34 +00:00
James Y Knight
6029aa8149 [opaque pointer types] Pass function types to CallInst creation.
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57170

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352909 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-01 20:43:25 +00:00
Yevgeny Rouban
a53c9ac0aa Provide reason messages for unviable inlining
InlineCost's isInlineViable() is changed to return InlineResult
instead of bool. This provides messages for failure reasons and
allows to get more specific messages for cases where callsites
are not viable for inlining.

Reviewed By: xbolva00, anemet

Differential Revision: https://reviews.llvm.org/D57089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352849 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-01 10:44:43 +00:00
James Y Knight
9ec60d7d8f [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.

Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352827 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-01 02:28:03 +00:00
James Y Knight
5be828a949 Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it."
This reverts commit f47d6b38c7a61d50db4566b02719de05492dcef1 (r352791).

Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352800 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-31 21:51:58 +00:00
James Y Knight
8e4d96d7af [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352791 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-31 20:35:56 +00:00
Bjorn Pettersson
d0b3c5133d [IPCP] Don't crash due to arg count/type mismatch between caller/callee
Summary:
This patch avoids an assert in IPConstantPropagation when
there is a argument count/type mismatch between the caller and
the callee.

While this is actually UB on C-level (clang emits a warning),
the IR verifier seems to accept it. I'm not sure what other
frontends/languages might think about this, so simply bailing out
to avoid hitting an assert (in CallSiteBase<>::getArgOperand or
Value::doRAUW) seems like a simple solution.

The problem is exposed by the fact that AbstractCallSites will look
through a bitcast at the callee position of a call/invoke.

Reviewers: jdoerfert, reames, efriedma

Reviewed By: jdoerfert, efriedma

Subscribers: eli.friedman, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D57052

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352469 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-29 10:19:44 +00:00
Teresa Johnson
e5aed2f2a4 [ThinLTO] Refine reachability check to fix compile time increase
Summary:
A recent fix to the ThinLTO whole program dead code elimination (D56117)
increased the thin link time on a large MSAN'ed binary by 2x.
It's likely that the time increased elsewhere, but was more noticeable
here since it was already large and ended up timing out.

That change made it so we would repeatedly scan all copies of linkonce
symbols for liveness every time they were encountered during the graph
traversal. This was needed since we only mark one copy of an aliasee as
live when we encounter a live alias. This patch fixes the issue in a
more efficient manner by simply proactively visiting the aliasee (thus
marking all copies live) when we encounter a live alias.

Two notes: One, this requires a hash table lookup (finding the aliasee
summary in the index based on aliasee GUID). However, the impact of this
seems to be small compared to the original pre-D56117 thin link time. It
could be addressed if we keep the aliasee ValueInfo in the alias summary
instead of the aliasee GUID, which I am exploring in a separate patch.

Second, we only populate the aliasee GUID field when reading summaries
from bitcode (whether we are reading individual summaries and merging on
the fly to form the compiled index, or reading in a serialized combined
index). Thankfully, that's currently the only way we can get to this
code as we don't yet support reading summaries from LLVM assembly
directly into a tool that performs the thin link (they must be converted
to bitcode first). I added a FIXME, however I have the fix under test
already. The easiest fix is to simply populate this field always, which
isn't hard, but more likely the change I am exploring to store the
ValueInfo instead as described above will subsume this. I don't want to
hold up the regression fix for this though.

Reviewers: trentxintong

Subscribers: mehdi_amini, inglorion, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D57203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352438 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-28 22:27:05 +00:00
Vedant Kumar
083b7c74a1 [HotColdSplit] Introduce a cost model to control splitting behavior
The main goal of the model is to avoid *increasing* function size, as
that would eradicate any memory locality benefits from splitting. This
happens when:

  - There are too many inputs or outputs to the cold region. Argument
    materialization and reloads of outputs have a cost.

  - The cold region has too many distinct exit blocks, causing a large
    switch to be formed in the caller.

  - The code size cost of the split code is less than the cost of a
    set-up call.

A secondary goal is to prevent excessive overall binary size growth.

With the cost model in place, I experimented to find a splitting
threshold that works well in practice. To make warm & cold code easily
separable for analysis purposes, I moved split functions to a "cold"
section. I experimented with thresholds between [0, 4] and set the
default to the threshold which minimized geomean __text size.

Experiment data from building LNT+externals for X86 (N = 639 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| **-Os**       | 1736.3           | 0, n=0           | 10961.6        |
| -Os, thresh=0 | 1740.53          | 124.482, n=134   | 11014          |
| -Os, thresh=1 | 1734.79          | 57.8781, n=90    | 10978.6        |
| -Os, thresh=2 | ** 1733.85 **    | 65.6604, n=61    | 10977.6        |
| -Os, thresh=3 | 1733.85          | 65.3071, n=61    | 10977.6        |
| -Os, thresh=4 | 1735.08          | 67.5156, n=54    | 10965.7        |
| **-Oz**       | 1554.4           | 0, n=0           | 10153          |
| -Oz, thresh=2 | ** 1552.2 **     | 65.633, n=61     | 10176          |
| **-O3**       | 2563.37          | 0, n=0           | 13105.4        |
| -O3, thresh=2 | ** 2559.49 **    | 71.1072, n=61    | 13162.4        |

Picking thresh=2 reduces the geomean __text section size by 0.14% at
-Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that
TEXT size is page-aligned, whereas section sizes are byte-aligned.

Experiment data from building LNT+externals for ARM64 (N = 558 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| **-Os**       | 1763.96          | 0, n=0           | 42934.9        |
| -Os, thresh=2 | ** 1760.9 **     | 76.6755, n=61    | 42934.9        |

Picking thresh=2 reduces the geomean __text section size by 0.17% at
-Os and causes no growth in the TEXT segment.

Measurements were done with D57082 (r352080) applied.

Differential Revision: https://reviews.llvm.org/D57125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352228 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-25 18:30:37 +00:00
Vedant Kumar
03b0e5dcc5 [HotColdSplit] Describe the pass in more detail, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352161 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-25 03:22:38 +00:00
Vedant Kumar
7216dfe791 [HotColdSplit] Split more aggressively before/after cold invokes
While a cold invoke itself and its unwind destination can't be
extracted, code which unconditionally executes before/after the invoke
may still be profitable to extract.

With cost model changes from D57125 applied, this gives a 3.5% increase
in split text across LNT+externals on arm64 at -Os.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352160 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-25 03:22:23 +00:00
Vedant Kumar
6d1f753c09 [HotColdSplit] Move splitting earlier in the pipeline
Performing splitting early has several advantages:

  - Inhibiting inlining of cold code early improves code size. Compared
    to scheduling splitting at the end of the pipeline, this cuts code
    size growth in half within the iOS shared cache (0.69% to 0.34%).

  - Inhibiting inlining of cold code improves compile time. There's no
    need to inline split cold functions, or to inline as much *within*
    those split functions as they are marked `minsize`.

  - During LTO, extra work is only done in the pre-link step. Less code
    must be inlined during cross-module inlining.

An additional motivation here is that the most common cold regions
identified by the static/conservative splitting heuristic can (a) be
found before inlining and (b) do not grow after inlining. E.g.
__assert_fail, os_log_error.

The disadvantages are:

  - Some opportunities for splitting out cold code may be missed. This
    gap can potentially be narrowed by adding a worklist algorithm to the
    splitting pass.

  - Some opportunities to reduce code size may be lost (e.g. store
    sinking, when one side of the CFG diamond is split). This does not
    outweigh the code size benefits of splitting earlier.

On net, splitting early in the pipeline has substantial code size
benefits, and no major effects on memory locality or performance. We
measured memory locality using ktrace data, and consistently found that
10% fewer pages were needed to capture 95% of text page faults in key
iOS benchmarks. We measured performance on frequency-stabilized iOS
devices using LNT+externals.

This reverses course on the decision made to schedule splitting late in
r344869 (D53437).

Differential Revision: https://reviews.llvm.org/D57082

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352080 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-24 18:55:49 +00:00
Julian Lettner
a1e328df4a Revert "[Sanitizers] UBSan unreachable incompatible with ASan in the presence of noreturn calls"
This reverts commit cea84ab93aeb079a358ab1c8aeba6d9140ef8b47.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352069 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-24 18:04:21 +00:00
Florian Hahn
603b29b882 Revert "[HotColdSplitting] Get DT and PDT from the pass manager."
This reverts commit a6982414edf315c39ae93f3c3322476217119e99 (llvm-svn: 352036),
because it causes a memory leak in the pass manager. Failing bot

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/10351/steps/check-llvm%20asan/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352041 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-24 11:22:08 +00:00
Florian Hahn
442573a8e6 [HotColdSplitting] Get DT and PDT from the pass manager.
Instead of manually computing DT and PDT, we can get the from the pass
manager, which ideally has them already cached. With the new pass
manager, we could even preserve DT/PDT on a per function basis in a
module pass.

I think this also addresses the TODO about re-using the computed DTs for
BFI. IIUC, GetBFI will fetch the DT from the pass manager and when we
will fetch the cached version later.

Reviewers: vsk, hiraditya, tejohnson, thegameg, sebpop

Reviewed By: vsk

Differential Revision: https://reviews.llvm.org/D57092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352036 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-24 09:44:52 +00:00
Julian Lettner
2e1beed270 [Sanitizers] UBSan unreachable incompatible with ASan in the presence of noreturn calls
Summary:
UBSan wants to detect when unreachable code is actually reached, so it
adds instrumentation before every `unreachable` instruction. However,
the optimizer will remove code after calls to functions marked with
`noreturn`. To avoid this UBSan removes `noreturn` from both the call
instruction as well as from the function itself. Unfortunately, ASan
relies on this annotation to unpoison the stack by inserting calls to
`_asan_handle_no_return` before `noreturn` functions. This is important
for functions that do not return but access the the stack memory, e.g.,
unwinder functions *like* `longjmp` (`longjmp` itself is actually
"double-proofed" via its interceptor). The result is that when ASan and
UBSan are combined, the `noreturn` attributes are missing and ASan
cannot unpoison the stack, so it has false positives when stack
unwinding is used.

Changes:
  # UBSan now adds the `expect_noreturn` attribute whenever it removes
    the `noreturn` attribute from a function
  # ASan additionally checks for the presence of this attribute

Generated code:
```
call void @__asan_handle_no_return    // Additionally inserted to avoid false positives
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable
unreachable
```

The second call to `__asan_handle_no_return` is redundant. This will be
cleaned up in a follow-up patch.

rdar://problem/40723397

Reviewers: delcypher, eugenis

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D56624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352003 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-24 01:06:19 +00:00
David Callahan
26d8013491 Update entry count for cold calls
Summary:
Profile sample files include the number of times each entry or inlined
call site is sampled. This is translated into the entry count metadta
on functions.

When sample data is being read, if a call site that was inlined
in the sample program is considered cold and not inlined, then
the entry count of the out-of-line functions does not reflect
the current compilation.

In this patch, we note call sites where the function was not inlined
and as a last action of the sample profile loading, we update the
called function's entry count to reflect the calls from these
call sites which are not included in the profile file.

Reviewers: danielcdh, wmi, Kader, modocache

Reviewed By: wmi

Subscribers: davidxl, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D52845

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352001 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-24 00:55:23 +00:00
Florian Hahn
f060050105 [HotColdSplitting] Remove unused SSAUpdater.h include (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351945 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-23 11:51:38 +00:00
Vedant Kumar
4c9ad6d8f1 [HotColdSplit] Calculate BFI lazily to reduce compile-time, NFC
The splitting pass does not need BFI unless the Module actually has a profile
summary. Do not calcualte BFI unless the summary is present.

For the sqlite3 amalgamation, this reduces time spent in the splitting pass
from 0.4% of the total to under 0.1%.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351894 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-22 22:49:22 +00:00
Vedant Kumar
4631696c98 [HotColdSplit] Calculate domtrees lazily to reduce compile-time, NFC
The splitting pass does not need (post)domtrees until after it's found a
cold block. Defer domtree calculation until a cold block is found.

For the sqlite3 amalgamation, this reduces time spent in the splitting
pass from 0.8% of the total to 0.4%.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351892 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-22 22:49:08 +00:00
Vedant Kumar
4a94358e03 [ConstantMerge] Factor out check for un-mergeable globals, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351671 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-20 02:44:43 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Johannes Doerfert
3afa698404 Enable IPConstantPropagation to work with abstract call sites
This modification of the currently unused inter-procedural constant
propagation pass (IPConstantPropagation) shows how abstract call sites
enable optimization of callback calls alongside direct and indirect
calls. Through minimal changes, mostly dealing with the partial mapping
of callbacks, inter-procedural constant propagation was enabled for
callbacks, e.g., OpenMP runtime calls or pthreads_create.

Differential Revision: https://reviews.llvm.org/D56447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351628 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 05:19:12 +00:00
Vedant Kumar
341781f38d [MergeFunc] Allow merging identical vararg functions using aliases
Thanks to Nikita Popov for pointing out this missed case.

This is a follow-up to r351411, which disabled function merging for
vararg functions outright due to a miscompile (see llvm.org/PR40345).

Differential Revision: https://reviews.llvm.org/D56865

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351624 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 02:46:22 +00:00
Vedant Kumar
3dfba4afc6 [HotColdSplit] Mark inherently cold functions as such
If an inherently cold function is found, mark it as cold. For now this
means applying the `cold` and `minsize` attributes.

As a drive-by, revisit and clean up the criteria for considering a
function for splitting. Add tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351623 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 02:38:47 +00:00
Vedant Kumar
1054bdf355 [HotColdSplit] Remove a set which tracked split functions (NFC)
Use the begin/end iterator idiom to avoid visiting split functions,
instead of doing a set lookup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351622 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 02:38:17 +00:00