.bss section emitted by llvm-bolt (e.g. with instrumentation) is not a
real BSS section, i.e. it takes space in the output file. Hence the
order with respect to .data is not defined. Remove .bss from the test
and fix the buildbot failure.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D135475
Add a new testcase that shows a bug in BOLT when writing out
dynamic relocations. This is currently marked as XFAIL as we work on
solving it. This bug happens when the current strategy fails to
recognize that the original dynamic relocation in the input should
reference the original .bolt.org.rodata section instead of the new one
.rodata created by BOLT after moving jump tables. This bug started
happening after 729d29e167.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D125941
While the order of new sections in the output binary was deterministic
in the past (i.e. there was no run-to-run variation), it wasn't always
rational as we used size to define the precedence of allocatable
sections within "code" or "data" groups (probably unintentionally).
Fix that by defining stricter section-ordering rules.
Other than the order of sections, this should be NFC.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135235
When BOLT updates .eh_frame section, it concatenates newly-generated
contents (from CFI directives) with the original .eh_frame that has
relocations applied to it. However, if no new content is generated,
the original .eh_frame has to be left intact. In that case, BOLT was
still writing out the relocatable copy of the original .eh_frame section
to the new segment, even though this copy was never used and was not
even marked in the section header table.
Detect the scenario above and skip allocating extra space for .eh_frame.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135223
To properly set the "_end" symbol, we need to track the last allocatable
address. Simply emitting "_end" at the end of some section is not
sufficient since the order of section allocation is unknown during the
emission step.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135121
We can emit a binary without a new text section. Hence, the text section
assertion is not needed.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D135120
I went over the output of the following mess of a command:
`(ulimit -m 2000000; ulimit -v 2000000; git ls-files -z | parallel --xargs -0 cat | aspell list --mode=none --ignore-case | grep -E '^[A-Za-z][a-z]*$' | sort | uniq -c | sort -n | grep -vE '.{25}' | aspell pipe -W3 | grep : | cut -d' ' -f2 | less)`
and proceeded to spend a few days looking at it to find probable typos
and fixed a few hundred of them in all of the llvm project (note, the
ones I found are not anywhere near all of them, but it seems like a
good start).
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D130824
In weird entries we were issueing a parse error. For example, in line 5 here:
6862acc063b0aa86595f52ff81628577df4296ff a.so
6862acc063b0aa86595f52ff81628577df4296ff a.so
6862acc063b0aa86595f52ff81628577df4296ff a.so
db758cb3c970044e78d5a4c99b011708a9995636 bin1
60326683eab31acfd03435d9ed4ff9a8 bin2
7d448e51851b4bdb33eac84f90e74628a14a5f00 b.so
742aa26e0211794356cc25f415c25230a26aa045 c.so
Error reading BOLT data input file: line 89, column 33: malformed field
Fix that.
Reviewed By: #bolt, Amir
Differential Revision: https://reviews.llvm.org/D134822
In lite mode, BOLT only transforms a subset of functions, leave the
remaining functions intact.
For NoPIC, it is fine. BOLT can scan relocations and fix-up all refs
that point to any function body in the subset.
For no-split function PIC, it is fine. Since jump tables are intra-
procedural transfer, BOLT can find both the jump table base and the
target within same function. Thus, BOLT can update and/or move jump
tables.
However, it is wrong to process a subset of functions in split function
PIC. This is because BOLT does not know if functions in the subset are
isolated, i.e., cannot be accessed by functions out of the subset,
especially via split jump table.
For example, BOLT only process three functions A, B and C. Suppose that
A is reached via jump table from A.cold, which is not processed. When
A is moved (due to optimization), the jump table in A.cold is invalid.
We cannot fix-up this jump table since it is only recognized in A.cold,
which BOLT does not process.
Solution: Disable lite mode if split function is present.
Future improvement: In lite mode, if split function is found, BOLT
processes both functions in the subset and all of their sibling
fragments.
Test Plan:
```
ninja check-bolt
```
Reviewed By: Amir, maksfb
Differential Revision: https://reviews.llvm.org/D131283
This does *not* link with libLLVM, but with static archives instead. Not
super-great, but at least the build works, which is probably better than
failing.
Related to #57551
Differential Revision: https://reviews.llvm.org/D134434
In perf2bolt and `-aggregate-only` BOLT mode, the output profile file is written
in fdata format by default. Provide a knob `-profile-format=[fdata,yaml]` to
control the format.
Note that `-w` option still dumps in YAML format.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D133995
This patch fixes warnings during a release build:
mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp:198:52: error:
lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
bolt/lib/Rewrite/RewriteInstance.cpp:5318:18: error: unused variable
'HasNoAddress' [-Werror,-Wunused-variable]
After BOLT's merge to LLVM, there are two (almost identical) versions of the
code layout algorithm. The diff unifies the implementations by keeping the one
in LLVM.
There are mild changes in the resulting block orders. I tested the changes
extensively both on the clang binary and on prod services. Didn't see stat sig
differences on average.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D129895
I'm planning to deprecate and eventually remove llvm::empty.
Note that no use of llvm::empty requires the ability of llvm::empty to
determine the emptiness from begin/end only.
When we derive EFMM from SectionMemoryManager, it brings into EFMM extra
functionality, such as the registry of exception handling sections,
page permission management, etc. Such functionality is of no use to
llvm-bolt and can even be detrimental (see
https://github.com/llvm/llvm-project/issues/56726).
Change the base class of ExecutableFileMemoryManager to MemoryManager,
avoid registering EH sections, and skip memory finalization.
Fixes#56726
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133994
In non-relocation mode, every function is emitted in its own section. If
a function is empty, RuntimeDyld will still allocate 1-byte section
for the function and initialize it with zero. As a result, we will
overwrite the first byte of the original function contents with zero.
Such scenario can happen when the input function had only NOP
instructions which BOLT removes by default. Even though such functions
likely cause undefined behavior, it's better to preserve their contents.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D133978
For functions with references to internal offsets from data, verify externally
referenced blocks against the set of jump table targets. Mark the function
as non-simple if there are any unclaimed data to code references.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D132495
In non-pie binaries BOLT unconditionally converted type encoding
from indirect to absptr, which broke std exceptions since pointers
to their typeinfo were only assigned at runtime in .data section.
In this patch we preserve original encoding so that indirect
remains indirect and can be resolved at runtime, and absolute remains absolute.
Reviewed By: rafauler, maksfb
Differential Revision: https://reviews.llvm.org/D132484
Without this patch, I get warnings like:
bolt/include/bolt/Core/BinaryContext.h:108:19: error:
'iterator<std::bidirectional_iterator_tag,
llvm::bolt::BinarySection>' is deprecated
[-Werror,-Wdeprecated-declarations]
This patch fixes those warnings by defining iterator_category,
value_type, etc.
This patch intentionally leaves duplicate types like FilterIterator::T
and FilterIterator::PointerT intact to avoid mixing the fix and the
cleanup.
Differential Revision: https://reviews.llvm.org/D133650
This adds a test to verify that when splitting all blocks, landing pad
trampolines are inserted in all blocks.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132426
For exception handling, LSDA call sites have to be emitted for each
fragment individually. With this patch, call sites and respective LSDA
symbols are generated and associated with each fragment of their
function, such that they can be used by the emitter.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132052
To enable split strategies that require view of the entire CFG (e.g. to
estimate cost of path from entry block), with this patch, all blocks of
a function are passed to `SplitStrategy::fragment`. Because this might
move non-outlineable blocks into a split fragment, these blocks are
moved back into the main fragment after fragmenting. This also gives
strategies the option to specify whether empty fragments should be
kept or removed.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132423
ICP has two modes: jump table promotion and indirect call promotion.
The selection is based on whether an instruction has a jump table or not.
An instruction with unknown control flow doesn't have a jump table and will
fall under indirect call promotion policy which might be incorrect/unsafe
(if an instruction is not a tail call, i.e. has local jump targets).
Prevent ICP for functions containing instructions with unknown control flow.
Follow-up to https://reviews.llvm.org/D128870.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132882
This introduces an abstract base class for splitting strategies to
document the interface a strategy needs to implement, and also to avoid
code bloat of the `splitFunction` method.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132054
Output different letters for different sections in the heatmap to
visually separate sections.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D133068
The command to detect whether the stash is needed is `git status --porcelain`
which includes untracked files by default. We want to stash untracked files
as well as they may affect compilation (LLVM CMake checks that all source files
should be included in CMakeLists).
Update the stash command to include untracked files as well.
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D132610
We were trying to process .debug_addr for CU that doesn't have it. This resulted
in assert. Example came from GCC that also doesn't use DW_OP_addrx in
DW_FORM_exprloc.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D132422
To generate all symbols correctly, it is necessary to record the address
of each fragment. This patch moves the address info for the main and
cold fragments from BinaryFunction to FunctionFragment, where this data
is recorded for all fragments.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132051
This changes `FunctionFragment` from being used as a temporary proxy
object to access basic block ranges to a heap-allocated object that can
store fragment-specific information.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132050
A const-qualified reference to function layout allows accessing
non-const qualified basic blocks on a const-qualified function. This
patch adds or removes const-qualifiers where necessary to indicate where
basic blocks are used in a non-const manner.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D132049