Summary:
The lower_bound might return the end iterator, the ignoring of which will
cause memory corruption.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD33307803)
Summary:
Fix missing string header file inclusion and link_fdata find
problem in lit tests. Change root-level tests to require
linux. Re-enable Windows in our root CMakeLists.txt.
(cherry picked from FBD33296290)
Summary:
Fix according to Coding Standards doc, section Don't Use
Braces on Simple Single-Statement Bodies of if/else/loop Statements.
This set of changes applies to lib Core only.
(cherry picked from FBD33240028)
Summary:
Since nops are now removed in a separate pass, the profile is consumed
on a CFG with nops. If previously a profile was generated without nops,
the offsets in the profile could be different if branches included nops
either as a source or a destination.
This diff adjust offsets to make the profile reading backwards
compatible.
(cherry picked from FBD33231254)
Summary:
The patch moves the shortenInstructions and nop remove to separate binary
passes. As a result when llvm-bolt optimizations stage will begin the
instructions of the binary functions will be absolutely the same as it
was in the binary. This is needed for the golang support by llvm-bolt.
Some of the tests must be changed, since bb alignment nops might create
unreachable BBs in original functions.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD32896517)
Summary:
This patch adds AArch64 relocations handling in case updating of
debug sections is enabled
Elvina Yakubova,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD33077609)
Summary:
Create a new high-level target named bolt that builds all
BOLT artifacts, as well as a install-bolt target that installs them.
(cherry picked from FBD33133002)
Summary:
Gracefully handle binaries with split functions where two fragments are folded
into one, resulting in a fragment with two parent functions.
This behavior is expected in GCC8+ with -O2 optimization level, where both
function splitting and ICF are enabled by default.
On the BOLT side, the changes are:
- BinaryFunction: allow multiple parent fragments:
- `ParentFragment` --> `ParentFragments`,
- `setParentFragment` --> `addParentFragment`.
- BinaryContext:
- `populateJumpTables`: mark fragments to be skipped later,
- `registerFragment`: add a name heuristic check, return false if it failed,
- `processInterproceduralReferences`: check if `registerFragment`
succeeded, otherwise issue a warning,
- `skipMarkedFragments`: move out fragment traversal and skipping from
`populateJumpTables` into a separate function.
This change fixes an issue where unrelated functions might be registered
as fragments:
```
BOLT-WARNING: interprocedural reference between unrelated fragments:
bad_gs/1(*2) and amd_decode_mce.cold.27/1(*2)
```
(Linux kernel binary)
(cherry picked from FBD32786688)
Summary:
Refactor members of BinaryBasicBlock. Replace some std containers with
ADT equivalents. The size of BinaryBasicBlock on x86-64 Linux is reduced
from 232 bytes to 192 bytes.
(cherry picked from FBD33081850)
Summary:
Build is already broken because VS fails to locate
llvm-boltdiff when running tests, and VS also complains that
include/bolt/Passes/InstrumentationSummary.h is lacking an include
string header. Disable this until we have a Windows buildbot to make
sure this build is sane.
(cherry picked from FBD33039972)
Summary:
Switched members of BinaryFunction to ADT where it was possible and
made sense. As a result, the size of BinaryFunction on x86-64 Linux
reduced from 1624 bytes to 1448.
(cherry picked from FBD32981555)
Summary:
For DWP case the AbbreviationsOffset is the offset of the abbrev
contribution in the DWP file, so can be none zero.
(cherry picked from FBD32961240)
Summary:
Currently, RuntimeDyld will not allocate a section without relocations
even if such a section is marked allocatable and defines symbols.
When we emit .debug_line for compile units with unchanged code, we
output original (input) data, without relocations. If all units are
emitted in this way, we will have no relocations in the emitted
.debug_line. RuntimeDyld will not allocate the section and as a result
we will write an empty .debug_line section.
To workaround the issue, always emit a relocation of RELOC_NONE type
when emitting raw contents to debug_line.
(cherry picked from FBD32909869)
Summary:
Some optimizations may remove all instructions in a basic block.
The pass will cleanup the CFG afterwards by removing empty basic
blocks and merging duplicate CFG edges.
The normalized CFG is printed under '-print-normalized' option.
(cherry picked from FBD32774360)
Summary:
The debug message for the last fall-through block was printed under the
reverse condition, i.e. when the block was not a fall-through. Remove
the debug message. If we'll need such information, we can add a pass
with more analysis, i.e. checking the last instruction, if the block is
reachable, etc.
(cherry picked from FBD32670816)
Summary:
TailDuplication::isInCacheLine makes the assumption that the block
has a valid layout index, which is not the case for unreachable blocks.
Add a check for a valid layout index.
(cherry picked from FBD32659755)
Summary:
In some of the system stack protection is enabled by default, which will
lead in extra symbols dependencies, which we want to avoid.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD32478426)
Summary:
As pointed out by Vladislav in issue 217, if our RTDyld-based
linker fails to locate a symbol, it will crash with segfault. Fix that.
(cherry picked from FBD32481543)
Summary:
Currently there are two issues rendering the use of bughunter/BOLT on a binary
with a large number of functions (100k) impossible:
1) `selectFunctionsToProcess` has O(binary_fn * force_fn) run-time, which is up
to quadratic with the number of functions in the binary.
2) It unnecessarily treats supplied function names as regexes.
This diff proposes the following changes to address the issue:
1. Add two options that treat function names as is, not as regexes, matching
bughunter usage model: `-funcs-no-regex`/`-funcs-file-no-regex`.
These options are complementary to `-funcs`/`-funcs-file` and `-skip-funcs`/
`-skip-funcs-file`. `funcs` takes precedence over `funcs-no-regex`.
2. Use string set to speed up function eligibility checking with
`-funcs-file-no-regex` to O(binary_fn * log force_fn).
(cherry picked from FBD28917225)
Summary:
The push and pop instructions might have wrong reorder due to this
error. Thanks rafaelauler for the provided test case.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD32478348)
Summary:
We were not tracking -DBUILD_SHARED_LIBS=ON and we introduced
some commits that break that by not specifying the correct dependencies.
Fix that.
(cherry picked from FBD32453377)
Summary: In some cases __open() is returning 0 for me. The open
syscall will return a negative number (-1) on error so 0 should be
valid.
This happens when one compiles a BOLT instrumented executable of
pyston (python implementation) and afterwards pip installs a package
which needs to be compiled and sets CFLAGS=-pipe. I could not reduce
it down to a small testcase but I guess it one can trigger when
manually closing std{in, out, err}. Everything seems to work
normally when disabling the assert for 0 in getBinaryPath() - I
decided to also modify the second case in readDescriptions() even
though I did not run into that one yet.
(cherry picked from FBD32409548)
Summary:
Make BOLT build in VisualStudio compiler and run without
crashing on a simple test. Other tests are not running.
(cherry picked from FBD32378736)