Summary:
Currently most of the warnings are printed only in debug mode. Since
relocations are very important for binary correct work I suggest to
print number of failed to process relocations to pay extra attention in
case some problems with them were met
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30500629)
Summary:
Added a function in TailDuplication
that will do Constant and Copy Propagation for blocks that
we duplicated as a part of tail duplication. Added supporting
functions to MCPlusBuilder to find src registers and replace
registers
(cherry picked from FBD30231907)
Summary:
This patch is part of preparation for golang support. The golang symbols
might have spaces in the name (for example "type..eq.[10]interface {}").
Since fdata uses spaces as a field separator such names brakes the fdata
format, so we need to escape whitespaces and backslashes in symbol names
using the backslash character.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD29999491)
Summary:
Remove unused code introduced a while ago (2016), with its use removed
since then.
PR facebookincubator/BOLT#198
Author: Amir Aupov <aaupov@fb.com>
(cherry picked from FBD30376537)
Summary:
The ADRP instructions has 21 bits to store page offsets + 12 lowest bits
are zero, that give us a total of 33 bits (32 bits for address + 1 sign
bit, to address +- 4GB).
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30283044)
Summary:
This commit adds dummy tests for checking instrumentation
support for PIE executables and shared libraries.
Vasily Leonenko,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092729)
Summary:
To avoid RELATIVE relocations avoid using of GOT table
by using hidden visibility for all symbols in library.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092712)
Summary:
The trampolines are no loger pointers to the functions. For
propper name resolving by bolt use extern "C" for all external symbols
in instr.cpp
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092698)
Summary:
This commit introduces -instrumentation-binpath argument used
to point instuqmented binary in runtime in case if /proc/self/map_files
path is not accessible due to access restriction issues.
Vasily Leonenko
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092681)
Summary:
This commit introduces static binaries instrumentation
support. Note that current implementation does not support profile
output on the instrumented binary finalization. So it requires to use
-instrumentation-sleep-time=N (N>0) option usage. Note: There is
unhandled case with static PIE executable which might have dynamic
header.
Vasily Leonenko,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092471)
Summary:
This commit adds support for opening libs based on links
/proc/self/map_files. For this we're getting current virtual address
and searching the lib in the directory with such address range. After
that, we're getting full path to the binary by using readlink
function. Direct read from link in /proc/self/map_files entries is not
possible because of lack of permissions.
Elvina Yakubova,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092422)
Summary:
This commit adds support for getting directory entries and
reading value of a symbolic link in instrumentation runtime library
Elvina Yakubova,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092362)
Summary:
This commit implements new method for _start & _fini functions hooking
which allows to use relative jumps for future PIE & .so library support.
Instead of using absolute address of _start & _fini functions known on
linking stage - we'll use dynamically created trampoline functions and
use corresponding symbols in instrumentation runtime library.
As we would like to use instrumentation for dynamically loaded binaries
(with PIE & .so), thus we need to compile instrumentation library with
"-fPIC" flag to support relative address resolution for functions and
data.
For shared libraries we need to handle initialization of instrumentation
library case by using DT_INIT section entry point.
Also this commit adds detection if the binary is executable or shared
library based on existence of PT_INTERP header. In case of shared
library we save information about real library init function address
for further usage for instrumentation library init trampoline function
creation and also update DT_INIT to point instrumentation library init
function.
Functions called from init/fini functions should be called with forced
stack alignment to avoid issues with instructions which relies on it.
E.g. optimized string operations.
Vasily Leonenko,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD30092316)
Summary:
Move the common code into MCPlusBuilder.h.
Use group 1 `kTailCall` MCAnnotation instead of dynamically allocated
annotation.
This diff reduces the processing time overhead to 1.5% vs using
TAILJMP opcode.
(cherry picked from FBD30055585)
Summary:
The linker can generate 8- or 16-byte entries in .plt.got and .plt.sec
sections. On X86, the main differentiator is the presence of endbr64
instruction at the beginning of the entry. Detect the instruction and
adjust the size accordingly.
(cherry picked from FBD29847639)
Summary:
.stab and .stabstr are special sections containing debugging
information and strings associated with the debugging information.
This commit adds them to the list of debugging sections, so
these sections can be removed for output binary.
Vasily Leonenko,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD29746153)
Summary:
Match new direct call generated during ICP to correct pseudo probe
New call is matched to the probes of original call instruction.
(cherry picked from FBD29591662)
Summary:
Created a binary pass that records how many
times tail duplication would be used and how many cache
misses it would theoretically stop
(cherry picked from FBD29619858)
Summary:
We extended DynoStats to dump the histogram per instruction opcode. By
default the dump is turned off. Use '-print-dyno-opcode-stats' to enable
the dump.
BOLT also dumps for each instruction opcode the maximum execution count and
corresponding function name and basic block offsets where the instruction
occurs. Below is a sample of the dump:
Opcode, Execution Count, Max Exec Count, Function Name:Offset
SHR8rCL, 232, 232, _ZNK5folly14AsyncSSLSocket4goodEv:53
VPADDDYrr, 13956, 388, chacha20_encrypt_bytes.part.0/3:736
PMOVSXBWrr, 4, 2, ares_expand_name/1:264
VMOVAPSmr, 1082, 43, chacha20_encrypt_bytes.part.0/3:2864
VPSHUFBrr, 9540, 1667, chacha20_encrypt_bytes.part.0/3:4416
VPUNPCKLDQYrr, 1102, 188, jsimd_ycc_rgb_convert_avx2/1:125
VPBROADCASTQYrm, 39, 39, chacha20_encrypt_bytes.part.0/3:400
PMOVSXWDrr, 8, 2, ares_expand_name/1:264
VPORrr, 817, 129, jsimd_idct_islow_avx2/1:41
PSLLDri, 8690752, 65644, blockmix_salsa8_xor/1:1424
(cherry picked from FBD28859624)
Summary:
A binary can contain multiple PLT sections with different name and
attributes (such as an entry size). Extend the support to .plt.sec and
refactor the code to make future extensions simpler.
(cherry picked from FBD29502107)
Summary:
clang-12 now compiles bolt without warnings.
Some warnings were fixed if possible while others were suppressed by
doing (void)variable for unused variable warnings or moving code inside
assert statements of LLVM_DEBUG blocks.
(cherry picked from FBD29469054)
Summary:
Add code to read more dynamic relocations (DT_JMPREL) and enforce strict
checks that corresponding sections sizes match .dynamic entry
description.
(cherry picked from FBD29502109)
Summary:
The code for writing out dwo files wasn't handling case where DWP is an input.
Because all the sections are part of the same binary.
One note with current implementation. .debug-str.dwo will have strings for all the dwo objects.
This is because llvm-dwp de-duplicates strings and combines them in to one section. It then re-writes .debug-str-offsets.dwo to point to new .debug-str.dwo section.
(cherry picked from FBD29244835)
Summary:
Our YAML objects contain references to dynamic relocations via .dynamic,
but there are no corresponding relocation sections. Change .dynamic
contents to specify no dynamic relocations.
(cherry picked from FBD29502108)
Summary:
Move the code that handles true external references (non-unreachable)
out of a for-loop in `BinaryFunction::disassemble`.
(cherry picked from FBD29411345)
Summary:
Handle R_X86_64_64 the same way as R_X86_64_32;
`getSizeForType` takes care of the size:
```x86_64 ABI relocation types
Name Value Field Calculation
R_X86_64_64 1 word64 S + A
R_X86_64_32 10 word32 S + A
```
(cherry picked from FBD29370417)
Summary:
When we fold a function in relocation mode, make sure to clear its state
to avoid emitting relocations against undefined symbols.
(cherry picked from FBD29245320)
Summary:
Dived more in to DWARF APIs and llvm-symbolizer this is a more streamline way of doing it, and address base gets set properly.
Writing out dwo files with dwp input will be separate patch.
(cherry picked from FBD31361529)
Summary:
When indirect call is instrmented it locks SimpleHashTable's mutex on get() call.
If while locked we we receive a signal and signal handler also will call
indirect function we will end up with deadlock.
PR facebookincubator/BOLT#167
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
(cherry picked from FBD28909921)
Summary:
Suppresses the warning
```
src/DebugData.h:338:20: warning: 'addList' overrides a member function but is not marked 'override' [-Wsuggest-override]
```
(cherry picked from FBD28858201)