Commit Graph

1614 Commits

Author SHA1 Message Date
Nathan Sidwell
f84ac48f1e [BOLT] Add BOLT_TARGETS_TO_BUILD
Adds BOLT_TARGETS_TO_BUILD, which defaults to the intersection of
X86;AArch64 and LLVM_TARGETS_TO_BUILD, but allows configuration to
alter that -- for instance omitting one of those two targets even if
llvm supports both.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148847
2023-04-21 13:07:04 -04:00
Nathan Sidwell
1c3653df08 [BOLT] Robustify compile-time config check
The BOLT runtime is specifically hard coded for x86_64 linux or x86_64
darwin. (Using x86_64 syscalls, hardcoding syscall numbers.)

Make it very clear this is for those specific pair of systems.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148825
2023-04-21 12:37:54 -04:00
Nathan Sidwell
06b8057cc2 [BOLT] Make BOLT_ENABLE_RUNTIME user-configurable
Defaults to ON for x86_64 && (Linux | Darwin).

If enabled, checks that /proc/self/map_files is readable. Some systems are configured so that getdents fails with EPERM.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148742
2023-04-20 14:41:32 -04:00
Nathan Sidwell
c3368fbfe8 [BOLT][NFC] Remove exec permission from some tests
These files unnecessarily had execute permission.
2023-04-19 19:42:01 -04:00
Nathan Sidwell
0044647fdc [BOLT] Add bolt-runtime requirement to tests
These tests rely on the	X86 runtime, add the REQUIRES.

Differential Revision: https://reviews.llvm.org/D148737
2023-04-19 19:42:01 -04:00
Nathan Sidwell
9c92b023da [BOLT][NFC] Move phdr typedef to cpp file
This typedef is only used inside the RewriteInstance source file, let's not
expose it in the header file -- even if private.

Differential Revision: https://reviews.llvm.org/D148667
2023-04-19 15:51:17 -04:00
Nathan Sidwell
f2f0411924 [BOLT] Adjust Shdr alignment
Shdr's are not necesarily size 2^n, and there is no reason to align to
that boundary if they are.

Differential Revision: https://reviews.llvm.org/D148666
2023-04-19 15:51:12 -04:00
Nathan Sidwell
3c8757a863 [BOLT] Don't enable runtime when not building X86 2023-04-18 18:19:55 -04:00
Alexander Yermolovich
125df67421 [BOLT][DWARF] Fix handling of CUs without TU reference
When input is DWP with DWARF5 bolt wasn't handling correctly CUs that didn't
have TU references. Which resulted in a crash.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D148589
2023-04-17 17:56:08 -07:00
Amir Ayupov
286b5071f1 [BOLT][test] Update AArch64/r_aarch64_prelxx.s test
Update section flags and type after https://reviews.llvm.org/D148386

Reviewed By: #bolt, rafauler, MaskRay

Differential Revision: https://reviews.llvm.org/D148511
2023-04-17 23:33:42 +02:00
Job Noorman
48ad4296f7 [BOLT] Fix use-after-free in RewriteInstance::mapCodeSections
When a cold function is too large, its section gets deregistered.
However, the section is still dereferenced later to get its RuntimeDyld
ID. This patch moves the deregistration to after the last dereference.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D148427
2023-04-17 16:16:49 +02:00
Mark de Wever
44d38022ab Revert "Revert "Revert "[CMake] Bumps minimum version to 3.20.0."""
This reverts commit 1ef4c3c859.

Two buildbots still haven't been updated.
2023-04-15 20:12:24 +02:00
Mark de Wever
1ef4c3c859 Revert "Revert "[CMake] Bumps minimum version to 3.20.0.""
This reverts commit 92523a35a8.

Reland to see whether CIs are updated.
2023-04-15 13:12:04 +02:00
Job Noorman
8bbfac7be1 [BOLT][NFC] Fix UB due to unaligned load in DebugStrOffsetsWriter
The following tests fail when enabling UBSan due to an unaligned memory
load:

> runtime error: load of misaligned address 0x620000000643 for type
> 'const uint32_t' (aka 'const unsigned int'), which requires 4 byte
> alignment

  BOLT :: AArch64/asm-func-debug.test
  BOLT :: AArch64/update-debug-reloc.test
  BOLT :: X86/asm-func-debug.test
  BOLT :: X86/dwarf5-df-dualcu.test
  BOLT :: X86/dwarf5-df-mono-dualcu.test
  BOLT :: X86/dwarf5-ftypes-dwp-input-dwo-output.test
  BOLT :: X86/dwarf5-locaddrx.test
  BOLT :: X86/dwarf5-split-dwarf4-monolithic.test
  BOLT :: X86/inlined-function-mixed.test
  BOLT :: non-empty-debug-line.test

This patch fixes this by using read32le for the load.

Reviewed By: ayermolo

Differential Revision: https://reviews.llvm.org/D148217
2023-04-13 16:39:22 +02:00
Job Noorman
df3f1e2f31 [BOLT][NFC] Fix UB due to left shift of negative value
The following test fails when enabling UBSan due to a left shift of a
negative value:

> runtime error: left shift of negative value -2

  BOLT :: AArch64/ext-island-ref.s

This patch fixes this by using a multiplication instead of a shift.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D148218
2023-04-13 14:29:19 +02:00
Muhammad Omair Javaid
1a8f3f96c2 [BOLT] Fix section-end-sym.s test to only run x86/Linux
section-end-sym.s contains x86_64 assembly instruction execution on target.
I have changed REQURIES: field system-linux --> x86_64-linux
This came up while testing LLVM 16.0.1 release on AArch64 Linux.
2023-04-12 17:14:30 +05:00
Rafael Auler
b87bf74428 [BOLT] Fix creation of invalid CFG in presence of dead code
When there is a direct jump right after an indirect one, in
the absence of code jumpting to this direct jump, this is obviously
dead code. However, BOLT was failing to recognize that by mistakenly
placing both jmp instructions in the same basic block, and creating
wrong successor edges. Fix that, so we can safely run UCE on
that. This bug also causes validateCFG to fail and BOLT to crash if it
is running ICP on that function.

Reviewed By: #bolt, Amir

Differential Revision: https://reviews.llvm.org/D148055
2023-04-11 17:19:39 -07:00
Alexis Engelke
0c049ea60a [MC] Always encode instruction into SmallVector
All users of MCCodeEmitter::encodeInstruction use a raw_svector_ostream
to encode the instruction into a SmallVector. The raw_ostream however
incurs some overhead for the actual encoding.

This change allows an MCCodeEmitter to directly emit an instruction into
a SmallVector without using a raw_ostream and therefore allow for
performance improvments in encoding. A default path that uses existing
raw_ostream implementations is provided.

Reviewed By: MaskRay, Amir

Differential Revision: https://reviews.llvm.org/D145791
2023-04-06 16:21:49 +02:00
Yi Kong
d788db3d19 [BOLT][NFC] Simplify code using std::optional
Use std::optional instead of tracking if it is the first profile seen.

Differential Revision: https://reviews.llvm.org/D147308
2023-04-01 13:47:36 +08:00
spupyrev
92758a99c3 [BOLT] computing raw branch count for yaml profiles
`Function.RawBranchCount` is initialized for fdata profile but not for yaml one.
The diff adds the computation of the field for yaml profiles

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D144211
2023-03-28 11:09:21 -07:00
Denis Revunov
22a4aaf2b0 [BOLT] Don't use section relocations when computing hash for data from other section
When computing symbol hashes in BinarySection::hash, we try to find relocations
in the section which reference the passed BinaryData. We do so by doing
lower_bound on data begin offset and upper_bound on data end offset. Since
offsets are relative to the current section, if it is a data from the previous
section, we get underflow when computing offset and lower_bound returns
Relocations.end(). If this data also ends where current section begins,
upper_bound on zero offset will return some valid iterator if we have any
relocations after the first byte. Then we'll try to iterate from lower_bound to
upper_bound, since they're not equal, which in that case means we'll dereference
Relocations.end(), increment it, and try to do so until we reach the second
valid iterator. Of course we reach segfault earlier. In this patch we stop BOLT
from searching relocations for symbols outside of the current section.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D146620
2023-03-24 21:59:50 +03:00
Amy Huang
fd47ab05e5 Add "REQUIRES: asserts" to test that uses --debug-only flag 2023-03-22 17:05:11 -07:00
Job Noorman
54ab954149 [BOLT] Reject symbols pointing to section end
Sometimes, symbols are present that point to the end of a section (i.e.,
one-past the highest valid address). Currently, BOLT either rejects
those symbols when they don't point to another existing section, or errs
when they do and the other section is not executable. I suppose BOLT
would accept the symbol when it points to an executable section.

In any case, these symbols should not be considered while discovering
functions and should not result in an error. This patch implements that.

Note that this patch checks explicitly for symbols whose value equals
the end of their section. It might make more sense to verify that the
symbol's value is within [section start, section end). However, I'm not
sure if this could every happen *and* its value does not equal the end.

Another way to implement this is to verify that the BinarySection we
find at the symbol's address actually corresponds to the symbol's
section. I'm not sure what the best approach is so feedback is welcome.

Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D146215
2023-03-21 13:59:39 +04:00
Mark de Wever
d0398d3593 Revert "Reland "[CMake] Bumps minimum version to 3.20.0.""
This reverts commit a72165e5df.

Some buildbots have not been updated yet.
2023-03-18 20:32:43 +01:00
Mark de Wever
a72165e5df Reland "[CMake] Bumps minimum version to 3.20.0."
This reverts commit 92523a35a8.

Test whether all CI runners are updated.
2023-03-18 13:33:42 +01:00
Vladislav Khmelevsky
f9bf9f925e [BOLT] Add .relr.dyn section support
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D146085
2023-03-17 17:24:19 +04:00
Kazu Hirata
4e585e51c1 Use *{Map,Set}::contains (NFC) 2023-03-15 22:55:35 -07:00
Amir Ayupov
c7af4f383d [BOLT][NFC] Simplify preprocessProfile
Move out prepareToParse lambda, generalize it to handle mem events perf process.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D146002
2023-03-15 12:56:06 -07:00
Kazu Hirata
8bdf387858 Use *{Map,Set}::contains (NFC)
Differential Revision: https://reviews.llvm.org/D146104
2023-03-15 08:46:32 -07:00
Amir Ayupov
edda85771a [BOLT][NFC] Move addRelocation{X86,AArch64} into MCPlusBuilder
The two methods don't belong in BinaryFunction methods.
Move the dispatch tables into target-specific MCPlusBuilder methods.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D131813
2023-03-14 17:34:25 -07:00
Amir Ayupov
ce1061074d [BOLT][NFC] Simplify MCPlusBuilder::getRegSize
Pre-calculate the register size table in MCPlusBuilder constructor,
similar to `AliasMap`/`SmallerAliasMap` in `initAliases`.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D145828
2023-03-14 17:26:36 -07:00
Amir Ayupov
4e99891e70 [BOLT][NFC] Provide default impl for MIB methods that are only overridden on X86
Simplifies D145687

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D145972
2023-03-14 17:19:24 -07:00
Amir Ayupov
2eae9d8eb2 [BOLT][NFC] Use llvm::is_contained
Apply the replacement throughout BOLT.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D145464
2023-03-14 15:37:03 -07:00
Amir Ayupov
16e67e6932 [BOLT][NFC] Remove BB::getBranchInfo accepting MCSymbol ptr
Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D144924
2023-03-14 15:35:05 -07:00
Vladislav Khmelevsky
207ea5f2e4 [BOLT] Add writable segment for allocatable sections
The golang support creates 2 new data segments, one of them contains
relocations in PIC binaries, so the section must have writable rights.
Currently BOLT creates only one new segment that contains new sections
with RX rights, now also create RW segment if there are any new writable
sections were allocated during BOLT binary processing.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D143390
2023-03-15 00:06:55 +04:00
Amir Ayupov
223ec28da4 [BOLT][NFC] Return instruction list from createInstrIncMemory
Leverage move semantics for `std::vector`.

This also makes it consistent with `createInstrumentationSnippet`.

Reviewed By: Elvina

Differential Revision: https://reviews.llvm.org/D145465
2023-03-13 12:56:39 -07:00
Job Noorman
4875e06709 [BOLT][NFC] Improve performance of MCPlusBuilder::initAliases
It was using a redundant iteration over super regs to build
SmallerAliasMap. Removing this results in exactly the same alias maps
and a noticeable performance gain on targets with a large number of
registers.

Just anecdotally: on my machine, processing a small AArch64 binary went
from 2.7s down to 80ms.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D145779
2023-03-13 11:51:12 -07:00
Vladislav Khmelevsky
40b273998a [BOLT] Pass instrumentation-file arg for X86 xmm test
Differential Revision: https://reviews.llvm.org/D144865
2023-03-13 13:37:28 +04:00
Vladislav Khmelevsky
7117af529e [BOLT] Improve dynamic relocations support for CI
This patch fixes few problems with supporting dynamic relocations in CI.
1. After dynamic relocations and functions were read search for dynamic
relocations located in functions. Currently we expected them only to be
relative and only to be in constant island. Mark islands of such
functions to have dynamic relocations and create CI access symbol on the
relocation offset, so the BD would be created for such place.
2. During function disassemble and handling address reference for
constant island check if the referred external CI has dynamic
relocation. And if it has one we would continue to refer original CI
rather then creating a local copy.
3. After function disassembly stage mark function that has dynamic reloc
in CI as non-simple. We don't want such functions to be optimized, since
such passes as split function would create 2 copies of CI which we
unable to support currently.
4. During updating output values for BF search for BD located in CI and
update their output locations.
5. On dynamic relocation patching stage search for binary data located
on relocation offset. If it was moved use new relocation offset value
rather then an old one.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D143748
2023-03-13 13:37:28 +04:00
Mark de Wever
92523a35a8 Revert "[CMake] Bumps minimum version to 3.20.0."
Some build bots have not been updated to the new minimal CMake version.
Reverting for now and ping the buildbot owners.

This reverts commit 44c6b905f8.
2023-03-04 18:28:13 +01:00
Mark de Wever
44c6b905f8 [CMake] Bumps minimum version to 3.20.0.
This partly undoes D137724.

This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193

Note this does not remove work-arounds for older CMake versions, that
will be done in followup patches.

Reviewed By: mehdi_amini, MaskRay, ChuanqiXu, to268, thieta, tschuett, phosek, #libunwind, #libc_vendors, #libc, #libc_abi, sivachandra, philnik, zibi

Differential Revision: https://reviews.llvm.org/D144509
2023-03-04 12:40:57 +01:00
Amir Ayupov
1e1dfbb94a [BOLT][Instrumentation] Preserve red zone for functions with tail calls only
Allow a function with tail calls only to clobber its red zone.

Fixes https://github.com/llvm/llvm-project/issues/61114.

Reviewed By: #bolt, yota9

Differential Revision: https://reviews.llvm.org/D145202
2023-03-03 12:02:17 -08:00
Maksim Panchenko
73b89e3f38 [BOLT] Remove dependency on StringMap iteration order
Remove the usage of StringMap in places where the iteration order
affects the output since the iteration over StringMap is
non-deterministic.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D145194
2023-03-03 09:21:26 -08:00
Amir Ayupov
d77f96a9e0 [BOLT][NFC] Simplify BinaryFunction::setTrapOnEntry
Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D144758
2023-02-27 15:41:40 -08:00
Amir Ayupov
c30ab9dcff [BOLT][NFC] Log reversing splitting decision
Expose log for testing purposes.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D144674
2023-02-27 15:28:06 -08:00
Amir Ayupov
1c286acfb8 [BOLT] Prevent unsetting unknown control flow for split jump table
In case of a function with unknown control flow but with a single jump
table and a single jump table site, we attempt to match the jump table
and a site and update block successors using jump table targets.
Restrict this behavior for split jump tables which have targets in a
fragment function.

Fixes https://github.com/llvm/llvm-project/issues/60795.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D144602
2023-02-27 15:22:43 -08:00
Amir Ayupov
08ab4faf1a [BOLT][NFC] Const-ify analyzeJumpTable
Avoid modifying `BF`, instead set extra output parameter and modify BF in caller
scope.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D144598
2023-02-27 15:22:35 -08:00
Maksim Panchenko
03e94f6608 [BOLT] Change call count output for ICF
ICF optimization runs multiple passes and the order in which functions
are folded could be dependent on the order they are being processed.
This order is indeterministic as functions are intermediately stored in
std::unordered_map<>. Note that this order is mostly stable, but is not
guaranteed to be and can change e.g. after switching to a different C++
library implementation.

Because the processing (and folding) order is indeterministic, the
previous way of calculating merged function call count could produce
different results.

Change the way we calculate the ICF call count to make it independent of
the function folding/processing order.

Mostly NFC as the output binary should remain the same, the change
affects only the console output.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D144807
2023-02-27 15:21:16 -08:00
Maksim Panchenko
fb28196a64 [BOLT] Fix intermittent crash with instrumentation
When createInstrumentedIndirectCall() was invoked for tail calls, we
attached annotation instruction twice to the new call instruction.
First in createDirectCall(), and then again while copying over the
metadata operands.

As a result, the annotations were not properly stripped for such calls
before the call to freeAnnotations() in LowerAnnotations pass. That lead
to use-after-free while restoring the offsets with setOffset() call.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D144806
2023-02-27 14:11:10 -08:00
Sebastian Pop
c63c185e72 [BOLT] fix tests on arm64
The two tests were failing on arm64-linux with:
BOLT-ERROR: invalid target 'x86-64'.

Differential Revision: https://reviews.llvm.org/D144593
2023-02-25 16:46:50 +00:00