1657 Commits

Author SHA1 Message Date
Amir Ayupov
bce889c8df [BOLT] Align BranchInfo and FuncBranchData in DataAggregator::recordTrace
`DataAggregator::recordTrace` serves two purposes:
  - Attaching LBR fallthrough ("trace") information to CFG (`getBranchInfo`),
    which eventually gets emitted as YAML profile.
  - Populating vector of offsets that gets added to `FuncBranchData`, which
    eventually gets emitted as fdata profile.

`recordTrace` is invoked from `getFallthroughsInTrace` which checks its return
status and passes on the collected vector of offsets to `doTrace`.

However, if a malformed trace is passed to `recordTrace` it might partially
attach the profile to CFG and exit with false, not propagating the vector of
offsets to `doTrace`. This leads to a difference between fdata and yaml profile
collected from the same binary and the same perf file.

(Skylake LBR errata might produce such malformed traces where the last entry
is duplicated, resulting in invalid fallthrough path between the last two
entries).

There are two ways to handle this mismatch: conservative (aligned with fdata),
or aggressive (aligned with yaml). Conservative approach would discard the
trace entirely, buffering the CFG updates until all fallthroughs are confirmed.
Aggressive approach would apply CFG updates and return the matching
fallthroughs in the vector even if the trace is invalid (doesn't correspond to
a valid fallthrough path). I chose to go with the former (conservative/fdata)
approach which produces more accurate profile.

We can't rely on pre-filtering such traces early (in LBR sample processing) as
DataAggregator is used for both perf samples and pre-aggregated perf information
which loses branch stack information.

Test Plan: https://github.com/rafaelauler/bolt-tests/pull/22

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D151614
2023-05-30 18:03:45 -07:00
Amir Ayupov
c03e6511cf [BOLT] Add skip-non-simple for boltdiff
Extra filtering for boltdiff, excluding non-simple functions from comparison.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D151510
2023-05-30 15:24:18 -07:00
Petr Hosek
99a1aeefb3 Revert "[BOLT][CMake] Use LLVM macros for install targets"
This reverts commit 627d5e16127bd8034b893e66ab0c86eacf2d939a.
2023-05-30 19:28:14 +00:00
Petr Hosek
627d5e1612 [BOLT][CMake] Use LLVM macros for install targets
The existing BOLT install targets are broken on Windows becase they
don't properly handle output extension. Rather than reimplementing
this logic in BOLT, reuse the existing LLVM macros which already
handle this aspect correctly.

Differential Revision: https://reviews.llvm.org/D151595
2023-05-30 19:23:11 +00:00
Mark de Wever
cbaa3597aa Reland "[CMake] Bumps minimum version to 3.20.0.
This reverts commit d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6.

Adds the patch by @hans from
https://github.com/llvm/llvm-project/issues/62719
This patch fixes the Windows build.

d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6 reverted the reviews

D144509 [CMake] Bumps minimum version to 3.20.0.

This partly undoes D137724.

This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193

Note this does not remove work-arounds for older CMake versions, that
will be done in followup patches.

D150532 [OpenMP] Compile assembly files as ASM, not C

Since CMake 3.20, CMake explicitly passes "-x c" (or equivalent)
when compiling a file which has been set as having the language
C. This behaviour change only takes place if "cmake_minimum_required"
is set to 3.20 or newer, or if the policy CMP0119 is set to new.

Attempting to compile assembly files with "-x c" fails, however
this is workarounded in many cases, as OpenMP overrides this with
"-x assembler-with-cpp", however this is only added for non-Windows
targets.

Thus, after increasing cmake_minimum_required to 3.20, this breaks
compiling the GNU assembly for Windows targets; the GNU assembly is
used for ARM and AArch64 Windows targets when building with Clang.
This patch unbreaks that.

D150688 [cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump

The build uses other mechanism to select the runtime.

Fixes #62719

Reviewed By: #libc, Mordante

Differential Revision: https://reviews.llvm.org/D151344
2023-05-27 12:51:21 +02:00
Tobias Hieta
f98ee40f4b
[NFC][Py Reformat] Reformat python files in the rest of the dirs
This is an ongoing series of commits that are reformatting our
Python code. This catches the last of the python files to
reformat. Since they where so few I bunched them together.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Reviewed By: jhenderson, #libc, Mordante, sivachandra

Differential Revision: https://reviews.llvm.org/D150784
2023-05-25 11:17:05 +02:00
Sergei Barannikov
ee1d5f6372 [MC] Check if register is non-null before calling isSubRegisterEq (NFCI)
D151036 adds an assertions that prohibits iterating over sub- and
super-registers of a null register. This is already the case when
iterating over register units of a null register, and worked by
accident for sub- and super-registers.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D151285
2023-05-25 08:53:15 +03:00
Yi Kong
67cf01bd37 Reland^2 "[BOLT] Parallelize legacy profile merging"
Resovled the issue that when number of tasks is fewer than cores, we end
up creating as many threads as the number of cores, making the
performance worse than the single thread version.
2023-05-22 13:37:41 -07:00
Shengchen Kan
3f1e9468f6 [X86][MC][bolt] Share code between encoding optimization and assembler relaxation, NFCI
PUSH[16|32|64]i[8|32] are not arithmetic instructions, so I renamed the
functions.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D151028
2023-05-21 09:31:50 +08:00
Shengchen Kan
89ca4eb002 [X86][NFC] Correct the instruction names for PUSH16i, PUSH32i
Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D151012
2023-05-20 17:33:42 +08:00
Amir Ayupov
068e9889b1 [BOLT] Add isParentOf and isParentOrChildOf BF checks
Add helper methods and simplify cases where we want to check if two functions
are parent-child of each other (function-fragment relationship).

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D142668
2023-05-19 17:51:54 -07:00
Amir Ayupov
860543d96e [BOLT][NFC] Extract DataAggregator::parseLBRSample
Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D150986
2023-05-19 17:50:02 -07:00
Petr Hosek
9e6e3375f1 [BOLT][CMake] Use correct output paths and passthrough necessary options
This addresses https://github.com/llvm/llvm-project/issues/62748.

Differential Revision: https://reviews.llvm.org/D150752
2023-05-19 17:43:27 +00:00
Yi Kong
65404e51bf Revert "Reland "[BOLT] Parallelize legacy profile merging""
This reverts commit 611fb179b19857ffb87df81c926902fc7e3412ab.

Broken tests
2023-05-18 16:26:43 -07:00
Yi Kong
611fb179b1 Reland "[BOLT] Parallelize legacy profile merging"
This reverts commit 78d8d016490909ac759c6f76c5f8679bc7a58b2e.
2023-05-18 16:06:46 -07:00
Amir Ayupov
b6f07d3ae8 [BOLT][NFC] Add MCPlusBuilder defOperands/useOperands helpers
Make intent more explicit with the use of new helper methods.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D150810
2023-05-17 21:52:33 -07:00
Amir Ayupov
17f3cbe3af [BOLT][NFC] Use llvm::make_range
Use `llvm::make_range` convenience wrapper from ADT.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D145887
2023-05-17 10:50:56 -07:00
Nico Weber
d763c6e5e2 Revert "Reland "[CMake] Bumps minimum version to 3.20.0.""
This reverts commit 65429b9af6a2c99d340ab2dcddd41dab201f399c.

Broke several projects, see https://reviews.llvm.org/D144509#4347562 onwards.

Also reverts follow-up commit "[OpenMP] Compile assembly files as ASM, not C"

This reverts commit 4072c8aee4c89c4457f4f30d01dc9bb4dfa52559.

Also reverts fix attempt  "[cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump"

This reverts commit 7d47dac5f828efd1d378ba44a97559114f00fb64.
2023-05-17 10:53:33 -04:00
Rafael Auler
62a2feff57 [BOLT] Fix state of MCSymbols in lowering pass
We have mostly harmless data races when running
BinaryContext::calculateEmittedSize() in parallel, while performing
split function pass.  However, it is possible to end up in a state
where some MCSymbols are still registered and our clean up
failed. This happens rarely but it does happen, and when it happens,
it is a difficult to diagnose heisenbug. To avoid this, add a new
clean pass to perform a last check on MCSymbols, before they
undergo our final emission pass, to verify that they are in a sane
state. If we fail to do this, we might resolve some symbols to zero
and crash the output binary.

Reviewed By: #bolt, Amir

Differential Revision: https://reviews.llvm.org/D137984
2023-05-16 14:54:16 -07:00
Job Noorman
8a5a12057e [BOLT][Wrapper] Fix off-by-one in find_section upper limit
find_section used to match offsets equal to file_offset + size causing
offsets to sometimes be attributed to the wrong section.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D149047
2023-05-16 09:25:06 +02:00
Mark de Wever
65429b9af6 Reland "[CMake] Bumps minimum version to 3.20.0."
The owner of the last two failing buildbots updated CMake.

This reverts commit e8e8707b4aa6e4cc04c0cffb2de01d2de71165fc.
2023-05-13 11:42:25 +02:00
Shengchen Kan
db39d47928 [X86][AsmParser] Reapply "Refactor code and optimize more instructions from VEX3 to VEX2"
This was reverted in d4994d0e7922 b/c a bolt test failed after the
encoding changed.

Relanded the patch with the updated test.
2023-05-13 09:26:29 +08:00
Rafael Auler
77811752e3 [BOLT] Fix flush pending relocs
https://github.com/facebookincubator/BOLT/pull/255 accidentally
omitted a relocation type when refactoring the code. Add this type back
and change function name so its intent is more clear.

Reviewed By: #bolt, Amir

Differential Revision: https://reviews.llvm.org/D150335
2023-05-11 11:52:32 -07:00
Alexander Yermolovich
640e07c490 [BOLT][DWARF][NFC] Fixed an assertion check
Spotted this one while working on new DWARF Rewriter. We were using wrong check
in assertion.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D150167
2023-05-09 11:37:40 -07:00
Amir Aupov
52e4f9e386 [BOLT][test] Fix retpoline-synthetic.test
Fix test on BOLT's buildbot, e.g.
https://lab.llvm.org/buildbot/#/builders/244/builds/10885
2023-05-08 20:17:03 -07:00
Amir Ayupov
6fcb91b2f7 [BOLT] Use opcode name in hashBlock
Use MCInst opcode name instead of opcode value in hashing.

Opcode values are unstable wrt changes to target tablegen definitions,
and we notice that as output mismatches in NFC testing. This makes BOLT YAML
profile tied to a particular LLVM revision which is less portable than
offset-based fdata profile.

Switch to using opcode names which have 1:1 mapping with opcode values for any
given LLVM revision, and are stable wrt modifications to .td files (except of
course modifications to names themselves).

Test Plan:
D150154 is a test commit adding new X86 instruction which shifts opcode values.
With current change, pre-aggregated-perf.test passes in nfc check mode.
Without current change, pre-aggregated-perf.test expectedly fails.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D150005
2023-05-08 18:54:29 -07:00
Amir Ayupov
19941b0468 [BOLT] Use MCInstPrinter in createRetpolineFunctionTag
Make retpoline functions invariant of X86 register numbers.
retpoline-synthetic.test is known to fail NFC testing due to shifting
register numbers. Use canonical register names instead of tablegen
numbers.

Before:
```
__retpoline_r51_
__retpoline_mem_r58+DATAat0x200fe8
__retpoline_mem_r51+0
__retpoline_mem_r132+0+8*53
```

After:
```
__retpoline_%rax_
__retpoline_mem_%rip+DATAat0x200fe8
__retpoline_mem_%rax+0
__retpoline_mem_%r12+0+8*%rbx
```

Test Plan:
- Revert 67bd3c58c0c7389e39c5a2f4d3b1a30459ccf5b7 that touches X86RegisterInfo.td.
- retpoline-synthetic.test passes in NFC mode with this diff, fails without it.

Reviewed By: #bolt, rafauler

Differential Revision: https://reviews.llvm.org/D150138
2023-05-08 18:50:49 -07:00
Alexander Yermolovich
69520fc771 [BOLT][DWARF] Fix dwarf5-one-loclists-two-bases test
Fix assembly for the helper file to work with the new DWARF rewriter.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D150147
2023-05-08 15:39:10 -07:00
Mark de Wever
e8e8707b4a Revert "Reland "[CMake] Bumps minimum version to 3.20.0.""
Unfortunatly not all buildbots are updated.

This reverts commit ffb807ab5375b3f78df198dc5d4302b3b552242f.
2023-05-06 17:03:56 +02:00
Mark de Wever
ffb807ab53 Reland "[CMake] Bumps minimum version to 3.20.0."
All build bots should be updated now.

This reverts commit 44d38022ab29a3156349602733b3459df5beef93.
2023-05-06 11:43:02 +02:00
Amir Ayupov
f7643f8da3 [BOLT] Remove redundant dumps in AsmDump
Dumping jump table and tail call fdata is covered by subsequent iteration over
successors.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D149799
2023-05-04 10:30:48 -07:00
Timm Bäder
eadf6db585 [docs] Hide collaboration and include graphs in doxygen docs
They don't convey any useful information and make the documentation
unnecessarily hard to read.

Differential Revision: https://reviews.llvm.org/D149641
2023-05-04 12:26:51 +02:00
Alexander Yermolovich
93ce096502 [BOLT][DWARF] Fix handling of loclists_base without location accesses
There are CUs that have DW_AT_loclists_base, but no DW_AT_location in children
DIEs. Pre-bolt it points to a valid offset. We were not updating it, so it ended
up pointing in the middle of a list and caused LLDB to print out errors. Changed
it to point to first location list. I don't think it should matter since there
are no accesses to it anyway.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D149798
2023-05-03 20:50:37 -07:00
spupyrev
3e3a926be8 [BOLT][NFC] Add hash computation for basic blocks
Extending yaml profile format with block hashes, which are used for stale
profile matching. To avoid duplication of the code, created a new class with a
collection of utilities for computing hashes.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D144306
2023-05-02 14:03:47 -07:00
Job Noorman
d755e10e7a [BOLT] Make sure Mach-O binaries are actually linked
Note that this issue is also solved by D147544.

Reviewed By: alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D149244
2023-05-02 16:22:49 +02:00
Job Noorman
f3ea4228fd [BOLT] Make sure all section allocations have deterministic contents
For empty sections, RuntimeDyld always allocates 1 byte but leaves it
uninitialized. This causes the contents of some output sections to be
non-deterministic.

Note that this issue is also solved by D147544.

Fixes #59008

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D149243
2023-05-02 16:18:01 +02:00
Yi Kong
78d8d01649 Revert "[BOLT] Parallelize legacy profile merging"
This reverts commit 35af20d9e036deeed250b73fd3ae86d6455173c5.

The patch caused a test failure.
2023-04-28 21:24:52 +09:00
Yi Kong
35af20d9e0 [BOLT] Parallelize legacy profile merging
Merging profiles is quite expensive, but easily paralleizable.

8359 profiles on n2d-standard-128:
single-thread: 808s
multi-thread: 200s (~75% speed up)

Differential Revision: https://reviews.llvm.org/D149014
2023-04-27 15:37:14 +09:00
Job Noorman
8421c7ad30 [BOLT][Wrapper] Fix off-by-one when parsing 'cmp' output
The byte offsets in the output of 'cmp' start from 1, not from 0 as the
current parser assumes. This caused mismatched bytes to sometimes be
attributed to the wrong section.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D149046
2023-04-24 20:54:56 +02:00
Job Noorman
b3780af3b3 [BOLT] Fix many tests detected as unsupported
Since D148847, many tests are detected as being unsupported. This is
caused by BOLT_TARGETS_TO_BUILD being ;-separated whereas the previously
used TARGETS_TO_BUILD is space-separated.

This patch fixes this by creating config.targets lit.cfg.py by splitting
on ';'.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D149026
2023-04-24 11:39:02 -07:00
Christian Ulmann
f5425c128a [LoopInfo] Move generic LoopInfo into own files
This commit splits the generic part of `LoopInfo` into separate files.
These new `GenericLoopInfo` files are located in `llvm/Support` to be inline
with `GenericDomTree`.

Furthermore, this change ensures that MLIR's Bazel build does not have
to link against `LLVMAnalysis` just to use these template headers.

Depends on D148219

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D148235
2023-04-24 06:07:05 +00:00
Nathan Sidwell
5b9f0309d6 [BOLT] Remove unsupported ELF type reloc handling
Drop unsupported ELF format reloc handling -- RewriteInstance lacks
this flexibility elsewhere.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148946
2023-04-23 13:09:37 -04:00
Nathan Sidwell
ffb42e313d [BOLT] Remove unneeded dyncasts
These checks are unnecessary -- we've already bailed if the format was wrong.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148848
2023-04-21 13:40:54 -04:00
Nathan Sidwell
f84ac48f1e [BOLT] Add BOLT_TARGETS_TO_BUILD
Adds BOLT_TARGETS_TO_BUILD, which defaults to the intersection of
X86;AArch64 and LLVM_TARGETS_TO_BUILD, but allows configuration to
alter that -- for instance omitting one of those two targets even if
llvm supports both.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148847
2023-04-21 13:07:04 -04:00
Nathan Sidwell
1c3653df08 [BOLT] Robustify compile-time config check
The BOLT runtime is specifically hard coded for x86_64 linux or x86_64
darwin. (Using x86_64 syscalls, hardcoding syscall numbers.)

Make it very clear this is for those specific pair of systems.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148825
2023-04-21 12:37:54 -04:00
Nathan Sidwell
06b8057cc2 [BOLT] Make BOLT_ENABLE_RUNTIME user-configurable
Defaults to ON for x86_64 && (Linux | Darwin).

If enabled, checks that /proc/self/map_files is readable. Some systems are configured so that getdents fails with EPERM.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D148742
2023-04-20 14:41:32 -04:00
Nathan Sidwell
c3368fbfe8 [BOLT][NFC] Remove exec permission from some tests
These files unnecessarily had execute permission.
2023-04-19 19:42:01 -04:00
Nathan Sidwell
0044647fdc [BOLT] Add bolt-runtime requirement to tests
These tests rely on the	X86 runtime, add the REQUIRES.

Differential Revision: https://reviews.llvm.org/D148737
2023-04-19 19:42:01 -04:00
Nathan Sidwell
9c92b023da [BOLT][NFC] Move phdr typedef to cpp file
This typedef is only used inside the RewriteInstance source file, let's not
expose it in the header file -- even if private.

Differential Revision: https://reviews.llvm.org/D148667
2023-04-19 15:51:17 -04:00
Nathan Sidwell
f2f0411924 [BOLT] Adjust Shdr alignment
Shdr's are not necesarily size 2^n, and there is no reason to align to
that boundary if they are.

Differential Revision: https://reviews.llvm.org/D148666
2023-04-19 15:51:12 -04:00