Commit Graph

1173 Commits

Author SHA1 Message Date
Amir Ayupov
9b02dc631d [BOLT] Check MCContext errors
Abort on emission errors to prevent a malformed binary being written.
Example:
```
<unknown>:0: error: Undefined temporary symbol .Ltmp26310
<unknown>:0: error: Undefined temporary symbol .Ltmp26311
<unknown>:0: error: Undefined temporary symbol .Ltmp26312
<unknown>:0: error: Undefined temporary symbol .Ltmp26313
<unknown>:0: error: Undefined temporary symbol .Ltmp26314
<unknown>:0: error: Undefined temporary symbol .Ltmp26315
BOLT-ERROR: Emission failed.
```

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D123263
2022-04-08 21:08:39 -07:00
Vladislav Khmelevsky
87a57aada3 [BOLT][test] Fix X86 tests
Differential Revision: https://reviews.llvm.org/D123133
2022-04-06 21:16:42 +03:00
Argyrios Kyrtzidis
330268ba34 [Support/Hash functions] Change the final() and result() of the hashing functions to return an array of bytes
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:

* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`

As part of this patch also:

* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.

Differential Revision: https://reviews.llvm.org/D123100
2022-04-05 21:38:06 -07:00
Amir Ayupov
f99398fe0e [BOLT][NFC] Move isADD64rr and isADDri out of MCPlusBuilder class
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D123077
2022-04-05 14:32:07 -07:00
Vladislav Khmelevsky
2e51a32219 [BOLT] Check for !isTailCall in isUnconditionalBranch
Add !isTailCall in isUnconditionalBranch check in order to sync the x86
and aarch64 and fix the fixDoubleJumps pass on aarch64.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122929
2022-04-05 23:39:34 +03:00
Vladislav Khmelevsky
4956e0e197 [BOLT] Fix plt relocations symbol match
The bfd linker adds the symbol versioning string to the symbol name in symtab.
Skip the versioning part in order to find the registered PLT function.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122039
2022-04-05 15:57:26 +03:00
Maksim Panchenko
163e188e3e [BOLT][test] Fix AArch64 test
Remove header dependency from cross-platform test.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D123107
2022-04-04 23:28:47 -07:00
Maksim Panchenko
f927106e10 [BOLT][test] Enable cross-target testing
Check for supported target architecture instead of the host arch when
deciding to execute non-runtime tests.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D122498
2022-04-04 23:00:54 -07:00
Maksim Panchenko
f0f5d19a36 [BOLT][test] Fix X86 cross-platform tests
Use target-specific flags for building X86 non-runnable tests.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D123072
2022-04-04 22:25:42 -07:00
Amir Ayupov
686406a006 [BOLT][NFC] Use X86 mnemonic checks
Remove switches in X86MCPlusBuilder.cpp, use mnemonic checks instead

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D122853
2022-04-04 14:05:46 -07:00
Vladislav Khmelevsky
3b1314f4de [BOLT] AArch64: Read all static relocations
Read static relocs on the same address, as dynamic in order to update
constant island data address properly.

Differential Revision: https://reviews.llvm.org/D122100
2022-04-03 19:03:35 +03:00
Maksim Panchenko
5679a3ce87 [BOLT][test] Fix AArch64 cross-platform tests
Use target-specific flags for building AArch64 non-runnable tests.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D122520
2022-04-01 17:10:47 -07:00
Vladislav Khmelevsky
4c14519ecb [BOLT] LongJmp: Check for shouldEmit
Check that the function will be emitted in the final binary. Preserving
old function address is needed in case it is PLT trampiline, that is
currently not moved by the BOLT.

Differential Revision: https://reviews.llvm.org/D122098
2022-03-31 22:33:09 +03:00
Vladislav Khmelevsky
fed958c6cc [BOLT] AArch64: Emit text objects
BOLT treats aarch64 objects located in text as empty functions with
contant islands. Emit them with at least 8-byte alignment to the new
text section.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D122097
2022-03-31 22:28:50 +03:00
Amir Ayupov
c31af7cfe3 [MC][BOLT] Add setter for AllowAtInName
Use the setter in BOLT to allow printing names with variant kind in the name
(e.g. "func@PLT").
Fixes BOLT buildbot tests that broke after D122516:
https://lab.llvm.org/buildbot/#/builders/215/builds/3595

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D122694
2022-03-30 13:04:28 -07:00
Vladislav Khmelevsky
af9bdcfc46 [BOLT] Align constant islands to 8 bytes
AArch64 requires CI to be aligned to 8 bytes due to access instructions
restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.

Differential Revision: https://reviews.llvm.org/D122065
2022-03-27 22:30:42 +03:00
spupyrev
4609f60ebc [BOLT] Avoid pointless loop rotation
It seems the earlier implementation does not follow the description
in LoopRotationPass.h: It rotates loops even if they are already laid out
correctly. The diff adjusts the behaviour.

Given that the impact of LoopInversionPass is minor, this change won't
yield significant perf differences. Tested on clang-10: there seems to be a
0.1%-0.3% cpu win and a small reduction of branch misses.

**Before:**
BOLT-INFO: 120 Functions were reordered by LoopInversionPass

**After:**
BOLT-INFO: 79 Functions were reordered by LoopInversionPass

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D121921
2022-03-22 12:42:42 -07:00
Vladislav Khmelevsky
5be5d0f56e [BOLT] LongJmp speedup refactoring
Run tentativeLayoutRelocMode twice only if UseOldText option was passed.
Refactor BF loop to break on condtition met.

Differential Revision: https://reviews.llvm.org/D121825
2022-03-18 16:16:47 +03:00
Amir Ayupov
42e8e00189 [BOLT][NFC] Use X86 mnemonic tables
Remove tables from X86MCPlusBuilder, make use of llvm::X86 mnemonic tables.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121573
2022-03-18 01:52:11 -07:00
Amir Ayupov
dc1cf838a5 [BOLT] Strip redundant AdSize override prefix
Since LLVM MC now preserves redundant AdSize override prefix (0x67), remove it
in BOLT explicitly (-x86-strip-redundant-adsize, on by default).

Test Plan:
`bin/llvm-lit -a bolt/test/X86/addr32.s`

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120975
2022-03-16 09:38:17 -07:00
Amir Ayupov
698127df51 [BOLT][NFC] Move isMOVSX64rm32 out of MCPlusBuilder
Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121669
2022-03-16 08:18:56 -07:00
Vladislav Khmelevsky
62a289d85c [BOLT] LongJmp: Fix hot text section alignment
The BinaryEmitter uses opts::AlignText value to align the hot text
section. Also check that the opts::AlignText is at least
equal opts::AlignFunctions for the same reason, as described in D121392.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D121728
2022-03-16 15:57:46 +03:00
Sam McCall
75acad41bc Use lit_config.substitute instead of foo % lit_config.params everywhere
This mechanically applies the same changes from D121427 everywhere.

Differential Revision: https://reviews.llvm.org/D121746
2022-03-16 09:57:41 +01:00
Maksim Panchenko
57f03db195 [BOLT][NFC] Remove unused function
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D121729
2022-03-15 12:39:14 -07:00
Vladislav Khmelevsky
8ab69baad5 [BOLT] Set cold sections alignment explicitly
The cold text section alignment is set using the maximum alignment value
passed to the emitCodeAlignment. In order to calculate tentetive layout
right we will set the minimum alignment of such sections to the maximum
possible function alignment explicitly.

Differential Revision: https://reviews.llvm.org/D121392
2022-03-15 22:12:17 +03:00
Amir Ayupov
5790441c45 [BOLT][NFC] Use getShortOpcodeArith in X86MCPlusBuilder
Unify `llvm::X86::getRelaxedOpcodeArith` and `getShortArithOpcode` in
X86MCPlusBuilder.cpp.

Addresses https://lists.llvm.org/pipermail/llvm-dev/2022-January/154526.html

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121404
2022-03-12 09:07:28 -08:00
Petr Hosek
0c0f6cfb7b [CMake] Rename TARGET_TRIPLE to LLVM_TARGET_TRIPLE
This clarifies that this is an LLVM specific variable and avoids
potential conflicts with other projects.

Differential Revision: https://reviews.llvm.org/D119918
2022-03-11 15:43:01 -08:00
Elvina Yakubova
db65429db5 [BOLT] Divide RegularPageSize for X86 and AArch64 cases
For AArch64 in some cases/some distributions ld uses 64K alignment of LOAD segments by default.

Reviewed By: yota9, maksfb

Differential Revision: https://reviews.llvm.org/D119267
2022-03-10 23:09:50 +03:00
Vladislav Khmelevsky
04b87cf0e7 [BOLT] LongJmp: Use per-function alignment values
The per-function alignment values must be used in order to create
tentative layout.

Differential Revision: https://reviews.llvm.org/D121298
2022-03-10 19:48:48 +03:00
Amir Ayupov
d1638cb0b5 [BOLT][NFC] Fix print-cfg data race
Addresses ThreadSanitizer warning

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121337
2022-03-09 20:28:06 -08:00
Amir Ayupov
d16bbc5340 [BOLT][NFC] Check errors from Obj.dynamicEntries
Addresses fuzzer crash

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121336
2022-03-09 20:24:44 -08:00
Vladislav Khmelevsky
506a91c089 [BOLT] Move some of the tests to common directory
Some of the tests are not x86-specific, move them to common directory.

Differential Revision: https://reviews.llvm.org/D121261
2022-03-09 14:51:17 +03:00
Vladislav Khmelevsky
8bdbcfe7d8 [BOLT] Handle ifuncs trampolines for aarch64
The aarch64 uses the trampolines located in .iplt section, which
contains plt-like trampolines on the value stored in .got. In this case
we don't have JUMP_SLOT relocation, but we have a symbol that belongs to
ifunc trampoline, so use it and set set plt symbol for such functions.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D120850
2022-03-09 01:38:19 +03:00
Amir Ayupov
1e016c3bd5 [BOLT][NFC] Handle "dynamic section sizes should match"
Address fuzzer crash on malformed input

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121070
2022-03-08 13:03:05 -08:00
Amir Ayupov
687e4af1c0 [BOLT] CMOVConversion pass
Convert simple hammocks into cmov based on misprediction rate.

Test Plan:
- Assembly test: `cmov-conversion.s`
- Testing on a binary:
  # Bootstrap clang with `-x86-cmov-converter-force-all` and `-Wl,--emit-relocs`
  (Release build)
  # Collect perf.data:

    - `clang++ <opts> bolt/lib/Core/BinaryFunction.cpp -E > bf.cpp`
    - `perf record -e cycles:u -j any,u -- clang-15 bf.cpp -O2 -std=c++14 -c -o bf.o`
  # Optimize clang-15 with and w/o -cmov-conversion:
    - `llvm-bolt clang-15 -p perf.data -o clang-15.bolt`
    - `llvm-bolt clang-15 -p perf.data -cmov-conversion -o clang-15.bolt.cmovconv`
  # Run perf experiment:
    - test: `clang-15.bolt.cmovconv`,
    - control: `clang-15.bolt`,
    - workload (clang options): `bf.cpp -O2 -std=c++14 -c -o bf.o`
Results:
```
  task-clock [delta: -360.21 ± 356.75, delta(%): -1.7760 ± 1.7589, p-value: 0.047951, balance: -6]
  instructions  [delta: 44061118 ± 13246382, delta(%): 0.0690 ± 0.0207, p-value: 0.000001, balance: 50]
  icache-misses [delta: -5534468 ± 2779620, delta(%): -0.4331 ± 0.2175, p-value: 0.028014, balance: -28]
  branch-misses [delta: -1624270 ± 1113244, delta(%): -0.3456 ± 0.2368, p-value: 0.030300, balance: -22]
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120177
2022-03-08 10:44:31 -08:00
Amir Ayupov
ced5472e09 [BOLT][NFC] Check section contents before registering it
Address fuzzer crash on malformed input:
```
BOLT-ERROR: cannot get section contents for .dynsym: The end of the file was unexpectedly encountered.
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121068
2022-03-08 09:13:01 -08:00
Amir Ayupov
018ad03efa [BOLT][CMAKE] Remove CMake 3.13.4 incompatible parameter
Remove `TYPE BIN` parameter that is introduced in CMake 3.14 and revert back to
the equivalent compatible form `DESTINATION ${CMAKE_INSTALL_BINDIR}`.

Addresses https://github.com/llvm/llvm-project/issues/54099

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D121012
2022-03-07 18:42:00 -08:00
Yi Kong
8142ace0a7 Revert "Add CMake option not to build BOLT tests"
This reverts commit d8f4d54664.

Merged by accident.
2022-03-08 02:01:15 +08:00
Yi Kong
d8f4d54664 Add CMake option not to build BOLT tests 2022-03-08 01:59:44 +08:00
Maksim Panchenko
fada230920 [BOLT][NFC] Return MCRegister::NoRegister from MCPlusBuilder::getNoRegister()
Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D120863
2022-03-03 13:25:13 -08:00
Vladislav Khmelevsky
00b6efc830 [BOLT] Enable PLT analysis for aarch64
This patch enables PLT analysis for aarch64. It is used by the static
relocations in order to provide final symbol address of PLT entry for some
instructions like ADRP.

Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei

Differential Revision: https://reviews.llvm.org/D118088
2022-03-02 22:14:48 +03:00
Maksim Panchenko
ae87445c25 [BOLT][test] Fix function size in test case 2022-03-01 17:53:41 -08:00
Amir Ayupov
08dcbed92f [BOLT] Fix X86MCPlusBuilder::replaceRegWithImm
Reassigning the operand didn't update the operand type which resulted in an
assertion (`Assertion `isReg() && "This is not a register operand!"' failed.`)
Reset the instruction instead.

Test Plan:
```
ninja check-bolt
...
PASS: BOLT-Unit :: Core/./CoreTests/X86/MCPlusBuilderTester.ReplaceRegWithImm/0 (90 of 136)
```

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120263
2022-02-28 19:24:46 -08:00
Alexander Yermolovich
a44fe31977 [BOLT][DWARF] Fix how DW_AT_high_pc [DW_FORM_udata] is handled
We were not handling correctly conversion from DW_AT_high_pc into DW_AT_ranges,
when size of DW_AT_high_pc is not 4/8 bytes.

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D120528
2022-02-25 10:32:05 -08:00
Maksim Panchenko
4101aa130a [BOLT] Support PC-relative relocations with addends
PC-relative memory operand could reference a different object from
the one located at the target address, e.g. when a negative offset
is used. Check relocations for the real referenced object.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120379
2022-02-23 22:54:42 -08:00
Amir Ayupov
af6e66f44c [BOLT][NFC] Report errors from RewriteInstance discoverStorage and run
Further improve error handling in BOLT by reporting `RewriteInstance` errors in
a library and fuzzer-friendly way instead of exiting.

Follow-up to D119658

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D120224
2022-02-23 20:42:39 -08:00
Amir Ayupov
454c149898 [BOLT][NFC] Fix undefined behavior in encodeAnnotationImm
Fix UBSan-reported issue in MCPlusBuilder::encodeAnnotationImm (left shift of a
negative value).

Test Plan:
```
ninja check-bolt
...
PASS: BOLT-Unit :: Core/./CoreTests/AArch64/MCPlusBuilderTester.Annotation/0 (1 of 140)
PASS: BOLT-Unit :: Core/./CoreTests/X86/MCPlusBuilderTester.Annotation/0 (131 of 134)
```

Reviewed By: maksfb, yota9

Differential Revision: https://reviews.llvm.org/D120260
2022-02-23 16:02:49 -08:00
Alexander Yermolovich
210bb04e23 [BOLT][DWARF] Remove patchLowHigh unused function.
Cleanup after removing caching mechanims for ranges/abbrevs.

Reviewed By: rafauler, yota9

Differential Revision: https://reviews.llvm.org/D120174
2022-02-22 13:28:15 -08:00
Amir Ayupov
d44f99c748 [BOLT] Added fuzzer target (llvm-bolt-fuzzer)
This adds a target that would consume random binary as an
input ELF file.
TBD: add structured input support (ELF).

Build:
```
cmake /path/to/llvm-project/llvm -GNinja \
-DLLVM_TARGETS_TO_BUILD="X86;AArch64" \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_ASSERTIONS=1 \
-DCMAKE_C_COMPILER=<sanitizer-capable clang> \
-DCMAKE_CXX_COMPILER=<sanitizer-capable clang++> \
-DLLVM_ENABLE_PROJECTS="bolt"  \
-DLLVM_USE_SANITIZER=Address \
-DLLVM_USE_SANITIZE_COVERAGE=On
ninja llvm-bolt-fuzzer
```

Test Plan: ninja llvm-bolt-fuzzer

Reviewed By: maksfb

Differential Revision: https://reviews.llvm.org/D120016
2022-02-20 17:24:16 -08:00
Amir Ayupov
36ada32727 [BOLT][NFC] Fix data race in ShrinkWrapping stats
Fix data race reported by ThreadSanitizer in clang.test:
```
ThreadSanitizer: data race /data/llvm-project/bolt/lib/Passes/ShrinkWrapping.cpp:1359:28
in llvm::bolt::ShrinkWrapping::moveSaveRestores()
```

The issue is with incrementing global counters from multiple threads.

Reviewed By: yota9

Differential Revision: https://reviews.llvm.org/D120218
2022-02-20 17:21:58 -08:00