.eh_frame pieces may be dropped due to GC/ICF. When --emit-relocs adds
relocations against .eh_frame, the offsets need to be adjusted. Use the same
way as MergeInputSection with a special case handling outSecOff==-1 for an
invalid piece (see eh-frame-marker.s).
This exposes an issue in mips64-eh-abs-reloc.s that we don't reliably
handle anyway. Just add --no-check-dynamic-relocations to paper over it.
Original patch by Ayrton Muñoz
Differential Revision: https://reviews.llvm.org/D122459
CLANG_TOOLS_DIR holds the the current bin/ directory, maybe with a %(build_mode)
placeholder. It is used to add the just-built binaries to $PATH for lit tests.
In most cases it equals LLVM_TOOLS_DIR, which is used for the same purpose.
But for a standalone build of clang, CLANG_TOOLS_DIR points at the build tree
and LLVM_TOOLS_DIR points at the provided LLVM binaries.
Currently CLANG_TOOLS_DIR is set in clang/test/, clang-tools-extra/test/, and
other things always built with clang. This is a few cryptic lines of CMake in
each place. Meanwhile LLVM_TOOLS_DIR is provided by configure_site_lit_cfg().
This patch moves CLANG_TOOLS_DIR to configure_site_lit_cfg() and renames it:
- there's nothing clang-specific about the value
- it will also replace LLD_TOOLS_DIR, LLDB_TOOLS_DIR etc (not in this patch)
It also defines CURRENT_LIBS_DIR. While I removed the last usage of
CLANG_LIBS_DIR in e4cab4e24d1, there are LLD_LIBS_DIR usages etc that
may be live, and I'd like to mechanically update them in a followup patch.
Differential Revision: https://reviews.llvm.org/D121763
--build-id was introduced as "approximation of true uniqueness across all
binaries that might be used by overlapping sets of people". It does not require
the some resistance mentioned below. In practice, people just use --build-id=md5
for 16-byte build ID and --build-id=sha1 for 20-byte build ID.
BLAKE3 has 256-bit key length, which provides 128-bit security against
(second-)preimage, collision, and differentiability attacks. Its portable
implementation is fast. It additionally provides Arm Neon/AVX2/AVX-512. Just
implement --build-id={md5,sha1} with truncated BLAKE3.
Linking clang 14 RelWithDebInfo with --threads=8 on a Skylake CPU:
* 1.13x as fast with --build-id=md5
* 1.15x as fast with --build-id=sha1
--threads=4 on Apple m1:
* 1.25x as fast with --build-id=md5
* 1.17x as fast with --build-id=sha1
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D121531
This is the orignal patch + a check that LLVM_BUILD_EXAMPLES is enabled before
adding a dependency on the 'Bye' example pass.
Original summary:
Add cli options for new passmanager plugin support to lld.
Currently it is not possible to load dynamic NewPM plugins with lld. This is an
incremental update to D76866. While that patch only added cli options for
llvm-lto2, this adds them for lld as well. This is especially useful for running
dynamic plugins on the linux kernel with LTO.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D120490
Add cli options for new passmanager plugin support to lld.
Currently it is not possible to load dynamic NewPM plugins with lld. This is an
incremental update to D76866. While that patch only added cli options for
llvm-lto2, this adds them for lld as well. This is especially useful for running
dynamic plugins on the linux kernel with LTO.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D120490
`config->priorities` has been used to hold the intermediate state during the construction of the order in which sections should be laid out. This is not a good place to hold this state since the intermediate state is not a "configuration" for LLD. It should be encapsulated in a class for building a mapping from section to priority (which I created in this diff as the `PriorityBuilder` class).
The same thing is being done for `config->callGraphProfile`.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D122156
Code object version 5 will use the same EFlags as version 4, so we only need to add an additional case
Differential Revision: https://reviews.llvm.org/D122190
Update DataInCode's calculation of `endAddr` to use `getSize()` instead
of `getFileSize()` -- while in practice they're the same for
non-zerofill sections (which code sections are), we still should treat
address sizes / offsets as distinct from file sizes / offsets.
In programs that don't otherwise depend on `__tls_base` it won't
be marked as live. However this symbol is used internally in
a couple of places do we need to mark it as live explictily in
those places.
Fixes: #54386
Differential Revision: https://reviews.llvm.org/D121931
* Test the case where a symbol is sometimes linkonce_odr and sometimes weak_odr
* Test the visibility of the symbols at the IR level, after the internalize
stage of LTO is done. (Previously we only checked the visibility of
symbols in the final output binary.)
Reviewed By: modimo
Differential Revision: https://reviews.llvm.org/D121428
https://discourse.llvm.org/t/parallel-input-file-parsing/60164
initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.
Postpone reporting duplicate symbol errors so that the messages have the
section information. (`Defined::section` is assigned in postParse and another
thread may not have the information).
* duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now
has no section information
* comdat-binding: `%t/w.o %t/g.o` leads to an undesired undefined symbol. This
is not ideal but we report a diagnostic to inform that this is unsupported.
(See release note)
* comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded
section error) makes more sense
* i386-comdat.s: switched to a better approach working around
.gnu.linkonce.t.__x86.get_pc_thunk.bx in glibc<2.32 for x86-32.
Drop the ancient no-longer-relevant workaround for __i686.get_pc_thunk.bx
Depends on D120640
Differential Revision: https://reviews.llvm.org/D120626
This reverts commit c30e6447c0225f675773d07f2b6d4e3c2b962155.
It exposed brittle support for __x86.get_pc_thunk.bx.
Need to think a bit how to support __x86.get_pc_thunk.bx.
In particular we use these in two places:
1. When building PIC code we no longer need to combine output segments
into a single segment that can be initialized at `__memory_base`.
Instead each segment can encode its offset from `__memory_base` in
its initializer. e.g.
```
(i32.add (global.get __memory_base) (i32.const offset)
```
2. When building PIC code we no longer need to relocation internalized
global addresses. We can just initialize them with their correct
offsets.
Differential Revision: https://reviews.llvm.org/D121420
Since Mach-O has a two-level namespace (unlike ELF), we can usually set
this property to true.
(I believe this setting is only available in the new LTO backend, so I
can't really use ld64 / libLTO's behavior as a reference here... I'm
just doing what I think is correct.)
See {D119294} for the work done to calculate the `interposable` used in
this diff.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D119506
This is a new mode for handling unresolved symbols that allows all
symbols to be imported in the same that they would be in the case of
`-fpie` or `-shared`, but generting an otherwise fixed/non-relocatable
binary.
Code linked in this way should still be compiled with `-fPIC` so that
data symbols can be resolved via imports.
This essentially allows the building of static binaries that have
dynamic imports. See:
https://github.com/emscripten-core/emscripten/issues/12682
As with other uses of the experimental dynamic linking ABI, this
behaviour will produce a warning unless run with `--experimental-pic`.
Differential Revision: https://reviews.llvm.org/D91577
All references to interposable symbols can be redirected at runtime to
point to a different symbol definition (with the same name). For
example, if both dylib A and B define symbol _foo, and we load A before
B at runtime, then all references to _foo within dylib B will point to
the definition in dylib A.
ld64 makes all extern symbols interposable when linking with
`-flat_namespace`.
TODO 1: Support `-interposable` and `-interposable_list`, which should
just be a matter of parsing those CLI flags and setting the
`Defined::interposable` bit.
TODO 2: Set Reloc::FinalDefinitionInLinkageUnit correctly with this info
(we are currently not setting it at all, so we're erring on the
conservative side, but we should help the LTO backend generate more
optimal code.)
Reviewed By: modimo, MaskRay
Differential Revision: https://reviews.llvm.org/D119294
Previously, we only allowed this for DylibSymbols. However, in order to
properly support `-flat_namespace` as well as `-interposable`, we need
to allow this for Defined symbols too. Therefore we hoist the
`lazyBindOffset` and the `stubsHelperIndex` into the parent Symbol
class.
The actual change to support interposition under `-flat_namespace` is in
{D119294}; the NFC changes here have been split out for easier review.
Perf regression isn't stat sig on my 3.2 GHz 16-Core Intel Xeon W linking
chromium_framework:
base diff difference (95% CI)
sys_time 1.227 ± 0.021 1.234 ± 0.031 [ -0.3% .. +1.5%]
user_time 3.665 ± 0.036 3.674 ± 0.035 [ -0.2% .. +0.7%]
wall_time 4.596 ± 0.055 4.609 ± 0.064 [ -0.3% .. +0.9%]
samples 34 47
Max RSS regression is barely stat sig:
base diff difference (95% CI)
time 1003664356.324 ± 15404053.912 1010380403.613 ± 10578309.455 [ +0.0% .. +1.3%]
samples 37 31
Reviewed By: modimo
Differential Revision: https://reviews.llvm.org/D121351
This reverts commit e049a87f04cff8e81b4177097a6b2fdf37c7b148.
That commit breaks the build with errors of the form:
/usr/local/google/home/saugustine/llvm/llvm-project/lld/MachO/ExportTrie.cpp:148:11: error: definition of implicitly declared destructor
TrieNode::~TrieNode() {
The code can be used in multi-threads and the allocator is not thread safe.
fixes PR/54378
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D121638
https://discourse.llvm.org/t/parallel-input-file-parsing/60164
initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.
Postpone reporting duplicate symbol errors so that the messages have the
section information. (`Defined::section` is assigned in postParse and another
thread may not have the information).
* duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now
has no section information
* comdat-binding: `%t/w.o %t/g.o` leads to an undesired undefined symbol. This
is not ideal but we report a diagnostic to inform that this is unsupported.
(See release note)
* comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded
section error) makes more sense
Depends on D120640
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D120626
In GCC -fgnu-unique output, STB_GNU_UNIQUE symbols are always defined relative
to a section in a COMDAT group. Currently `other` cannot be STB_GNU_UNIQUE for
valid input, so this patch is NFC.
If we switch to the model that ignores COMDAT resolution when performing symbol
resolution (D120626), this will fix bogus `relocation refers to a symbol in a
discarded section` errors when mixing -fno-gnu-unique objects with -fgnu-unique
objects.
Differential Revision: https://reviews.llvm.org/D120640
This change continues to lay the ground work for supporting extended
const expressions in the linker.
The included test covers object file reading and writing and the YAML
representation.
Differential Revision: https://reviews.llvm.org/D121349
Previously, the test checked for a "undefined symbol" error
(instead of the "could not open std*.lib" which would happen without
the flag).
Instead, use /entry: so that the link succeeds.
No behavior change, but maybe makes the test a bit easier to understand.
Differential Revision: https://reviews.llvm.org/D121553
This clarifies that this is an LLVM specific variable and avoids
potential conflicts with other projects.
Differential Revision: https://reviews.llvm.org/D119918
GNU ld 2.38 added -z pack-relative-relocs which is similar to
--pack-dyn-relocs=relr but synthesizes the `GLIBC_ABI_DT_RELR` version
dependency if a shared object named `libc.so.*` has a `GLIBC_2.*` version
dependency.
This is used to implement the (as some glibc folks call) version lockout
mechanism. Add this option, because glibc does not want to support
--pack-dyn-relocs=relr which does not add `GLIBC_ABI_DT_RELR`.
See https://maskray.me/blog/2021-10-31-relative-relocations-and-relr for
detail.
Close https://github.com/llvm/llvm-project/issues/53775
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D120701
Previously, we aligned every cstring to 16 bytes as a temporary hack to
deal with https://github.com/llvm/llvm-project/issues/50135. However, it
was highly wasteful in terms of binary size.
To recap, in contrast to ELF, which puts strings that need different
alignments into different sections, `clang`'s Mach-O backend puts them
all in one section. Strings that need to be aligned have the .p2align
directive emitted before them, which simply translates into zero padding
in the object file. In other words, we have to infer the alignment of
the cstrings from their addresses.
We differ slightly from ld64 in how we've chosen to align these
cstrings. Both LLD and ld64 preserve the number of trailing zeros in
each cstring's address in the input object files. When deduplicating
identical cstrings, both linkers pick the cstring whose address has more
trailing zeros, and preserve the alignment of that address in the final
binary. However, ld64 goes a step further and also preserves the offset
of the cstring from the last section-aligned address. I.e. if a cstring
is at offset 18 in the input, with a section alignment of 16, then both
LLD and ld64 will ensure the final address is 2-byte aligned (since
`18 == 16 + 2`). But ld64 will also ensure that the final address is of
the form 16 * k + 2 for some k (which implies 2-byte alignment).
Note that ld64's heuristic means that a dedup'ed cstring's final address is
dependent on the order of the input object files. E.g. if in addition to the
cstring at offset 18 above, we have a duplicate one in another file with a
`.cstring` section alignment of 2 and an offset of zero, then ld64 will pick
the cstring from the object file earlier on the command line (since both have
the same number of trailing zeros in their address). So the final cstring may
either be at some address `16 * k + 2` or at some address `2 * k`.
I've opted not to follow this behavior primarily for implementation
simplicity, and secondarily to save a few more bytes. It's not clear to me
that preserving the section alignment + offset is ever necessary, and there
are many cases that are clearly redundant. In particular, if an x86_64 object
file contains some strings that are accessed via SIMD instructions, then the
.cstring section in the object file will be 16-byte-aligned (since SIMD
requires its operand addresses to be 16-byte aligned). However, there will
typically also be other cstrings in the same file that aren't used via SIMD
and don't need this alignment. They will be emitted at some arbitrary address
`A`, but ld64 will treat them as being 16-byte aligned with an offset of
`16 % A`.
I have verified that the two repros in https://github.com/llvm/llvm-project/issues/50135
work well with the new alignment behavior.
Fixes https://github.com/llvm/llvm-project/issues/54036.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D121342