15197 Commits

Author SHA1 Message Date
Fangrui Song
48e251b1d6 Revert D122459 "[ELF] --emit-relocs: adjust offsets of .rel[a].eh_frame relocations"
This reverts commit 6faba31e0d88ce71e87567ddb51d2444524b8a81.

It may cause "offset is outside the section".
2022-03-28 20:26:21 -07:00
Fangrui Song
6faba31e0d [ELF] --emit-relocs: adjust offsets of .rel[a].eh_frame relocations
.eh_frame pieces may be dropped due to GC/ICF. When --emit-relocs adds
relocations against .eh_frame, the offsets need to be adjusted. Use the same
way as MergeInputSection with a special case handling outSecOff==-1 for an
invalid piece (see eh-frame-marker.s).

This exposes an issue in mips64-eh-abs-reloc.s that we don't reliably
handle anyway. Just add --no-check-dynamic-relocations to paper over it.

Original patch by Ayrton Muñoz

Differential Revision: https://reviews.llvm.org/D122459
2022-03-28 16:23:13 -07:00
Fangrui Song
27ef7494b1 [ELF][test] Refactor some .eh_frame tests
* Improve eh-frame-merge.s
* Delete invalid .eh_frame+5 test in ehframe-relocation.s
2022-03-28 15:55:46 -07:00
Fangrui Song
1db59dc8e2 [ELF] Fix llvm_unreachable failure when COMMON is placed in SHT_PROGBITS output section
Fix a regression in aa27bab5a1a17e9c4168a741a6298ecaa92c1ecb: COMMON in an
SHT_PROGBITS output section caused llvm_unreachable failure.
2022-03-28 11:05:52 -07:00
Fangrui Song
8565a87fd4 [ELF] Simplify MergeInputSection::getParentOffset. NFC
and remove overly verbose comments.
2022-03-28 10:02:35 -07:00
Fangrui Song
c37accf0a2 [Option] Avoid using the default argument for the 3-argument hasFlag. NFC
The default argument true is error-prone: I think many would think the
default is false.
2022-03-26 00:57:06 -07:00
Sam McCall
57ee624d79 [cmake] Provide CURRENT_TOOLS_DIR centrally, replacing CLANG_TOOLS_DIR
CLANG_TOOLS_DIR holds the the current bin/ directory, maybe with a %(build_mode)
placeholder. It is used to add the just-built binaries to $PATH for lit tests.
In most cases it equals LLVM_TOOLS_DIR, which is used for the same purpose.
But for a standalone build of clang, CLANG_TOOLS_DIR points at the build tree
and LLVM_TOOLS_DIR points at the provided LLVM binaries.

Currently CLANG_TOOLS_DIR is set in clang/test/, clang-tools-extra/test/, and
other things always built with clang. This is a few cryptic lines of CMake in
each place. Meanwhile LLVM_TOOLS_DIR is provided by configure_site_lit_cfg().

This patch moves CLANG_TOOLS_DIR to configure_site_lit_cfg() and renames it:
 - there's nothing clang-specific about the value
 - it will also replace LLD_TOOLS_DIR, LLDB_TOOLS_DIR etc (not in this patch)

It also defines CURRENT_LIBS_DIR. While I removed the last usage of
CLANG_LIBS_DIR in e4cab4e24d1, there are LLD_LIBS_DIR usages etc that
may be live, and I'd like to mechanically update them in a followup patch.

Differential Revision: https://reviews.llvm.org/D121763
2022-03-25 20:22:01 +01:00
Fangrui Song
940bd4c771 [ELF] addSectionSymbols: simplify isec->getOutputSection(). NFC 2022-03-24 21:54:20 -07:00
Fangrui Song
d3e5b6f753 [ELF] Implement --build-id={md5,sha1} with truncated BLAKE3
--build-id was introduced as "approximation of true uniqueness across all
binaries that might be used by overlapping sets of people". It does not require
the some resistance mentioned below. In practice, people just use --build-id=md5
for 16-byte build ID and --build-id=sha1 for 20-byte build ID.

BLAKE3 has 256-bit key length, which provides 128-bit security against
(second-)preimage, collision, and differentiability attacks. Its portable
implementation is fast. It additionally provides Arm Neon/AVX2/AVX-512. Just
implement --build-id={md5,sha1} with truncated BLAKE3.

Linking clang 14 RelWithDebInfo with --threads=8 on a Skylake CPU:

* 1.13x as fast with --build-id=md5
* 1.15x as fast with --build-id=sha1

--threads=4 on Apple m1:

* 1.25x as fast with --build-id=md5
* 1.17x as fast with --build-id=sha1

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D121531
2022-03-24 11:31:39 -07:00
Jakob Koschel
0c86198b27 Reland "[ELF] Enable new passmanager plugin support for LTO"
This is the orignal patch + a check that LLVM_BUILD_EXAMPLES is enabled before
adding a dependency on the 'Bye' example pass.

Original summary:

Add cli options for new passmanager plugin support to lld.

Currently it is not possible to load dynamic NewPM plugins with lld. This is an
incremental update to D76866. While that patch only added cli options for
llvm-lto2, this adds them for lld as well. This is especially useful for running
dynamic plugins on the linux kernel with LTO.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D120490
2022-03-24 16:29:18 +01:00
Raphael Isemann
1104d79261 Revert "[ELF] Enable new passmanager plugin support for LTO"
This reverts commit 32012eb11b235e1560a253664095676ea8ebd027.

Broke CMake configuration.
2022-03-24 09:57:15 +01:00
Jakob Koschel
32012eb11b [ELF] Enable new passmanager plugin support for LTO
Add cli options for new passmanager plugin support to lld.

Currently it is not possible to load dynamic NewPM plugins with lld. This is an
incremental update to D76866. While that patch only added cli options for
llvm-lto2, this adds them for lld as well. This is especially useful for running
dynamic plugins on the linux kernel with LTO.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D120490
2022-03-24 08:08:54 +01:00
Roger Kim
f858fba631 [lld][Macho][NFC] Encapsulate priorities map in a priority class
`config->priorities` has been used to hold the intermediate state during the construction of the order in which sections should be laid out. This is not a good place to hold this state since the intermediate state is not a "configuration" for LLD. It should be encapsulated in a class for building a mapping from section to priority (which I created in this diff as the `PriorityBuilder` class).

The same thing is being done for `config->callGraphProfile`.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D122156
2022-03-23 13:57:26 -04:00
Jacob Lambert
71b162c4bd [AMDGPU][LLD] Adding support for ABI version 5 option
Code object version 5 will use the same EFlags as version 4, so we only need to add an additional case

Differential Revision: https://reviews.llvm.org/D122190
2022-03-23 01:22:37 -07:00
Jez Ng
c9c2363048 [lld-macho][nfc] Don't mix file sizes with addresses
Update DataInCode's calculation of `endAddr` to use `getSize()` instead
of `getFileSize()` -- while in practice they're the same for
non-zerofill sections (which code sections are), we still should treat
address sizes / offsets as distinct from file sizes / offsets.
2022-03-22 17:52:53 -04:00
Jez Ng
a993d607de [lld-macho][nfc] Add comment explaining why a cast<> is safe 2022-03-21 07:23:09 -04:00
Jez Ng
1c0234dfcc [lld-macho][nfc] Have findContainingSubsection take a Section
... instead of an instance of `Subsections`.

This simplifies the code slightly since all its callsites have a Section
instance anyway.
2022-03-21 07:23:09 -04:00
Sam Clegg
a04a507714 [lld][WebAssembly] Fix crash accessing non-live __tls_base symbol
In programs that don't otherwise depend on `__tls_base` it won't
be marked as live.  However this symbol is used internally in
a couple of places do we need to mark it as live explictily in
those places.

Fixes: #54386

Differential Revision: https://reviews.llvm.org/D121931
2022-03-17 13:59:45 -07:00
henry wong
948d05324a [LTO][ELF] Require asserts for --stats-file= tests.
https://reviews.llvm.org/D121809 causes the build bot failure, add the `REQUIRES: asserts` to fix it.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D121888
2022-03-17 23:57:13 +08:00
wangliushuai
1c04b52b25 [LTO][ELF] Add --stats-file= option.
This patch adds a StatsFile option supported by gold to lld, related patch https://reviews.llvm.org/D45531.

Reviewed By: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D121809
2022-03-17 12:01:39 +08:00
Jez Ng
f5ddcf25d6 [lld-macho] Extend lto-internalize-unnamed-addr.ll
* Test the case where a symbol is sometimes linkonce_odr and sometimes weak_odr
* Test the visibility of the symbols at the IR level, after the internalize
  stage of LTO is done. (Previously we only checked the visibility of
  symbols in the final output binary.)

Reviewed By: modimo

Differential Revision: https://reviews.llvm.org/D121428
2022-03-16 17:30:31 -04:00
Sam McCall
75acad41bc Use lit_config.substitute instead of foo % lit_config.params everywhere
This mechanically applies the same changes from D121427 everywhere.

Differential Revision: https://reviews.llvm.org/D121746
2022-03-16 09:57:41 +01:00
serge-sans-paille
989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Fangrui Song
c9dbf407af [ELF] Move invalid binding diagnostic from initializeSymbols to postParse
It is excessive to have a diagnostic for STB_LOCAL. Just reuse the invalid
binding diagnostic for STB_LOCAL.
2022-03-16 00:31:29 -07:00
Fangrui Song
bdb98bd979 [ELF] Use endianness-aware read32 to avoid dispatch. NFC 2022-03-15 23:51:11 -07:00
Fangrui Song
385573e07b [ELF] Inline ARMExidxSyntheticSection::classof. NFC
To optimize the only call site `dyn_cast<ARMExidxSyntheticSection>(first)` and
decrease code size.
2022-03-15 23:41:30 -07:00
Fangrui Song
1a590232f4 [ELF] Optimize "Strip sections"
If SHT_LLVM_SYMPART is unused, don't iterate over inputSections.
If neither --strip-debug/--strip-all, don't iterate over inputSections.
2022-03-15 23:15:43 -07:00
Fangrui Song
7c7702b318 [ELF] Move section assignment from initializeSymbols to postParse
https://discourse.llvm.org/t/parallel-input-file-parsing/60164

initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.

Postpone reporting duplicate symbol errors so that the messages have the
section information. (`Defined::section` is assigned in postParse and another
thread may not have the information).

* duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now
  has no section information
* comdat-binding: `%t/w.o %t/g.o` leads to an undesired undefined symbol. This
  is not ideal but we report a diagnostic to inform that this is unsupported.
  (See release note)
* comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded
  section error) makes more sense
* i386-comdat.s: switched to a better approach working around
  .gnu.linkonce.t.__x86.get_pc_thunk.bx in glibc<2.32 for x86-32.
  Drop the ancient no-longer-relevant workaround for __i686.get_pc_thunk.bx

Depends on D120640

Differential Revision: https://reviews.llvm.org/D120626
2022-03-15 19:24:41 -07:00
Fangrui Song
9b61fff0eb Revert D120626 "[ELF] Move section assignment from initializeSymbols to postParse"
This reverts commit c30e6447c0225f675773d07f2b6d4e3c2b962155.
It exposed brittle support for __x86.get_pc_thunk.bx.
Need to think a bit how to support __x86.get_pc_thunk.bx.
2022-03-15 19:00:54 -07:00
Fangrui Song
48a02152ab [ELF][test] Improve i386-linkonce.s
Make it behave like the glibc<2.32 .gnu.linkonce usage that we want to work around.
2022-03-15 18:47:52 -07:00
Sam Clegg
4690bf2ed3 [lld][WebAssembly] Take advantage of extended const expressions when available
In particular we use these in two places:

1. When building PIC code we no longer need to combine output segments
   into a single segment that can be initialized at `__memory_base`.
   Instead each segment can encode its offset from `__memory_base` in
   its initializer.  e.g.

```
(i32.add (global.get __memory_base) (i32.const offset)
```

2. When building PIC code we no longer need to relocation internalized
   global addresses.  We can just initialize them with their correct
   offsets.

Differential Revision: https://reviews.llvm.org/D121420
2022-03-15 17:50:05 -07:00
Jez Ng
8ce3750ff6 [lld-macho] Set FinalDefinitionInLinkageUnit on most LTO externs
Since Mach-O has a two-level namespace (unlike ELF), we can usually set
this property to true.

(I believe this setting is only available in the new LTO backend, so I
can't really use ld64 / libLTO's behavior as a reference here... I'm
just doing what I think is correct.)

See {D119294} for the work done to calculate the `interposable` used in
this diff.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D119506
2022-03-15 20:25:06 -04:00
Fangrui Song
c1d4c67718 [ELF] Suppress duplicate symbol error for __x86.get_pc_thunk.bx 2022-03-15 17:20:29 -07:00
Sam Clegg
86c90f9bfd [lld][WebAssembly] Add --unresolved-symbols=import-dynamic
This is a new mode for handling unresolved symbols that allows all
symbols to be imported in the same that they would be in the case of
`-fpie` or `-shared`, but generting an otherwise fixed/non-relocatable
binary.

Code linked in this way should still be compiled with `-fPIC` so that
data symbols can be resolved via imports.

This essentially allows the building of static binaries that have
dynamic imports.  See:
https://github.com/emscripten-core/emscripten/issues/12682

As with other uses of the experimental dynamic linking ABI, this
behaviour will produce a warning unless run with `--experimental-pic`.

Differential Revision: https://reviews.llvm.org/D91577
2022-03-15 15:10:21 -07:00
Fangrui Song
6be457c14d [ELF] Work around not-fully-supported .gnu.linkonce.t.__x86.get_pc_thunk.bx 2022-03-15 14:48:29 -07:00
Jez Ng
ceff23c6e3 [lld-macho] -flat_namespace for dylibs should make all externs interposable
All references to interposable symbols can be redirected at runtime to
point to a different symbol definition (with the same name). For
example, if both dylib A and B define symbol _foo, and we load A before
B at runtime, then all references to _foo within dylib B will point to
the definition in dylib A.

ld64 makes all extern symbols interposable when linking with
`-flat_namespace`.

TODO 1: Support `-interposable` and `-interposable_list`, which should
just be a matter of parsing those CLI flags and setting the
`Defined::interposable` bit.

TODO 2: Set Reloc::FinalDefinitionInLinkageUnit correctly with this info
(we are currently not setting it at all, so we're erring on the
conservative side, but we should help the LTO backend generate more
optimal code.)

Reviewed By: modimo, MaskRay

Differential Revision: https://reviews.llvm.org/D119294
2022-03-14 22:18:32 -04:00
Jez Ng
7f3ddf8443 [lld-macho][nfc] Allow Defined symbols to be placed in binding sections
Previously, we only allowed this for DylibSymbols. However, in order to
properly support `-flat_namespace` as well as `-interposable`, we need
to allow this for Defined symbols too. Therefore we hoist the
`lazyBindOffset` and the `stubsHelperIndex` into the parent Symbol
class.

The actual change to support interposition under `-flat_namespace` is in
{D119294}; the NFC changes here have been split out for easier review.

Perf regression isn't stat sig on my 3.2 GHz 16-Core Intel Xeon W linking
chromium_framework:

             base           diff           difference (95% CI)
  sys_time   1.227 ± 0.021  1.234 ± 0.031  [  -0.3% ..   +1.5%]
  user_time  3.665 ± 0.036  3.674 ± 0.035  [  -0.2% ..   +0.7%]
  wall_time  4.596 ± 0.055  4.609 ± 0.064  [  -0.3% ..   +0.9%]
  samples    34             47

Max RSS regression is barely stat sig:

           base                           diff                           difference (95% CI)
  time     1003664356.324 ± 15404053.912  1010380403.613 ± 10578309.455  [  +0.0% ..   +1.3%]
  samples  37                             31

Reviewed By: modimo

Differential Revision: https://reviews.llvm.org/D121351
2022-03-14 22:18:32 -04:00
Vy Nguyen
0d5e27623a Reland "[lld-macho] Avoid using bump-alloc in TrieBuider""
This reverts commit ee7a286cd3e4364d2f7d5b6ba4b8a6fc0d524854.
2022-03-14 19:33:13 -04:00
Sterling Augustine
ee7a286cd3 Revert "[lld-macho] Avoid using bump-alloc in TrieBuider"
This reverts commit e049a87f04cff8e81b4177097a6b2fdf37c7b148.

That commit breaks the build with errors of the form:

/usr/local/google/home/saugustine/llvm/llvm-project/lld/MachO/ExportTrie.cpp:148:11: error: definition of implicitly declared destructor
TrieNode::~TrieNode() {
2022-03-14 15:23:04 -07:00
Vy Nguyen
e049a87f04 [lld-macho] Avoid using bump-alloc in TrieBuider
The code can be used in multi-threads and the allocator is not thread safe.

fixes PR/54378

Reviewed By: int3, #lld-macho

Differential Revision: https://reviews.llvm.org/D121638
2022-03-14 17:22:53 -04:00
Fangrui Song
c30e6447c0 [ELF] Move section assignment from initializeSymbols to postParse
https://discourse.llvm.org/t/parallel-input-file-parsing/60164

initializeSymbols currently sets Defined::section and handles non-prevailing
COMDAT groups. Move the code to the parallel postParse to reduce work from the
single-threading code path and make parallel section initialization infeasible.

Postpone reporting duplicate symbol errors so that the messages have the
section information. (`Defined::section` is assigned in postParse and another
thread may not have the information).

* duplicated-synthetic-sym.s: BinaryFile duplicate definition (very rare) now
  has no section information
* comdat-binding: `%t/w.o %t/g.o` leads to an undesired undefined symbol. This
  is not ideal but we report a diagnostic to inform that this is unsupported.
  (See release note)
* comdat-discarded-lazy.s: %tdef.o is unextracted. The new behavior (discarded
  section error) makes more sense

Depends on D120640

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D120626
2022-03-14 14:13:41 -07:00
Fangrui Song
c7cf960d85 [ELF] Set the priority of STB_GNU_UNIQUE the same as STB_WEAK
In GCC -fgnu-unique output, STB_GNU_UNIQUE symbols are always defined relative
to a section in a COMDAT group. Currently `other` cannot be STB_GNU_UNIQUE for
valid input, so this patch is NFC.

If we switch to the model that ignores COMDAT resolution when performing symbol
resolution (D120626), this will fix bogus `relocation refers to a symbol in a
discarded section` errors when mixing -fno-gnu-unique objects with -fgnu-unique
objects.

Differential Revision: https://reviews.llvm.org/D120640
2022-03-14 12:00:15 -07:00
Sam Clegg
9504ab32b7 [WebAssembly] Second phase of implemented extended const proposal
This change continues to lay the ground work for supporting extended
const expressions in the linker.

The included test covers object file reading and writing and the YAML
representation.

Differential Revision: https://reviews.llvm.org/D121349
2022-03-14 08:55:47 -07:00
Nico Weber
17414150cf [lld-link] Tweak winsysroottest.test to have passing links on happy path
Previously, the test checked for a "undefined symbol" error
(instead of the "could not open std*.lib" which would happen without
the flag).

Instead, use /entry: so that the link succeeds.

No behavior change, but maybe makes the test a bit easier to understand.

Differential Revision: https://reviews.llvm.org/D121553
2022-03-14 10:44:26 -04:00
Fangrui Song
7b8fbb796c [ELF] Simplify addCopyRelSymbol with invokeELFT. NFC 2022-03-12 14:08:10 -08:00
Petr Hosek
0c0f6cfb7b [CMake] Rename TARGET_TRIPLE to LLVM_TARGET_TRIPLE
This clarifies that this is an LLVM specific variable and avoids
potential conflicts with other projects.

Differential Revision: https://reviews.llvm.org/D119918
2022-03-11 15:43:01 -08:00
Jez Ng
9b7b21d2f7 [lld-macho] Don't allocate memory in parallelForEach
... since BumpPtrAllocator isn't thread-safe.

Reviewed By: #lld-macho, Roger

Differential Revision: https://reviews.llvm.org/D121458
2022-03-11 13:32:24 -05:00
Fangrui Song
4a8de2832a [ELF] Add -z pack-relative-relocs
GNU ld 2.38 added -z pack-relative-relocs which is similar to
--pack-dyn-relocs=relr but synthesizes the `GLIBC_ABI_DT_RELR` version
dependency if a shared object named `libc.so.*` has a `GLIBC_2.*` version
dependency.

This is used to implement the (as some glibc folks call) version lockout
mechanism. Add this option, because glibc does not want to support
--pack-dyn-relocs=relr which does not add `GLIBC_ABI_DT_RELR`.
See https://maskray.me/blog/2021-10-31-relative-relocations-and-relr for
detail.

Close https://github.com/llvm/llvm-project/issues/53775

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D120701
2022-03-10 19:54:21 -08:00
Jez Ng
fc968bcba4
[lld-macho][nfc] Fix formatting in ld64-vs-lld.rst 2022-03-10 18:33:18 -05:00
Jez Ng
4308f031cd [lld-macho] Align cstrings less conservatively
Previously, we aligned every cstring to 16 bytes as a temporary hack to
deal with https://github.com/llvm/llvm-project/issues/50135. However, it
was highly wasteful in terms of binary size.

To recap, in contrast to ELF, which puts strings that need different
alignments into different sections, `clang`'s Mach-O backend puts them
all in one section.  Strings that need to be aligned have the .p2align
directive emitted before them, which simply translates into zero padding
in the object file. In other words, we have to infer the alignment of
the cstrings from their addresses.

We differ slightly from ld64 in how we've chosen to align these
cstrings. Both LLD and ld64 preserve the number of trailing zeros in
each cstring's address in the input object files. When deduplicating
identical cstrings, both linkers pick the cstring whose address has more
trailing zeros, and preserve the alignment of that address in the final
binary. However, ld64 goes a step further and also preserves the offset
of the cstring from the last section-aligned address.  I.e. if a cstring
is at offset 18 in the input, with a section alignment of 16, then both
LLD and ld64 will ensure the final address is 2-byte aligned (since
`18 == 16 + 2`). But ld64 will also ensure that the final address is of
the form 16 * k + 2 for some k (which implies 2-byte alignment).

Note that ld64's heuristic means that a dedup'ed cstring's final address is
dependent on the order of the input object files. E.g. if in addition to the
cstring at offset 18 above, we have a duplicate one in another file with a
`.cstring` section alignment of 2 and an offset of zero, then ld64 will pick
the cstring from the object file earlier on the command line (since both have
the same number of trailing zeros in their address). So the final cstring may
either be at some address `16 * k + 2` or at some address `2 * k`.

I've opted not to follow this behavior primarily for implementation
simplicity, and secondarily to save a few more bytes. It's not clear to me
that preserving the section alignment + offset is ever necessary, and there
are many cases that are clearly redundant. In particular, if an x86_64 object
file contains some strings that are accessed via SIMD instructions, then the
.cstring section in the object file will be 16-byte-aligned (since SIMD
requires its operand addresses to be 16-byte aligned). However, there will
typically also be other cstrings in the same file that aren't used via SIMD
and don't need this alignment. They will be emitted at some arbitrary address
`A`, but ld64 will treat them as being 16-byte aligned with an offset of
`16 % A`.

I have verified that the two repros in https://github.com/llvm/llvm-project/issues/50135
work well with the new alignment behavior.

Fixes https://github.com/llvm/llvm-project/issues/54036.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D121342
2022-03-10 15:18:15 -05:00