This reverts commit 9246df7049b0bb83743f860caff4221413c63de2.
Reason: This patch broke the UBSan buildbots. See more information in
the original phabricator review: https://reviews.llvm.org/D139092
This reverts commit c4fea3905617af89d1ad87319893e250f5b72dd6.
I am reverting this for now until I figure out how to fix
the build bot errors and warnings.
Errors:
llvm-project/lld/ELF/Arch/ARM.cpp:1300:29: error: expected primary-expression before ‘>’ token
osec->writeHeaderTo<ELFT>(++sHdrs);
Warnings:
llvm-project/lld/ELF/Arch/ARM.cpp:1306:31: warning: left operand of comma operator has no effect [-Wunused-value]
This commit provides linker support for Cortex-M Security Extensions (CMSE).
The specification for this feature can be found in ARM v8-M Security Extensions:
Requirements on Development Tools.
The linker synthesizes a security gateway veneer in a special section;
`.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`,
defined relative to the same text section and having the same address. The
address of `<entry>` is retargeted to the starting address of the
linker-synthesized security gateway veneer in section `.gnu.sgstubs`.
In summary, the linker translates input:
```
.text
entry:
__acle_se_entry:
[entry_code]
```
into:
```
.section .gnu.sgstubs
entry:
SG
B.W __acle_se_entry
.text
__acle_se_entry:
[entry_code]
```
If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker
considers that `<entry>` already defines a secure gateway veneer so does not
synthesize one.
If `--out-implib=<out.lib>` is specified, the linker writes the list of secure
gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library
will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway
veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with
value `<addr>`.
If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import
library `<in.lib>` and preserves the entry function addresses in the resulting
executable and new import library.
Reviewed By: MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D139092
The test requires that LLVM integreated assembler generates
SUBTRACTOR/UNSIGNED relocations for `.long _minuend_1 - _subtrahend_1`.
This currently works by luck because:
* `_minuend_1` and `_subtrahend_1` are in different fragments (separated by a MCFillFragment)
* and the result is known to be negative at parsing time.
D153096 will change the assembler to fold `.long _minuend_1 - _subtrahend_1` to
a constant, giving ld -order_file no chance to change the result.
To fix the test, move the referenced labels after the label differences to block
constant folding.
Note: you may think the model is somewhat broken and it is. The
.subsections_via_symbols mechanism does not block such constant folding. In
reality SUBTRACTOR/UNSIGNED is for references to another section, which does not
have the problem.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D153382
2700da5fe28d8 added lld/test/Unit/lit.site.cfg.py.in in a state
that half-supports relocatable lld lit tests.
Make them fully relocatable.
See description of fb80b6b2d58c47 for background.
Differential Revision: https://reviews.llvm.org/D152885
Arm has BE8 big endian configuration called a byte-invariant(every byte has the same address on little and big-endian systems).
When in BE8 mode:
1. Instructions are big-endian in relocatable objects but
little-endian in executables and shared objects.
2. Data is big-endian.
3. The data encoding of the ELF file is ELFDATA2MSB.
To support BE8 without an ABI break for relocatable objects,the linker takes on the responsibility of changing the endianness of instructions. At a high level the only difference between BE32 and BE8 in the linker is that for BE8:
1. The linker sets the flag EF_ARM_BE8 in the ELF header.
2. The linker endian reverses the instructions, but not data.
This patch adds BE8 big endian support for Arm. To endian reverse the instructions we'll need access to the mapping symbols. Code sections can contain a mix of Arm, Thumb and literal data. We need to endian reverse Arm instructions as words, Thumb instructions
as half-words and ignore literal data.The only way to find these transitions precisely is by using mapping symbols. The instruction reversal will need to take place after relocation. For Arm BE8 code sections (Section has SHF_EXECINSTR flag ) we inserted a step after relocation to endian reverse the instructions. The implementation strategy i have used here is to write all sections BE32 including SyntheticSections then endian reverse all code in InputSections via mapping symbols.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D150870
This reverts commit aa495214b39d475bab24b468de7a7c676ce9e366.
As discussed in https://github.com/llvm/llvm-project/issues/53475 this patch
allows for using LLD-as-a-lib. It also lets clients link only the drivers that
they want (see unit tests).
This also adds the unit test infra as in the other LLVM projects. Among the
test coverage, I've added the original issue from @krzysz00, see:
https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction
Important note: this doesn't allow (yet) linking in parallel. This will come a
bit later hopefully, in subsequent patches, for COFF at least.
Differential revision: https://reviews.llvm.org/D119049
The left/right shift linker script operators may trigger UB.
E.g. in linkerscript/end-overflow-check.test, the initial REGION1__PADDED_SR_SHIFT is
uint64_t(-3), cause the following expression to trigger an out-of-range shift in
a ubsan build of lld.
REGION1__PADDED_SR_SIZE = MAX(1 << REGION1__PADDED_SR_SHIFT, 32);
Protect such UBs by making RHS less than 64.
As of Xcode 15 there is now a tool ID for LLD, likely driven by Apple's
tests with using LLD for their CAS work in clang. This updates LLD to
use the correct ID, and updates the object library so that llvm-objdump
prints it correctly.
Differential Revision: https://reviews.llvm.org/D152929
The corresponding class definition was removed by:
commit 502d4ce2e4921cc38ef1226ae093e594d900fe46
Author: Reid Kleckner <rnk@google.com>
Date: Mon Jun 26 15:39:52 2017 +0000
LLD terminates with errors when it detects overflows in the
finalizeAddressDependentContent calculation. Although, sometimes, those errors
are not really errors, but an intermediate result of an ongoing address
calculation. If we continue the fixed-point algorithm we can converge to the
correct result.
This patch
* Removes the verification inside the fixed point algorithm.
* Calls checkMemoryRegions at the end.
Reviewed By: peter.smith, MaskRay
Differential Revision: https://reviews.llvm.org/D152170
The warning "ignoring memory region assignment for non-allocatable section" should be generated under the following conditions:
* sections without SHF_ALLOC attribute and,
* presence of input sections or data commands (ByteCommand)
The goal of the change is to reduce spurious warnings that are generated for some output sections that have no input section.
Reviewed By: MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D151802
MSVC link.exe allows overriding exports on the cmd-line with exports seen in OBJ directives. The typical case is what is described in #62329.
Before this patch, trying to override an export with `/export` or `/def` would generate a duplicate warning. This patches tries to replicate the MSVC behavior. A second override on the cmd-line would still generate the warning.
There's still a case which we don't cover: MSVC link.exe is able to demangle an exported OBJ directive function, and match it with a unmangled export function in a .def file. In the meanwhile, one can use the mangled export in the .def to cover that case.
This fixes#62329
Differential revision: https://reviews.llvm.org/D149611
Reasons for rolling forward:
- the crash reported from Chromium was fixed in D151824 (not related to this patch at all)
- since D152824 was committed, it should now be safe to roll this forward.
New change:
- add an additional _ in name check
This reverts commit 4980eead4d0b4666d53dad07afb091375b3a13a0.
This reverts commit 11d61c079d4b4927efea42a38a27d4586887b764 to re-apply
6836a47b7e6b57927664ec6ec750ae37bb951129 with modifications.
Specifically, the errors in DWARFAbbreviationDeclaration::extract needed
to be moved as they are returned to ensure the right Error constructor
is selected.
Follow up to D151815.
Or else we properly handle the first instance of a file, then error out on the second instance of the same file.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D152198
In trying to hoist errors further up this callstack, I discovered that
if the data in the debug_abbrev section is invalid entirely, the code
that parses the debug_abbrev section may do strange or unpredictable
things. The underlying issue is that DataExtractor will return a value
of 0 when it encounters an error in extracting a LEB128 value. It's thus
difficult to determine if there was an error just by looking at the
return value. This patch aims to bail at the first sight of an error in
the debug_abbrev parsing code.
Differential Revision: https://reviews.llvm.org/D151755
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549
There's potential for a lot more cleanups around these APIs. This is
just a start.
Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using `llvm::demangle`. Add a
release note.
Differential Revision: https://reviews.llvm.org/D149104
This change to llvm-objcopy preserves the ELF section sh_link to .symtab
so long as none of the symbol table indices have been changed.
Previously, any invocation of llvm-objcopy including a "no-op" would
clear any section sh_link to .symtab.
Differential Revision: https://reviews.llvm.org/D150859
As per section 4.2.2 of the PowerPC ELFv2 ABI, this value tells the dynamic linker which optimizations it is allowed to do.
Specifically, the higher order bit of the two tells the dynamic linker that there may be multiple TOC pointers in the binary.
When we resolve any NOTOC relocations during linking, we need to set this value because we may be calling
TOC functions from NOTOC functions when the NOTOC function already clobbered the TOC pointer.
In practice, this ensures that the PLT resolver always resolves the call to the GEP (global entry point) of
the TOC function (which will set up the TOC for the TOC function).
Original patch by nemanjai
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D150631
This fixes a runtime error that occurred due to incorrect
internalization of linkonce_odr functions where function pointer
equality was broken. This was hit because the prevailing copy was in a
native object, so the IR copies were not exported, and the existing code
internalized all of the IR copies. It could be fixed by guarding this
internalization on whether the defs are (local_)unnamed_addr, meaning
that their address is not significant (which we have in the summary
currently for linkonce_odr via the CanAutoHide flag). Or we can
propagate reference attributes as we do when determining whether a
global variable is read or write-only (reference edges are annotated
with whether they are read-only, write-only, or neither, and taking the
address of a function would result in a reference edge to the function
that is not read or write-only).
However, this exposed a larger issue with the internalization handling.
Looking at test cases, it appears the intent is to internalize when
there is a single definition of a linkonce/weak ODR symbol (that isn't
exported). This makes sense in the case of functions, because the
inliner can apply its last call to static heuristic when appropriate. In
the case where there is no prevailing copy in IR, internalizing all of
the IR copies of a linkonce_odr, even if legal, just increases binary
size. In that case it is better to fall back to the normal handling of
converting all non-prevailing copies to available_externally so that
they are eliminated after inlining.
In the case of variables, the existing code was attempting to
internalize the non-exported linkonce/weak ODR variables if they were
read or write-only. While this is legal (we propagate reference
attributes to determine this information), we don't even need to
internalize these here as there is later separate handling that
internalizes read and write-only variables when we process the module at
the start of the ThinLTO backend (processGlobalForThinLTO). Instead, we
can also internalize any non-exported variable when there is only one
(IR) definition, which is prevailing. And in that case, we don't need to
require that it is read or write-only, since we are guaranteed that all
uses must use that single definition.
In the new LTO API, if there are multiple defs of a linkonce or weak ODR
it will be marked exported, but it isn't clear that this will always be
true for the legacy LTO API. Therefore, require that there is only a
single (non-local) def, and that it is prevailing.
The test cases changes are both to reflect the change in the handling of
linkonce_odr IR copies where the prevailing def is not in IR (the main
correctness bug fix here), and to reflect the more aggressive
internalization of variables when there is only a single def, it is in
IR, and not exported.
I've also added some additional testing via the new LTO API.
Differential Revision: https://reviews.llvm.org/D151965
With /winsysroot and without /machine, we don't know which paths to add to the search paths.
We do autodetect machine type and add winsysroot search paths in SymbolTable::addFile(), but that happens after all input files are opened. So in the loop where we read files, if we fail to open a file we can retry with the winsysroot search path potentially added by reading a previous file. This will fail if we try to open something in the winsysroot before reading a file that can give us the architecture, but shrug.
Fixes#54409
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D151815
Emit a 4-byte alignment after the .arm directive and a 2-byte alignment
after the .thumb directive. The new behavior matches GNU assembler.
Fixes#53386
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D147763
Enable support for CSPGO for lld MachO targets.
Since lld MachO does not support `-plugin-opt=`, we need to create the `--cs-profile-generate` and `--cs-profile-path=` options and propagate them in `Darwin.cpp`. These flags are not supported by ld64.
Also outline code into `getLastCSProfileGenerateArg()` to share between `CommonArgs.cpp` and `Darwin.cpp`.
CSPGO is already implemented for ELF (https://reviews.llvm.org/D56675) and COFF (https://reviews.llvm.org/D98763).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D151589
Add support for printing the passes run for LTO.
Both ELF and COFF have `--lto-debug-pass-manager` (`-ltodebugpassmanager`) to print the compiler passes run during LTO. This is useful to check that a certain compiler pass is run in a test, e.g., https://reviews.llvm.org/D151589
Reviewed By: #lld-macho, MaskRay, int3
Differential Revision: https://reviews.llvm.org/D151746
With --wrap=foo, we may have `d->file != file` for a defined symbol `foo`.
For the object file defining `foo`, its symbol table may not contain
`foo` after `redirectSymbols` changed the `foo` entry to `__wrap_foo` (see D50569).
Therefore, skipping `foo` with the condition `if (!d || d->file != file)` may
cause `__wrap_foo` not to be updated. See `ab.o w.o --wrap=foo` in the new test
(originally reported by D150220).
We could adjust the condition to `if (!d)`, but that would leave many `anchors`
entries if a symbol is referenced by many files. Switch to iterating over
`symtab` instead.
Note: D149735 (actually not NFC) allowed duplicate `anchors` entries and fixed
`a.o bw.o --wrap=foo`.
Reviewed By: jobnoorman
Differential Revision: https://reviews.llvm.org/D151768
Apple deprecated bitcode in the deployment process in Xcode 14.0. Last
month Apple started requiring Xcode 14.1+ to submit apps to the App
Store. Since there isn't a use for bundling bitcode outside of
submitting to the App Store we should be safe to delete this handling
entirely from LLD.
Differential Revision: https://reviews.llvm.org/D150697
This adds very minimal support for ARM64EC/ARM64X targets,
just enough for interesting test cases. Next patches in the
series extend llvm-objdump and llvm-readobj to provide
better tests. Those will also be useful for testing further
ARM64EC LLD support.
Differential Revision: https://reviews.llvm.org/D149086
This reverts part of commit 44363f2ff2736e4edf4a260f442b513ceac661fc.
Fixup for NO symbol table test has been reserved.
Reviewed By: wxiao3
Differential Revision: https://reviews.llvm.org/D151417
This reverts commit d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6.
Adds the patch by @hans from
https://github.com/llvm/llvm-project/issues/62719
This patch fixes the Windows build.
d763c6e5e2d0a6b34097aa7dabca31e9aff9b0b6 reverted the reviews
D144509 [CMake] Bumps minimum version to 3.20.0.
This partly undoes D137724.
This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193
Note this does not remove work-arounds for older CMake versions, that
will be done in followup patches.
D150532 [OpenMP] Compile assembly files as ASM, not C
Since CMake 3.20, CMake explicitly passes "-x c" (or equivalent)
when compiling a file which has been set as having the language
C. This behaviour change only takes place if "cmake_minimum_required"
is set to 3.20 or newer, or if the policy CMP0119 is set to new.
Attempting to compile assembly files with "-x c" fails, however
this is workarounded in many cases, as OpenMP overrides this with
"-x assembler-with-cpp", however this is only added for non-Windows
targets.
Thus, after increasing cmake_minimum_required to 3.20, this breaks
compiling the GNU assembly for Windows targets; the GNU assembly is
used for ARM and AArch64 Windows targets when building with Clang.
This patch unbreaks that.
D150688 [cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump
The build uses other mechanism to select the runtime.
Fixes#62719
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D151344
The x86-64 medium code model utilizes large data sections, namely .lrodata,
.lbss, and .ldata (along with some variants of .ldata). There is a proposal to
extend the use of large data sections to the large code model as well[1].
This patch aims to place large data sections away from code sections in order to
alleviate relocation overflow pressure caused by code sections referencing
regular data sections.
```
.lrodata
.rodata
.text # if --ro-segment, MAXPAGESIZE alignment
RELRO # MAXPAGESIZE alignment
.data # MAXPAGESIZE alignment
.bss
.ldata # MAXPAGESIZE alignment
.lbss
```
In comparison to GNU ld, which places .lbss, .lrodata, and .ldata after .bss, we
place .lrodata above .rodata to minimize the number of permission transitions in
the memory image.
While GNU ld places .lbss after .bss, the subsequent sections don't reuse the
file offset bytes of BSS.
Our approach is to place .ldata and .lbss after .bss and create a PT_LOAD
segment for .bss to large data section transition in the absence of SECTIONS
commands. assignFileOffsets ensures we insert an alignment instead of allocating
space for BSS, and therefore we don't waste more than MAXPAGESIZE bytes. We have
a missing optimization to prevent all waste, but implementing it would introduce
complexity and likely be error-prone.
GNU ld's layout introduces 2 more MAXPAGESIZE alignments while ours
introduces just one.
[1]: https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU "Large data sections for the large code model"
With help from Arthur Eubanks.
Co-authored-by: James Y Knight <jyknight@google.com>
Reviewed By: aeubanks, tkoeppe
Differential Revision: https://reviews.llvm.org/D150510
This is an ongoing series of commits that are reformatting our
Python code. This catches the last of the python files to
reformat. Since they where so few I bunched them together.
Reformatting is done with `black`.
If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.
If you run into any problems, post to discourse about it and
we will try to help.
RFC Thread below:
https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Reviewed By: jhenderson, #libc, Mordante, sivachandra
Differential Revision: https://reviews.llvm.org/D150784
Much of NOLOAD's intended use is to explicitly change the type of an
output section, so we shouldn't flag these as warnings.
Differential Revision: https://reviews.llvm.org/D151144