293 Commits

Author SHA1 Message Date
Nico Weber
5976a3f5aa Fix a few typos in lld/ELF to cycle bots 2019-10-28 21:41:47 -04:00
Fangrui Song
e47bbd28f8 [ELF] Make MergeInputSection merging aware of output sections
Fixes PR38748

mergeSections() calls getOutputSectionName() to get output section
names. Two MergeInputSections may be merged even if they are made
different by SECTIONS commands.

This patch moves mergeSections() after processSectionCommands() and
addOrphanSections() to fix the issue. The new pass is renamed to
OutputSection::finalizeInputSections().

processSectionCommands() and addorphanSections() are changed to add
sections to InputSectionDescription::sectionBases.

finalizeInputSections() merges MergeInputSections and migrates
`sectionBases` to `sections`.

For the -r case, we drop an optimization that tries keeping sh_entsize
non-zero. This is for the simplicity of addOrphanSections(). The
updated merge-entsize2.s reflects the change.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D67504

llvm-svn: 372734
2019-09-24 11:48:31 +00:00
Fangrui Song
5d9f419a2e Revert "Revert r370635, it caused PR43241."
This reverts commit 50d2dca22b3b05d0ee4883b0cbf93d7d15f241fc.

llvm-svn: 371215
2019-09-06 15:57:24 +00:00
Nico Weber
8455294f2a Revert r370635, it caused PR43241.
llvm-svn: 371202
2019-09-06 13:23:42 +00:00
Fangrui Song
d8bc6a48ea [ELF] Do not ICF two sections with different output sections (by SECTIONS commands)
Fixes PR39418. Complements D47241 (the non-linker-script case).

processSectionCommands() assigns input sections to output sections.
ICF is called before it, so .text.foo and .text.bar may be folded even if
their output sections are made different by SECTIONS commands.

```
markLive<ELFT>()
doIcf<ELFT>()                      // During ICF, we don't know the output sections
writeResult()
  combineEhSections<ELFT>()
  script->processSectionCommands() // InputSection -> OutputSection assignment
```

This patch splits processSectionCommands() into processSectionCommands() and
processSymbolAssignments(), and moves processSectionCommands() before ICF:

```
markLive<ELFT>()
combineEhSections<ELFT>()
script->processSectionCommands()
doIcf<ELFT>()                      // should remove folded input sections
writeResult()
  script->processSymbolAssignments()
```

An alternative approach is to unfold a section `sec` in
processSectionCommands() when we find `sec` and `sec->repl` belong to
different output sections. I feel this patch is superior because this
can fold more sections and the decouple of
SectionCommand/SymbolAssignment gives flexibility:

* An ExprValue can't be evaluated before its section is assigned to an
  output section -> we can delete getOutputSectionVA and simplify
  another place where we had to check if the output section is null.
  Moreover, a case in linkerscript/early-assign-symbol.s can be handled
  now.
* processSectionCommands/processSymbolAssignments can be freely moved
  around.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66717

llvm-svn: 370635
2019-09-02 10:33:58 +00:00
Fangrui Song
debcac9fef [ELF] Make LinkerScript::assignAddresses iterative
PR42990. For `SECTIONS { b = a; . = 0xff00 + (a >> 8); a = .; }`,
we currently set st_value(a)=0xff00 while st_value(b)=0xffff.

The following call tree demonstrates the problem:

```
link<ELF64LE>(Args);
  Script->declareSymbols(); // insert a and b as absolute Defined
  Writer<ELFT>().run();
    Script->processSectionCommands();
      addSymbol(cmd);       // a and b are re-inserted. LinkerScript::getSymbolValue
                            // is lazily called by subsequent evaluation
    finalizeSections();
      forEachRelSec(scanRelocations<ELFT>);
        processRelocAux     // another problem PR42506, not affected by this patch
      finalizeAddressDependentContent(); // loop executed once
        script->assignAddresses(); // a = 0, b = 0xff00
    script->assignAddresses(); // a = 0xff00, _end = 0xffff
```

We need another assignAddresses() to finalize the value of `a`.

This patch

1) modifies assignAddress() to track the original section/value of each
  symbol and return a symbol whose section/value has changed.
2) moves the post-finalizeSections assignAddress() inside the loop
  of finalizeAddressDependentContent() and makes it iterative.
  Symbol assignment may not converge so we make a few attempts before
  bailing out.

Note, assignAddresses() must be called at least twice. The penultimate
call finalized section addresses while the last finalized symbol values.
It is somewhat obscure and there was no comment.
linkerscript/addr-zero.test tests this.

Reviewed By: ruiu

Differential Revision: https://reviews.llvm.org/D66279

llvm-svn: 369889
2019-08-26 10:23:31 +00:00
Rui Ueyama
3837f4273f [Coding style change] Rename variables so that they start with a lowercase letter
This patch is mechanically generated by clang-llvm-rename tool that I wrote
using Clang Refactoring Engine just for creating this patch. You can see the
source code of the tool at https://reviews.llvm.org/D64123. There's no manual
post-processing; you can generate the same patch by re-running the tool against
lld's code base.

Here is the main discussion thread to change the LLVM coding style:
https://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html
In the discussion thread, I proposed we use lld as a testbed for variable
naming scheme change, and this patch does that.

I chose to rename variables so that they are in camelCase, just because that
is a minimal change to make variables to start with a lowercase letter.

Note to downstream patch maintainers: if you are maintaining a downstream lld
repo, just rebasing ahead of this commit would cause massive merge conflicts
because this patch essentially changes every line in the lld subdirectory. But
there's a remedy.

clang-llvm-rename tool is a batch tool, so you can rename variables in your
downstream repo with the tool. Given that, here is how to rebase your repo to
a commit after the mass renaming:

1. rebase to the commit just before the mass variable renaming,
2. apply the tool to your downstream repo to mass-rename variables locally, and
3. rebase again to the head.

Most changes made by the tool should be identical for a downstream repo and
for the head, so at the step 3, almost all changes should be merged and
disappear. I'd expect that there would be some lines that you need to merge by
hand, but that shouldn't be too many.

Differential Revision: https://reviews.llvm.org/D64121

llvm-svn: 365595
2019-07-10 05:00:37 +00:00
Fangrui Song
7f1ff68a16 [ELF] Deleted unused forward declarations. NFC
llvm-svn: 361614
2019-05-24 09:25:47 +00:00
Rui Ueyama
68b9f45fee Replace typedef A B with using B = A. NFC.
I did this using Perl.

Differential Revision: https://reviews.llvm.org/D60003

llvm-svn: 357372
2019-04-01 00:11:24 +00:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
George Rimar
f49fe218c2 [LLD][ELF] - Linker script: accept using a file name without a list of sections.
This is a part of
https://bugs.llvm.org/show_bug.cgi?id=39885

Linker script specification says:
"You can specify a file name to include sections from a particular file. You would
do this if one or more of your files contain special data that needs to be at a
particular location in memory."

LLD did not accept this syntax. The patch implements it.

Differential revision: https://reviews.llvm.org/D55324

llvm-svn: 348463
2018-12-06 08:34:52 +00:00
Rui Ueyama
02c7fae348 Move forward declarations to the top of the file and sort.
llvm-svn: 345094
2018-10-23 22:37:14 +00:00
George Rimar
d30a78b3fe [ELF] - Eliminate the AssertCommand.
Currently, LLD supports ASSERT as a separate command.

We support two forms now.

Assign expression-form: . = ASSERT(0x100)
(old GNU ld required it and some scripts in the wild are still using
something like . = ASSERT((_end - _text <= (512 * 1024 * 1024)), "kernel image bigger than KERNEL_IMAGE_SIZE");

Nowadays above is not a mandatory form and command-like form is commonly used:
ASSERT(<expr>, "text);

The return value of the ASSERT is Dot. That was implemented in D30171.
It looks like (2) is just a short version of (1) then.

GNU ld does *not* list ASSERT as a SECTIONS command:
https://sourceware.org/binutils/docs/ld/SECTIONS.html#SECTIONS

Given above we probably can change ASSERT to be an assignment to Dot. 
That makes the rest of the code much simpler. Patch do that.

Differential revision: https://reviews.llvm.org/D45434

llvm-svn: 330814
2018-04-25 11:16:31 +00:00
George Rimar
e88b76a989 [ELF] - Reveal more information in -Map file about assignments.
Currently, LLD print symbol assignment commands to the map file,
but it does not do that for assignments that are outside of the section
descriptions. Such assignments can affect the layout though.

The patch implements the following:

* Teaches LLD to print symbol assignments outside of section declaration.
* Teaches LLD to print PROVIDE/HIDDEN/PROVIDE hidden commands.

In case when symbol is not provided, nothing will be printed.

Differential revision: https://reviews.llvm.org/D44894

llvm-svn: 329272
2018-04-05 11:25:58 +00:00
George Rimar
4d2740c6ed [ELF] - Cleanup. NFCI.
Rename field, added comments.

This is splitted from the D44894. 
Requested to be committed as independent cleanup.

llvm-svn: 329162
2018-04-04 09:39:05 +00:00
George Rimar
a6ce78ece1 This is PR36799.
Currently, we might have a bug with scripts like below:

.foo : ALIGN(8) 
{
  *(.foo)
} > ram
because do not expand the memory region when doing ALIGN.

This might result in file range overlaps. The patch fixes the issue.

Differential revision: https://reviews.llvm.org/D44730

llvm-svn: 328479
2018-03-26 08:58:16 +00:00
George Rimar
211e94d666 [ELF] - Fix build bot after rL327612.
Missed this one.

llvm-svn: 327616
2018-03-15 09:40:25 +00:00
George Rimar
61a1f50b39 [ELF] - Fix build bot after rL327612.
Error was: 
error: field 'Size' will be initialized after field 'CommandString' [-Werror,-Wreorder]

llvm-svn: 327613
2018-03-15 09:24:51 +00:00
George Rimar
84bcabcb86 [ELF] - Show data and assignment commands in the map file.
Patch teaches LLD to print BYTE/SHORT/LONG/QUAD and
location move commands to the map file.

Differential revision: https://reviews.llvm.org/D44004

llvm-svn: 327612
2018-03-15 09:16:40 +00:00
George Rimar
796684b451 [ELF] - Implement INSERT BEFORE.
This finishes PR35877.

INSERT BEFORE used similar to INSERT AFTER,
it inserts sections before the given target section.

Differential revision: https://reviews.llvm.org/D44380

llvm-svn: 327378
2018-03-13 09:18:11 +00:00
George Rimar
9e2c8a9db1 [ELF] - Support "INSERT AFTER" statement.
This implements INSERT AFTER in a following way:

During reading scripts it collects all insert statements.
After we done and read all files it inserts statements into script commands list.

With that:
* Rest of code does know nothing about INSERT.
* Approach is straightforward and have no visible limitations.
* It is also easy to support INSERT BEFORE (was seen in clang code once).
* Should work for PR35877 and similar cases.

Cons:
* It assumes we have "main" scripts that describes sections.

Differential revision: https://reviews.llvm.org/D43468

llvm-svn: 327003
2018-03-08 14:54:38 +00:00
George Rimar
162d436c8e [ELF] - Support moving location counter when MEMORY is used.
We do not expand memory region correctly for following scripts:

.foo.1 : 
 {
   *(.foo.1)
   . += 0x1000;
 } > ram
Patch generalizes expanding of output sections and memory
regions in one place and fixes the issue.

Differential revision: https://reviews.llvm.org/D43999

llvm-svn: 326688
2018-03-05 10:54:03 +00:00
Rui Ueyama
ee17371897 Merge {COFF,ELF}/Strings.cpp to Common/Strings.cpp.
This should resolve the issue that lld build fails in some hosts
that uses case-insensitive file system.

Differential Revision: https://reviews.llvm.org/D43788

llvm-svn: 326339
2018-02-28 17:38:19 +00:00
George Rimar
db1a062447 [ELF] - Do not remove empty output sections that are explicitly assigned to phdr in script.
This continues direction started in D43069.

We can keep sections that are explicitly assigned to segment in script.
It helps to simplify code.

Differential revision: https://reviews.llvm.org/D43571

llvm-svn: 325887
2018-02-23 10:53:04 +00:00
Rafael Espindola
c9265e81f4 Run dos2unix in a few files. NFC.
llvm-svn: 323793
2018-01-30 17:24:28 +00:00
Rafael Espindola
22d533568b Sort orphan section if --symbol-ordering-file is given.
Before this patch orphan sections were not sorted.

llvm-svn: 323779
2018-01-30 16:20:08 +00:00
George Rimar
c4ccfb5d93 [ELF] - Define linkerscript symbols early.
Currently symbols assigned or created by linkerscript are not processed early
enough. As a result it is not possible to version them or assign any other flags/properties.

Patch creates Defined symbols for -defsym and linkerscript symbols early,
so that issue from above can be addressed.

It is based on Rafael Espindola's version of D38239 patch.

Fixes PR34121.

Differential revision: https://reviews.llvm.org/D41987

llvm-svn: 323729
2018-01-30 09:04:27 +00:00
Rafael Espindola
db9dd5b43e Improve LMARegion handling.
This fixes the crash reported at PR36083.

The issue is that we were trying to put all the sections in the same
PT_LOAD and crashing trying to write past the end of the file.

This also adds accounting for used space in LMARegion, without it all
3 PT_LOADs would have the same physical address.

llvm-svn: 323449
2018-01-25 17:42:03 +00:00
Rafael Espindola
667ffcf153 Simplify. NFC.
llvm-svn: 323440
2018-01-25 16:43:49 +00:00
Rafael Espindola
490f0a4da9 Remove MemRegionOffset. NFC.
We can just use a member variable in MemoryRegion.

llvm-svn: 323399
2018-01-25 02:18:00 +00:00
Rafael Espindola
09b53f6fd8 Delete dead code. NFC.
llvm-svn: 319274
2017-11-29 01:55:03 +00:00
Peter Collingbourne
e9a9e0a1e7 ELF: Merge DefinedRegular and Defined.
Now that DefinedRegular is the only remaining derived class of
Defined, we can merge the two classes.

Differential Revision: https://reviews.llvm.org/D39667

llvm-svn: 317448
2017-11-06 04:35:31 +00:00
Rui Ueyama
aa8523e4b6 Move OutputSectionFactory to LinkerScript.cpp. NFC.
That class is used only by LinkerScript.cpp, so we should move it to
that file. Also, it no longer has to be a "factory" class. It can just
be a non-member function.

llvm-svn: 317427
2017-11-04 23:54:25 +00:00
Rui Ueyama
f52496e1e0 Rename SymbolBody -> Symbol
Now that we have only SymbolBody as the symbol class. So, "SymbolBody"
is a bit strange name now. This is a mechanical change generated by

  perl -i -pe s/SymbolBody/Symbol/g $(git grep -l SymbolBody lld/ELF lld/COFF)

nd clang-format-diff.

Differential Revision: https://reviews.llvm.org/D39459

llvm-svn: 317370
2017-11-03 21:21:47 +00:00
George Rimar
31b6b0a820 [ELf] - Fix compilation after r317307.
Not sure why that seems did not break any llvm bots or
my windows local build, but is was required to fix compilation
breakage of my ubuntu build when using
gcc version 8.0.0 20171019 (experimental)

llvm-svn: 317317
2017-11-03 11:57:01 +00:00
George Rimar
8c825db25e [ELF] - Linkerscript: fixed non-determinism when handling MEMORY.
When findMemoryRegion do search to find a region for output section it
iterates over MemoryRegions which is DenseMap and so does not
guarantee iteration in insertion order. As a result selected region depends
on its name and not on its definition position
Testcase shows the issue, patch fixes it. Behavior after applying the patch
seems consistent with bfd.

Differential revision: https://reviews.llvm.org/D39544

llvm-svn: 317307
2017-11-03 08:21:51 +00:00
George Rimar
f9b04fd91f [ELF] - Simplify output section creation.
When there is no SECTION commands given, all sections are
technically orphans, but now we handle script orphans sections
and regular "orphans" sections for non-scripted case differently,
though we can handle them at one place.

Patch do that change.

Differential revision: https://reviews.llvm.org/D39045

llvm-svn: 316984
2017-10-31 10:31:58 +00:00
Peter Smith
6c9df3fce5 [ELF] Add support for multiple passes to createThunks()
This change allows Thunks to be added on multiple passes. To do this we must
merge only the thunks added in each pass, and deal with thunks that have
drifted out of range of their callers.

A thunk may end out of range of its caller if enough thunks are added in
between the caller and the thunk. To handle this we create another thunk.

Differential Revision: https://reviews.llvm.org/D34692

llvm-svn: 316754
2017-10-27 09:07:10 +00:00
Peter Smith
4a8e11595c [ELF] Record created ThunkSections in InputSectionDescription [NFC].
Instead of maintaining a map of the std::vector to ThunkSections, record the
ThunkSections directly in InputSectionDescription.

Differential Revision: https://reviews.llvm.org/D37743

llvm-svn: 316750
2017-10-27 08:56:20 +00:00
Rafael Espindola
089dac7be1 Make Ctx a plain pointer again.
If a struct has a std::unique_ptr member, the logical interpretation
is that that member will be destroyed with the struct.

That is not the case for Ctx. It is has to be deleted earlier and its
lifetime is defined by the functions where the AddressState is
created.

llvm-svn: 316378
2017-10-23 21:12:19 +00:00
Rafael Espindola
849d499e6d Don't call buildSectionOrder multiple times.
This takes linking the linux kernel from 1.52s to 0.58s.

llvm-svn: 316251
2017-10-21 00:05:01 +00:00
Rafael Espindola
3558e24ae8 Remove unused argument.
llvm-svn: 316248
2017-10-20 23:28:19 +00:00
George Rimar
ee7c99b63a [ELF] - Make LinkerScript::assignOffsets private. NFC.
llvm-svn: 316073
2017-10-18 12:09:41 +00:00
Rui Ueyama
7ad1e3102a Split LinkerScript::computeInputSections into two functions.
llvm-svn: 315434
2017-10-11 04:50:30 +00:00
Rui Ueyama
722221f5a7 Swap parameters of getSymbolValue.
Usually, a function that does symbol lookup takes symbol name as
its first argument. Also, if a function takes a source location hint,
it is usually the last parameter. So the previous parameter order
was counter-intuitive.

llvm-svn: 315433
2017-10-11 04:34:34 +00:00
Rui Ueyama
f0403c601a Rename BytesDataCommand -> ByteCommand.
llvm-svn: 315431
2017-10-11 04:22:09 +00:00
Rui Ueyama
0543343170 Use more precise type.
llvm-svn: 315426
2017-10-11 04:01:13 +00:00
Rui Ueyama
29b240c671 Rename CurAddressState -> Ctx.
We used CurAddressState to capture a dynamic context just like
we use lambdas to capture static contexts. So, CurAddressState
is used everywhere in LinkerScript.cpp. It is worth a shorter
name.

llvm-svn: 315418
2017-10-11 02:45:54 +00:00
Rui Ueyama
183aa2731e Make LinkerScript::addSymbol a private member function.
llvm-svn: 315416
2017-10-11 02:28:39 +00:00
Rui Ueyama
5908c2f877 Rename processCommands -> processSectionCommands.
llvm-svn: 315415
2017-10-11 02:28:28 +00:00