This patch implements a new design for the symbol table that stores
SymbolBodies within a memory region of the Symbol object. Symbols are mutated
by constructing SymbolBodies in place over existing SymbolBodies, rather
than by mutating pointers. As mentioned in the initial proposal [1], this
memory layout helps reduce the cache miss rate by improving memory locality.
Performance numbers:
old(s) new(s)
Without debug info:
chrome 7.178 6.432 (-11.5%)
LLVMgold.so 0.505 0.502 (-0.5%)
clang 0.954 0.827 (-15.4%)
llvm-as 0.052 0.045 (-15.5%)
With debug info:
scylla 5.695 5.613 (-1.5%)
clang 14.396 14.143 (-1.8%)
Performance counter results show that the fewer required indirections is
indeed the cause of the improved performance. For example, when linking
chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and
instructions per cycle increases from 0.78 to 0.83. We are also executing
many fewer instructions (15,516,401,933 down to 15,002,434,310), probably
because we spend less time allocating SymbolBodies.
The new mechanism by which symbols are added to the symbol table is by calling
add* functions on the SymbolTable.
In this patch, I handle local symbols by storing them inside "unparented"
SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating
these SymbolBodies, we can probably do that separately.
I also removed a few members from the SymbolBody class that were only being
used to pass information from the input file to the symbol table.
This patch implements the new design for the ELF linker only. I intend to
prepare a similar patch for the COFF linker.
[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html
Differential Revision: http://reviews.llvm.org/D19752
llvm-svn: 268178
There are currently some bugs in tree around SCEV caching an incorrect
loop disposition. Printing out loop dispositions will let us write
whitebox tests as those are fixed.
The dispositions are printed as a list in "inside out" order,
i.e. innermost loop first.
llvm-svn: 268177
The aim of this patch is to make it easy to re-run the command without
updating paths in the command line. Here is a use case.
Assume that Alice is having an issue with lld and is reporting the issue
to developer Bob. Alice's current directly is /home/alice/work and her
command line is "ld.lld -o foo foo.o ../bar.o". She adds "--reproduce repro"
to the command line and re-run. Then the following text will be produced as
response.txt (notice that the paths are rewritten so that they are
relative to /home/alice/work/repro.)
-o home/alice/work/foo home/alice/work/foo.o home/alice/bar.o
The command also produces the following files by copying inputs.
/home/alice/repro/home/alice/work/foo.o
/home/alice/repro/home/alice/bar.o
Alice zips the directory and send it to Bob. Bob get an archive from Alice
and extract it to his home directory as /home/bob/repro. Now his directory
have the following files.
/home/bob/repro/response.txt
/home/bob/repro/home/alice/work/foo.o
/home/bob/repro/home/alice/bar.o
Bob then re-run the command with these files by the following commands.
cd /home/bob/repro
ld.lld @response.txt
This command will run the linker with the same command line options and
the same input files as Alice's, so it is very likely that Bob will see
the same issue as Alice saw.
Differential Revision: http://reviews.llvm.org/D19737
llvm-svn: 268169
Summary:
We need these variables to concatenate two absolute paths to construct
a valid path. Currently, %t\%t is, for example, expanded to C:\foo\C:\foo,
which is not a valid path because ":" is not a valid path character
on Windows. With this patch, %t will be expanded to C\foo.
Differential Revision: http://reviews.llvm.org/D19757
llvm-svn: 268168
This exposes the Clang API bindings clang_getChildDiagnostics (which returns a
CXDiagnosticSet) and clang_getNumDiagnosticsInSet / clang_getDiagnosticInSet (to
traverse the CXDiagnosticSet), and adds a helper children property in the Python
Diagnostic wrapper.
Also, this adds the missing OVERLOAD_CANDIDATE (700) cursor type.
Patch by Hanson Wang!
llvm-svn: 268167
SystemZ on Linux currently has 53-bit address space. In theory, the hardware
could support a full 64-bit address space, but that's not supported due to
kernel limitations (it'd require 5-level page tables), and there are no plans
for that. The default process layout stays within first 4TB of address space
(to avoid creating 4-level page tables), so any offset >= (1 << 42) is fine.
Let's use 1 << 52 here, ie. exactly half the address space.
I've originally used 7 << 50 (uses top 1/8th of the address space), but ASan
runtime assumes there's some space after the shadow area. While this is
fixable, it's simpler to avoid the issue entirely.
Also, I've originally wanted to have the shadow aligned to 1/8th the address
space, so that we can use OR like X86 to assemble the offset. I no longer
think it's a good idea, since using ADD enables us to load the constant just
once and use it with register + register indexed addressing.
Differential Revision: http://reviews.llvm.org/D19650
llvm-svn: 268161
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.
llvm-svn: 268158
In http://reviews.llvm.org/D19100, I introduced a bug: On OS X, existing programs rely on malloc_size() to detect whether a pointer comes from heap memory (malloc_size returns non-zero) or not. We have to distinguish between a zero-sized allocation (where we need to return 1 from malloc_size, due to other binary compatibility reasons, see http://reviews.llvm.org/D19100), and pointers that are not returned from malloc at all.
Differential Revision: http://reviews.llvm.org/D19653
llvm-svn: 268157
If, in between the splat and the load (which does an implicit splat), there is
a read of the splat register, then that register must have another earlier
definition. In that case, we can't replace the load's destination register with
the splat's destination register.
Unfortunately, I don't have a small or non-fragile test case.
llvm-svn: 268152
These would just crash at runtime.
If we ever decide to support rw text segments this should make it easier
to implement as there is now a single point where we notice the problem.
I have tested this with a freebsd buildworld. It found a non pic
assembly file being linked into a .so,. With that fixed, buildworld
finished.
llvm-svn: 268149
If a guard call being lowered by LowerGuardIntrinsics has the
`!make.implicit` metadata attached, then reattach the metadata to the
branch in the resulting expanded form of the intrinsic. This allows us
to implement null checks as guards and still get the benefit of implicit
null checks.
llvm-svn: 268148
support multiple induction variables
This patch enable loop reroll for the following case:
for(int i=0; i<N; i += 2) {
S += *a++;
S += *a++;
};
Differential Revision: http://reviews.llvm.org/D16550
llvm-svn: 268147
We currently don't do a good job of diagnosing inputs that would require
dynamic relocations to be applied to read only segments.
I am about to improve lld in that area, but unfortunately we developed
tests that depend on the current behavior.
To make clear what is actually changing, this first patch just updates
tests to not depend on the current behavior. In most cases this just
means using a rw section instead of a ro one, but that unfortunately
changes many addresses.
llvm-svn: 268145
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.
Reviewers: arsenm
Subscribers: qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18602
llvm-svn: 268143