llvm-capstone/lld/ELF/SymbolTable.h
Rui Ueyama bbf154cf9c Move symbol resolution code out of SymbolTable class.
This is the last patch of the series of patches to make it possible to
resolve symbols without asking SymbolTable to do so.

The main point of this patch is the introduction of
`elf::resolveSymbol(Symbol *Old, Symbol *New)`. That function resolves
or merges given symbols by examining symbol types and call
replaceSymbol (which memcpy's New to Old) if necessary.

With the new function, we have now separated symbol resolution from
symbol lookup. If you already have a Symbol pointer, you can directly
resolve the symbol without asking SymbolTable to do that.

Now that the nice abstraction become available, I can start working on
performance improvement of the linker. As a starter, I'm thinking of
making --{start,end}-lib faster.

--{start,end}-lib is currently unnecessarily slow because it looks up
the symbol table twice for each symbol.

 - The first hash table lookup/insertion occurs when we instantiate a
   LazyObject file to insert LazyObject symbols.

 - The second hash table lookup/insertion occurs when we create an
   ObjFile from LazyObject file. That overwrites LazyObject symbols
   with Defined symbols.

I think it is not too hard to see how we can now eliminate the second
hash table lookup. We can keep LazyObject symbols in Step 1, and then
call elf::resolveSymbol() to do Step 2.

Differential Revision: https://reviews.llvm.org/D61898

llvm-svn: 360975
2019-05-17 01:55:20 +00:00

109 lines
3.7 KiB
C++

//===- SymbolTable.h --------------------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLD_ELF_SYMBOL_TABLE_H
#define LLD_ELF_SYMBOL_TABLE_H
#include "InputFiles.h"
#include "LTO.h"
#include "lld/Common/Strings.h"
#include "llvm/ADT/CachedHashString.h"
#include "llvm/ADT/DenseMap.h"
namespace lld {
namespace elf {
class CommonSymbol;
class Defined;
class LazyArchive;
class LazyObject;
class SectionBase;
class SharedSymbol;
class Undefined;
// SymbolTable is a bucket of all known symbols, including defined,
// undefined, or lazy symbols (the last one is symbols in archive
// files whose archive members are not yet loaded).
//
// We put all symbols of all files to a SymbolTable, and the
// SymbolTable selects the "best" symbols if there are name
// conflicts. For example, obviously, a defined symbol is better than
// an undefined symbol. Or, if there's a conflict between a lazy and a
// undefined, it'll read an archive member to read a real definition
// to replace the lazy symbol. The logic is implemented in the
// add*() functions, which are called by input files as they are parsed. There
// is one add* function per symbol type.
class SymbolTable {
public:
template <class ELFT> void addCombinedLTOObject();
void wrap(Symbol *Sym, Symbol *Real, Symbol *Wrap);
ArrayRef<Symbol *> getSymbols() const { return SymVector; }
Symbol *insert(StringRef Name);
Symbol *addSymbol(const Symbol &New);
void fetchLazy(Symbol *Sym);
void scanVersionScript();
Symbol *find(StringRef Name);
void trace(StringRef Name);
void handleDynamicList();
// Set of .so files to not link the same shared object file more than once.
llvm::DenseMap<StringRef, SharedFile *> SoNames;
private:
std::vector<Symbol *> findByVersion(SymbolVersion Ver);
std::vector<Symbol *> findAllByVersion(SymbolVersion Ver);
llvm::StringMap<std::vector<Symbol *>> &getDemangledSyms();
void handleAnonymousVersion();
void assignExactVersion(SymbolVersion Ver, uint16_t VersionId,
StringRef VersionName);
void assignWildcardVersion(SymbolVersion Ver, uint16_t VersionId);
// The order the global symbols are in is not defined. We can use an arbitrary
// order, but it has to be reproducible. That is true even when cross linking.
// The default hashing of StringRef produces different results on 32 and 64
// bit systems so we use a map to a vector. That is arbitrary, deterministic
// but a bit inefficient.
// FIXME: Experiment with passing in a custom hashing or sorting the symbols
// once symbol resolution is finished.
llvm::DenseMap<llvm::CachedHashStringRef, int> SymMap;
std::vector<Symbol *> SymVector;
// Comdat groups define "link once" sections. If two comdat groups have the
// same name, only one of them is linked, and the other is ignored. This set
// is used to uniquify them.
llvm::DenseSet<llvm::CachedHashStringRef> ComdatGroups;
// A map from demangled symbol names to their symbol objects.
// This mapping is 1:N because two symbols with different versions
// can have the same name. We use this map to handle "extern C++ {}"
// directive in version scripts.
llvm::Optional<llvm::StringMap<std::vector<Symbol *>>> DemangledSyms;
// For LTO.
std::unique_ptr<BitcodeCompiler> LTO;
};
extern SymbolTable *Symtab;
void mergeSymbolProperties(Symbol *Old, const Symbol &New);
void resolveSymbol(Symbol *Old, const Symbol &New);
} // namespace elf
} // namespace lld
#endif