208 Commits

Author SHA1 Message Date
Reid Kleckner
4a5ccd44d5 [CodeView] Dump BuildInfoSym and ProcSym type indices
I need to print the type index in hex so that I can match it in
FileCheck for a test I'm writing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308107 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-15 18:10:39 +00:00
Reid Kleckner
fe30dbf0ab [PDB] Fix type server handling for archives
Summary:
This fixes type indices for SDK or CRT static archives. Previously we'd
try to look next to the archive object file path, which would not exist
on the local machine.

Also error out if we can't resolve a type server record. Hypothetically
we can recover from this error by discarding debug info for this object,
but that is not yet implemented.

Reviewers: ruiu, amccarth

Subscribers: aprantl, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D35369

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307946 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-13 20:12:23 +00:00
Reid Kleckner
028eab103d [codeview] Change readobj symbol dumping format
Avoid duplicating DictScope with hand-written names everywhere.  Print
the S_-prefixed symbol kind for every record. This should make it easier
to search for certain kinds of records when debugging PDB linking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-11 23:41:41 +00:00
Reid Kleckner
c80f62248b [codeview] Fix type index discovery for four symbol records
I encountered these when linking LLD, which uses atls.lib. Those objects
appear to use these uncommon symbol records:

  0x115E S_HEAPALLOCSITE
  0x113D S_ENVBLOCK
  0x1113 S_GTHREAD32
  0x1153 S_FILESTATIC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307725 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-11 22:37:25 +00:00
Eugene Zelenko
14a125c6c6 [CodeView, PDB] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306911 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-30 23:06:03 +00:00
Zachary Turner
d36e155be1 [llvm-pdbutil] Output the symbol offset when dumping.
Type records have a unique type index, but symbol records do
not.  Instead, symbol records refer to other symbol records
by referencing their offset in the symbol stream.  In a sense
this is the analogue of the TypeIndex, but we are not printing
it in the dumper.  Printing it not only gives us more useful
information when manually investigating the contents of a PDB,
but also allows us to write better tests by enabling us to
verify that fields that reference other symbol records do
so correctly.

Differential Revision: https://reviews.llvm.org/D34906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306890 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-30 21:35:00 +00:00
Zachary Turner
f330d3e627 [llvm-pdbutil] Add the ability to dump the dependency tree for a type
Previously we had the -type-index option which would dump the record of
a single, but we had no way to follow the dependency graph backwards and
also dump all dependent types.

Having this option makes test-writing better, because we can limit the
test to only those records that are of importance for the thing we're
trying to test, which allows us to use things like CHECK-NEXT to reduce
fragility.

Differential Revision: https://reviews.llvm.org/D34899

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-30 18:15:47 +00:00
Eugene Zelenko
bde81f144d [CodeView] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306616 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-29 00:05:44 +00:00
Zachary Turner
5d2c917523 [llvm-pdbutil] Dump raw bytes of type and id records.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306167 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-23 21:50:54 +00:00
Reid Kleckner
41428eb757 [PDB] Add symbols to the PDB
Summary:
The main complexity in adding symbol records is that we need to
"relocate" all the type indices. Type indices do not have anything like
relocations, an opaque data structure describing where to find existing
type indices for fixups. The linker just has to "know" where the type
references are in the symbol records. I added an overload of
`discoverTypeIndices` that works on symbol records, and it seems to be
able to link the standard library.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34432

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305933 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-21 17:25:56 +00:00
Reid Kleckner
d89466b34c [PDB] Start emitting source file and line information
Summary:
This is a first step towards getting line info to show up in VS and
windbg. So far, only llvm-pdbutil can parse the PDBs that we produce.
cvdump doesn't like something about our file checksum tables. I'll have
to dig into that next.

This patch adds a new DebugSubsectionRecordBuilder which takes bytes
directly from some other producer, such as a linker, and sticks it into
the PDB. Line tables only need to be relocated. No data needs to be
rewritten.

File checksums and string tables, on the other hand, need to be re-done.

Reviewers: zturner, ruiu

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D34257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305713 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-19 17:21:45 +00:00
Reid Kleckner
e075b10f6b [CodeView] Fix dumping of public symbol record flags
I noticed nonsensical type information while dumping PDBs produced by
MSVC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305708 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-19 16:54:51 +00:00
Zachary Turner
d8c55ee753 Delete TypeDatabase.
Merge the functionality into the random access type collection.
This class was only being used in 2 places, so getting rid of it
simplifies the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305653 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-18 20:52:45 +00:00
Zachary Turner
824edbed01 Don't crash if a type record can't be found.
This was a regression introduced in a previous patch.  Adding
back the code that handles this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305617 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-17 00:02:24 +00:00
Zachary Turner
2757ca62d7 [CodeView] Fix random access of type names.
Suppose we had a type index offsets array with a boundary at type index
N. Then you request the name of the type with index N+1, and that name
requires the name of index N-1 (think a parameter list, for example). We
didn't handle this, and we would print something like (<unknown UDT>,
<unknown UDT>).

The fix for this is not entirely trivial, and speaks to a larger
problem. I think we need to kill TypeDatabase, or at the very least kill
TypeDatabaseVisitor. We need a thing that doesn't do any caching
whatsoever, just given a type index it can compute the type name "the
slow way". The reason for the bug is that we don't have anything like
that. Everything goes through the type database, and if we've visited a
record, then we're "done". It doesn't know how to do the expensive thing
of re-visiting dependent records if they've not yet been visited.

What I've done here is more or less copied the code (albeit greatly
simplified) from TypeDatabaseVisitor, but wrapped it in an interface
that just returns a std::string. The logic of caching the name is now in
LazyRandomTypeCollection. Eventually I'd like to move the record
database here as well and the visited record bitfield here as well, at
which point we can actually just delete TypeDatabase. I don't see any
reason for it if a "sequential" collection is just a special case of a
random access collection with an empty partial offsets array.

Differential Revision: https://reviews.llvm.org/D34297

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305612 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-16 23:42:44 +00:00
Zachary Turner
bd5fe95fd2 Remove some dead code / includes.
I'm trying to get rid of the TypeDatabase class, so the first
step is to minimize its footprint.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305611 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-16 23:42:15 +00:00
Zachary Turner
7b1eb002e9 [llvm-pdbutil] Add support for dumping lines and inlinee lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305529 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 23:56:19 +00:00
Zachary Turner
7e5d31edd7 Resubmit "[llvm-pdbutil] rewrite the "raw" output style."
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.

It was broken due to some weird template issues, which have
since been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305517 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 22:24:24 +00:00
Zachary Turner
48370ee21f Revert "[llvm-pdbutil] rewrite the "raw" output style."
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.

This is failing due to some strange template problems, so reverting
until it can be straightened out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305505 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 20:55:51 +00:00
Zachary Turner
0f6dce0526 [llvm-pdbutil] rewrite the "raw" output style.
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.

One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.

This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.

See the tests for some examples of what the new output looks like.

Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:

The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.

Differential Revision: https://reviews.llvm.org/D34191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305495 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 19:34:41 +00:00
Zachary Turner
63d2fab3cf Resubmit "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This was originally reverted because of some non-deterministic
failures on certain buildbots.  Luckily ASAN eventually caught
this as a stack-use-after-scope, so the fix is included in
this patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305393 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-14 15:59:27 +00:00
Zachary Turner
76c210df33 Revert "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This is causing failures on linux bots with an invalid stream
read.  It doesn't repro in any configuration on Windows, so
reverting until I have a chance to investigate on Linux.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305371 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-14 06:24:24 +00:00
Zachary Turner
16e473d50a [codeview] Make obj2yaml/yaml2obj support .debug$S/T sections.
This allows us to use yaml2obj and obj2yaml to round-trip CodeView
symbol and type information without having to manually specify the bytes
of the section. This makes for much easier to maintain tests. See the
tests under lld/COFF in this patch for example. Before they just said
SectionData: <blob> whereas now we can use meaningful record
descriptions. Note that it still supports the SectionData yaml field,
which could be useful for initializing a section to invalid bytes for
testing, for example.

Differential Revision: https://reviews.llvm.org/D34127

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305366 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-14 05:31:00 +00:00
Sylvestre Ledru
b12244c42f Same expressions on both sides of the return
Summary:
I guess we want PointerToMemberFunction & PointerToDataMember


Fix coverity cid 1376038 


Reviewers: zturner

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34110

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305219 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-12 18:53:46 +00:00
Zachary Turner
bb67f7f534 [pdb] Support CoffSymbolRVA debug subsection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305108 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 20:46:52 +00:00
Zachary Turner
4b665d2445 Allow VarStreamArray to use stateful extractors.
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method.  This led to quite cumbersome
implementation challenges and awkwardness of use.  This patch
brings back support for stateful extractors, making the
implementation and usage simpler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305093 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 17:54:36 +00:00
Bob Haarman
67d04e8fdf [codeview] use 32-bit integer for RelocOffset in DebugLinesSubsection
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.

Fixes PR33335.

Reviewers: zturner, hiraditya!

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305043 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 01:18:10 +00:00
Zachary Turner
680d997aa7 [pdb] Don't crash on unknown debug subsections.
More and more unknown debug subsection kinds are being discovered
so we should make it possible to dump these and display the
bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305041 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 00:53:59 +00:00
Zachary Turner
68ca30aa36 [CodeView] Support remaining debug subsection types
This adds support for Symbols, StringTable, and FrameData subsection
types.  Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB.  The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them.  And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time.  Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305037 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 00:28:08 +00:00
Zachary Turner
8f318ceb6f [llvm-pdbdump] Support native ordering of subsections in raw mode.
This is the same change for the YAML Output style applied to the
raw output style.  Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order.  This was because some subsections need to be
read first in order to properly dump later subsections.  This
patch allows them to be dumped in the order they appear.

Differential Revision: https://reviews.llvm.org/D34015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305034 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-08 23:49:01 +00:00
Zachary Turner
efcc38aeb3 [CodeView] Fix endianness bug.
We should be outputting in little endian, but we were writing
in host endianness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304741 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-05 22:12:23 +00:00
Zachary Turner
11d1678b59 [CodeView] Handle Cross Module Imports and Exports.
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.

This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304738 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-05 21:40:33 +00:00
Zachary Turner
42d60ef585 [CodeView] Support CodeView subsections in any order.
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way.  This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.

Differential Revision: https://reviews.llvm.org/D33807

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304588 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-02 19:49:14 +00:00
Zachary Turner
ce3608ab25 Fix 2 more -Wreorder warnings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304494 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-01 23:24:50 +00:00
Zachary Turner
6a330c6d5d [CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries.  Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.

Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.

Differential Revision: https://reviews.llvm.org/D33785

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304484 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-01 21:52:41 +00:00
Zachary Turner
ea64a9b812 [CodeView] Move CodeView YAML code to ObjectYAML.
This is the beginning of an effort to move the codeview yaml
reader / writer into ObjectYAML so that it can be shared.
Currently the only consumer / producer of CodeView YAML is
llvm-pdbdump, but CodeView can exist outside of PDB files, and
indeed is put into object files and passed to the linker to
produce PDB files.  Furthermore, there are subtle differences
in the types of records that show up in object file CodeView
vs PDB file CodeView, but they are otherwise 99% the same.

By having this code in ObjectYAML, we can have llvm-pdbdump
reuse this code, while teaching obj2yaml and yaml2obj to use
this syntax for dealing with object files that can contain
CodeView.

This patch only adds support for CodeView type information
to ObjectYAML.  Subsequent patches will add support for
CodeView symbol information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304248 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-30 21:53:05 +00:00
Zachary Turner
825457abaa [CodeView] Add more DebugSubsection implementations.
This adds implementations for Symbols and FrameData, and renames
the existing codeview::StringTable class to conform to the
DebugSectionStringTable convention.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304222 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-30 17:13:33 +00:00
Zachary Turner
4b1845a38a [CodeView] Rename ModuleDebugFragment -> DebugSubsection.
This is more concise, and matches the terminology used in other
parts of the codebase more closely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304218 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-30 16:36:15 +00:00
Zachary Turner
3e612404a5 Remove unused member.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303942 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 23:47:56 +00:00
Zachary Turner
f061f6ca68 [CV Type Merging] Find nested type indices faster.
Merging two type streams is one of the most time consuming
parts of generating a PDB, and as such it needs to be as
fast as possible.  The visitor abstractions used for interoperating
nicely with many different types of inputs and outputs have
been used widely and help greatly for testability and implementing
tools, but the abstractions build up and get in the way of
performance.

This patch removes all of the visitation stuff from the type
stream merger, essentially re-inventing the leaf / member switch
and loop, but at a very low level.  This allows us many other
optimizations, such as not actually deserializing *any* records
(even member records which don't describe their own length), as
the operation of "figure out how long this record is" is somewhat
faster than "figure out how long this record *and* get all its
fields out".  Furthermore, whereas before we had to deserialize,
re-write type indices, then re-serialize, now we don't have to
do any of those 3 steps.  We just find out where the type indices
are and pull them directly out of the byte stream and re-write
them.

This is worth a 50-60% performance increase.  On top of all other
optimizations that have been applied this week, I now get the
following numbers when linking lld.exe and lld.pdb

MSVC: 25.67s
Before This Patch: 18.59s
After This Patch: 8.92s

So this is a huge performance win.

Differential Revision: https://reviews.llvm.org/D33564

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303935 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 23:36:16 +00:00
Zachary Turner
522178bccc [CodeView Type Merging] Don't keep re-allocating temp serializer.
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.

This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.

Differential Revision: https://reviews.llvm.org/D33506

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303919 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 21:15:37 +00:00
Zachary Turner
3f3a4fb6ff [CodeView Type Merging] Avoid record deserialization when possible.
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.

Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.

In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.

Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.

This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.

Differential Revision: https://reviews.llvm.org/D33480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303914 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 21:06:28 +00:00
Zachary Turner
d448c732cb Don't do a full scan of the type stream before processing records.
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types.  In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted.  Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303707 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-24 00:26:27 +00:00
Zachary Turner
87fb2325ec [CodeView] Eliminate redundant hashes and allocations.
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever.  Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.

These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other.  So all of this hashing was
unnecessary.

To solve this, two fixes are made:

1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one.  When it's finished, all of
its memory is freed.

2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed.  This way the hash table will not grow as other
records from the same type stream get inserted.  Further improvements
could eliminate hashing entirely from this codepath.

This reduces the link time by 85% in my test, from 1 minute to 9
seconds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303676 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 18:56:23 +00:00
Reid Kleckner
2d1ebed099 Speculative build fix for non-Windows
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303667 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 18:28:13 +00:00
Reid Kleckner
9d053875b7 [PDB] Hash types up front when merging types instead of using StringMap
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.

Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.

This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.

Reviewers: zturner, inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303665 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 18:23:59 +00:00
Zachary Turner
e7ff77144c Revert "Make TypeSerializer's StringMap use the same allocator."
This reverts commit e34ccb7b57da25cc89ded913d8638a2906d1110a.

This is causing failures on the ASAN bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303640 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 15:50:37 +00:00
Zachary Turner
0897ebf658 Implement various flavors of type merging.
Previous algotirhm assumed that types and ids are in a single
unified stream.  For inputs that come from object files, this
is the case.  But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.

Differential Revision: https://reviews.llvm.org/D33417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303577 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-22 21:07:43 +00:00
Zachary Turner
1f0271a22b Make TypeSerializer's StringMap use the same allocator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303576 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-22 21:07:14 +00:00
Zachary Turner
d32a382ebb Resubmit "[CodeView] Provide a common interface for type collections."
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows.  After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling.  Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303446 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-19 19:26:58 +00:00