193 Commits

Author SHA1 Message Date
Zachary Turner
bd5fe95fd2 Remove some dead code / includes.
I'm trying to get rid of the TypeDatabase class, so the first
step is to minimize its footprint.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305611 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-16 23:42:15 +00:00
Zachary Turner
7b1eb002e9 [llvm-pdbutil] Add support for dumping lines and inlinee lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305529 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 23:56:19 +00:00
Zachary Turner
7e5d31edd7 Resubmit "[llvm-pdbutil] rewrite the "raw" output style."
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.

It was broken due to some weird template issues, which have
since been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305517 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 22:24:24 +00:00
Zachary Turner
48370ee21f Revert "[llvm-pdbutil] rewrite the "raw" output style."
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.

This is failing due to some strange template problems, so reverting
until it can be straightened out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305505 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 20:55:51 +00:00
Zachary Turner
0f6dce0526 [llvm-pdbutil] rewrite the "raw" output style.
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.

One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.

This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.

See the tests for some examples of what the new output looks like.

Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:

The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.

Differential Revision: https://reviews.llvm.org/D34191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305495 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-15 19:34:41 +00:00
Zachary Turner
63d2fab3cf Resubmit "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This was originally reverted because of some non-deterministic
failures on certain buildbots.  Luckily ASAN eventually caught
this as a stack-use-after-scope, so the fix is included in
this patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305393 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-14 15:59:27 +00:00
Zachary Turner
76c210df33 Revert "[codeview] Make obj2yaml/yaml2obj support .debug$S..."
This is causing failures on linux bots with an invalid stream
read.  It doesn't repro in any configuration on Windows, so
reverting until I have a chance to investigate on Linux.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305371 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-14 06:24:24 +00:00
Zachary Turner
16e473d50a [codeview] Make obj2yaml/yaml2obj support .debug$S/T sections.
This allows us to use yaml2obj and obj2yaml to round-trip CodeView
symbol and type information without having to manually specify the bytes
of the section. This makes for much easier to maintain tests. See the
tests under lld/COFF in this patch for example. Before they just said
SectionData: <blob> whereas now we can use meaningful record
descriptions. Note that it still supports the SectionData yaml field,
which could be useful for initializing a section to invalid bytes for
testing, for example.

Differential Revision: https://reviews.llvm.org/D34127

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305366 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-14 05:31:00 +00:00
Sylvestre Ledru
b12244c42f Same expressions on both sides of the return
Summary:
I guess we want PointerToMemberFunction & PointerToDataMember


Fix coverity cid 1376038 


Reviewers: zturner

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34110

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305219 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-12 18:53:46 +00:00
Zachary Turner
bb67f7f534 [pdb] Support CoffSymbolRVA debug subsection.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305108 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 20:46:52 +00:00
Zachary Turner
4b665d2445 Allow VarStreamArray to use stateful extractors.
Previously extractors tried to be stateless with any additional
context information needed in order to parse items being passed
in via the extraction method.  This led to quite cumbersome
implementation challenges and awkwardness of use.  This patch
brings back support for stateful extractors, making the
implementation and usage simpler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305093 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 17:54:36 +00:00
Bob Haarman
67d04e8fdf [codeview] use 32-bit integer for RelocOffset in DebugLinesSubsection
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.

Fixes PR33335.

Reviewers: zturner, hiraditya!

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305043 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 01:18:10 +00:00
Zachary Turner
680d997aa7 [pdb] Don't crash on unknown debug subsections.
More and more unknown debug subsection kinds are being discovered
so we should make it possible to dump these and display the
bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305041 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 00:53:59 +00:00
Zachary Turner
68ca30aa36 [CodeView] Support remaining debug subsection types
This adds support for Symbols, StringTable, and FrameData subsection
types.  Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB.  The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them.  And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time.  Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305037 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-09 00:28:08 +00:00
Zachary Turner
8f318ceb6f [llvm-pdbdump] Support native ordering of subsections in raw mode.
This is the same change for the YAML Output style applied to the
raw output style.  Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order.  This was because some subsections need to be
read first in order to properly dump later subsections.  This
patch allows them to be dumped in the order they appear.

Differential Revision: https://reviews.llvm.org/D34015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305034 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-08 23:49:01 +00:00
Zachary Turner
efcc38aeb3 [CodeView] Fix endianness bug.
We should be outputting in little endian, but we were writing
in host endianness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304741 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-05 22:12:23 +00:00
Zachary Turner
11d1678b59 [CodeView] Handle Cross Module Imports and Exports.
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.

This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304738 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-05 21:40:33 +00:00
Zachary Turner
42d60ef585 [CodeView] Support CodeView subsections in any order.
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way.  This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.

Differential Revision: https://reviews.llvm.org/D33807

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304588 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-02 19:49:14 +00:00
Zachary Turner
ce3608ab25 Fix 2 more -Wreorder warnings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304494 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-01 23:24:50 +00:00
Zachary Turner
6a330c6d5d [CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries.  Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.

Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.

Differential Revision: https://reviews.llvm.org/D33785

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304484 91177308-0d34-0410-b5e6-96231b3b80d8
2017-06-01 21:52:41 +00:00
Zachary Turner
ea64a9b812 [CodeView] Move CodeView YAML code to ObjectYAML.
This is the beginning of an effort to move the codeview yaml
reader / writer into ObjectYAML so that it can be shared.
Currently the only consumer / producer of CodeView YAML is
llvm-pdbdump, but CodeView can exist outside of PDB files, and
indeed is put into object files and passed to the linker to
produce PDB files.  Furthermore, there are subtle differences
in the types of records that show up in object file CodeView
vs PDB file CodeView, but they are otherwise 99% the same.

By having this code in ObjectYAML, we can have llvm-pdbdump
reuse this code, while teaching obj2yaml and yaml2obj to use
this syntax for dealing with object files that can contain
CodeView.

This patch only adds support for CodeView type information
to ObjectYAML.  Subsequent patches will add support for
CodeView symbol information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304248 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-30 21:53:05 +00:00
Zachary Turner
825457abaa [CodeView] Add more DebugSubsection implementations.
This adds implementations for Symbols and FrameData, and renames
the existing codeview::StringTable class to conform to the
DebugSectionStringTable convention.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304222 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-30 17:13:33 +00:00
Zachary Turner
4b1845a38a [CodeView] Rename ModuleDebugFragment -> DebugSubsection.
This is more concise, and matches the terminology used in other
parts of the codebase more closely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304218 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-30 16:36:15 +00:00
Zachary Turner
3e612404a5 Remove unused member.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303942 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 23:47:56 +00:00
Zachary Turner
f061f6ca68 [CV Type Merging] Find nested type indices faster.
Merging two type streams is one of the most time consuming
parts of generating a PDB, and as such it needs to be as
fast as possible.  The visitor abstractions used for interoperating
nicely with many different types of inputs and outputs have
been used widely and help greatly for testability and implementing
tools, but the abstractions build up and get in the way of
performance.

This patch removes all of the visitation stuff from the type
stream merger, essentially re-inventing the leaf / member switch
and loop, but at a very low level.  This allows us many other
optimizations, such as not actually deserializing *any* records
(even member records which don't describe their own length), as
the operation of "figure out how long this record is" is somewhat
faster than "figure out how long this record *and* get all its
fields out".  Furthermore, whereas before we had to deserialize,
re-write type indices, then re-serialize, now we don't have to
do any of those 3 steps.  We just find out where the type indices
are and pull them directly out of the byte stream and re-write
them.

This is worth a 50-60% performance increase.  On top of all other
optimizations that have been applied this week, I now get the
following numbers when linking lld.exe and lld.pdb

MSVC: 25.67s
Before This Patch: 18.59s
After This Patch: 8.92s

So this is a huge performance win.

Differential Revision: https://reviews.llvm.org/D33564

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303935 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 23:36:16 +00:00
Zachary Turner
522178bccc [CodeView Type Merging] Don't keep re-allocating temp serializer.
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.

This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.

Differential Revision: https://reviews.llvm.org/D33506

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303919 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 21:15:37 +00:00
Zachary Turner
3f3a4fb6ff [CodeView Type Merging] Avoid record deserialization when possible.
A profile shows the majority of time doing type merging is spent
deserializing records from sequences of bytes into friendly C++ structures
that we can easily access members of in order to find the type indices to
re-write.

Records are prefixed with their length, however, and most records have
type indices that appear at fixed offsets in the record. For these
records, we can save some cycles by just looking at the right place in the
byte sequence and re-writing the value, then skipping the record in the
type stream. This saves us from the costly deserialization of examining
every field, including potentially null terminated strings which are the
slowest, even though it was unnecessary to begin with.

In addition, we apply another optimization. Previously, after
deserializing a record and re-writing its type indices, we would
unconditionally re-serialize it in order to compute the hash of the
re-written record. This would result in an alloc and memcpy for every
record. If no type indices were re-written, however, this was an
unnecessary allocation. In this patch re-writing is made two phase. The
first phase discovers the indices that need to be rewritten and their new
values. This information is passed through to the de-duplication code,
which only copies and re-writes type indices in the serialized byte
sequence if at least one type index is different.

Some records have type indices which only appear after variable length
strings, or which have lists of type indices, or various other situations
that can make it tricky to make this optimization. While I'm not giving up
on optimizing these cases as well, for now we can get the easy cases out
of the way and lay the groundwork for more complicated cases later.

This patch yields another 50% speedup on top of the already large speedups
submitted over the past 2 days. In two tests I have run, I went from 9
seconds to 3 seconds, and from 16 seconds to 8 seconds.

Differential Revision: https://reviews.llvm.org/D33480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303914 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-25 21:06:28 +00:00
Zachary Turner
d448c732cb Don't do a full scan of the type stream before processing records.
LazyRandomTypeCollection is designed for random access, and in
order to provide this it lazily indexes ranges of types.  In the
case of types from an object file, there is no partial index
to build off of, so it has to index the full stream up front.
However, merging types only requires sequential access, and when
that is needed, this extra work is simply wasted.  Changing the
algorithm to work on sequential arrays of types rather than
random access type collections eliminates this up front scan.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303707 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-24 00:26:27 +00:00
Zachary Turner
87fb2325ec [CodeView] Eliminate redundant hashes and allocations.
When writing field list records, we would construct a temporary
type serializer that shared a bump ptr allocator with the rest
of the application, so anything allocated from here would live
forever.  Furthermore, this temporary serializer had all the
properties of a full blown serializer including record hashing
and de-duplication.

These features are required when you're merging multiple type
streams into each other, because different streams may contain
identical records, but records from the same type stream will
never collide with each other.  So all of this hashing was
unnecessary.

To solve this, two fixes are made:

1) The temporary serializer keeps its own bump ptr allocator
instead of sharing a global one.  When it's finished, all of
its memory is freed.

2) Instead of using the same temporary serializer for the life
of an entire type stream, we use it only for the life of a single
field list record and delete it when the field list record is
completed.  This way the hash table will not grow as other
records from the same type stream get inserted.  Further improvements
could eliminate hashing entirely from this codepath.

This reduces the link time by 85% in my test, from 1 minute to 9
seconds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303676 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 18:56:23 +00:00
Reid Kleckner
2d1ebed099 Speculative build fix for non-Windows
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303667 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 18:28:13 +00:00
Reid Kleckner
9d053875b7 [PDB] Hash types up front when merging types instead of using StringMap
Summary:
First, StringMap uses llvm::HashString, which is only good for short
identifiers and really bad for large blobs of binary data like type
records. Moving to `DenseMap<StringRef, TypeIndex>` with some tricks for
memory allocation fixes that.

Unfortunately, that didn't buy very much performance. Profiling showed
that we spend a long time during DenseMap growth rehashing existing
entries. Also, in general, DenseMap is faster when the keys are small.
This change takes that to the logical conclusion by introducing a small
wrapper value type around a pointer to key data. The key data contains a
precomputed hash, the original record data (pointer and size), and the
type index, which is the "value" of our original map.

This reduces the time to produce llvm-as.exe and llvm-as.pdb from ~15s
on my machine to 3.5s, which is about a 4x improvement.

Reviewers: zturner, inglorion, ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303665 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 18:23:59 +00:00
Zachary Turner
e7ff77144c Revert "Make TypeSerializer's StringMap use the same allocator."
This reverts commit e34ccb7b57da25cc89ded913d8638a2906d1110a.

This is causing failures on the ASAN bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303640 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-23 15:50:37 +00:00
Zachary Turner
0897ebf658 Implement various flavors of type merging.
Previous algotirhm assumed that types and ids are in a single
unified stream.  For inputs that come from object files, this
is the case.  But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.

Differential Revision: https://reviews.llvm.org/D33417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303577 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-22 21:07:43 +00:00
Zachary Turner
1f0271a22b Make TypeSerializer's StringMap use the same allocator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303576 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-22 21:07:14 +00:00
Zachary Turner
d32a382ebb Resubmit "[CodeView] Provide a common interface for type collections."
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows.  After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling.  Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303446 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-19 19:26:58 +00:00
Zachary Turner
27f68cfeaf Revert "[CodeView] Provide a common interface for type collections."
This is a squash of ~5 reverts of, well, pretty much everything
I did today.  Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures.  I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!

At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303409 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-19 05:57:45 +00:00
Zachary Turner
95239b531c Don't crash if someone tries to visit an empty type stream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303408 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-19 05:18:09 +00:00
Zachary Turner
ac2e785519 [CodeView] Reduce memory usage in TypeSerializer.
We were using a BumpPtrAllocator to allocate stable storage for
a record, then trying to insert that into a hash table.  If a
collision occurred, the bytes were never inserted and the
allocation was unnecessary.  At the cost of an extra hash
computation, check first if it exists, and only if it does do
we allocate and insert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303407 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-19 04:56:48 +00:00
Zachary Turner
943db67405 Fix crasher in CodeView test.
Apparently this was always broken, but previously we were more
graceful about it and we would print "unknown udt" if we couldn't
find the type index, whereas now we just segfault because we
assume it's valid.  But this exposed a real bug, which is that
we weren't looking in the right place.  So fix that, and also
fix this crash at the same time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303397 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-19 00:56:39 +00:00
Zachary Turner
bde49e3081 Fix some build errors and warnings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303391 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-18 23:12:42 +00:00
Zachary Turner
e24978b754 [CodeView] Raise the source to ID map out of the TypeStreamMerger.
This map will be needed to rewrite symbol streams after re-writing
the corresponding type streams.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303390 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-18 23:04:08 +00:00
Zachary Turner
2a4f1171a7 [CodeView] Provide a common interface for type collections.
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).

But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.

This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.

This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.

The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.

Differential Revision: https://reviews.llvm.org/D33293

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303388 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-18 23:03:06 +00:00
Zachary Turner
c254cb777d [CodeView] Simplify the use of visiting type records & streams.
There is often a lot of boilerplate code required to visit a type
record or type stream.  The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine.  Currently this
requires at least 6 lines of code:

  codeview::TypeVisitorCallbackPipeline Pipeline;
  Pipeline.addCallbackToPipeline(Deserializer);
  Pipeline.addCallbackToPipeline(MyCallbacks);

  codeview::CVTypeVisitor Visitor(Pipeline);
  consumeError(Visitor.visitTypeRecord(Record));

With this patch, it becomes one line of code:

  consumeError(codeview::visitTypeRecord(Record, MyCallbacks));

This is done by having the deserialization happen internally inside
of the visitTypeRecord function.  Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.

Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.

Differential Revision: https://reviews.llvm.org/D33245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303271 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-17 16:39:06 +00:00
Zachary Turner
337b2d88de [CodeView] Add a random access type visitor.
This adds a visitor that is capable of accessing type
records randomly and caching intermediate results that it
learns about during partial linear scans.  This yields
amortized O(1) access to a type stream even though type
streams cannot normally be indexed.

Differential Revision: https://reviews.llvm.org/D33009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302936 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-12 19:18:12 +00:00
Aaron Ballman
9bd179193e Removing a file that is not necessary (and was causing link diagnostics with MSVC 2015); NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302531 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-09 14:22:48 +00:00
Zachary Turner
f019c6b219 [CodeView] Add support for random access type visitors.
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record.  This works fine for situations like dumping,
but not when you want to visit types in random order.  For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.

In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline.  This
patch adds such a mode.  In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.

Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.

Differential Revision: https://reviews.llvm.org/D32928

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302454 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-08 18:38:43 +00:00
Zachary Turner
4a6f9ee16e [CodeView] Reserve TypeDatabase records up front.
Most of the time we know exactly how many type records we
have in a list, and we want to use the visitor to deserialize
them into actual records in a database.  Previously we were
just using push_back() every time without reserving the space
up front in the vector.  This is obviously terrible from a
performance standpoint, and it's not uncommon to have PDB
files with half a million type records, where the performance
degredation was quite noticeable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 22:02:37 +00:00
Zachary Turner
2a1f47ab11 Remove unused private field.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302069 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 19:42:06 +00:00
Davide Italiano
b95636bcab [CodeView] Remove constructor initialization of a removed field.
I should've staged this with my last commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302059 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 18:02:46 +00:00
Zachary Turner
ff94599664 [CodeView] Use actual strings for dealing with checksums and lines.
The raw CodeView format references strings by "offsets", but it's
confusing what table the offset refers to.  In the case of line
number information, it's an offset into a buffer of records,
and an indirection is required to get another offset into a
different table to find the final string.  And in the case of
checksum information, there is no indirection, and the offset
refers directly to the location of the string in another buffer.

This would be less confusing if we always just referred to the
strings by their value, and have the library be smart enough
to correctly resolve the offsets on its own from the right
location.

This patch makes that possible.  When either reading or writing,
all the user deals with are strings, and the library does the
appropriate translations behind the scenes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302053 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-03 17:11:40 +00:00