140 Commits

Author SHA1 Message Date
Zachary Turner
61e0e2783c [llvm-pdbdump] Dump MSF headers to YAML.
This is the simplest possible patch to get some kind of YAML
output.  All it dumps is the MSF header fields so that in
theory an empty MSF file could be reconstructed.

Reviewed By: ruiu, majnemer
Differential Revision: http://reviews.llvm.org/D20971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271939 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-06 20:37:05 +00:00
Rui Ueyama
24ef682bfb [pdbdump] Print out New FPO stream contents.
The data strucutre in the new FPO stream is described in the
PE/COFF spec. There is one record per function if frame pointer
is omitted.

Differential Revision: http://reviews.llvm.org/D20999

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271926 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-06 18:39:21 +00:00
Rui Ueyama
e5f15a26d7 pdbdump: print out TPI hashes.
Differential Revision: http://reviews.llvm.org/D20945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271736 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-03 20:48:51 +00:00
Zachary Turner
041e9f2c56 [llvm-pdbdump] Introduce an abstraction for the output style.
This opens the door to introducing a YAML outputter which can be
used for machine consumption.  Currently the yaml output style
is unimplemented and returns an error if you try to use it.

Reviewed By: rnk, ruiu
Differential Revision: http://reviews.llvm.org/D20967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271712 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-03 19:28:33 +00:00
Zachary Turner
f7d0c1f9cf [pdb] Print out file names instead of file offsets.
When printing line information and file checksums, we were printing
the file offset field from the struct header.  This teaches
llvm-pdbdump how to turn those numbers into the filename.  In the
case of file checksums, this is done by looking in the global
string table.  In the case of line contributions, this is done
by indexing into the file names buffer of the DBI stream.  Why
they use a different technique I don't know.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271630 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-03 05:52:57 +00:00
Zachary Turner
c5689fd37c [pdb] Dump file checksums from pdb codeview line info.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271622 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-03 04:01:48 +00:00
Zachary Turner
7e17f48869 [codeview] Dump line number and column information.
To facilitate this, a couple of changes had to be made:

1. `ModuleSubstream` got moved from `DebugInfo/PDB` to
`DebugInfo/CodeView`, and various codeview related types are defined
there.  It turns out `DebugInfo/CodeView/Line.h` already defines many of
these structures, but this is really old code that is not endian aware,
doesn't interact well with `StreamInterface` and not very helpful for
getting stuff out of a PDB.  Eventually we should migrate the old readobj
`COFFDumper` code to these new structures, or at least merge their
functionality somehow.

2. A `ModuleSubstream` visitor is introduced.  Depending on where your
module substream array comes from, different subsets of record types can
be expected.  We are already hand parsing these substream arrays in many
places especially in `COFFDumper.cpp`.  In the future we can migrate these
paths to the visitor as well, which should reduce a lot of code in
`COFFDumper.cpp`.

Differential Revision: http://reviews.llvm.org/D20936
Reviewed By: ruiu, majnemer

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271621 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-03 03:25:59 +00:00
Rui Ueyama
a4bbfe5926 Fix indentation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271620 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-03 02:42:30 +00:00
Zachary Turner
b36db5aff2 [llvm-pdbdump] Dump CodeView line information.
This first pass only splits apart the records and dumps the line
info kinds and binary data.  Subsequent patches will parse out
the binary data into more useful information and dump it in
detail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271576 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-02 20:11:22 +00:00
Rui Ueyama
698e829b29 pdbdump: print out COFF section headers.
Unlike other sections that can grow to any size, the COFF section header
stream has maximum length because each record is fixed size and the COFF
file format limits the maximum number of sections. So I decided to not
create a specific stream class for it. Instead, I added a member function
to DbiStream class which returns a vector of COFF headers.

Differential Revision: http://reviews.llvm.org/D20717

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271557 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-02 18:20:20 +00:00
Zachary Turner
8839a0f88f [pdb] Parse and dump section map and section contribs
Differential Revision: http://reviews.llvm.org/D20876
Reviewed By: rnk, ruiu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271488 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-02 05:07:49 +00:00
Reid Kleckner
25565d97e7 [codeview] Improve readability of type record assembly
Adds the method MCStreamer::EmitBinaryData, which is usually an alias
for EmitBytes. In the MCAsmStreamer case, it is overridden to emit hex
dump output like this:
        .byte   0x0e, 0x00, 0x08, 0x10
        .byte   0x03, 0x00, 0x00, 0x00
        .byte   0x00, 0x00, 0x00, 0x00
        .byte   0x00, 0x10, 0x00, 0x00

Also, when verbose asm comments are enabled, this patch prints the dump
output for each comment before its record, like this:
        # ArgList (0x1000) {
        #   TypeLeafKind: LF_ARGLIST (0x1201)
        #   NumArgs: 0
        #   Arguments [
        #   ]
        # }
        .byte   0x06, 0x00, 0x01, 0x12
        .byte   0x00, 0x00, 0x00, 0x00

This should make debugging easier and testing more convenient.

Reviewers: aaboud

Subscribers: majnemer, zturner, amccarth, aaboud, llvm-commits

Differential Revision: http://reviews.llvm.org/D20711

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271313 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-31 18:45:36 +00:00
David Majnemer
872acbc405 llvm-pdbdump should have a non-zero exit code on error
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271132 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-28 18:25:15 +00:00
Zachary Turner
5cfb6469b8 [pdb] Finish conversion to zero copy pdb access.
This converts remaining uses of ByteStream, which was still
left in the symbol stream and type stream, to using the new
StreamInterface zero-copy classes.

RecordIterator is finally deleted, so this is the only way left
now.  Additionally, more error checking is added when iterating
the various streams.

With this, the transition to zero copy pdb access is complete.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271101 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-28 05:21:57 +00:00
Zachary Turner
9cc0faee65 [codeview] Remove StreamReader copying method.
Since we want to move toward zero-copy access to stream data, we
want to remove all instances of copying operations.  So get rid
of some of those here.

Differential Revision: http://reviews.llvm.org/D20720
Reviewed By: ruiu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270960 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 03:51:53 +00:00
Zachary Turner
0212bc82e0 [codeview,pdb] Try really hard to conserve memory when reading.
PDBs can be extremely large.  We're already mapping the entire
PDB into the process's address space, but to make matters worse
the blocks of the PDB are not arranged contiguously.  So, when
we have something like an array or a string embedded into the
stream, we have to make a copy.  Since it's convenient to use
traditional data structures to iterate and manipulate these
records, we need the memory to be contiguous.

As a result of this, we were using roughly twice as much memory
as the file size of the PDB, because every stream was copied
out and re-stitched together contiguously.

This patch addresses this by improving the MappedBlockStream
to allocate from a BumpPtrAllocator only when a read requires
a discontiguous read.  Furthermore, it introduces some data
structures backed by a stream which can iterate over both
fixed and variable length records of a PDB.  Since everything
is backed by a stream and not a buffer, we can read almost
everything from the PDB with zero copies.

Differential Revision: http://reviews.llvm.org/D20654
Reviewed By: ruiu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270951 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 01:54:44 +00:00
Rui Ueyama
4c87e7cf1a pdbdump: print out the name of the stream 0.
Differential Revision: http://reviews.llvm.org/D20712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270943 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 00:32:07 +00:00
Rui Ueyama
3438b38d19 pdbdump: Add -raw-all to enable all -raw-* flags.
Differential Revision: http://reviews.llvm.org/D20707

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270937 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-26 23:26:55 +00:00
Rui Ueyama
5b155a0d90 Fix typo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270934 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-26 23:01:05 +00:00
Zachary Turner
213d3d3c81 [codeview] Move StreamInterface and StreamReader to libcodeview.
We have need to reuse this functionality, including making
additional generic stream types that are smarter about how and
when they copy memory versus referencing the original memory.
So all of these structures belong in the common library
rather than being pdb specific.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270751 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 20:37:03 +00:00
Zachary Turner
f50e8cdfbd [llvm-pdbdump] Dump raw stream contents as binary block.
Dumping it as ASCII makes it fairly useless.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270742 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 18:32:07 +00:00
Zachary Turner
4f4018ce2c [llvm-pdbdump] Decipher the remaining PDB streams.
We know at least know the meaning of every stream of the
PDB file.  Yay!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270669 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 05:49:48 +00:00
Zachary Turner
fbbe74464a [llvm-pdbdump] Dump the IPI stream and all records.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270661 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 04:35:22 +00:00
Zachary Turner
3b27e60206 [llvm-pdbdump] Stream 0 isn't actually the MSF superblock.
Oddly enough, I realized we don't actually know what stream
0 is (if anything).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270655 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 03:53:16 +00:00
Zachary Turner
ec842a3379 [llvm-pdbdump] Dump stream summary list.
Try to figure out what each stream is, and dump its name.

This gives us a better picture of what streams we still don't
understand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270653 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 03:43:17 +00:00
Zachary Turner
45ab259f8d [llvm-pdbdump] Rework command line options.
When dumping huge PDB files, too many of the options were grouped
together so you would get neverending spew of output.  This patch
introduces more granular display options so you can only dump the
fields you actually care about.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270607 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-24 20:31:48 +00:00
Zachary Turner
653eb429f5 [codeview, pdb] Dump symbol records in publics stream
Differential Revision: http://reviews.llvm.org/D20580
Reviewed By: ruiu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270597 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-24 18:55:14 +00:00
Zachary Turner
fe03e32b28 Dump symbol record details in llvm-pdbdump
This makes use of the newly introduced `CVSymbolVisitor` to dump details
of each type of symbol record in the symbol streams.  Future patches will
bring this visitor based dumping to the publics stream, as well as
creating a `SymbolDumpDelegate` to print more information about
relocations etc.

Differential Revision: http://reviews.llvm.org/D20545
Reviewed By: ruiu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270585 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-24 17:30:25 +00:00
Rui Ueyama
59427f6dc7 pdbdump: print out symbol names referred by publics stream.
DBI stream contains a stream number of the symbol record stream.
Symbol record streams is an array of length-type-value members.
Each member represents one symbol.

Publics stream contains offsets to the symbol record stream.
This patch is to print out all symbols that are referenced by
the publics stream.

Note that even with this patch, llvm-pdbdump cannot dump all the
information in a publics stream since it contains more information
than symbol names. I'll improve it in followup patches.

Differential Revision: http://reviews.llvm.org/D20480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270262 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-20 19:55:17 +00:00
Rui Ueyama
cc76d42fd3 pdbdump: Rename NumberOfSymbols -> SymbolRecordStreamIndex.
Differential Revision: http://reviews.llvm.org/D20441

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270088 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-19 18:05:58 +00:00
Rui Ueyama
17e3d064b5 pdbdump: Print out section offsets in the publics stream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269955 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-18 16:24:16 +00:00
Rui Ueyama
dce33b5457 pdbdump: Print out more strcutures.
I don't yet fully understand the meaning of these data strcutures,
but at least it seems that their sizes and types are correct.
With this change, we can read publics streams till end.

Differential Revision: http://reviews.llvm.org/D20343

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269861 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-17 23:07:48 +00:00
Rui Ueyama
2acb5f6a3a pdbdump: Print "Publics" stream.
Publics stream seems to contain information as to public symbols.
It actually contains a serialized hash table along with fixed-sized
headers. This patch is not complete. It scans only till the end of
the stream and dump the header information. I'll write code to
de-serialize the hash table later.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269484 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 21:21:53 +00:00
Reid Kleckner
30e317fc1e [codeview] Try to handle errors better in record iterator
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269381 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-12 23:26:23 +00:00
Zachary Turner
2391618ef9 Fix build error with ambiguity of size_t.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268948 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 18:45:21 +00:00
Zachary Turner
a95bc25b20 [pdb] Parse the module info stream for each module.
Differential Revision: http://reviews.llvm.org/D20026
Reviewed By: rnk

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268942 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 17:45:21 +00:00
Zachary Turner
5c1193559a Make TypeIterator generic so it can iterate symbols too.
Reviewed By: amccarth
Differential Revision: http://reviews.llvm.org/D20038

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268941 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 17:44:58 +00:00
Zachary Turner
b7d84117e3 Make llvm-pdbdump print CV type records
This reuses the CVTypeDumper from libcodeview to dump full
information about type records within a PDB file.

Differential Revision: http://reviews.llvm.org/D20022
Reviewed By: rnk

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268808 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 22:15:42 +00:00
Zachary Turner
fdb6da3015 Port DebugInfoPDB over to using llvm::Error.
Differential Revision: http://reviews.llvm.org/D19940
Reviewed By: rnk

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268791 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 20:51:57 +00:00
Reid Kleckner
d1310b73a3 Reland "Use ScopedPrinter in llvm-pdbdump"
This reverts r268508 and reinstates r268506 with an additional cast from
TypeLeafKind to unsigned to allow conversion to HexNumber.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268517 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-04 16:09:04 +00:00
Chad Rosier
ea1c623e32 Revert "Use ScopedPrinter in llvm-pdbdump"
This reverts commit r268506 due to build breakage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268508 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-04 15:25:06 +00:00
Zachary Turner
02f1f19dae Use ScopedPrinter in llvm-pdbdump
When printing raw PDB file fields, streams, and records, use the
ScopedPrinter class so we have consistency with llvm-readobj's output
format.

For the most part this is pretty mechanical, but I had to fix up the test
file to conform to the new YAMLesque output format. i added a few
additional helper functions to the ScopedPrinter such as one to print a
dotted version, etc.

Differential Revision: http://reviews.llvm.org/D19897
Reviewed By: rnk

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268506 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-04 15:05:12 +00:00
Zachary Turner
546d80a65a Fix template type deduction error on some compilers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268458 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 22:37:12 +00:00
Zachary Turner
14c490a90d Move CodeViewTypeStream to DebugInfo/CodeView
Ability to parse codeview type streams is also needed by
DebugInfoPDB for parsing PDBs, so moving this into a library
gives us this option.  Since DebugInfoPDB had already hand
rolled some code to do this, that code is now convereted over
to using this common abstraction.

Differential Revision: http://reviews.llvm.org/D19887
Reviewed By: dblaikie, amccarth

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268454 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 22:18:17 +00:00
Zachary Turner
84bc70bd9c Parse the TPI (type information) stream of PDB files.
This parses the TPI stream (stream 2) from the PDB file. This stream
contains some header information followed by a series of codeview records.
There is some additional complexity here in that alongside this stream of
codeview records is a serialized hash table in order to efficiently query
the types. We parse the necessary bookkeeping information to allow us to
reconstruct the hash table, but we do not actually construct it yet as
there are still a few things that need to be understood first.

Differential Revision: http://reviews.llvm.org/D19840
Reviewed By: ruiu, rnk

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268343 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-03 00:28:21 +00:00
Zachary Turner
0e6ef97ce0 Parse PDB Name Hash Table
PDB has a lot of similar data structures.  We already have code
for parsing a Name Map, but PDB seems to have a different but
very similar structure that is a hash table.  This is the
beginning of code needed in order to parse the name hash table,
but it is not yet complete.  It parses the basic metadata of
the hash table, the bucket array, and the names buffer, but
doesn't use any of these fields yet as the data structure
requires a non-trivial amount of work to understand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268268 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-02 18:09:14 +00:00
Zachary Turner
a91bcf5593 Put PDB parsing code into a pdb namespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268072 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 17:28:47 +00:00
Zachary Turner
64cc1b2eb3 Refactor the PDB Stream reading interface.
The motivation for this change is that PDB has the notion of
streams and substreams.  Substreams often consist of variable
length structures that are convenient to be able to treat as
guaranteed, contiguous byte arrays, whereas the streams they
are contained in are not necessarily so, as a single stream
could be spread across many discontiguous blocks.

So, when processing data from a substream, we want to be able
to assume that we have a contiguous byte array so that we can
cast pointers to variable length arrays and such.

This leads to the question of how to be able to read the same
data structure from either a stream or a substream using the
same interface, which is where this patch comes in.

We separate out the stream's read state from the underlying
representation, and introduce a `StreamReader` class.  Then
we change the name of `PDBStream` to `MappedBlockStream`, and
introduce a second kind of stream called a `ByteStream` which is
simply a sequence of contiguous bytes.  Finally, we update all
of the std::vectors in `PDBDbiStream` to use `ByteStream` instead
as a proof of concept.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268071 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 17:22:58 +00:00
David Majnemer
00a5029707 [llvm-pdbdump] Restore error messages, handle bad block sizes
We lost the ability to report errors, bring it back.  Also, correctly
validate the block size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267955 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-28 23:47:27 +00:00
Zachary Turner
5515858465 Read the rest of the DBI substreams, and parse source info.
We now read out the rest of the substreams from the DBI streams.  One of
these substreams, the FileInfo substream, contains information about which
source files contribute to each module (aka compiland).  This patch
additionally parses out the file information from that substream, and
dumps it in llvm-pdbdump.

Differential Revision: http://reviews.llvm.org/D19634
Reviewed by: ruiu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267928 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-28 20:05:18 +00:00