Commit Graph

115 Commits

Author SHA1 Message Date
Zachary Turner
88e1ef47a6 [pdb] Use MsfBuilder to handle the writing PDBs.
Previously we would read a PDB, then write some of it back out,
but write the directory, super block, and other pertinent metadata
back out unchanged.  This generates incorrect PDBs since the amount
of data written was not always the same as the amount of data read.

This patch changes things to use the newly introduced `MsfBuilder`
class to write out a correct and accurate set of Msf metadata for
the data *actually* written, which opens up the door for adding and
removing type records, symbol records, and other types of data to
an existing PDB.

llvm-svn: 275627
2016-07-15 22:16:56 +00:00
Rui Ueyama
eb8764db35 Re-enable TPI hash verification for enum records.
We didn't read unique names correctly. As a result, we computed
hashes on (non-)unique names instead of unique names.

llvm-svn: 275150
2016-07-12 03:25:03 +00:00
Benjamin Kramer
8095a0cda8 [codeview] Drop unused private inheritance.
There is no polymorphism here, and StreamRef already contains a
StreamInterface pointer. Dropping the base class makes StreamRef more
transparent to the compiler, for example it can find unused variables.

llvm-svn: 275013
2016-07-10 10:17:36 +00:00
David Majnemer
634a992c4e [llvm-pdbdump] Propagate errors a little more consistently
PDBFile::getBlockData didn't really return any indication that it
failed.  It merely returned an empty buffer.

llvm-svn: 275009
2016-07-10 03:34:47 +00:00
Adrian Prantl
b0bebc48b4 Reapply "Define a module map entry for DebugInfo/CodeView."
This reapplies r274313 with two additional #include directives needed
when submodule visibility is enabled.

Fixes PR28384.

llvm-svn: 274358
2016-07-01 15:54:46 +00:00
Zachary Turner
05b0d33b0c [pdb] Re-add code to write PDB files.
Somehow all the functionality to write PDB files got removed,
probably accidentally when uploading the patch perhaps the wrong
one got uploaded.  This re-adds all the code, as well as the
corresponding test.

llvm-svn: 274248
2016-06-30 17:43:00 +00:00
David Majnemer
2bb75b48c6 [CodeView] Healthy paranoia around strings
Make sure strings don't get too big for a record, truncate them if
need-be.

llvm-svn: 273710
2016-06-24 19:34:41 +00:00
Reid Kleckner
e8d9ec6b43 [codeview] Use one byte for S_FRAMECOOKIE CookieKind and add flags byte
We bailed out while printing codeview for an MSVC compiled
SemaExprCXX.cpp that used this record. The MS reference headers look
incorrect here, which is probably why we had this bug. They use a 32-bit
enum as the field type, but the actual record appears to use one byte
for the cookie kind followed by a flags byte.

llvm-svn: 273691
2016-06-24 17:23:49 +00:00
Reid Kleckner
6e294f46a8 [codeview] Fix trivial bug in OneMethodRecord::isIntroducingVirtual
These should be equality comparisons. Fixes assertions while
self-hosting clang with codeview debug info.

Ultimately this is going to be covered by real tests for virtual method
emission, so I'm not adding a "don't crash on this input" test that I'll
remove soon afterwards.

llvm-svn: 273446
2016-06-22 17:32:59 +00:00
Reid Kleckner
805d357d67 [codeview] Add support for splitting field list records over 64KB
The basic structure is that once a list record goes over 64K, the last
subrecord of the list is an LF_INDEX record that refers to the next
record. Because the type record graph must be toplogically sorted, this
means we have to emit them in reverse order. We build the type record in
order of declaration, so this means that if we don't want extra copies,
we need to detect when we were about to split a record, and leave space
for a continuation subrecord that will point to the eventual split
top-level record.

Also adds dumping support for these records.

Next we should make sure that large method overload lists work properly.

llvm-svn: 273294
2016-06-21 18:33:01 +00:00
Zachary Turner
b871327aa8 Resubmit "[pdb] Change type visitor pattern to be dynamic."
There was a regression introduced during type stream merging when
visiting a field list record.  This has been fixed in this patch.

llvm-svn: 272929
2016-06-16 18:22:27 +00:00
Zachary Turner
9dbc164c30 Revert "[pdb] Change type visitor pattern to be dynamic."
This reverts commit fb0dd311e1ad945827b8ffd5354f4810e2be1579.

This breaks some llvm-readobj tests.

llvm-svn: 272927
2016-06-16 18:09:04 +00:00
Zachary Turner
9300409ecf [pdb] Change type visitor pattern to be dynamic.
This allows better catching of compiler errors since we can use
the override keyword to verify that methods are actually
overridden.

Also in this patch I've changed from storing a boolean Error
code everywhere to returning an llvm::Error, to propagate richer
error information up the call stack.

Reviewed By: ruiu, rnk
Differential Revision: http://reviews.llvm.org/D21410

llvm-svn: 272926
2016-06-16 18:00:28 +00:00
Rui Ueyama
f7a0e93409 [codeview] Pass CVRecord to visitTypeBegin callback.
Both parameters to visitTypeBegin are actually members of CVRecord,
so we can just pass CVRecord instead of destructuring it.

Differential Revision: http://reviews.llvm.org/D21435

llvm-svn: 272899
2016-06-16 14:47:23 +00:00
Rui Ueyama
1b33098d46 [codeview] Remove unused parameter.
Differential Revision: http://reviews.llvm.org/D21433

llvm-svn: 272898
2016-06-16 14:41:22 +00:00
Rui Ueyama
c81d309c64 Implement pdb::hashBufferV8 hash function.
llvm-svn: 272894
2016-06-16 13:48:16 +00:00
Rui Ueyama
44f2539d12 [Codeview] Add a class for LF_UDT_MOD_SRC_LINE.
Differential Revision: http://reviews.llvm.org/D21406

llvm-svn: 272843
2016-06-15 21:25:29 +00:00
Reid Kleckner
2eba5b5cb6 [codeview] Move deserialization methods out of line
They aren't performance critical and don't need to be inline.

llvm-svn: 272829
2016-06-15 20:30:34 +00:00
Reid Kleckner
2655e6fd2e [codeview] Use ArrayRef instead of a non-const vector reference
llvm-svn: 272817
2016-06-15 18:48:35 +00:00
Zachary Turner
ebef1e3fab Resubmit "[pdb] Actually write a PDB to disk from YAML.""
Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21220

llvm-svn: 272708
2016-06-14 20:48:36 +00:00
Zachary Turner
d084f1683f Revert "[pdb] Actually write a PDB to disk from YAML."
This reverts commit 879139e1c6577b09df52de56a6bab856a19ed185.

This was committed accidentally when I blindly typed git svn
dcommit instead of the command to generate a patch.

llvm-svn: 272693
2016-06-14 18:51:35 +00:00
Zachary Turner
9646f68e1a [pdb] Actually write a PDB to disk from YAML.
llvm-svn: 272692
2016-06-14 18:49:36 +00:00
Zachary Turner
8110e5a8a3 Add support for writing through StreamInterface.
This adds method and tests for writing to a PDB stream.  With
this, even a PDB stream which is discontiguous can be treated
as a sequential stream of bytes for the purposes of writing.

Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21157

llvm-svn: 272369
2016-06-10 05:09:12 +00:00
Benjamin Kramer
e3b0933b91 [CodeView] Remove manual expansion of the default copy ctor.
It provides nothing over the default one but makes the class not
trivially copyable. No functionality change intended.

llvm-svn: 272186
2016-06-08 18:19:38 +00:00
Zachary Turner
1431c0d45e [pdb] Fix a potential overflow and remove unnecessary comments.
llvm-svn: 272043
2016-06-07 18:42:39 +00:00
Zachary Turner
df1bab5ad7 [pdb] Use MappedBlockStream to parse the PDB directory.
In order to efficiently write PDBs, we need to be able to make a
StreamWriter class similar to a StreamReader, which can transparently deal
with writing to discontiguous streams, and we need to use this for all
writing, similar to how we use StreamReader for all reading.

Most discontiguous streams are the typical numbered streams that appear in
a PDB file and are described by the directory, but the exception to this,
that until now has been parsed by hand, is the directory itself.
MappedBlockStream works by querying the directory to find out which blocks
a stream occupies and various other things, so naturally the same logic
could not possibly work to describe the blocks that the directory itself
resided on.

To solve this, I've introduced an abstraction IPDBStreamData, which allows
the client to query for the list of blocks occupied by the stream, as well
as the stream length. I provide two implementations of this: one which
queries the directory (for indexed streams), and one which queries the
super block (for the directory stream).

This has the side benefit of vastly simplifying the code to parse the
directory. Whereas before a mini state machine was rolled by hand, now we
simply use FixedStreamArray to read out the stream sizes, then build a
vector of FixedStreamArrays for the stream map, all in just a few lines of
code.

Reviewed By: ruiu
Differential Revision: http://reviews.llvm.org/D21046

llvm-svn: 271982
2016-06-07 05:28:55 +00:00
NAKAMURA Takumi
ce4daad0d6 Trailing whitespace.
llvm-svn: 271861
2016-06-06 00:31:45 +00:00
NAKAMURA Takumi
c3ea8d0e63 Untabify.
llvm-svn: 271860
2016-06-06 00:31:28 +00:00
David Majnemer
838aba6c09 [CodeView] Validate the vftable offset
llvm-svn: 271791
2016-06-04 15:40:29 +00:00
Rui Ueyama
05c45592e0 pdbdump: print out TPI hashes.
Differential Revision: http://reviews.llvm.org/D20945

llvm-svn: 271736
2016-06-03 20:48:51 +00:00
Reid Kleckner
14799a2f9b [codeview] Add basic record type translation
This only translates data members for now. Translating overloaded
methods is complicated, so I stopped short of doing that.

Reviewers: aaboud

Differential Revision: http://reviews.llvm.org/D20924

llvm-svn: 271680
2016-06-03 15:58:20 +00:00
Zachary Turner
eace145381 [pdb] Print out file names instead of file offsets.
When printing line information and file checksums, we were printing
the file offset field from the struct header.  This teaches
llvm-pdbdump how to turn those numbers into the filename.  In the
case of file checksums, this is done by looking in the global
string table.  In the case of line contributions, this is done
by indexing into the file names buffer of the DBI stream.  Why
they use a different technique I don't know.

llvm-svn: 271630
2016-06-03 05:52:57 +00:00
Zachary Turner
6fb9f9896d [pdb] Dump file checksums from pdb codeview line info.
llvm-svn: 271622
2016-06-03 04:01:48 +00:00
Zachary Turner
9277831e4a [codeview] Dump line number and column information.
To facilitate this, a couple of changes had to be made:

1. `ModuleSubstream` got moved from `DebugInfo/PDB` to
`DebugInfo/CodeView`, and various codeview related types are defined
there.  It turns out `DebugInfo/CodeView/Line.h` already defines many of
these structures, but this is really old code that is not endian aware,
doesn't interact well with `StreamInterface` and not very helpful for
getting stuff out of a PDB.  Eventually we should migrate the old readobj
`COFFDumper` code to these new structures, or at least merge their
functionality somehow.

2. A `ModuleSubstream` visitor is introduced.  Depending on where your
module substream array comes from, different subsets of record types can
be expected.  We are already hand parsing these substream arrays in many
places especially in `COFFDumper.cpp`.  In the future we can migrate these
paths to the visitor as well, which should reduce a lot of code in
`COFFDumper.cpp`.

Differential Revision: http://reviews.llvm.org/D20936
Reviewed By: ruiu, majnemer

llvm-svn: 271621
2016-06-03 03:25:59 +00:00
Zachary Turner
8fad65b692 [llvm-pdbdump] Dump CodeView line information.
This first pass only splits apart the records and dumps the line
info kinds and binary data.  Subsequent patches will parse out
the binary data into more useful information and dump it in
detail.

llvm-svn: 271576
2016-06-02 20:11:22 +00:00
Zachary Turner
d356ecd7d3 [codeview] Fix a nasty use after free.
StreamRef was designed to be a thin wrapper over an abstract
stream interface that could itself be treated the same as any
other stream interface.  For this reason, it inherited publicly
from StreamInterface, and stored a StreamInterface* internally.

But StreamRef was also designed to be lightweight and easily
copyable, similar to ArrayRef.  This led to two misuses of
the classes.

1) When creating a StreamRef A from another StreamRef B, it was
   possible to end up with A storing a pointer to B, even when
   B was a temporary object, leading to use after free.
2) The above situation could be repeated ad nauseum, so that
   A stores a pointer to B, which itself stores a pointer to
   another StreamRef C, and so on and so on, creating an
   unnecessarily level of nesting depth.

This patch removes the public inheritance relationship between
StreamRef and StreamInterface, making it so that we can never
accidentally convert a StreamRef to a StreamInterface.

llvm-svn: 271570
2016-06-02 19:51:48 +00:00
David Majnemer
7085a0544f [CodeView] Use None instead of Void if there is no subprogram
llvm-svn: 271566
2016-06-02 18:51:24 +00:00
Zachary Turner
2ff589d36c Fix uninitialized members in VarStreamArrayIterator.
llvm-svn: 271529
2016-06-02 16:28:52 +00:00
David Majnemer
a8e21e2a83 [CodeView] Use the right type index for long long
We used T_INT8 instead of T_QUAD.

llvm-svn: 271497
2016-06-02 07:02:32 +00:00
David Majnemer
7be7a53496 [CodeView] Take the StreamRef::readBytes offset into account when validating
We only considered the length of the operation and the length of the
StreamRef without considered what it meant for the offset to be at a
non-zero position.

llvm-svn: 271496
2016-06-02 06:21:44 +00:00
Zachary Turner
cc571053da [pdb] Parse and dump section map and section contribs
Differential Revision: http://reviews.llvm.org/D20876
Reviewed By: rnk, ruiu

llvm-svn: 271488
2016-06-02 05:07:49 +00:00
David Majnemer
0d36b051c9 [CodeView] Simplify StreamArray operator++
llvm-svn: 271419
2016-06-01 18:13:08 +00:00
David Majnemer
818581cd0c [CodeView] Make sure StreamRef::readBytes doesn't read too much
llvm-svn: 271418
2016-06-01 18:13:06 +00:00
Reid Kleckner
05a06ad643 [codeview] Improve readability of type record assembly
Adds the method MCStreamer::EmitBinaryData, which is usually an alias
for EmitBytes. In the MCAsmStreamer case, it is overridden to emit hex
dump output like this:
        .byte   0x0e, 0x00, 0x08, 0x10
        .byte   0x03, 0x00, 0x00, 0x00
        .byte   0x00, 0x00, 0x00, 0x00
        .byte   0x00, 0x10, 0x00, 0x00

Also, when verbose asm comments are enabled, this patch prints the dump
output for each comment before its record, like this:
        # ArgList (0x1000) {
        #   TypeLeafKind: LF_ARGLIST (0x1201)
        #   NumArgs: 0
        #   Arguments [
        #   ]
        # }
        .byte   0x06, 0x00, 0x01, 0x12
        .byte   0x00, 0x00, 0x00, 0x00

This should make debugging easier and testing more convenient.

Reviewers: aaboud

Subscribers: majnemer, zturner, amccarth, aaboud, llvm-commits

Differential Revision: http://reviews.llvm.org/D20711

llvm-svn: 271313
2016-05-31 18:45:36 +00:00
Reid Kleckner
0021f4c0e0 [codeview] Add a CVTypeDumper::dump(ArrayRef<uint8_t>) overload
This is a convenient wrapper when the type record is already laid out as
bytes in memory.

llvm-svn: 271309
2016-05-31 18:15:23 +00:00
David Majnemer
a1f3014900 [CVRecord] Don't assume that the record has two bytes of data in it
llvm-svn: 271171
2016-05-29 06:18:04 +00:00
David Majnemer
2fa58d5a19 Don't let the readArray size calculation overflow
llvm-svn: 271170
2016-05-29 06:18:01 +00:00
David Majnemer
0f549afb7e [CVSymbolVisitor] It's possible for an error to occur in begin()
If the begin iterator fails, we cannot dereference it's contents.
Instead, we must immediately stop processing symbols.

llvm-svn: 271141
2016-05-28 19:45:54 +00:00
David Majnemer
f5f0beba17 An empty record cannot be null-terminated
llvm-svn: 271104
2016-05-28 05:59:22 +00:00
Zachary Turner
94e5255730 [pdb] Finish conversion to zero copy pdb access.
This converts remaining uses of ByteStream, which was still
left in the symbol stream and type stream, to using the new
StreamInterface zero-copy classes.

RecordIterator is finally deleted, so this is the only way left
now.  Additionally, more error checking is added when iterating
the various streams.

With this, the transition to zero copy pdb access is complete.

llvm-svn: 271101
2016-05-28 05:21:57 +00:00