VarStreamArray was built on the assumption that it is backed by a
StreamRef, and offset 0 of that StreamRef is the first byte of the first
record in the array.
This is a logical and intuitive assumption, but unfortunately we have
use cases where it doesn't hold. Specifically, a PDB module's symbol
stream is prefixed by 4 bytes containing a magic value, and the first
byte of record data in the array is actually at offset 4 of this byte
sequence.
Previously, we would just truncate the first 4 bytes and then construct
the VarStreamArray with the resulting StreamRef, so that offset 0 of the
underlying stream did correspond to the first byte of the first record,
but this is problematic, because symbol records reference other symbol
records by the absolute offset including that initial magic 4 bytes. So
if another record wants to refer to the first record in the array, it
would say "the record at offset 4".
This led to extremely confusing hacks and semantics in loading code, and
after spending 30 minutes trying to get some math right and failing, I
decided to fix this in the underlying implementation of VarStreamArray.
Now, we can say that a stream is skewed by a particular amount. This
way, when we access a record by absolute offset, we can use the same
values that the records themselves contain, instead of having to do
fixups.
Differential Revision: https://reviews.llvm.org/D55344
llvm-svn: 348499
The Globals table is a hash table keyed on symbol name, so
it's possible to lookup symbols by name in O(1) time. Add
a function to the globals stream to do this, and add an option
to llvm-pdbutil to exercise this, then use it to write some
tests to verify correctness.
llvm-svn: 343951
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.
This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.
llvm-svn: 304738
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way. This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.
Differential Revision: https://reviews.llvm.org/D33807
llvm-svn: 304588
In preparation for introducing writing capabilities for each of
these classes, I would like to adopt a Foo / FooRef naming
convention, where Foo indicates that the class can manipulate and
serialize Foos, and FooRef indicates that it is an immutable view of
an existing Foo. In other words, Foo is a writer and FooRef is a
reader. This patch names some existing readers to conform to the
FooRef convention, while offering no functional change.
llvm-svn: 301810
There is a lot of duplicate code for printing line info between
YAML and the raw output printer. This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.
llvm-svn: 301728
The llvm-readobj parsing code currently exists in our CodeView
library, so we use that to parse instead of re-writing the logic
in the tool.
llvm-svn: 301718
Previously parsing of these were all grouped together into a
single master class that could parse any type of debug info
fragment.
With writing forthcoming, the complexity of each individual
fragment is enough to warrant them having their own classes so
that reading and writing of each fragment type can be grouped
together, but isolated from the code for reading and writing
other fragment types.
In doing so, I found a place where parsing code was duplicated
for the FileChecksums fragment, across llvm-readobj and the
CodeView library, and one of the implementations had a bug.
Now that the codepaths are merged, the bug is resolved.
Differential Revision: https://reviews.llvm.org/D32547
llvm-svn: 301557
We have a lot of very similarly named classes related to
dealing with module debug info. This patch has NFC, it just
renames some classes to be more descriptive (albeit slightly
more to type). The mapping from old to new class names is as
follows:
Old | New
ModInfo | DbiModuleDescriptor
ModuleSubstream | ModuleDebugFragment
ModStream | ModuleDebugStream
With the corresponding Builder classes renamed accordingly.
Differential Revision: https://reviews.llvm.org/D32506
llvm-svn: 301555