Summary:
When dumping these records from an object file section, we should use
only one type database. However, when dumping from a PDB, we should use
two: one for the type stream and one for the IPI stream.
Certain type records that normally live in the .debug$T object file
section get moved over to the IPI stream of the PDB file and they get
new indices.
So far, I've noticed that the MSVC linker always moves these records
into IPI:
- LF_FUNC_ID
- LF_MFUNC_ID
- LF_STRING_ID
- LF_SUBSTR_LIST
- LF_BUILDINFO
- LF_UDT_MOD_SRC_LINE
These records have index fields that can point into TPI or IPI. In
particular, LF_SUBSTR_LIST and LF_BUILDINFO point to LF_STRING_ID
records to describe compilation command lines.
I've modified the dumper to have an optional pointer to the item DB, and
to do type name lookup of these fields in that DB. See printItemIndex.
The result is that our pdbdump-headers.test is more faithful to the PDB
contents and the output is less confusing.
Reviewers: ruiu
Subscribers: amccarth, zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D31309
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298649 91177308-0d34-0410-b5e6-96231b3b80d8
Adds -color-output option to llvm-pdbdump pretty commands that lets the user
specify whether the output should have color. The default depends on whether
the output is going to a TTY (per prior discussion in
https://reviews.llvm.org/D31246).
This will enable tests that pipe llvm-pdbdump output to FileCheck to work
across platforms without regard to the differences in ANSI codes.
Differential Revision: https://reviews.llvm.org/D31263
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298610 91177308-0d34-0410-b5e6-96231b3b80d8
They are structurally the same, but now we need to distinguish them
because one record lives in the IPI stream and the other lives in TPI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298474 91177308-0d34-0410-b5e6-96231b3b80d8
This was discovered when running `llvm-pdbdump diff` against
two files, the second of which was generated by running the
first one through pdb2yaml and then yaml2pdb.
The second one was missing some bytes from the PDB Stream, and
tracking this down showed that at the end of the PDB Stream were
some additional bytes that we were ignoring. Looking back
to the reference code, these seem to specify some additional
flags that indicate whether the PDB supports various optional
features.
This patch adds support for reading, writing, and round-tripping
these flags through YAML and the raw dumper, and updates the
tests accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297984 91177308-0d34-0410-b5e6-96231b3b80d8
In doing so I discovered that we completely ignore some bytes
of the PDB Stream after we "finish" loading it. These bytes
seem to specify some additional information about what kind
of data is present in the PDB. A subsequent patch will add
code to read in those fields and store their values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297983 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we did not have support for writing detailed
module information for each module, as well as the symbol
records. This patch adds support for this, and in doing
so enables the ability to construct minimal PDBs from
just a few lines of YAML. A test is added to illustrate
this functionality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297900 91177308-0d34-0410-b5e6-96231b3b80d8
Together, these allow lldb-pdbdump to list all the modules from a PDB using a
native reader (rather than DIA).
Note that I'll probably be specializing NativeRawSymbol in a subsequent patch.
Differential Revision: https://reviews.llvm.org/D30956
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297883 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we could round-trip type records from PDB -> Yaml ->
PDB, but for symbols we could only go from PDB -> Yaml. This
completes the round-tripping for symbols as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297625 91177308-0d34-0410-b5e6-96231b3b80d8
Before the endianness was specified on each call to read
or write of the StreamReader / StreamWriter, but in practice
it's extremely rare for streams to have data encoded in
multiple different endiannesses, so we should optimize for the
99% use case.
This makes the code cleaner and more general, but otherwise
has NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296415 91177308-0d34-0410-b5e6-96231b3b80d8
This was reverted because it was breaking some builds, and
because of incorrect error code usage. Since the CL was
large and contained many different things, I'm resubmitting
it in pieces.
This portion is NFC, and consists of:
1) Renaming classes to follow a consistent naming convention.
2) Fixing the const-ness of the interface methods.
3) Adding detailed doxygen comments.
4) Fixing a few instances of passing `const BinaryStream& X`. These
are now passed as `BinaryStreamRef X`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296394 91177308-0d34-0410-b5e6-96231b3b80d8
r296215, "[PDB] General improvements to Stream library."
r296217, "Disable BinaryStreamTest.StreamReaderObject temporarily."
r296220, "Re-enable BinaryStreamTest.StreamReaderObject."
r296244, "[PDB] Disable some tests that are breaking bots."
r296249, "Add static_cast to silence -Wc++11-narrowing."
std::errc::no_buffer_space should be used for OS-oriented errors for socket transmission.
(Seek discussions around llvm/xray.)
I could substitute s/no_buffer_space/others/g, but I revert whole them ATM.
Could we define and use LLVM errors there?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296258 91177308-0d34-0410-b5e6-96231b3b80d8
This adds various new functionality and cleanup surrounding the
use of the Stream library. Major changes include:
* Renaming of all classes for more consistency / meaningfulness
* Addition of some new methods for reading multiple values at once.
* Full suite of unit tests for reader / writer functionality.
* Full set of doxygen comments for all classes.
* Streams now store their own endianness.
* Fixed some bugs in a few of the classes that were discovered
by the unit tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296215 91177308-0d34-0410-b5e6-96231b3b80d8
This is part of a larger effort to get the Stream code moved
up to Support. I don't want to do it in one large patch, in
part because the changes are so big that it will treat everything
as file deletions and add, losing history in the process.
Aside from that though, it's just a good idea in general to
make small changes.
So this change only changes the names of the Stream related
source files, and applies necessary source fix ups.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296211 91177308-0d34-0410-b5e6-96231b3b80d8
This introduces the `analyze` subcommand. For now there is only
one option, to analyze hash collisions in the type streams. In
the future, however, we could add many more things here, such
as performing size analyses, compacting, and statistics about
the type of records etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293795 91177308-0d34-0410-b5e6-96231b3b80d8
This is not a list of pairs, it is a hash table data structure. We now
correctly parse this out and dump it from llvm-pdbdump.
We still need to understand the conditions that lead to a type
getting an entry in the hash adjuster table. That will be done
in a followup investigation / patch.
Differential Revision: https://reviews.llvm.org/D29090
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293090 91177308-0d34-0410-b5e6-96231b3b80d8
While the builder pattern has proven useful for certain other
larger types, in this case it was hampering the ability to use
the data structure, as for runtime access we need a map that
we can efficiently read from and write to. So the two are merged
into a single data structure that can efficiently be read to,
written from, deserialized from bytes, and serialized to bytes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292664 91177308-0d34-0410-b5e6-96231b3b80d8
Previously the type dumper itself was passed around to a lot of different
places and manipulated in ways that were more appropriate on the type
database. For example, the entire TypeDumper was passed into the symbol
dumper, when all the symbol dumper wanted to do was lookup the name of a
TypeIndex so it could print it. That's what the TypeDatabase is for --
mapping type indices to names.
Another example is how if the user runs llvm-pdbdump with the option to
dump symbols but not types, we still have to visit all types so that we
can print minimal information about the type of a symbol, but just without
dumping full symbol records. The way we did this before is by hacking it
up so that we run everything through the type dumper with a null printer,
so that the output goes to /dev/null. But really, we don't need to dump
anything, all we want to do is build the type database. Since
TypeDatabaseVisitor now exists independently of TypeDumper, we can do
this. We just build a custom visitor callback pipeline that includes a
database visitor but not a dumper.
All the hackery around printers etc goes away. After this patch, we could
probably even delete the entire CVTypeDumper class since really all it is
at this point is a thin wrapper that hides the details of how to build a
useful visitation pipeline. It's not a priority though, so CVTypeDumper
remains for now.
After this patch we will be able to easily plug in a different style of
type dumper by only implementing the proper visitation methods to dump
one-line output and then sticking it on the pipeline.
Differential Revision: https://reviews.llvm.org/D28524
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291724 91177308-0d34-0410-b5e6-96231b3b80d8
We were starting to get some name clashes between llvm-pdbdump
and the common CodeView framework, so I took this opportunity
to rename a bunch of files to more accurately describe their
usage. This also helps in llvm-pdbdump to distinguish
between different files and whether they are used for pretty
dump mode or raw dump mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291627 91177308-0d34-0410-b5e6-96231b3b80d8
Since we type-erase the Windows GUID structure, use unsigned bytes
rather than char, which may be signed (-fsigned-char). NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290765 91177308-0d34-0410-b5e6-96231b3b80d8
The original patch was broken due to some undefined behavior
as well as warnings that were triggering -Werror.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290000 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The code we use to read PDBs assumed that streams we ask it to read exist, and would read memory outside a vector and crash if this wasn't the case. This would, for example, cause llvm-pdbdump to crash on PDBs generated by lld. This patch handles such cases more gracefully: the PDB reading code in LLVM now reports errors when asked to get a stream that is not present, and llvm-pdbdump will report missing streams and continue processing streams that are present.
Reviewers: ruiu, zturner
Subscribers: thakis, amccarth
Differential Revision: https://reviews.llvm.org/D27325
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288722 91177308-0d34-0410-b5e6-96231b3b80d8
Previously support had been added for using CodeViewRecordIO
to read (deserialize) CodeView type records. This patch adds
support for writing those same records. With this patch,
reading and writing of CodeView type records finally uses a single
codepath.
Differential Revision: https://reviews.llvm.org/D26253
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286304 91177308-0d34-0410-b5e6-96231b3b80d8
Using a pattern similar to that of YamlIO, this allows
us to have a single codepath for translating codeview
records to and from serialized byte streams. The
current patch only hooks this up to the reading of
CodeView type records. A subsequent patch will hook
it up for writing of CodeView type records, and then a
third patch will hook up the reading and writing of
CodeView symbols.
Differential Revision: https://reviews.llvm.org/D26040
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285836 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This adds support for dumping the globals stream from PDB files using llvm-pdbdump, similar to the support we have for the publics stream.
Reviewers: ruiu, zturner
Subscribers: beanz, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D25801
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284861 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first step towards round-tripping symbol information,
and thusly being able to write symbol information to a PDB.
This patch writes the symbol information for each compiland to
the Yaml when running in pdb2yaml mode. There's still some loose
ends, such as what to do about relocations (necessary in order to
print linkage names), how to print enums with friendly names, and
how to give the dumper access to the StringTable, but this is a
good first start.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283641 91177308-0d34-0410-b5e6-96231b3b80d8
Type visitor code had already been refactored previously to
decouple the visitor and the visitor callback interface. This
was necessary for having the flexibility to visit in different
ways (for example, dumping to yaml, reading from yaml, dumping
to ScopedPrinter, etc).
This patch merely implements the same visitation pattern for
symbol records that has already been implemented for type records.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283609 91177308-0d34-0410-b5e6-96231b3b80d8
When we create a PDB file using PDBFileBuilder, the information
in the superblock, such as the size of the resulting file, is not
available.
Previously, PDBFileBuilder::initialize took a superblock assuming
that all the members of the struct are correct. That is useful when
you want to restore the exact information from a YAML file, but
that's probably the only use case in which that is useful.
When we are creating a PDB file on the fly, we have to backfill the
members.
This patch redefines PDBFileBuilder::initialize to take only a
block size. Now all the other members are left as default values,
so that they'll be updated when commit() is called.
Differential Revision: https://reviews.llvm.org/D25108
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282944 91177308-0d34-0410-b5e6-96231b3b80d8
WritableStream needs the exact file size to open a file, but
until we fix the final layout of a PDB file, we don't know the
size of the file.
This patch changes the parameter type of PDBFileBuilder::commit
to solve that chiecken-and-egg problem. Now the function opens
a file after fixing the layout, so it can create a file with the
exact size.
Differential Revision: https://reviews.llvm.org/D25107
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282940 91177308-0d34-0410-b5e6-96231b3b80d8
The IPI stream is structurally identical to the TPI stream, but it
contains different record types. So we just re-use the TPI writing
code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281638 91177308-0d34-0410-b5e6-96231b3b80d8
The `CVType` had two redundant fields which were confusing and
error-prone to fill out. By treating member records as a distinct
type from leaf records, we are able to simplify this quite a bit.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24432
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281556 91177308-0d34-0410-b5e6-96231b3b80d8
We have various command line options that print the type of a
stream, the size of a stream, etc but nowhere that it can all be
viewed together.
Since a previous patch introduced the ability to dump the bytes
of a stream, this seems like a good place to present a full view
of the stream's properties including its size, what kind of data
it represents, and the blocks it occupies. So I added the
ability to print that information to the -stream-data command
line option.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281077 91177308-0d34-0410-b5e6-96231b3b80d8
I ran into a situation where I wanted to print out the contents of
page 6 of a PDB as a binary blob, and there was no straightforward
way to do that.
In addition to adding that, this patch also adds the ability to dump
a stream by index as a binary blob, and it will stitch together all
the blocks and dump the whole thing as one seemingly contiguous
sequence of bytes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281070 91177308-0d34-0410-b5e6-96231b3b80d8
This simplifies a lot of code, and will actually be necessary for
an upcoming patch to serialize TPI record hash values.
The idea before was that visitors should be examining records, not
modifying them. But this is no longer true with a visitor that
constructs a CVRecord from Yaml. To handle this until now, we
were doing some fixups on CVRecord objects at a higher level, but
the code is really awkward, and it makes sense to just have the
visitor write the bytes into the CVRecord. In doing so I uncovered
a few bugs related to `Data` and `RawData` and fixed those.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24362
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281067 91177308-0d34-0410-b5e6-96231b3b80d8