Summary: The code we use to read PDBs assumed that streams we ask it to read exist, and would read memory outside a vector and crash if this wasn't the case. This would, for example, cause llvm-pdbdump to crash on PDBs generated by lld. This patch handles such cases more gracefully: the PDB reading code in LLVM now reports errors when asked to get a stream that is not present, and llvm-pdbdump will report missing streams and continue processing streams that are present.
Reviewers: ruiu, zturner
Subscribers: thakis, amccarth
Differential Revision: https://reviews.llvm.org/D27325
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288722 91177308-0d34-0410-b5e6-96231b3b80d8
PDBFileBuilder supports two different ways to create files.
One is PDBFileBuilder::commit. That function takes a filename
and write a result to the file. The other is PDBFileBuilder::build.
That returns a new PDBFile object.
This patch removes the latter because no one is using it and
in a real life situation we are very unlikely to need it.
Even if you need it, it'd be easy to write a new PDB to a memory
buffer and read it back.
Removing PDBFileBuilder::build enables us to remove other classes
build transitively.
Differential Revision: https://reviews.llvm.org/D26987
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287697 91177308-0d34-0410-b5e6-96231b3b80d8
This is required by DbiStream, but DbiStreamBuilder didn't align
these substreams, so the output of DbiSTreamBuilder couldn't be
read by DbiStream.
Test will be added to LLD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287067 91177308-0d34-0410-b5e6-96231b3b80d8
These numbers are intended to be capped at 65535, but
`std::max<uint16_t>(UINT16_MAX, N)` always returns N for any N because
the expression is the same as `std::max((uint16_t)UINT16_MAX, (uint16_t)N)`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287060 91177308-0d34-0410-b5e6-96231b3b80d8
This patch defines a new function to add a SectionContribs stream
to a PDB file. Unlike SectionMap, SectionContribs contains a list
of input sections as opposed to output sections.
Note that this patch needs improving because currently we do not
set Module field in SectionContribs entries. In a follow-up patch,
I'll add Modules and then fix it after that.
Differential Revision: https://reviews.llvm.org/D26210
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286677 91177308-0d34-0410-b5e6-96231b3b80d8
This change enables LLD to construct a Section Map stream in a PDB file.
I do not understand all these fields in the Section Map yet, but it seems
like a copy of a COFF section header in another format.
With this patch, DbiStreamBuilder can emit a Section Map which
llvm-pdbdump can dump.
Differential Revision: https://reviews.llvm.org/D26112
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285606 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This adds support for dumping the globals stream from PDB files using llvm-pdbdump, similar to the support we have for the publics stream.
Reviewers: ruiu, zturner
Subscribers: beanz, mgorny, modocache
Differential Revision: https://reviews.llvm.org/D25801
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284861 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have dropped MSVC 2013, all supported compilers support
noexcept and we can drop this portability macro.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284672 91177308-0d34-0410-b5e6-96231b3b80d8
This is just a quick utility handy for getting rough summaries of types
in a given object or dwo file. I've been using it to investigate the
amount of type info redundancy across a project build, for example.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284537 91177308-0d34-0410-b5e6-96231b3b80d8
The previous commit was failing because we filled empty slots of
the debug stream index with kInvalidStreamIndex. It should've been 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283925 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, there is no way to create a stream other than pre-defined
special stream such as DBI or IPI. This patch adds a new method,
addDbgStream, to add a debug stream to a PDB file.
Differential Revision: https://reviews.llvm.org/D25356
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283823 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first step towards round-tripping symbol information,
and thusly being able to write symbol information to a PDB.
This patch writes the symbol information for each compiland to
the Yaml when running in pdb2yaml mode. There's still some loose
ends, such as what to do about relocations (necessary in order to
print linkage names), how to print enums with friendly names, and
how to give the dumper access to the StringTable, but this is a
good first start.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283641 91177308-0d34-0410-b5e6-96231b3b80d8
When we create a PDB file using PDBFileBuilder, the information
in the superblock, such as the size of the resulting file, is not
available.
Previously, PDBFileBuilder::initialize took a superblock assuming
that all the members of the struct are correct. That is useful when
you want to restore the exact information from a YAML file, but
that's probably the only use case in which that is useful.
When we are creating a PDB file on the fly, we have to backfill the
members.
This patch redefines PDBFileBuilder::initialize to take only a
block size. Now all the other members are left as default values,
so that they'll be updated when commit() is called.
Differential Revision: https://reviews.llvm.org/D25108
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282944 91177308-0d34-0410-b5e6-96231b3b80d8
WritableStream needs the exact file size to open a file, but
until we fix the final layout of a PDB file, we don't know the
size of the file.
This patch changes the parameter type of PDBFileBuilder::commit
to solve that chiecken-and-egg problem. Now the function opens
a file after fixing the layout, so it can create a file with the
exact size.
Differential Revision: https://reviews.llvm.org/D25107
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282940 91177308-0d34-0410-b5e6-96231b3b80d8
The IPI stream is structurally identical to the TPI stream, but it
contains different record types. So we just re-use the TPI writing
code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281638 91177308-0d34-0410-b5e6-96231b3b80d8
We were inadvertently adding the size of the hash value stream to
the size of the TPI stream, even though the hash value stream is
an entirely separate stream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281636 91177308-0d34-0410-b5e6-96231b3b80d8
The `CVType` had two redundant fields which were confusing and
error-prone to fill out. By treating member records as a distinct
type from leaf records, we are able to simplify this quite a bit.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24432
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281556 91177308-0d34-0410-b5e6-96231b3b80d8
This simplifies a lot of code, and will actually be necessary for
an upcoming patch to serialize TPI record hash values.
The idea before was that visitors should be examining records, not
modifying them. But this is no longer true with a visitor that
constructs a CVRecord from Yaml. To handle this until now, we
were doing some fixups on CVRecord objects at a higher level, but
the code is really awkward, and it makes sense to just have the
visitor write the bytes into the CVRecord. In doing so I uncovered
a few bugs related to `Data` and `RawData` and fixed those.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D24362
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281067 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we were assuming that any visitation of types would
necessarily be against a type we had binary data for. Reasonable
assumption when were just reading PDBs and dumping them, but once
we start writing PDBs from Yaml this breaks down, because we have
no binary data yet, only Yaml, and from that we need to read the
record kind and perform the switch based on that.
So this patch does that. Instead of having the visitor switch
on the kind that is already in the CVType record, we change the
visitTypeBegin() method to return the Kind, and switch on the
returned value. This way, the default implementation can still
return the value from the CVType, but the implementation which
visits Yaml records and serializes binary PDB type records can
use the field in the Yaml as the source of the switch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280307 91177308-0d34-0410-b5e6-96231b3b80d8
We were kind of hacking this together before by embedding the
ability to forward requests into the TypeDeserializer. When
we want to start adding more different kinds of visitor callback
interfaces though, this doesn't scale well and is very inflexible.
So introduce the notion of a pipeline, which itself implements
the TypeVisitorCallbacks interface, but which contains an internal
list of other callbacks to invoke in sequence.
Also update the existing uses of CVTypeVisitor to use this new
pipeline class for deserializing records before visiting them
with another visitor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280293 91177308-0d34-0410-b5e6-96231b3b80d8
Until now, our use case for the visitor has been to take a stream of bytes
representing a type stream, deserialize the records in sequence, and do
something with them, where "something" is determined by how the user
implements a particular set of callbacks on an abstract class.
For actually writing PDBs, however, we want to do the reverse. We have
some kind of description of the list of records in their in-memory format,
and we want to process each one. Perhaps by serializing them to a byte
stream, or perhaps by converting them from one description format (Yaml)
to another (in-memory representation).
This was difficult in the current model because deserialization and
invoking the callbacks were tightly coupled.
With this patch we change this so that TypeDeserializer is itself an
implementation of the particular set of callbacks. This decouples
deserialization from the iteration over a list of records and invocation
of the callbacks. TypeDeserializer is initialized with another
implementation of the callback interface, so that upon deserialization it
can pass the deserialized record through to the next set of callbacks. In
a sense this is like an implementation of the Decorator design pattern,
where the Deserializer is a decorator.
This will be useful for writing Pdbs from yaml, where we have a
description of the type records in Yaml format. In this case, the visitor
implementation would have each visitation callback method implemented in
such a way as to extract the proper set of fields from the Yaml, and it
could maintain state that builds up a list of these records. Finally at
the end we can pass this information through to another set of callbacks
which serializes them into a byte stream.
Reviewed By: majnemer, ruiu, rnk
Differential Revision: https://reviews.llvm.org/D23177
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277871 91177308-0d34-0410-b5e6-96231b3b80d8
pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit
without calling DbiStreamBuilder::finalize. Because `finalize` initializes
`Header` member, `Header` remained nullptr which caused a crash bug.
Differential Revision: https://reviews.llvm.org/D23143
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277681 91177308-0d34-0410-b5e6-96231b3b80d8
MappedBlockSTream can work with any sequence of block data where
the ordering is specified by a list of block numbers. So rather
than manually stitch them together in the case of the FPM, reuse
this functionality so that we can treat the FPM as if it were
contiguous.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23066
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277609 91177308-0d34-0410-b5e6-96231b3b80d8
The FPM is split at regular intervals across the MSF file, as the MS code
suggests. It turns out that the value of the interval is precisely the
block size. If the block size is 4096, then there are two Fpm pages every
4096 blocks.
So here we teach the PDBFile class to parse a split FPM, and also add more
options when dumping the FPM to display some additional information such
as orphaned pages (pages which the FPM says are allocated, but which
nothing appears to use), use after free pages (pages which the FPM says
are not allocated, but which are referenced by a stream), and multiple use
pages (pages which the FPM says are allocated but are used more than
once).
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D23022
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277388 91177308-0d34-0410-b5e6-96231b3b80d8
Previously this change was submitted from a Windows machine, so
changes made to the case of filenames and directory names did
not survive the commit, and as a result the CMake source file
names and the on-disk file names did not match on case-sensitive
file systems.
I'm resubmitting this patch from a Linux system, which hopefully
allows the case changes to make it through unfettered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277213 91177308-0d34-0410-b5e6-96231b3b80d8
In a previous patch, it was suggested to use all caps instead of
rolling caps for initialisms, so this patch changes everything
to do this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277190 91177308-0d34-0410-b5e6-96231b3b80d8
This was a pure virtual base class whose purpose was to abstract
away the notion of how you retrieve the layout of a discontiguous
stream of blocks in an Msf file. This led to too many layers of
abstraction making it difficult to figure out what was going on
and extend things. Ultimately, a stream's layout is decided by
its length and the array of block numbers that it lives on. So
rather than have an abstract base class which can return this in
any number of ways, it's more straightforward to simply store them
as fields of a trivial struct, and also to give a more appropriate
name.
This patch does that. It renames IMsfStreamData to MsfStreamLayout,
and deletes the 2 concrete implementations, DirectoryStreamData
and IndexedStreamData. MsfStreamLayout is a trivial struct
with the necessary data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277018 91177308-0d34-0410-b5e6-96231b3b80d8
Previously it was storing all the fields of an msf::Layout as
separate members. This is a trivial cleanup to make it store
an msf::Layout directly. This makes the code more readable
since it becomes clear which fields of PDBFile are actually the
msf specific layout information in a sea of other bookkeeping
fields.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276460 91177308-0d34-0410-b5e6-96231b3b80d8
This makes it easier to have the writable and readable PDB
interfaces share code since the read/write and write-only
interfaces now share a single allocator, you don't have to worry
about a builder building a read only interface and then having
the read-only interface's data become corrupt when the builder
goes out of scope. Now the allocator is specified explicitly
to all constructors, so all interfaces can share a single allocator
that is scoped appropriately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276459 91177308-0d34-0410-b5e6-96231b3b80d8
This provides a better layering of responsibilities among different
aspects of PDB writing code. Some of the MSF related code was
contained in CodeView, and some was in PDB prior to this. Further,
we were often saying PDB when we meant MSF, and the two are
actually independent of each other since in theory you can have
other types of data besides PDB data in an MSF. So, this patch
separates the MSF specific code into its own library, with no
dependencies on anything else, and DebugInfoCodeView and
DebugInfoPDB take dependencies on DebugInfoMsf.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276458 91177308-0d34-0410-b5e6-96231b3b80d8
This facilitates code reuse between the builder classes and the
"frozen" read only versions of the classes used for parsing
existing PDB files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276427 91177308-0d34-0410-b5e6-96231b3b80d8
This implements support for writing compiland and compiland source
file info to a binary PDB. This is tested by adding support for
dumping these fields from an existing PDB to yaml, reading them
back in, and dumping them again and verifying the values are as
expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276426 91177308-0d34-0410-b5e6-96231b3b80d8
Block 1 and 2 of an MSF file are bit vectors that represent the
list of blocks allocated and free in the file. We had been using
these blocks to write stream data and other data, so we mark them
as the free page map now. We don't yet serialize these pages to
the disk, but at least we make a note of what it is, and avoid
writing random data to them.
Doing this also necessitated cleaning up some of the tests to be
more general and hardcode fewer values, which is nice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275629 91177308-0d34-0410-b5e6-96231b3b80d8