This is useful when you want to look at a specific chunk of a
stream or look for discontinuities, and you need to know the
list of blocks occupied by a stream.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306150 91177308-0d34-0410-b5e6-96231b3b80d8
This patch dumps the raw bytes of the pdb name map which contains
the mapping of stream name to stream index for the string table
and other reserved streams.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306148 91177308-0d34-0410-b5e6-96231b3b80d8
Normally we can only make sense of the content of a PDB in terms
of streams and blocks, but in some cases it may be useful to dump
bytes at a specific absolute file offset. For example, if you
know that some interesting data is at a particular location and
you want to see some surrounding data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306146 91177308-0d34-0410-b5e6-96231b3b80d8
The goal here is to make it possible to display absolute
file offsets when dumping byets from an MSF. The problem is
that when dumping bytes from an MSF, often the bytes will
cross a block boundary and encounter a discontinuity. We
can't use the normal formatBinary() function for this because
this would just treat the sequence as entirely ascending, and
not account out-of-order blocks.
This patch adds a formatMsfData() function to our printer, and
then uses this function to improve the output of the -stream-data
command line option for dumping bytes from a particular stream.
Test coverage is also expanded to make sure to include all possible
scenarios of offsets, sizes, and crossing block boundaries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306141 91177308-0d34-0410-b5e6-96231b3b80d8
This idea originally came about when I was doing some deep
investigation of why certain bytes in a PDB that we round-tripped
differed from their original bytes in the source PDB. I found
myself having to hack up the code in many places to dump the
bytes of this substream, or that record. It would be nice if
we could just do this for every possible stream, substream,
debug chunk type, etc.
It doesn't make sense to put this under dump because there's just
so many options that would detract from the more common use case
of just dumping deserialized records. So making a new subcommand
seems like the most logical course of action. In doing so, we
already have two command line options that are suitable for this
new subcommand, so start out by moving them there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306056 91177308-0d34-0410-b5e6-96231b3b80d8
Now you run llvm-pdbutil dump <options>. This is a followup
after having renamed the tool, whereas before raw was obviously
just the style of dumping, whereas now "dump" is the action to
perform with the "util".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306055 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This fixes a bug where we always treat APSInts in Codeview as
signed when writing them to YAML. One symptom of this problem is that
llvm-pdbdump raw would show Enumerator Values that differ between the
original PDB and a PDB that has been round-tripped through YAML.
Reviewers: zturner
Reviewed By: zturner
Subscribers: llvm-commits, fhahn
Differential Revision: https://reviews.llvm.org/D34013
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305965 91177308-0d34-0410-b5e6-96231b3b80d8
We forgot to serialize these because llvm-readobj didn't dump them. They
are typically all zeros in an object file. The linker fills them in with
relocations before adding them to the PDB. Now we can properly round
trip these symbols through pdb2yaml -> yaml2pdb.
I made these fields optional with a zero default so that we can elide
them from our test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305857 91177308-0d34-0410-b5e6-96231b3b80d8
This was regressed in a previous patch that re-wrote the dumper,
and I'm incrementally adding back the pieces that are missing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305524 91177308-0d34-0410-b5e6-96231b3b80d8
This resubmits commit c0c249e9f2ef83e1d1e5f166b50673d92f3579d7.
It was broken due to some weird template issues, which have
since been fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305517 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 83ea17ebf2106859a51fbc2a86031b44d33696ad.
This is failing due to some strange template problems, so reverting
until it can be straightened out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305505 91177308-0d34-0410-b5e6-96231b3b80d8
After some internal discussions, we agreed that the raw output style had
outlived its usefulness. It was originally created before we had even
thought of dumping to YAML, and it was intended to give us some insight
into the internals of a PDB file. Now we have YAML mode which does
almost exactly this but is more powerful in that it can round-trip back
to a PDB, which the raw mode could not do. So the raw mode had become
purely a maintenance burden.
One option was to just delete it. However, its original goal was to be
as readable as possible while staying close to the "metal" - i.e.
presenting the output in a way that maps directly to the underlying file
format. We don't actually need that last requirement anymore since it's
covered by the yaml mode, so we could repurpose "raw" mode to actually
just be as readable as possible.
This patch implements about 80% of the functionality previously in raw
mode, but in a completely different style that is more akin to what
cvdump outputs. Records are very compressed, often times appearing on
just one line. One nice thing about this is that it makes full record
matching easier, because you can grep for indices, names, and leaf types
on a single line often.
See the tests for some examples of what the new output looks like.
Note that this patch actually regresses the functionality of raw mode in
a few areas, but only because the patch was already unreasonably large
and going 100% would have been even worse. Specifically, this patch is
missing:
The ability to dump module debug subsections (checksums, lines, etc)
The ability to dump section headers
Aside from that everything is here. While goign through the tests fixing
them all up, I found many duplicate tests. They've been deleted. In
subsequent patches I will go through and re-add the missing
functionality.
Differential Revision: https://reviews.llvm.org/D34191
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305495 91177308-0d34-0410-b5e6-96231b3b80d8
When we get an unknown symbol type, we might as well at least
dump it. Same goes for round-tripping through YAML, we can
dump the record contents as raw bytes even if we don't know
how to interpret it semantically.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305248 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.
Fixes PR33335.
Reviewers: zturner, hiraditya!
Reviewed By: zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33968
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305043 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for Symbols, StringTable, and FrameData subsection
types. Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB. The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them. And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time. Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305037 91177308-0d34-0410-b5e6-96231b3b80d8
This is the same change for the YAML Output style applied to the
raw output style. Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order. This was because some subsections need to be
read first in order to properly dump later subsections. This
patch allows them to be dumped in the order they appear.
Differential Revision: https://reviews.llvm.org/D34015
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305034 91177308-0d34-0410-b5e6-96231b3b80d8
The pdb2yaml and raw subcommands did something very
similar but with a different output format, and they
used a lot of the same command line options, but each
one re-implemented the command line option with slightly
different spellings / options. This patch merges them
together into a single definition which is shared by
both subcommands. This new syntax also allows for more
flexibility in the way debug subsections are dumped.
Differential Revision: https://reviews.llvm.org/D33996
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305032 91177308-0d34-0410-b5e6-96231b3b80d8
While it's not entirely clear why a compiler or linker might
put this information into an object or PDB file, one has been
spotted in the wild which was causing llvm-pdbdump to crash.
This patch adds support for reading-writing these sections.
Since I don't know how to get one of the native tools to
generate this kind of debug info, the only test here is one
in which we feed YAML into the tool to produce a PDB and
then spit out YAML from the resulting PDB and make sure that
it matches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304738 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way. This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.
Differential Revision: https://reviews.llvm.org/D33807
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304588 91177308-0d34-0410-b5e6-96231b3b80d8
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries. Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.
Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.
Differential Revision: https://reviews.llvm.org/D33785
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304484 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
DbiStreamBuilder calculated the offset of the source file names inside
the file info substream as the size of the file info substream minus
the size of the file names. Since the file info substream is padded to
a multiple of 4 bytes, this caused the first file name to be aligned
on a 4-byte boundary. By contrast, DbiModuleList would read the file
names immediately after the file name offset table, without skipping
to the next 4-byte boundary. This change makes it so that the file
names are written to the location where DbiModuleList expects them,
and puts any necessary padding for the file info substream after the
file names instead of before it.
Reviewers: amccarth, rnk, zturner
Reviewed By: amccarth, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33475
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303917 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Previously, the yaml2pdb subcommand of llvm-pdbdump only
included object file names in module info if a module info stream was
present. This change makes it so that we include the object file name
even if there is no module info stream for the module. As a result,
running
llvm-pdbdump pdb2yaml -dbi-module-info original.pdb > original.yaml &&
llvm-pdbdump yaml2pdb -pdb=new.pdb original.yaml && llvm-pdbdump
pdb2yaml -dbi-module-info new.pdb > new.yaml now produces identical
original.yaml and new.yaml files.
Reviewers: amccarth, zturner
Reviewed By: zturner
Subscribers: fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D33463
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303891 91177308-0d34-0410-b5e6-96231b3b80d8
Previous algotirhm assumed that types and ids are in a single
unified stream. For inputs that come from object files, this
is the case. But if the input is already a PDB, or is the result
of a previous merge, then the types and ids will already have
been split up, in which case we need an algorithm that can
accept operate on independent streams of types and ids that
refer across stream boundaries to each other.
Differential Revision: https://reviews.llvm.org/D33417
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303577 91177308-0d34-0410-b5e6-96231b3b80d8
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows. After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling. Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303446 91177308-0d34-0410-b5e6-96231b3b80d8
This is a squash of ~5 reverts of, well, pretty much everything
I did today. Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures. I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!
At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303409 91177308-0d34-0410-b5e6-96231b3b80d8
Similar to my previous fix, it turns out llvm-pdbdump has been
printing an incorrect value since the beginning of time, but
we didn't know it was incorrect. Specifically, we were interpreting
a TypeIndex as referencing a type from the TPI stream when it
actually should come from the IPI stream. So we were printing a
string that looked like a valid string, but was just from the
wrong place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303403 91177308-0d34-0410-b5e6-96231b3b80d8
Merging PDBs is a feature that will be used heavily by
the linker. The functionality already exists but does not
have deep test coverage because it's not easily exposed through
any tools. This patch aims to address that by adding the
ability to merge PDBs via llvm-pdbdump. It takes arbitrarily
many PDBs and outputs a single PDB.
Using this new functionality, a test is added for merging
type records. Future patches will add the ability to merge
symbol records, module information, etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303389 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we wrote line information and file checksum
information, but we did not write information about inlinee
lines and functions. This patch adds support for that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301936 91177308-0d34-0410-b5e6-96231b3b80d8
There is a lot of duplicate code for printing line info between
YAML and the raw output printer. This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301728 91177308-0d34-0410-b5e6-96231b3b80d8
The previous algorithm processed one character at a time, which is very
painful on a modern CPU. Replace it with xxHash64, which both already
exists in the codebase and is fairly fast.
Patch from Scott Smith!
Differential Revision: https://reviews.llvm.org/D32509
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301487 91177308-0d34-0410-b5e6-96231b3b80d8
We were already parsing and dumping this to the human readable
format, but not to the YAML format. This does so, in preparation
for reading it in and reconstructing the line information from
YAML.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301357 91177308-0d34-0410-b5e6-96231b3b80d8
Previously the dumping of class definitions was very primitive,
and it made it hard to do more than the most trivial of output
formats when dumping. As such, we would only dump one line for
each field, and then dump non-layout items like nested types
and enums.
With this patch, we do a complete analysis of the object
hierarchy including aggregate types, bases, virtual bases,
vftable analysis, etc. The only immediately visible effects
of this are that a) we can now dump a line for the vfptr where
before we would treat that as padding, and b) we now don't
treat virtual bases that come at the end of a class as padding
since we have a more detailed analysis of the class's storage
usage.
In subsequent patches, we should be able to use this analysis
to display a complete graphical view of a class's layout including
recursing arbitrarily deep into an object's base class / aggregate
member hierarchy.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300133 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This lets PDB readers lookup type record data by type index in O(log n)
time. It also enables makes `cvdump -t` work on PDBs produced by LLD.
cvdump will not dump a PDB that doesn't have an index-to-offset table.
The table is sorted by type index, and has an entry every 8KB. Looking
up a type record by index is a binary search of this table, followed by
a scan of at most 8KB.
Reviewers: ruiu, zturner, inglorion
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31636
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299958 91177308-0d34-0410-b5e6-96231b3b80d8
When dumping classes, show where padding occurs, and at the end of the
class print statistics about how many bytes total of padding exist in a
class.
Since PDB doesn't specifically contain information about padding, we have
to mimic this by sort of reversing a small portion of the record layout
algorithm (e.g. looking at offsets and sizes and trying to determine
whether something is part of the same field or a new field).
Differential Revision: https://reviews.llvm.org/D31800
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299869 91177308-0d34-0410-b5e6-96231b3b80d8
* Adds support for pointers to arrays, which was missing
* Adds some tests
* Improves consistency of const and volatile qualifiers
* Eliminates non-composable special case code for arrays and function by using
a more general recursive approach
* Has a hack for getting the calling convention into the right spot for
pointer-to-functions
Given the rapid changes happenning in llvm-pdbdump, this may be difficult to
merge.
Differential Revision: https://reviews.llvm.org/D31832
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299848 91177308-0d34-0410-b5e6-96231b3b80d8
Previously when dumping class definitions, there were only
two modes - on or off. But it's useful to sometimes get a
little more fine-grained. For example, you might only want
to see the record layout (for example to look for extraneous
padding). This patch adds a third mode, layout mode, which
does exactly that. Only this-relative data members are
displayed in this mode.
Differential Revision: https://reviews.llvm.org/D31794
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299733 91177308-0d34-0410-b5e6-96231b3b80d8
This should work on all platforms now that r299006 has landed. Tested locally
on Windows and Linux.
This moves exe symbol-specific method implementations out of NativeRawSymbol
into a concrete subclass. Also adds implementations for hasCTypes and
hasPrivateSymbols and a simple test to ensure the native reader can access the
summary information for the executable from the PDB.
Original Differential Revision: https://reviews.llvm.org/D31059
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299019 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When dumping these records from an object file section, we should use
only one type database. However, when dumping from a PDB, we should use
two: one for the type stream and one for the IPI stream.
Certain type records that normally live in the .debug$T object file
section get moved over to the IPI stream of the PDB file and they get
new indices.
So far, I've noticed that the MSVC linker always moves these records
into IPI:
- LF_FUNC_ID
- LF_MFUNC_ID
- LF_STRING_ID
- LF_SUBSTR_LIST
- LF_BUILDINFO
- LF_UDT_MOD_SRC_LINE
These records have index fields that can point into TPI or IPI. In
particular, LF_SUBSTR_LIST and LF_BUILDINFO point to LF_STRING_ID
records to describe compilation command lines.
I've modified the dumper to have an optional pointer to the item DB, and
to do type name lookup of these fields in that DB. See printItemIndex.
The result is that our pdbdump-headers.test is more faithful to the PDB
contents and the output is less confusing.
Reviewers: ruiu
Subscribers: amccarth, zturner, llvm-commits
Differential Revision: https://reviews.llvm.org/D31309
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298649 91177308-0d34-0410-b5e6-96231b3b80d8
Reverting until I can figure out the root cause.
Revert "Re-land: Make NativeExeSymbol a concrete subclass of NativeRawSymbol [PDB]"
This reverts commit f461a70cc376f0f91c8b4917be79479cc86330a5.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298626 91177308-0d34-0410-b5e6-96231b3b80d8
Use the -color-output option explicitly to eliminate the ANSI color codes in
pdb-native-summary.test. (The default should have done this.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298625 91177308-0d34-0410-b5e6-96231b3b80d8
The new test should pass on all platforms now that llvm-pdbdump has the
`-color-output` option.
This moves exe symbol-specific method implementations out of NativeRawSymbol
into a concrete subclass. Also adds implementations for hasCTypes and
hasPrivateSymbols and a simple test to ensure the native reader can access
the summary information for the executable from the PDB.
Original Differential Revision: https://reviews.llvm.org/D31059
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298623 91177308-0d34-0410-b5e6-96231b3b80d8