Commit Graph

146 Commits

Author SHA1 Message Date
Rafael Espindola
35fe018057 keep only the StringRef version of getFileOrSTDIN.
llvm-svn: 184826
2013-06-25 05:28:34 +00:00
Bill Wendling
49ef14ef73 Use pointers to the MCAsmInfo and MCRegInfo.
Someone may want to do something crazy, like replace these objects if they
change or something.

No functionality change intended.

llvm-svn: 184175
2013-06-18 07:20:20 +00:00
Rui Ueyama
9fac39d7e4 readobj: Dump PE/COFF optional records.
These records are mandatory for executables and are used by the loader.

Reviewers: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D939

llvm-svn: 183852
2013-06-12 19:10:33 +00:00
Kevin Enderby
644e3fc29e Teach llvm-objdump with the -macho parser how to use the data in code table
from the LC_DATA_IN_CODE load command.  And when disassembling print
the data in code formatted for the kind of data it and not disassemble those
bytes.

I added the format specific functionality to the derived class MachOObjectFile
since these tables only appears in Mach-O object files. This is my first
attempt to modify the libObject stuff so if folks have better suggestions
how to fit this in or suggestions on the implementation please let me know.

rdar://11791371

llvm-svn: 183424
2013-06-06 17:20:50 +00:00
Rafael Espindola
6f41305fd8 Handle relocations that don't point to symbols.
In ELF (as in MachO), not all relocations point to symbols. Represent this
properly by using a symbol_iterator instead of a SymbolRef. Update llvm-readobj
ELF's dumper to handle relocatios without symbols.

llvm-svn: 183284
2013-06-05 01:33:53 +00:00
NAKAMURA Takumi
f9fd8805ba llvm-objdump.cpp: Appease MSC16 x64. utostr(n++) causes internal compiler error.
llvm-svn: 182722
2013-05-27 00:02:48 +00:00
Michael J. Spencer
c195b8a813 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Ahmed Bougacha
eedbfb8aab MC: Disassembled CFG reconstruction.
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
  contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
  backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
  disassembler. It first builds an atom for each section. It can also
  construct the CFG, and this splits the text atoms into basic blocks.

MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.

In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).

This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.

Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.

llvm-svn: 182628
2013-05-24 01:07:04 +00:00
Ahmed Bougacha
6979d48fa4 Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
  in the X86 and ARM disassemblers to symbolize immediate operands and
  to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
  translate relocations (either object::RelocationRef, or disassembler
  C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
  finds in an object::ObjectFile. This makes simple symbolization (with
  no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
  support the C API VariantKinds.

Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations:       call _foo-_bar; call _foo-4
- __cf?string:       leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).

As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.

I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).

llvm-svn: 182625
2013-05-24 00:39:57 +00:00
Ahmed Bougacha
92a36ededf llvm-objdump: Initialize MCDisassembler once instead of for each section.
llvm-svn: 182054
2013-05-16 21:28:23 +00:00
Rafael Espindola
237980d752 Remove the MachineMove class.
It was just a less powerful and more confusing version of
MCCFIInstruction. A side effect is that, since MCCFIInstruction uses
dwarf register numbers, calls to getDwarfRegNum are pushed out, which
should allow further simplifications.

I left the MachineModuleInfo::addFrameMove interface unchanged since
this patch was already fairly big.

llvm-svn: 181680
2013-05-13 01:16:13 +00:00
Rafael Espindola
8d7780bd2b Introduce convenience typedefs for the 4 ELF object types.
llvm-svn: 181509
2013-05-09 13:13:28 +00:00
Rafael Espindola
9541db893e Clarify getRelocationAddress x getRelocationOffset a bit.
getRelocationAddress is for dynamic libraries and executables,
getRelocationOffset for relocatable objects.

Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a
test of ELF's. llvm-readobj -r now prints the same values as readelf -r.

llvm-svn: 180259
2013-04-25 12:28:45 +00:00
Rafael Espindola
f56c9bfd51 Don't read one command past the end.
Thanks to Evgeniy Stepanov for reporting this.

It might be a good idea to add a command iterator abstraction to MachO.h, but
this fixes the bug for now.

llvm-svn: 179848
2013-04-19 11:36:47 +00:00
Rafael Espindola
c44b97c596 At Jim Grosbach's request detemplate Object/MachO.h.
We are still able to handle mixed endian objects by swapping one struct at a
time.

llvm-svn: 179778
2013-04-18 18:08:55 +00:00
Alexey Samsonov
8fcb809efb llvm-objdump: Don't print contents of BSS sections: it makes no sense and crashes llvm-objdump on relocated objects with large bss
llvm-svn: 179589
2013-04-16 10:53:11 +00:00
Rafael Espindola
7beab5afc5 Finish templating MachObjectFile over endianness.
We are now able to handle big endian macho files in llvm-readobject. Thanks to
David Fang for providing the object files.

llvm-svn: 179440
2013-04-13 01:45:40 +00:00
Rafael Espindola
09d9620af5 Simplify the code. No functionality change.
llvm-svn: 179259
2013-04-11 03:34:37 +00:00
Rafael Espindola
636657c964 Template the MachO types over endianness.
For now they are still only used as little endian.

llvm-svn: 179147
2013-04-10 03:48:25 +00:00
Rafael Espindola
8c4963712f Convert MachOObjectFile to a template.
For now it is templated only on being 64 or 32 bits. I will add little/big
endian next.

llvm-svn: 179097
2013-04-09 14:49:08 +00:00
Rafael Espindola
9e1ac1d494 Implement MachOObjectFile::getHeader directly.
llvm-svn: 178994
2013-04-07 19:26:57 +00:00
Rafael Espindola
583b0c9980 Remove LoadCommandInfo now that we always have a pointer to the command.
LoadCommandInfo was needed to keep a command and its offset in the file. Now
that we always have a pointer to the command, we don't need the offset.

llvm-svn: 178991
2013-04-07 18:42:06 +00:00
Rafael Espindola
d0f09e97b6 Add MachOObjectFile::LoadCommandInfo.
This avoids using MachOObject::getLoadCommandInfo.

llvm-svn: 178990
2013-04-07 18:08:12 +00:00
Rafael Espindola
fcd60f68e0 Remove MachOObjectFile::getObject.
llvm-svn: 178986
2013-04-07 16:07:35 +00:00
Rafael Espindola
ac8e6c6147 Make getObject const. Remove a const_cast.
llvm-svn: 178980
2013-04-07 14:50:40 +00:00
Rafael Espindola
c5166cdd27 Remove last use of InMemoryStruct in llvm-objdump.
llvm-svn: 178979
2013-04-07 14:40:18 +00:00
Rafael Espindola
80ee9cc2a6 Remove dead code.
llvm-svn: 178977
2013-04-07 14:30:21 +00:00
Rafael Espindola
f2a182a43c Remove unused argument.
llvm-svn: 178976
2013-04-07 14:25:39 +00:00
Rafael Espindola
79787bd764 Don't fetch pointers from a InMemoryStruct.
InMemoryStruct is extremely dangerous as it returns data from an internal
buffer when the endiannes doesn't match. This should fix the tests on big
endian hosts.

llvm-svn: 178875
2013-04-05 15:15:22 +00:00
Eric Christopher
4dc4bfd311 Don't disassemble symbols with an unknown address or size.
Patch by Nico Rieck!

llvm-svn: 178678
2013-04-03 18:31:23 +00:00
Shankar Easwaran
6818273b65 print TLS segment
llvm-svn: 176192
2013-02-27 17:57:17 +00:00
Michael J. Spencer
25f5a420ec [objdump] Add PT_PHDR.
llvm-svn: 175709
2013-02-21 02:21:29 +00:00
Michael J. Spencer
95d35d8c5e [objdump] Print the PT_INTERP and PT_DYNAMIC correcctly.
llvm-svn: 175659
2013-02-20 20:18:10 +00:00
Guy Benyei
92dac48079 Add static cast to unsigned char whenever a character classification function is called with a signed char argument, in order to avoid assertions in Windows Debug configuration.
llvm-svn: 175006
2013-02-12 21:21:59 +00:00
Michael J. Spencer
09867f6b42 [objdump,readobj] Document the purpose and goals of each tool.
llvm-svn: 174439
2013-02-05 20:27:22 +00:00
Jakub Staszak
280583f1aa Remove unneeded #include.
llvm-svn: 173088
2013-01-21 21:02:47 +00:00
Chandler Carruth
7c0d682f47 Sort all of the includes. Several files got checked in with mis-sorted
includes.

llvm-svn: 172891
2013-01-19 08:03:47 +00:00
Michael J. Spencer
265466c97e [Object][ELF] Simplify ELFObjectFile by using ELFType.
This simplifies the usage and implementation of ELFObjectFile by using ELFType
to replace:

<endianness target_endianness, std::size_t max_alignment, bool is64Bits>

This does complicate the base ELF types as they must now use template template
parameters to partially specialize for the 32 and 64bit cases. However these
are only defined once.

llvm-svn: 172515
2013-01-15 07:44:25 +00:00
Michael J. Spencer
6a1aee19b4 [llvm-objdump] Emit addresses with the correct number of leading 0's.
llvm-svn: 172130
2013-01-10 22:40:50 +00:00
Michael J. Spencer
00f3025f6f [objdump] Use correct format specifiers and fix C++03 variadic warning.
llvm-svn: 171651
2013-01-06 05:23:59 +00:00
Michael J. Spencer
b8ff7ef369 [objdump] Add --private-headers, -p.
This currently prints the ELF program headers.

llvm-svn: 171649
2013-01-06 03:56:49 +00:00
Chandler Carruth
03fda8f594 Sort a few more #include lines in tools/... unittests/... and utils/...
llvm-svn: 171363
2013-01-02 10:26:28 +00:00
Rafael Espindola
cca2985848 Add a function to get the segment name of a section.
On MachO, sections also have segment names. When a tool looking at a .o file
prints a segment name, this is what they mean. In reality, a .o has only one
anonymous, segment.

This patch adds a MachO only function to fetch that segment name. I named it
getSectionFinalSegmentName since the main use for the name seems to be inform
the linker with segment this section should go to.

The patch also changes MachOObjectFile::getSectionName to return just the
section name instead of computing SegmentName,SectionName.

The main difference from the previous patch is that it doesn't use
InMemoryStruct. It is extremely dangerous: if the endians match it returns
a pointer to the file buffer, if not, it returns a pointer to an internal buffer
that is overwritten in the next API call.

We should change all of this code to use
support::detail::packed_endian_specific_integral like ELF, but since these
functions only handle strings, they work with big and little endian machines
as is.

I have tested this by installing ubuntu 12.10 ppc on qemu, that is why it took
so long :-)

llvm-svn: 170838
2012-12-21 03:47:03 +00:00
Rafael Espindola
e919c7cf9e Revert 170545 while I debug the ppc failures.
llvm-svn: 170547
2012-12-19 14:48:05 +00:00
Rafael Espindola
f95d54b477 Add r170095 back.
I cannot reproduce it the failures locally, so I will keep an eye at the ppc
bots. This patch does add the change to the "Disassembly of section" message,
but that is not what was failing on the bots.

Original message:

Add a funciton to get the segment name of a section.

On MachO, sections also have segment names. When a tool looking at a .o file
prints a segment name, this is what they mean. In reality, a .o has only one
anonymous, segment.

This patch adds a MachO only function to fetch that segment name. I named it
getSectionFinalSegmentName since the main use for the name seems to be infor
the linker with segment this section should go to.

The patch also changes MachOObjectFile::getSectionName to return just the
section name instead of computing SegmentName,SectionName.

llvm-svn: 170545
2012-12-19 14:15:04 +00:00
Eric Christopher
f555863a16 Revert "Add a funciton to get the segment name of a section."
This reverts commit r170095 since it appears to be breaking the bots.

llvm-svn: 170105
2012-12-13 06:36:18 +00:00
Rafael Espindola
7eb5b9d2f9 Add a funciton to get the segment name of a section.
On MachO, sections also have segment names. When a tool looking at a .o file
prints a segment name, this is what they mean. In reality, a .o has only one,
anonymous, segment.

This patch adds a MachO only function to fetch that segment name. I named it
getSectionFinalSegmentName since the main use for the name seems to be informing
the linker with segment this section should go to.

The patch also changes MachOObjectFile::getSectionName to return just the
section name instead of computing SegmentName,SectionName.

llvm-svn: 170095
2012-12-13 04:07:18 +00:00
Michael J. Spencer
e054e34b38 Quick build fix for c++03 clang. This needs a proper solution. Note that these offsets are guaranteed to be correct by Endian.h.
llvm-svn: 169438
2012-12-05 22:38:01 +00:00
Michael J. Spencer
938a1f875b Add dump of Win64 EH unwind data.
The new command line option -unwind-info dumps the Win64 EH unwind
data to the console. This is a nice feature if you need to debug
generated EH data (e.g. from LLVM). Includes a test case.

Initial patch by João Matos, extensions and rework by Kai Nacke.

llvm-svn: 169415
2012-12-05 20:12:35 +00:00
Chandler Carruth
800daa7d3d Sort the #include lines for tools/...
Again, tools are trickier to pick the main module header for than
library source files. I've started to follow the pattern of using
LLVMContext.h when it is included as a stub for program source files.

llvm-svn: 169252
2012-12-04 10:44:52 +00:00