While using dwarf5 `.debug_names` in internal large targets, we noticed
a performance issue (around 10 seconds delay) while `lldb-vscode` tries
to show `scopes` for a compile unit. Profiling shows the bottleneck is
inside `DebugNamesDWARFIndex::GetGlobalVariables` which linearly search
all index entries belongs to a compile unit.
This patch improves the performance by using the compile units list to
filter first before checking index entries. This significantly improves
the performance (drops from 10 seconds => under 1 second) in the split
dwarf situation because each compile unit has its own named index.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
This patch changes the way plugin objects used with Scripted Interfaces
are created.
Instead of implementing a different SWIG method to create the object for
every scripted interface, this patch makes the creation more generic by
re-using some of the ScriptedPythonInterface templated Dispatch code.
This patch also improves error handling of the object creation by
returning an `llvm::Expected`.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Fixes#68987
Early on we load the interpreter (most commonly ld-linux) in
LoadInterpreterModule. Then later when we get the first DYLD rendezvous
we get a list of libraries that commonly includes ld-linux again.
Previously we would load this duplicate, see that it was a duplicate,
and unload it.
Problem was that this unloaded the section information of the first copy
of ld-linux. On platforms where you can place a breakpoint using only an
address, this wasn't an issue.
On ARM you have ARM and Thumb modes. We must know which one the section
we're breaking in is, otherwise we'll go there in the wrong mode and
SIGILL. This happened on ARM when lldb tried to call mmap during
expression evaluation.
To fix this, I am making the assumption that the base address we see in
the module prior to loading can be compared with what we know the
interpreter base address is. Then we don't have to load the module to
know we can ignore it.
This fixes the lldb test suite on Ubuntu versions where
https://bugs.launchpad.net/ubuntu/+source/gdb/+bug/1927192 has been
fixed. Which was recently done on Jammy.
FEAT_SME_FA64 (smefa64 in Linux cpuinfo) allows the use of the full A64
instruction set while in streaming SVE mode.
See https://developer.arm.com/documentation/ddi0616/latest/ for details.
This means for example if we want to write to the ffr register during or
use floating point registers while in streaming mode, we need this
extension.
I initially was using QEMU which has it by default, and switched to
Arm's FVP which does not. So this change adds a more strict check and
converts most of the tests to use that. It would be possible in some
cases to avoid the offending instructions but it would be a lot of
effort and liable to fail randomly as the C library changes.
It is also my assumption that the majority of systems will have smefa64
as QEMU has chosen to have. If I turn out to be wrong, we can make the
effort to get the tests working without smefa64.
`isAArch64SME` remains for some tests, which are as follows:
* `test_aarch64_dynamic_regset_config` merely checks for the presence of
a register set, which appears for any SME system not just one with
smefa64.
* `test_aarch64_dynamic_regset_config_sme_za_disabled` only needs the ZA
register and does not enter streaming mode.
* `test_sme_not_present` tests for the absence of the SME register set,
so must be skipped if any form of SME is present.
* Various tests in `TestSVERegisters.py` need to know if SME is present
at all to generate an expected SVCR value. Earlier in the callstack
something else checked `isAArch64SMEFA64` already.
* `TestAArch64LinuxTLSRegisters.py` needs to test the `tpidr2` register
if any form of SME is present. msr/mrs instructions are used to do this
and are allowed even if smefa64 is not present.
This register reports the configuration of the AArch64 Linux tagged
address ABI, part of which is the memory tagging (MTE) settings.
It will always be present in core files because even without MTE, there
are parts of the tagged address ABI that can be configured (these parts
use the Top Byte Ignore feature).
I missed adding this when I previously worked on MTE support. Until now
you could read memory tags from a core file but not this register.
This reverts commit 8d80a452b8.
The pointer to the invalidates lists needs to be non-const. Though in this case
I don't think it's ever modified.
Also I realised that the invalidate list was being set on svg not vg.
Should be the other way around.
This fixes a bug where writing vg during streaming mode
could prevent you reading za directly afterwards.
vg is invalidated just prior to us reading it in AArch64Reconfigure,
but svg was not. This lead to some situations where vg would be
updated or cleared and re-read, but svg would not be.
This meant it had some undefined value which lead to errors
that prevented us reading ZA. Likely we received a lot more
data than we were expecting.
There are at least 2 ways to get into this situation:
* Explicit write by the user to vg.
* We have just stopped and need to get the potentially new svg and vg.
The first is handled by invalidating svg client side before fetching the
new one. This also
covers some but not all of the second scenario. For the second, I've
made writes to vg
invalidate svg by noting this in the register information.
Whichever one of those kicks in, we'll get the latest value of svg.
The bug may depend on timing, I could not find a consistent way
to trigger it. I originally found it when checking whether za
is disabled after a vg change, so I've added checks for that
to TestZAThreadedDynamic.
The SVE VG version of the bug did show up on the buildbot,
but not consistently. So it's possible that TestZAThreadedDynamic
does in fact cover this, but I haven't run it enough times to know.
As ReadRegister always read into a uint64_t, when it called operator=
with uint64_t it was setting the RegisterValue's type to eTypeUInt64
regardless of its size.
This mostly works because most registers are 64 bit, and very few bits
of code rely on the type being correct. However, cpsr, fpsr and fpcr are
in fact 32 bit, and my upcoming register fields code relies on this type
being correct.
Which is how I found this bug and unfortunately is the only way to test
it. As RegisterValue::Type never makes it out via the API anywhere. So
this change will be tested once I start adding register field
information.
We've been using the backtick as our escape character, however that
leads to a weird experience on VS Code, because on most hosts, as soon
as you type the backtick on VS Code, the IDE will introduce another
backtick. As changing the default escape character might be out of
question because other plugins might rely on it, we can instead
introduce an option to change this variable upon lldb-vscode
initialization.
FWIW, my users will be using : instead ot the backtick.
When moving the GetTypeForDIE function from DWARFASTParserClang to
DWARFASTParser, I kept the signature as-is. To match the rest of the
function signatures in DWARFASTParser, remove the full name
(lldb_private::plugin::dwarf::DWARFDIE -> DWARFDIE) in the signature of
DWARFASTParser::GetTypeForDIE.
The "protected" was accidentally removed during refactoring of
SymbolFileDWARF. Reintroduce it and also make DWARFASTParser a friend
class since 040c4f4d98f3306e068521e3c218bdbc170f81f3 was merged and
won't build without it, as it dependeds on a method which was made
public by accident.
This is relanding of https://github.com/llvm/llvm-project/pull/69253.
`TestTemplatePackArgs.py` is passing now.
https://github.com/llvm/llvm-project/pull/68012/files added new data
formatters for LibStdC++ std::variant.
However, this formatter can crash if std::variant's index field has
invalid value (exceeds the number of template arguments).
This can happen if the current IP stops at a place std::variant is not
initialized yet.
This patch fixes the crash by ensuring the index is a valid value and
fix GetNthTemplateArgument() to make sure it is not crashing.
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
This patch updates the documentation to match recent changes and make it
more clear. More specifically, the process for installing sphinx has
changed with the transition to myst with the requirements.txt in
llvm/docs being the preferred method for installation now. In addition,
the docs-lldb-html target is never generated if swig isn't installed, so
having something expliti in the documentation section (even if it is
mentioned as a dependency of lldb itself above) probably doesn't hurt.
This means you don't have to do RegisterField("", 0, 0), you can do
RegisterField("", 0).
Which is useful for testing and even more useful when we are writing
definitions of real registers which have 10s of single bit fields.
This patch moves the template files for the various scripting
affordances to a separate directory.
This is a preparatory work for upcoming improvements and consolidations
to other scripting affordances.
Differential Revision: https://reviews.llvm.org/D159310
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
As we're consolidating and streamlining the various scripting
affordances of lldb, we keep creating new interface files.
This patch groups all the current interface files into a separate sub
directory called `Interfaces` both in the core `Interpreter` directory
and the `ScriptInterpreter` plugin directory.
Differential Revision: https://reviews.llvm.org/D158833
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
lldb/test/Shell/Breakpoint/breakpoint-command.test adds a python
command, to be executed when a breakpoint hits, that writes out a
number. It then runs, hits the breakpoint and checks that the number is
present exactly once.
The problem is that on some systems the test can be run in a filepath
that happens to contain the number (e.g. auto-generated directory
names). The number is then detected multiple times and the test fails.
This patch fixes the issue by using a string instead, particularly a
string with spaces, which is very unlikely to be auto-generated by any
system.
When the debug info refers to a dwo with relative `DW_AT_comp_dir` and
`DW_AT_dwo_name`, we only print the `DW_AT_comp_dir` in our error
message if we can't find it. This often isn't very helpful, especially
when the `DW_AT_comp_dir` is ".":
```
(lldb) fr v
error: unable to locate .dwo debug file "." for skeleton DIE 0x000000000000003c
```
I'm updating the error message to include both `DW_AT_comp_dir` (if it
exists) and `DW_AT_dwo_name` when the `DW_AT_dwo_name` is relative. The
behavior when `DW_AT_dwo_name` is absolute should be the same.
Before target.xml, lldb had its own method for querying the remote stub
of the registers it supports, qRegisterInfo. The gdb standard method of
using a target.xml file to describe the available registers has become
commonplace, and the lldb method for doing this is no longer needed.
Stubs should describe their registers to lldb, but it should be with the
target.xml file.
The underlying timezone classes are being reimplemented in Swift, and these
strings will be Swift strings, without the ObjC `@` prefix. Leaving off the `@`
makes these tests usable both before and after the reimplmentation of
Foundation in Swift.
There are some uses of "vscode" in
`lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py` where
I'm not sure if it's referring to the adapter or VS Code itself, so
those remain.
This adds a release note for all the SME support now in LLDB and a page
where I have documented the user experience (for want of a better term)
when using SVE and SME.
This includes things like which mode transitions can or cannot be
triggered from within LLDB. I hope this will serve to A: document what
I've implemented and B: be a user's guide to these extensions.
(though it is not a design document, read the commits and code for that
sort of detail)
The method DWARFDebugInfoEntry::Extract needs to skip over all the data
in the debug_info / debug_types section for each DIE. It had the logic
to do so hardcoded inside a loop, when it already exists in a neatly
isolated function.
CompileUnit::SetSupportFiles had two overloads, one that took and lvalue
reference and one that takes an rvalue reference. This removes both and
replaces it with an overload that takes the FileSpecList by value and
moves it into the member variable.
Because we're storing the value as a member, this covers both cases. If
the new FileSpecList was passed by lvalue reference, we'd copy it into
the member anyway. If it was passed as an rvalue reference, we'll have
created a new instance using its move and then immediately move it again
into our member. In either case the number of copies remains unchanged.
Rename lldb-vscode to lldb-dap. This change is largely mechanical. The
following substitutions cover the majority of the changes in this
commit:
s/VSCODE/DAP/
s/VSCode/DAP/
s/vscode/dap/
s/g_vsc/g_dap/
Discourse RFC:
https://discourse.llvm.org/t/rfc-rename-lldb-vscode-to-lldb-dap/74075/
LLVM detects when ncurses has a separate terminfo library, but linking to it
was broken in lldb since b66339575a (LLVM14)
due to a change of variables. This commit fixes that oversight.
https://github.com/llvm/llvm-project/pull/68012/files added new data
formatters for LibStdC++ std::variant.
However, this formatter can crash if std::variant's index field has
invalid value (exceeds the number of template arguments).
This can happen if the current IP stops at a place std::variant is not
initialized yet.
This patch fixes the crash by ensuring the index is a valid value.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
CompilerType constructors rely on the NDEBUG macro, so it's better to move them to their cpp file so that the header doesn't get confused when this macro is used differently for other compilation units.
This can be used to have VS Code debug various emulators, remote
systems, hardware probes, etc.
In my case I was doing this for the Gameboy Advance,
https://github.com/stuij/gba-llvm-devkit/blob/main/docs/Debugging.md#debugging-using-visual-studio-code.
It's not very complex if you know LLDB well, but when using another
plugin, CodeLLDB, I was very glad that they had an example for it. So we
should have one too.
This patch adds a `SBType::FindDirectNestedType(name)` function which performs a non-recursive search in given class for a type with specified name. The intent is to perform a fast search in debug info, so that it can be used in formatters, and let them remain responsive.
This is driven by my work on formatters for Clang and LLVM types. In particular, by [`PointerIntPairInfo::MaskAndShiftConstants`](cde9f9df79/llvm/include/llvm/ADT/PointerIntPair.h (L174C16-L174C16)), which is required to extract pointer and integer from `PointerIntPair`.
Related Discourse thread: https://discourse.llvm.org/t/traversing-member-types-of-a-type/72452
As a followup of https://github.com/llvm/llvm-project/pull/67851, I'm
defining a new namespace `lldb_plugin::dwarf` for the classes in this
Plugins/SymbolFile/DWARF folder. This change is very NFC and helped me
with exporting the necessary symbols for my out-of-tree language plugin.
The only class that I didn't change is ClangDWARFASTParser, because that
shouldn't be in the same namespace as the generic language-agnostic
dwarf parser.
It would be a good idea if other plugins follow the same namespace
scheme.
To get the number of children for a VectorType (i.e.,
a type declared with a `vector_size`/`ext_vector_type` attribute)
LLDB previously did following calculation:
1. Get byte-size of the vector container from Clang (`getTypeInfo`).
2. Get byte-size of the element type we want to interpret the array as.
(e.g., sometimes we want to interpret an `unsigned char vec[16]`
as a `float32[]`).
3. `numChildren = containerSize / reinterpretedElementSize`
However, for step 1, clang will return us the *aligned* container
byte-size.
So for a type such as `float __attribute__((ext_vector_type(3)))`
(which is an array of 3 4-byte floats), clang will round up the
byte-width of the array to `16`.
(see
[here](ab6a66dbec/clang/lib/AST/ASTContext.cpp (L1987-L1992)))
This means that for vectors where the size isn't a power-of-2, LLDB
will miscalculate the number of elements.
**Solution**
This patch changes step 1 such that we calculate the container size
as `numElementsInSource * byteSizeOfElement`.
The type formatter code is effectively considering empty strings as read
errors, which is wrong. The fix is very simple. We should rely on the
error object and stop checking the size. I also added a test.
The `po` alias now matches the behavior of the `expression` command when
the it can apply a Fix-It to an expression.
Modifications
- Add has `m_fixed_expression` to the `CommandObjectDWIMPrint` class a
`protected` member that stores the post Fix-It expression, just like the
`CommandObjectExpression` class.
- Converted messages to present tense.
- Add test cases that confirms a Fix-It for a C++ expression for both
`po` and `expressions`
rdar://115317419
Since D101206 (`ba79fb2e1ff7130cde02fbbd325f0f96f8a522ca`) the `__hash_node::__value_`
member is wrapped in an anonymous union. `ValueObject::GetChildMemberWithName` doesn't see
through the union.
This patch accounts for this possible new layout by getting a handle to
the union before doing the by-name `__value_` lookup.
Started failing since D101206, but root-cause is unclear. It's
definitely not an issue with th libc++ patch itself however. So disable
the test until we know what's going on.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an
enum. This patch replaces support::{big,little,native} with
llvm::endianness::{big,little,native}.
PR#66035 introduced a test failure that causes windows build bots to
fail. These unit tests shouldn't be running on Windows.
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
lldb-vscode had installation instructions based on creating a folder
inside ~/.vscode/extensions, which no longer works. A different
installation mechanism is needed based on a VSCode command. More can be
read in the contents of this patch.
Closes https://github.com/llvm/llvm-project/issues/63655
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form. This patch replaces
support::endianness with llvm::endianness.
The `po` alias now matches the behavior of the `expression` command when
the it can apply a Fix-It to an expression.
Modifications
- Add has `m_fixed_expression` to the `CommandObjectDWIMPrint` class a
`protected` member that stores the post Fix-It expression, just like the
`CommandObjectExpression` class.
- Converted messages to present tense.
- Add test cases that confirms a Fix-It for a C++ expression for both
`po` and `expressions`
rdar://115317419
Co-authored-by: Pete Lawrence <plawrence@apple.com>
## Description
This pull request adds a new `stop-at-user-entry` option to LLDB
`process launch` command, allowing users to launch a process and pause
execution at the entry point of the program (for C-based languages,
`main` function).
## Motivation
This option provides a convenient way to begin debugging a program by
launching it and breaking at the desired entry point.
## Changes Made
- Added `stop-at-user-entry` option to `Options.td` and the
corresponding case in `CommandOptionsProcessLaunch.cpp` (short option is
'm')
- Implemented `GetUserEntryPointName` method in the Language plugins
available at the moment.
- Declared the `CreateBreakpointAtUserEntry` method in the Target API.
- Create Shell test for the command
`command-process-launch-user-entry.test`.
## Usage
`process launch --stop-at-user-entry` or `process launch -m` launches
the process and pauses execution at the entry point of the program.
There are only ever 2 FilterRules and their operations are either
"regex" or "match". This does not benefit from deduplication since the
strings have static lifetime and we can just compare StringRefs pointing
to them. This is also not on a fast path, so it doesn't really benefit
from the pointer comparisons of ConstStrings.
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form. This patch replaces
llvm::support::endianness with llvm::endianness.
Split out the assertions that fail on Windows in preparation to
XFAILing them.
Drive-by change:
* Add a missing `self.build()` call in `test_union_in_anon_namespace`
* Fix formatting
* Add expectedFailureWindows decorator
Bridge network means that you can get to any port on the VM,
from the host, which is great. However it is quite involved to
setup in some cases, and I've certainly messed it up in the past.
An alternative is forwarding a block of ports and using some
hidden options to lldb-server to limit what it uses. This
commit documents that and the pitfall that the port list isn't shared.
The theory also works for Arm's FVP (which inspired me to write
this up) but since QEMU is the preferred option upstream, it goes
in that document.
Along the way I fixed a link to the QEMU page that used the URL
not a relative link to the document.
**Background**
Prior to DWARFv4, there was no clear normative text on how to handle
static data members. Non-normative text suggested that compilers should
use `DW_AT_external` to mark static data members of structrues/unions.
Clang does this consistently. However, GCC doesn't, e.g., when the
structure/union is in an anonymous namespace (which is C++ standard
conformant). Additionally, GCC never emits `DW_AT_data_member_location`s
for union members (regardless of storage linkage and storage duration).
Since DWARFv5 (issue 161118.1), static data members get emitted as
`DW_TAG_variable`.
LLDB used to differentiate between static and non-static members by
checking the `DW_AT_external` flag and the absence of
`DW_AT_data_member_location`. With
[D18008](https://reviews.llvm.org/D18008) LLDB started to pretend that
union members always have a `0` `DW_AT_data_member_location` by default
(because GCC never emits these locations).
In [D124409](https://reviews.llvm.org/D124409) LLDB stopped checking the
`DW_AT_external` flag to account for the case where GCC doesn't emit the
flag for types in anonymous namespaces; instead we only check for
presence of `DW_AT_data_member_location`s.
The combination of these changes then meant that LLDB would never
correctly detect that a union has static data members.
**Solution**
Instead of unconditionally initializing the `member_byte_offset` to `0`
specifically for union members, this patch proposes to check for both
the absence of `DW_AT_data_member_location` and `DW_AT_declaration`,
which consistently gets emitted for static data members on GCC and
Clang.
We initialize the `member_byte_offset` to `0` anyway if we determine it
wasn't a static. So removing the special case for unions makes this code
simpler to reason about.
Long-term, we should just use DWARFv5's new representation for static
data members.
Fixes#68135
LLDB has the cmake flag `LLDB_EXPORT_ALL_SYMBOLS` that exports the lldb,
lldb_private namespaces, as well as other symbols like python and lua
(see `lldb/source/API/liblldb-private.exports`). However, not all
symbols in lldb fall into these categories and in order to get access to
some symbols that live in plugin folders (like dwarf parsing symbols),
it's useful to be able to specify a custom exports file giving more
control to the developer using lldb as a library.
This adds the new cmake flag `LLDB_EXPORT_ALL_SYMBOLS_EXPORTS_FILE` that
is used when `LLDB_EXPORT_ALL_SYMBOLS` is enabled to specify that custom
exports file.
This is a follow up of https://github.com/llvm/llvm-project/pull/67851
Currently this non-trivial calculation is repeated multiple times,
making it hard to reason about when the
`byte_offset`/`member_byte_offset` is being set or not.
This patch simply moves all those instances of the same calculation into
a helper function.
We return an optional to remain an NFC patch. Default initializing the
offset would make sense but requires further analysis and can be done in
a follow-up patch.
C++20 will automatically generate an operator== with reversed operand
order, which is ambiguous with the written operator== when one argument
is marked const and the other isn't.
These operators currently trigger -Wambiguous-reversed-operator at usage
sites lldb/source/Symbol/SymbolFileOnDemand.cpp:68 and
lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp:1286.
https://github.com/llvm/llvm-project/pull/68012 works on my CentOS Linux
and Macbook but seems to fail for certain build bots. The error log
complains "No Value" check failure for `std::variant` but not very
actionable without a reproduce.
To unblock the build bots, I am commenting out the "No Value" checks.
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
The implemtation support parsing kernel module for FreeBSD Kernel and
has been test on x86-64 and arm64.
In summary, this class parse the linked list resides in the kernel
memory that record all kernel module and load the debug symbol file to
facilitate debug process
I plan on replacing LLDB's DWARFUnitHeader implementation with LLVM's.
LLVM's DWARFUnitHeader::extract applies the DWARFUnitIndex::Entry to a
given DWARFUnitHeader outside of the extraction because the index entry
is only relevant to one place where we may parse DWARFUnitHeaders
(specifically when we're creating a DWARFUnit in a DWO context). To ease
the transition, I've reshaped LLDB's implementation to look closer to
LLVM's.
Reviewed By: aprantl, fdeazeve
Differential Revision: https://reviews.llvm.org/D151919
This patch tentatively fixes the various test failures introduced
following 0ea3d88bdb:
https://green.lab.llvm.org/green/view/LLDB/job/as-lldb-cmake/6316/
From my understanding, the main issue here is that we can't find some headers
when evaluating C++ expressions since those headers have been promoted
to be system modules, and to be shipped as part of the toolchain.
Prior to 0ea3d88bdb, the `BuiltinHeadersInSystemModules` flag for in
the clang `LangOpts` struct was always set, however, after it landed,
the flag becomes opt-in, depending on toolchain that is used with the
compiler instance. This gets set in `clang::createInvocation` down to
`Darwin::addClangTargetOptions`, as this is used mostly on Apple platforms.
However, since `ClangExpressionParser` makes a dummy `CompilerInstance`,
and sets the various language options arbitrarily, instead of using the
`clang::createInvocation`, the flag remains unset, which causes the
various error messages:
```
AssertionError: 'error: module.modulemap:96:11: header 'stdarg.h' not found
96 | header "stdarg.h" // note: supplied by the compiler
| ^
```
Given that this flag was opt-out previously, this patch brings back that
behavior by setting it in lldb's `ClangExpressionParser` constructor,
until we actually decide to pull the language options from the compiler driver.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
LLDB_EXPORT_ALL_SYMBOLS is useful when building out-of-tree plugins and
extensions that rely on LLDB's internal symbols. For example, this is
how the Mojo language provides its REPL and debugger support.
Supporting this on windows is kind of tricky because this is normally
expected to be done using dllexport/dllimport, but lldb uses these with
the public api. This PR takes an approach similar to what LLVM does with
LLVM_EXPORT_SYMBOLS_FOR_PLUGINS, and what chromium does for
[abseil](253d14e20f/third_party/abseil-cpp/generate_def_files.py),
and uses a python script to extract the necessary symbols by looking at
the symbol table for the various lldb libraries.
Also default to disassembling a and m features
Some code taken from https://reviews.llvm.org/D62732 , which hasn't been
updated in a year.
Tested with 32 and 64 bit Linux user space QEMU
Reviewed By: jasonmolenda
Differential Revision: https://reviews.llvm.org/D159101
The LLVM implementation of DWARFDebugAbbrev does not have a way of
listing all the DW_FORM values that have been parsed but are unsupported
or otherwise unknown. AFAICT this functionality does not exist in LLVM
at all. Since my primary goal is to unify the implementations and not
judge the usefulness or completeness of this functionality, I decided to
move it out of LLDB's implementation of DWARFDebugAbbrev for the time
being.
We just forget to check for interrupt while waiting for the answer to the prompt. But if we are in the interrupt state then the lower
layers of the EditLine code just eat all characters so we never get out of the query prompt. You're pretty much stuck and have to kill lldb.
The solution is to check for the interrupt. The patch is a little bigger because where I needed to check the Interrupt state I only
had the ::EditLine object, but the editor state is held in lldb's EditLine wrapper, so I had to do a little work to get my hands on it.
This patch implements the thread local storage support for linux
(https://github.com/llvm/llvm-project/issues/28766).
TLS feature is originally only implemented for Mac. With my previous
patch to enable `fs_base` register for Linux
(https://reviews.llvm.org/D155256), now it is feasible to implement this
feature for Linux.
The major changes are:
* Track the main module's link address during launch
* Fetch thread pointer from `fs_base` register
* Create register alias for thread pointer
* Read pthread metadata from target memory instead of process so that it
works for coredump
With the patch the failing test is passing now. Note: I am only enabling
this test for Mac and Linux because I do not have machine to test for
FreeBSD/NetBSD.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
In https://reviews.llvm.org/D136620 I changed debugserver to stop using
the kernel-provided functions
arm_thread_state64_get_{pc,lr,sp,fp} to postprocess those four registers
on aarch64 systems after we thread_get_state() them. The kernel stores
these four registers with signing internally, either from the inferior
process' actual signing, or its own.
When a program had crashed by doing an authenticated BL to an address
with improper signing, the inferior process would crash and that
improperly signed pc would be given to debugserver via thread_get_state.
debugserver would run that through arm_thread_state64_get_pc() and then
debugserver would crash when authenticating & stripping the value, on
newer Mac hardware.
To avoid debugserver crashing on a crashed inferior process, I switched
from using these system functions to strip the values, to simply
clearing the bits outright in debugserver.
However, lr is a special case where the inferior may have signed this
value (against the stack pointer value at the time). Or it may not yet
have any authentication bits, right after a BL. In the latter case, the
kernel will add its own auth bits for while it is stored inside the
kernel. In the case of a user lr value, we cannot authenticate it in
debugserver without knowing the sp value it was signed against (and the
way it is signed is not specified by the ABI) so an "improperly" signed
lr (whatever that means) won't cause debugserver to crash.
debugserver can thread_get_state the inferior's lr, run it through
arm_thread_state64_get_lr(), and get the actual signed 64-bit value that
the inferior process is using. And the specifics of how that lr is
signed may be important for debugging the process, instead of how I am
currently clearing the auth bits outright.
This patch reverts that change for lr only, and also adds a new logging
to debugserver specifically for the four sp/fp/lr/pc values that
thread_get_state hands to us, before we process them at all.
I want to work towards unifying the implementations. It would be a lot
easier to do if LLDB's DWARFDebugAbbrev looked more similar to LLVM's
implementation, so this change moves in that direction.
This patch adds a LLVM_FORCE_VC_REVISION option to force a custom
VC revision to be included instead of trying to fetch one from a
git command. This is helpful in environments where git is not
available or is non-functional but the vc revision is available
through some other means.
This is my second recent change to TestStepOverWatchpoint.py,
the first was to change the two global variables it is watching
from 'char' to 'long' so they're more likely to be on separate
words/doublewords of memory that can be watched indepdently.
I believe this removes the need for the MIPS and S390X skips.
The test was testing a combination of read and write watchpoints,
stepping over a function that hits them, and then instruction stepping
over a source line which hits them. But previously it was
always starting with the read watchpoint in both of them, it
didn't test the instruction-stepping for the read watchpoint.
I now have to tests in TestStepOverWatchpoint.py, one which
runs to the read-watchpoint function, sets that watchpoint,
steps over another function which reads from that global, then
instruction steps over a source line that reads from that global.
The second test runs to the write-watchpoint function, sets that
watchpoint. Steps over another function which writes to that
global, then instruction steps over a source line that writes to
that global.
These are both xfailed on Darwin systems because watchpoints and
hardware breakpoints are currently disabled by the kernel when
instruction stepping.
This patch fixes:
lldb/include/lldb/Symbol/Function.h:270:3: error:
'lldb_private::CallEdge' has virtual functions but non-virtual
destructor [-Werror,-Wnon-virtual-dtor]
GetObjectPointer (and other related methods) do not need `ConstString`
parameters. The string parameter in these methods boil down to getting a
StringRef and calling `StackFrame::GetValueForVariableExpressionPath`
which takes a `StringRef` parameter. All the users of `GetObjectPointer`
(and related methods) end up creating ConstString objects to pass to
these methods, but they could just as easily be StringRefs (potentially
saving us some allocations in the StringPool).
Prior to this the command would simply crash when run on a running
process.
Of the three register commands, "info" was the only one missing these
requirements. On some level it makes sense because you're not going to
read a value or modify anything, but practically I think lldb assumes
any time you're going to access register related stuff, the process
should be paused.
I noticed this debugging with a remote gdb stub, so I've recreated that
scenario using attach in a new test case.
We noticed some performance issue while in lldb-vscode for grabing the
name of the SBValue. Profiling shows SBValue::GetName() can cause
synthetic children provider of shared/unique_ptr to deference underlying
object and complete it type.
This patch lazily moves the dereference from synthetic child provider's
Update() method to GetChildAtIndex() so that SBValue::GetName() won't
trigger the slow code path.
Here is the culprit slow code path:
```
...
frame #59: 0x00007ff4102e0660 liblldb.so.15`SymbolFileDWARF::CompleteType(this=<unavailable>, compiler_type=0x00007ffdd9829450) at SymbolFileDWARF.cpp:1567:25 [opt]
...
frame #67: 0x00007ff40fdf9bd4 liblldb.so.15`lldb_private::ValueObject::Dereference(this=0x0000022bb5dfe980, error=0x00007ffdd9829970) at ValueObject.cpp:2672:41 [opt]
frame #68: 0x00007ff41011bb0a liblldb.so.15`(anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::Update(this=0x000002298fb94380) at LibStdcpp.cpp:403:40 [opt]
frame #69: 0x00007ff41011af9a liblldb.so.15`lldb_private::formatters::LibStdcppSharedPtrSyntheticFrontEndCreator(lldb_private::CXXSyntheticChildren*, std::shared_ptr<lldb_private::ValueObject>) [inlined] (anonymous namespace)::LibStdcppSharedPtrSyntheticFrontEnd::LibStdcppSharedPtrSyntheticFrontEnd(this=0x000002298fb94380, valobj_sp=<unavailable>) at LibStdcpp.cpp:371:5 [opt]
...
frame #78: 0x00007ff40fdf6e42 liblldb.so.15`lldb_private::ValueObject::CalculateSyntheticValue(this=0x000002296c66a500) at ValueObject.cpp:1836:27 [opt]
frame #79: 0x00007ff40fdf1939 liblldb.so.15`lldb_private::ValueObject::GetSyntheticValue(this=<unavailable>) at ValueObject.cpp:1867:3 [opt]
frame #80: 0x00007ff40fc89008 liblldb.so.15`ValueImpl::GetSP(this=0x0000022c71b90de0, stop_locker=0x00007ffdd9829d00, lock=0x00007ffdd9829d08, error=0x00007ffdd9829d18) at SBValue.cpp:141:46 [opt]
frame #81: 0x00007ff40fc7d82a liblldb.so.15`lldb::SBValue::GetSP(ValueLocker&) const [inlined] ValueLocker::GetLockedSP(this=0x00007ffdd9829d00, in_value=<unavailable>) at SBValue.cpp:208:21 [opt]
frame #82: 0x00007ff40fc7d817 liblldb.so.15`lldb::SBValue::GetSP(this=0x00007ffdd9829d90, locker=0x00007ffdd9829d00) const at SBValue.cpp:1047:17 [opt]
frame #83: 0x00007ff40fc7da6f liblldb.so.15`lldb::SBValue::GetName(this=0x00007ffdd9829d90) at SBValue.cpp:294:32 [opt]
...
```
Differential Revision: https://reviews.llvm.org/D159542
`watch set expression` was passing the OptionGroupWatchpoint enum
in to Target where the LLDB_WATCH_TYPE_* bitfield was expected.
Modify matched READ|WRITE and resulted in a test failure in
TestWatchTaggedAddress.py. David temporarily changed the test to
expect this incorrect output; this fixes the bug and updates the
test case to test it for correctness again.
I just fixed a bug in the core file equivalent where this was the issue.
This class avoids the issue by setting m_sve_state early but should
still be fixed so it doesn't crop up in a later refactor.
This reverts commit 3fa5035823.
m_sve_state was not initialised which (I'm guessing) meant that
it could potentially be a value that matched a real SVE state.
Then we'd be acting as if we're streaming mode, for example,
without ever having the data required to back that up.
By sheer luck this only turn up on x86, AArch64 and ARM were fine.
It is UB regardless.
This adds the ability to read streaming SVE registers,
ZA, SVCR and SVG from core files.
Streaming SVE is in a new note NT_ARM_SSVE but otherwise
has the same format as SVE. So I've done the same as I
did for live processes and reused the existing SVE state
with an extra state for the mode variable.
ZA is in a note NT_ARM_ZA and again the handling matches
live processes. Except that it gets setup only once. A
disabled ZA reads as 0s as usual.
SVCR and SVG are pseudo registers, generated from the notes.
An important detail is that the notes represent what
you would have got if you read from ptrace at the time of
the crash.
This means that for a corefile in non-streaming mode,
there is still an NT_ARM_SSVE note and we check the header
flags to tell if it is active. We cannot just say if you
have the note you're in streaming mode.
The kernel does not provide register values for the inactive
mode and even if it did, they would be undefined, so if we find
streaming state, we ignore the non-streaming state.
Same for ZA, a disabled ZA still has the header in the note.
The tests do not cover all combinations but enough different
vector lengths, modes and ZA states to be confident.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D158500
This reverts commit a7b78cac9a.
With updates to the tests.
TestWatchTaggedAddress.py: Updated the expected watchpoint types,
though I'm not sure there should be a differnt default for the two
ways of setting them, that needs to be confirmed.
TestStepOverWatchpoint.py: Skipped this everywhere because I think
what used to happen is you couldn't put 2 watchpoints on the same
address (after alignment). I guess that this is now allowed because
modify watchpoints aren't accounted for, but likely should be.
Needs investigating.
The test is expecting watchpoint hits that are valid on aarch64
systems but not on Intel. I need to update this test to match the
actual behavior on Intel macs, and don't want the CI bots to stay
broken while I get that figured out.
I changed the test so I could tell whether the problem was sometimes the
interrupt was failing, or maybe the was just racy. It failed again, but
in the new failures we waited 20 seconds for the attach-wait to get interrupted
and that never happened.
So there seems to be some real raciness in the feature of interrupting an
attach-wait, but only on Linux & Windows. The bug fix that this test was
testing was for a bug that would cause us to never try to interrupt in this
case. So it looks like this test is uncovering some flakiness in the underlying
interrupt support when in this state. That's a separate bug that needs fixing.
For now, I disabled the test except on macOS where it seems to run reliably.
`std::basic_string<char>` is not a regex, and treating it as such could
unintentionally
cause a formatter to substring match a template type parameter, for
example:
`std::vector<std::basic_string<char>>`.
Differential Revision: https://reviews.llvm.org/D158663
Watchpoints in lldb can be either 'read', 'write', or 'read/write'. This
is exposing the actual behavior of hardware watchpoints. gdb has a
different behavior: a "write" type watchpoint only stops when the
watched memory region *changes*.
A user is using a watchpoint for one of three reasons:
1. Want to find what is changing/corrupting this memory.
2. Want to find what is writing to this memory.
3. Want to find what is reading from this memory.
I believe (1) is the most common use case for watchpoints, and it
currently can't be done in lldb -- the user needs to continue every time
the same value is written to the watched-memory manually. I think gdb's
behavior is the correct one. There are some use cases where a developer
wants to find every function that writes/reads to/from a memory region,
regardless of value, I want to still allow that functionality.
This is also a bit of groundwork for my large watchpoint support
proposal
https://discourse.llvm.org/t/rfc-large-watchpoint-support-in-lldb/72116
where I will be adding support for AArch64 MASK watchpoints which watch
power-of-2 memory regions. A user might ask to watch 24 bytes, and a
MASK watchpoint stub can do this with a 32-byte MASK watchpoint if it is
properly aligned. And we need to ignore writes to the final 8 bytes of
that watched region, and not show those hits to the user.
This patch adds a new 'modify' watchpoint type and it is the default.
Re-landing this patch after addressing testsuite failures found in CI on
Linux, Intel machines, and windows.
rdar://108234227
command.
We were reading the command output right after sending the interrupt, but
sometimes that wasn't long enough for the command result text to have been emitted.
I added a poll for the state change to eStateExited, and then added a bit more sleep
to give the command a chance to complete.
This fixes 46b961f36b.
Prior to the SME changes the steps were:
* Invalidate all registers.
* Update value of VG and use that to reconfigure the registers.
* Invalidate all registers again.
With the changes for SME I removed the initial invalidate thinking
that it didn't make sense to do if we were going to invalidate them
all anyway after reconfiguring.
Well the reason it made sense was that it forced us to get the
latest value of vg which we needed to reconfigure properly.
Not doing so caused a test failure on our Graviton bot which has SVE
(https://lab.llvm.org/buildbot/#/builders/96/builds/45722). It was
flaky and looping it locally would always fail within a few minutes.
Presumably it was using an invalid value of vg, which caused some offsets
to be calculated incorrectly.
To fix this I've invalided vg in AArch64Reconfigure just before we read
it. This is the same as the fix I have in review for SME's svg register.
Pushing this directly to fix the ongoing test failure.
Auto summaries were only being used when non-pointer/reference variables
didn't have values nor summaries. Greg pointed out that it should be
better to simply use auto summaries when the variable doesn't have a
summary of its own, regardless of other conditions.
This led to code simplification and correct visualization of auto
summaries for pointer/reference types, as seen in this screenshot.
<img width="310" alt="Screenshot 2023-09-19 at 7 04 55 PM"
src="https://github.com/llvm/llvm-project/assets/1613874/d356d579-13f2-487b-ae3a-f3443dce778f">
Software can tell if it is in streaming SVE mode by checking
the Streaming Vector Control Register (SVCR).
"E3.1.9 SVCR, Streaming Vector Control Register" in
"Arm® Architecture Reference Manual Supplement, The Scalable Matrix
Extension (SME), for Armv9-A"
https://developer.arm.com/documentation/ddi0616/latest/
This is especially useful for debug because the names of the
SVE registers are the same betweeen non-streaming and streaming mode.
The Linux Kernel chose to not put this register behind ptrace,
and it can be read from EL0. However, this would mean running code
in process to read it. That can be done but we already know all
the information just from ptrace.
So this is a pseudo register that matches the architectural
content. The name is just "svcr", which aligns with GDB's proposed naming,
and it's added to the existing SME register set.
The SVCR register contains two bits:
0 : Whether streaming SVE mode is enabled (SM)
1 : Whether the array storage is enabled (ZA)
Array storage can be active when streaming mode is not, so this register
can have any permutation of those bits.
This register is currently read only. We can emulate the result of
writing to it, using ptrace. However at this point the utility of that
is not obvious.
Existing tests have been updated to check for appropriate SVCR values
at various points.
Given that this register is a read only pseudo, there is no need
to save and restore it around expressions.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D154927
The problem is that the when the "attach" command is initiated, the
ExecutionContext for the command has a process - it's the exited one
from the previour run. But the `attach wait` creates a new process for
the attach, and then errors out instead of interrupting when it finds
that its process and the one in the command's ExecutionContext don't
match.
This change checks that if we're returning a target from
GetExecutionContext, we fill the context with it's current process, not
some historical one.
Instead of copying memory out of the PythonString (via a std::string)
and then using that to create a ConstString, it would make more sense to
just create the ConstString from the original StringRef in the first
place.
An SME enabled program has the following extra state:
* Streaming mode or non-streaming mode.
* ZA enabled or disabled.
* The active vector length.
Covering the transition between all possible states and all other
possible states is not viable, therefore the testing added here is a cross
section of that, all of which found real bugs in LLDB and the Linux
Kernel during development.
Many of those transitions will not be possible via LLDB
(e.g. disabling ZA) and many more are possible but unlikely to be
used in normal use.
Added testing:
* TestSVEThreadedDynamic now checks for correct SVG values.
* New test TestZAThreadedDynamic creates 3 threads with different ZA sizes
and states and switches between them verifying the register value
(derived from the existing threaded SVE test).
* New test TestZARegisterSaveRestore starts in a given SME state, runs a
set of expressions in various orders, then checks that the original
state has been restored.
* TestArm64DynamicRegsets has ZA and SVG checks added, including writing
to ZA to enable it.
Running these tests will as usual require QEMU as there is no
real SME hardware available at this time, and a very recent
kernel.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159505
The size of ZA depends on the streaming vector length regardless
of the active mode. So in addition to vg (which reports the active
mode) we must send the client svg.
Otherwise the mechanics are the same as for non-streaming SVE.
Use the svg value to update the defined size of ZA, accounting
for the fact that ZA is not a single vector but a suqare matrix.
So if svg is 8, a single streaming vector would be 8*8 = 64 bytes.
ZA is that squared, so 64*64 = 4096 bytes.
Testing is included in a later patch.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159504
This adds a register "svg" which mirrors SVE's "vg" register.
This reports the streaming vector length at all times, read
from the ZA ptrace header.
This register is needed first to implement ZA resizing as
the streaming vector length changes. Like vg, svg will be
expedited to the client so it can reconfigure its register
definitions.
The other use is for users to be able to know the streaming
vector length without resorting to counting the (many, many)
bytes in ZA, or temporarily entering streaming mode (which
would be destructive).
Some refactoring has been done so we don't have to recalculate the
register offsets twice.
Testing for this will come in a later patch.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159503
Note: This requires later commits for ZA to function properly,
it is split for ease of review. Testing is also in a later patch.
The "Matrix" part of the Scalable Matrix Extension is a new register
"ZA". You can think of this as a square matrix made up of scalable rows,
where each row is one scalable vector long. However it is not made
of the existing scalable vector registers, it is its own register.
Meaning that the size of ZA is the vector length in bytes * the vector
length in bytes.
https://developer.arm.com/documentation/ddi0616/latest/
It uses the streaming vector length, even when streaming mode itself
is not active. For this reason, it's register data header always
includes the streaming vector length.
Due to it's size I've changed kMaxRegisterByteSize to the maximum
possible ZA size and kTypicalRegisterByteSize will be the maximum
possible scalable vector size. Therefore ZA transactions will cause heap
allocations, and non ZA registers will perform exactly as before.
ZA can be enabled and disabled independently of streaming mode. The way
this works in ptrace is different to SVE versus streaming SVE. Writing
NT_ARM_ZA without register data disables ZA, writing NT_ARM_ZA with
register data enables ZA (LLDB will only support the latter, and only
because it's convenient for us to do so).
https://kernel.org/doc/html/v6.2/arm64/sme.html
LLDB does not handle registers that can appear and dissappear at
runtime. Rather than add complexity to implement that, LLDB will
show a block of 0s when ZA is disabled.
The alternative is not only updating the vector lengths every stop,
but every register definition. It's possible but I'm not sure it's worth
pursuing.
Users should refer to the SVCR register (added in later patches)
for the final word on whether ZA is active or not.
Writing to "VG" during streaming mode will change the size of the
streaming sve registers and ZA. LLDB will not attempt to preserve
register values in this case, we'll just read back the undefined
content the kernel shows. This is in line with, as stated, the
kernel ABIs and the prospective software ABIs look like.
ZA is defined as a vector of size SVL*SVL, so the display in lldb
is very basic. A giant block of values. This is no worse than SVE,
just larger. There is scope to improve this but that can wait
until we see some use cases.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D159502
Running:
$ clang-format -i $(find -regex "\./lldb/.*\.c") $(find -regex
"\./lldb/.*\.cpp") $(find -regex "\./lldb/.*\.h")
Resulted in:
1602 files changed, 25090 insertions(+), 25849 deletions(-)
(note: this includes tests which we wouldn't format, just using this as
an example)
The vast majority of which were whitespace changes. So as far as
formatting we're not deviating from llvm for any reason other than not
churning old code.
Formatting aside, the major features of lldb (single line if, early
return) are all reflected in llvm's style. We differ mainly on variable
naming (proposed to change in
https://llvm.org/docs/Proposals/VariableNames.html anyway) and use of
asserts. Which was already documented.
Watchpoints in lldb can be either 'read', 'write', or 'read/write'. This
is exposing the actual behavior of hardware watchpoints. gdb has a
different behavior: a "write" type watchpoint only stops when the
watched memory region *changes*.
A user is using a watchpoint for one of three reasons:
1. Want to find what is changing/corrupting this memory.
2. Want to find what is writing to this memory.
3. Want to find what is reading from this memory.
I believe (1) is the most common use case for watchpoints, and it
currently can't be done in lldb -- the user needs to continue every time
the same value is written to the watched-memory manually. I think gdb's
behavior is the correct one. There are some use cases where a developer
wants to find every function that writes/reads to/from a memory region,
regardless of value, I want to still allow that functionality.
This is also a bit of groundwork for my large watchpoint support
proposal
https://discourse.llvm.org/t/rfc-large-watchpoint-support-in-lldb/72116
where I will be adding support for AArch64 MASK watchpoints which watch
power-of-2 memory regions. A user might ask to watch 24 bytes, and a
MASK watchpoint stub can do this with a 32-byte MASK watchpoint if it is
properly aligned. And we need to ignore writes to the final 8 bytes of
that watched region, and not show those hits to the user.
This patch adds a new 'modify' watchpoint type and it is the default.
rdar://108234227
In 014c41d688 I tried to fix these tests,
but it seems that I needed to change TEST for TEST_F to make that work.
It's a pain that these failures don't repro on any of my machines, but I
verified thta the initialization code for the tests is invoked.
This test was broken by 710276a250 because
DumpDataExtractor now accesses the Target properties, which someone ends
up relying on the file system.
This is an instance of this error https://lab.llvm.org/buildbot/#/builders/96/builds/45607/steps/6/logs/stdio
I cannot reproduce this locally, but it seems that the error happens
because we are not initializing the FileSystem and the Host as part of
the test setup.
As suggested by Greg in https://github.com/llvm/llvm-project/pull/66534,
I'm adding a setting at the Target level that controls whether to show
leading zeroes in hex ValueObject values.
This has the benefit of reducing the amount of characters displayed in
certain interfaces, like VSCode.
This is a (rough) port of architecture/varformats.html page that got
lost during the migration to RST in edb874b. I'm not sure how much of
its content is still relevant today. However, the page is pretty
extensive and seems like it's worth preserving.
Neither actually linked anywhere, but I did find a good target for the
introduction.
For the architecture side I don't see a page that would fit, so I've
just removed the sentence.
Add a configuration entry for whether LLDB was configured with wide
character support in Editline and use it in a decorator to guard the
UTF-8 prompt test.
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
The remaining use of ConstString in StructuredData is the Dictionary
class. Internally it's backed by a `std::map<ConstString, ObjectSP>`.
I propose that we replace it with a `llvm::StringMap<ObjectSP>`.
Many StructuredData::Dictionary objects are ephemeral and only exist for
a short amount of time. Many of these Dictionaries are only produced
once and are never used again. That leaves us with a lot of string data
in the ConstString StringPool that is sitting there never to be used
again. Even if the same string is used many times for keys of different
Dictionary objects, that is something we can measure and adjust for
instead of assuming that every key may be reused at some point in the
future.
Quick comparisons of key data is likely not a concern with Dictionary,
but the use of `llvm::StringMap` means that lookups should be fast with
its hashing strategy.
Switching to a llvm::StringMap meant that the iteration order may be
different. To account for this when serializing/dumping the dictionary,
I added some code to sort the output by key before emitting anything.
Differential Revision: https://reviews.llvm.org/D159313
The majority of UnixSignals strings are static in the sense that they do
not change. The overwhelming majority of these strings are string
literals. Using ConstString to manage their lifetime does not make
sense. The only exception to this is one of the subclasses of
UnixSignals, for which I have created a StringSet local to that file
which will guarantee the lifetimes of these StringRefs.
As for the other benefits of ConstString, string uniqueness is not a
concern (as many of them are already string literals) and comparing
signal names and aliases should not be a hot path.
Differential Revision: https://reviews.llvm.org/D159011
The testing page already has some page about debugging failures.
I'm not linking directly to that section because:
* The earlier sections about running single tests and such
are just as useful for debugging in general.
* The new theme has a nice sidebar on the right that makes
it really easy to find what you want once on the page.
* We'll probably add more content to the testing page later.
This patch simplifies the color handling logic in Editline and
IOHandlerEditline:
- Remove the m_color_prompts property from Editline and use the prompt
ANSI prefix and suffix as the single source of truth. This avoids
having to redraw the prompt unnecessarily, for example when colors
are enabled but the prompt prefix and suffix are empty.
- Rename m_color_prompts to just m_color in IOHandlerEditline and use
it to ensure consistency between colored prompts and colored
auto-suggestions. Some IOHandler explicitly turn off colors (such as
IOHandlerConfirm) and it doesn't really make sense to have one or the
other.
Users often want to change the look of their prompt and currently the
only way to do that is by using ANSI escape codes in the prompt itself.
This is not only tedious, it also results in extra whitespace because
our Editline wrapper, when computing the cursor column, doesn't ignore
the invisible escape codes.
We already have various *-ansi-{prefix,suffix} settings that allow the
users to customize the color of auto-suggestions and progress events,
using mnemonics like ${ansi.fg.yellow}. This patch brings the same
mechanism to the prompt.
rdar://115390406
PExpect 4.6, when using 'utf-8' throws a TypeError when trying to write
to the log file:
File "llvm-project/lldb/third_party/Python/module/pexpect-4.6/pexpect/spawnbase.py", line 126, in _log
self.logfile.write(s)
TypeError: a bytes-like object is required, not 'str'
This looks like a bug in PExpect to me. Since the log file is only used
for tracing, work around the issue by disabling the log file when
specifying an encoding. This should fix the Debian bot [1] which runs
the test with tracing enabled (-t).
[1] https://lab.llvm.org/buildbot/#/builders/68/builds/59955
Account for Unicode when computing the prompt column width. Previously,
the string length (i.e. number of bytes) rather than the width of the
Glyph was used to compute the cursor position. The result was that the
cursor would be offset to the right when using a prompt containing
Unicode.
Value::ResolveValue calls Value::GetValueAsData as part of its
implementation. The latter can receive an optional Module pointer, which
is always null when called from the former. Allow threading in the
Module in Value::ResolveValue.
rdar://115021869
Previously we would check all built-ins first for suggestions,
then check built-ins and aliases. This meant that if you had
an alias brkpt -> breakpoint, "br" would complete to "breakpoint".
Instead of giving you the choice of "brkpt" or "breakpoint".
SME reuses SVE's register state but adds new modes to it. Therefore
we can't check all those in the same test as the existing SVE
checks.
SME's ZA, SVG and SVCR register checks will be added to this test
in later patches.
Prior to this we didn't have any testing of writing streaming mode
SVE registers from lldb, only writing SVE registers in normal
(non-streaming) SVE mode.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D157846
Instead of creating a copy of the vector, we should just pass a
reference along. The only method that calls this Ctor also holds onto a
non-mutable reference to the vector of strings so a copy should be
unnecessary.
This reverts commit 8012518f60.
The x86 register write test had one that expected "\$rax" so on.
As these patterns were previously regex, the $ had to be escaped.
Now they are just plain strings to this is not needed.
Add a new LC_NOTE for Mach-O corefiles, "proces metadata", which is a
JSON string. Currently there may be a `threads` key in the JSON,
and if `threads` is present, it is an array with the same number of
elements as there are LC_THREADs in the corefile. This patch adds
support for a `thread_id` key-value for each `thread` entry, to
supply a thread ID for that LC_THREAD.
Differential Revision: https://reviews.llvm.org/D158785
rdar://113037252
"descriptive summaries" should only be used for small to medium binaries
because of the performance penalty the cause when completing types. I'm
defaulting it to false.
Besides that, the "raw child" for synthetics should be optional as well.
I'm defaulting it to false.
Both options can be set via a launch or attach config, following the
pattern of most settings. javascript extension wrappers can set these
settings on their own as well.
* Assert no completions for tests that should not find completions.
* Remove regex mode from complete_from_to, which was unused.
This exposed bugs in 2 of the tests, target stop-hook and
process unload. These were fixed in previous commits but
couldn't be tested properly until this patch.
Some functions in Process were using LLDB_INVALID_ADDRESS instead of
LLDB_INVALID_TOKEN.
The only visible effect of this appears to be that "process unload
<tab>" would complete to 0 even after the image was unloaded. Since the
command is checking for LLDB_INVALID_TOKEN.
Everything else worked somehow. I've added a check to the existing load
unload tests anyway.
The tab completion cannot be checked as is, but when I make them more
strict in a later patch it will be tested.
This already applies to enable and disable, delete was missing
a check.
This cannot be tested properly with the current completion tests,
but it will be when I make them more strict in a follow up patch.
While working in support for SME's ZA register, I found a circumstance
where restoring ZA after SVE, when the current SVE mode is non-streaming,
will kick the process back into FPSIMD mode. Meaning the SVE values that
you just wrote are now cut off at 128 bit.
The fix for that is to write ZA then SVE. Problem with that
is, the current ReadAll/WriteAll makes a lot of assumptions about the
saved data length.
This patch changes the format so there is a "type" written before
each data block. This tells WriteAllRegisterValues what it's looking at
without brittle checks on length, or assumptions about ordering.
If we want to change the order of restoration, all we now have to
do is change the order of saving.
This exposes a bug where the TLS registers are not restored.
This will be fixed by https://reviews.llvm.org/D156512 in some form,
depending on what lands first.
Existing SVE tests certainly check restoration and when I got this
wrong, many, many tests failed. So I think we have enough coverage
already, and more will be coming with future ZA changes.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D156687
The default constructor for SBCommandInterpreter was (unintentionally)
made protected in 27b6a4e63a. The goal of the patch was to make the
constructor taking an lldb_private type protected, but due to the
presence of a default argument, this ctor also served as the default
constructor. The latter should remain public.
This commit reinstates the original behavior by removing the default
argument and having an explicit, public default constructor.
While looking at https://github.com/llvm/llvm-project/issues/49528 I
found that, happily, aliases can now be tab completed.
However, if there is a built-in match that will always be taken. Which
is a bit surprising, though logical if we don't want people really
messing up their commands I guess.
Meaning "b" tab completes to our built-in breakpoint alias, before it
looks at any of the aliases. "bf" doesn't match "b", so it looks through
the aliases.
I didn't find any tests for this in the obvious places, so this adds
some.
In the LDDB documentation you have the following sentence:
"The --format (which you can shorten to -f) option accepts a format name."
The link points to the wrong place.
I pointed it to the table that precedes the Type summary section.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D151668
The issues link as it was was including PRs, so I've made that
issues only and only those in an open state.
Code review is now on Gtihub so same thing, PRs only, open state,
label lldb.
Adds the following:
* A note that you can use attaching to debug the right lldb-server
process, though there are drawbacks.
* A section on debugging the remote protocol.
* Reducing bugs, including reducing ptrace bugs to remove the need for
LLDB.
I've added a standlone ptrace program to the examples folder because:
* There's no better place to put it.
* Adding it to the page seems like wasting space, and would be harder to
update.
* I link to Eli Bendersky's classic blog on the subject, but we are
safer with our own example as well.
* Eli's example is for 32 bit Intel, AArch64 is more common these days.
* It's easier to show the software breakpoint steps in code than explain
it (though I still do that in the text).
* It was living on my laptop not helping anyone so I think it's good to
have it upstream for others, including future me.
D157648 broke the function because it put the blocking wait into a
critical section. This meant that, if m_iohandler_sync was not updated
before entering the function, no amount of waiting would help.
Fix that by restriciting the scope of the critical section to the
iohandler check.
This reverts commit dc3f758ddc.
Lit decided to show me the least interesting part of the
test output, but from what I gather on Mac OS the DWARF
stays in the object files (https://stackoverflow.com/a/12827463).
So either split DWARF options do nothing or they produce
files I don't know the name of that aren't .dwo, so I'm
skipping these tests on Darwin.
lldb already has a `ValueSP` type. This was confusing to me when reading
TypeCategoryMap, especially when `ValueSP` is not qualified. From first
glance it looks like it's referring to a
`std::shared_ptr<lldb_private::Value>` when it's really referring to a
`std::shared_ptr<lldb_private::TypeCategoryImpl>`.
Fixes#28667
There's a bunch of ways to end up building split DWARF where the
DWO file is not next to the program file. On top of that you may
distribute the program in various ways, move files about, switch
machines, flatten the directories, etc.
This change adds a few more strategies to find DWO files:
* Appending the DW_AT_COMP_DIR and DWO name to all the debug
search paths.
* Appending the same to the binary's dir.
* Appending the DWO name (e.g. a/b/foo.dwo) to all the debug
search paths.
* Appending the DWO name to the binary's location.
* Appending the DWO filename (e.g. foo.dwo) to the debug
search paths.
* Appending the DWO filename to the binary's location.
They are applied in that order and some will be skipped
if the DW_AT_COMP_DIR is relative or absolute, same for
the DWO name (though that seems to always be relative).
This uses the setting target.debug-file-search-paths, which
is used for DWP files already.
The added tests likely do not cover every part of the
strategies listed, it's a best effort.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D157609
We've been displaying types and addresses for containers, but that's not
very useful information. A better approach is to compose the summary of
containers with the summary of a few of its children.
Not only that, we can dereference simple pointers and references to get
the summary of the pointer variable, which is also better than just
showing an anddress.
And in the rare case where the user wants to inspect the raw address,
they can always use the debug console for that.
For the record, this is very similar to what the CodeLLDB extension
does, and it seems to give a better experience.
An example of the new output:
<img width="494" alt="Screenshot 2023-09-06 at 2 24 27 PM"
src="https://github.com/llvm/llvm-project/assets/1613874/588659b8-421a-4865-8d67-ce4b6182c4f9">
And this is the
<img width="476" alt="Screenshot 2023-09-06 at 2 46 30 PM"
src="https://github.com/llvm/llvm-project/assets/1613874/5768a52e-a773-449d-9aab-1b2fb2a98035">
old output:
Fixes `lldb/test/Shell/SymbolFile/NativePDB/inline_sites.test` to use the correct line number now that f2f36c9b29 is causing the inline call site info to be taken into account.
We have docs about how to use lldb on other programs, this tells you how
to use lldb on ldlb and lldb-server.
Lacking any Mac experience I've not included any debugserver information
apart from stating it will be similar but not the same.
I plan for this page to include sections on debugging tests and other
things but this initial commit is purely about the two main binaries
involved.
The main way I cross build lldb is to point CMake at an existing host
build to get the native tablegen tools. This is what we had documented
before.
There is another option where you start from scratch and the host tools
are built for you. This patch documents that and explains which one to
choose.
Added another arm64 example which uses this. So the frst one is the
"automatic" build and the second is the traditional approach.
For ease of copy paste and understanding, I've kept the full command in
each section and noted the one difference between them.
Along the way I updated some of the preamble to explain the two
approaches and updated some language e.g. removing "just ...". Eveyone's
"just" is different, doubly so when cross-compiling.
Our LLDB parser didn't correctly handle archives of all flavors on different systems, it currently only correctly handled BSD archives, normal and thin, on macOS, but I noticed that it was getting incorrect information when decoding a variety of archives on linux. There were subtle changes to how names were encoded that we didn't handle correctly and we also didn't set the result of GetObjectSize() correctly as there was some bad math. This didn't matter when exracting .o files from .a files for LLDB because the size was always way too big, but it was big enough to at least read enough bytes for each object within the archive.
This patch does the following:
- switch over to use LLVM's archive parser and avoids previous code duplication
- remove values from ObjectContainerBSDArchive::Object that we don't use like:
- uid
- gid
- mode
- fix ths ObjectContainerBSDArchive::Object::file_size value to be correct
- adds tests to test that we get the correct module specifications
Differential Revision: https://reviews.llvm.org/D159408
This always succeeds. While I'm here, document why we check the size
of p0 against the value of VG.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D157845
This method was working as expected if LLDB_ENABLE_LIBEDIT is false, however, if it was true, then the variable m_current_lines_ptr was always pointing to an empty list, because Editline only updates its contents once the full input has been completed (see https://github.com/llvm/llvm-project/blob/main/lldb/source/Host/common/Editline.cpp#L1576).
A simple fix is to invoke Editline::GetInputAsStringList() from GetCurrentLines(), which is already used in many places as the common way to get the full input list.
Add support for syntax color highlighting disassembly in LLDB. This
patch relies on 77d1032516, which introduces support for syntax
highlighting in MC.
Currently only AArch64 and X86 have color support, but other interested
backends can adopt WithColor in their respective MCInstPrinter.
Differential revision: https://reviews.llvm.org/D159164
Fixes#6515707c215e8a8 added some extra logging
which compiles ok with clang but not msvc. Use LLVM_PRETTY_FUNCTION
to fix that.
Fix suggested by Carlos Alberto Enciso.
In Apple's downstream fork, there is support for understanding the swift
AST sections in various binaries. Even though the lldb on llvm.org does
not have support for debugging swift, I think it makes sense to move
support for recognizing swift ast sections upstream.
Differential Revision: https://reviews.llvm.org/D159142
This should fix the test failure in breakpoint_function_callback.test
since SBStructuredData can now display the content of SBStructuredData.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch re-lands f0731d5b61 with more fixes and improvements.
First, this patch removes `__eq__` implementations from classes that
didn't implemented `operator!=` on the C++ implementation.
This patch removes sphinx document generation for special members such
as `__len__`, since there is no straightforward way to skip class that
don't implement them. We also don't want to introduce a change in
behavior by implementing artifical special members for classes that are
missing them.
Finally, this patch improve the ergonomics of some classes by
implementing special members where it makes sense, i.e. `hex(SBFrame)`
is equivalent to `SBFrame.GetPC()`.
Differential Revision: https://reviews.llvm.org/D159017
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
ConstString can be implicitly converted into a llvm::StringRef. This is
very useful in many places, but it also hides places where we are
creating a ConstString only to use it as a StringRef for the entire
lifespan of the ConstString object.
I locally removed the implicit conversion and found some of the places we
were doing this.
Differential Revision: https://reviews.llvm.org/D159237
The primary goal of this change is to change `if (pair != *(--m_dict.end()))`
to `if (std::next(iter) != m_dict.end())`. I was experimenting with
changing the underlying type of `m_dict` and found that this was an
issue. Specifically, it assumes that m_dict iterators are bidirectional.
This change should make it so we only need to assume m_dict iterators can move
forward.
Differential Revision: https://reviews.llvm.org/D159150
We ran into a case where shared libraries would fail to load in some processes on linux. The issue turned out to be if the main executable or a shared library defined a symbol named "_r_debug", then it would cause problems once the executable that contained it was loaded into the process. The "_r_debug" structure is currently found by looking through the .dynamic section in the main executable and finding the DT_DEBUG entry which points to this structure. The dynamic loader will update this structure as shared libraries are loaded and LLDB watches the contents of this structure as the dyld breakpoint is hit. Currently we expect the "state" in this structure to change as things happen. An issue comes up if someone defines another "_r_debug" struct in their program:
```
r_debug _r_debug;
```
If this code is included, a new "_r_debug" structure is created and it causes problems once the executable is loaded. This is because of the way symbol lookups happen in linux: they use the shared library list in the order it created and the dynamic loader is always last. So at some point the dynamic loader will start updating this other copy of "_r_debug", yet LLDB is only watching the copy inside of the dynamic loader.
Steps that show the problem are:
- lldb finds the "_r_debug" structure via the DT_DEBUG entry in the .dynamic section and this points to the "_r_debug" in ld.so
- ld.so modifies its copy of "_r_debug" with "state = eAdd" before it loads the shared libraries and calls the dyld function that LLDB has set a breakpoint on and we find this state and do nothing (we are waiting for a state of eConsistent to tell us the shared libraries have been fully loaded)
- ld.so loads the main executable and any dependent shared libraries and wants to update the "_r_debug" structure, but it now finds "_r_debug" in the a.out program and updates the state in this other copy
- lldb hits the notification breakpoint and checks the ld.so copy of "_r_debug" which still has a state of "eAdd". LLDB wants the new "eConsistent" state which will trigger the shared libraries to load, but it gets stale data and doesn't do anyhing and library load is missed. The "_r_debug" in a.out has the state set correctly, but we don't know which "_r_debug" is the right one.
The new fix detects the two "eAdd" states and loads shared libraries and will emit a log message in the "log enable lldb dyld" log channel which states there might be multiple "_r_debug" structs.
The correct solution is that no one should be adding a duplicate "_r_debug" symbol to their binaries, but we have programs that are doing this already and since it can be done, we should be able to work with this and keep debug sessions working as expected. If a user #includes the <link.h> file, they can just use the existing "_r_debug" structure as it is defined in this header file as "extern struct r_debug _r_debug;" and no local copies need to be made.
If your ld.so has debug info, you can easily see the duplicate "_r_debug" structs by doing:
```
(lldb) target variable _r_debug --raw
(r_debug) _r_debug = {
r_version = 1
r_map = 0x00007ffff7e30210
r_brk = 140737349972416
r_state = RT_CONSISTENT
r_ldbase = 0
}
(r_debug) _r_debug = {
r_version = 1
r_map = 0x00007ffff7e30210
r_brk = 140737349972416
r_state = RT_ADD
r_ldbase = 140737349943296
}
(lldb) target variable &_r_debug
(r_debug *) &_r_debug = 0x0000555555601040
(r_debug *) &_r_debug = 0x00007ffff7e301e0
```
And if you do a "image lookup --address <addr>" in the addresses, you can see one is in the a.out and one in the ld.so.
Adding more logging to print out the m_previous and m_current Rendezvous structures to make things more clear. Also added a log when we detect multiple eAdd states in a row to detect this problem in logs.
Differential Revision: https://reviews.llvm.org/D158583
This reverts 3 commit:
- f0731d5b61.
- 8e0a087571.
- f2f5d6fb8d.
This changes were introduced to silence the warnings that are printed
when generating the lldb module documentation for the website but it
changed the python bindings and causes test failures on the macos bot:
https://green.lab.llvm.org/green/job/lldb-cmake/59438/
We will have to consider other options to silence these warnings.
This has always worked but had no coverage. Adding testing now so that
later I can refactor the save/restore code safely.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D157488
While doing some refactoring I forgot to carry over the copying in of
SIMD data in normal mode, but no tests failed.
Turns out, it's very easy for us to get the restore wrong because
even if you forget the memcopy, setting the buffer to valid may
just read the data you had before the expression evaluation.
So I've extended the SVE SIMD testing (which includes the plain SIMD mode)
to check expression save/restore. This is the only test that fails
if you forget to do `m_fpu_is_valid = true` so I take from that, that
prior to this it wasn't tested at all.
As a bonus, we now have coverage of the same thing for SVE and SSVE modes.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D157000
Previously we would "process continue" then wait for the number of
threads to be 3 before proceeding with the test.
Testing this on QEMU I saw it would sometimes get stuck at this check,
with one of the threads on a breakpoint before the other had started.
We do want it to be on a breakpoint, but we need the other thread to have
at least started so lldb can interact with both.
I've also seen it timeout on the Graviton buildbot, likely the same
cause.
To fix this add 2 variables to stall either thread until the other
has started up. Then it doesn't matter which one hits its breakpoint
first, the test will just continue the one that didn't, until both
are on the expected breakpoint.
Differential Revision: https://reviews.llvm.org/D157967
A followon to https://reviews.llvm.org/D158237 ,
where this text can print stdout text when run
under address-sanitizer, and the test harness
does not expect any output, resulting in a test
failure on a sanitizer CI bot.
- Allow the definition of synthetic formatters in C++ even when LLDB is built without python scripting support.
- Fix linking problems with the CXXSyntheticChildren
Differential Revision: https://reviews.llvm.org/D158010
When sending file from a Linux host to a Windows remote, Linux host will try to copy the source file's permission bits, which will contain `_S_I?GRP` and `_S_I?OTH` bits. Those bits are rejected by `_wsopen_s`, causing it to return EINVAL.
This patch masks out the rejected bits.
GitHub issue: #64313
Reviewed By: jasonmolenda, DavidSpickett
Differential Revision: https://reviews.llvm.org/D156817
This is the next step in removing ConstString from StructuredData. There
are StringRef overloads already, let's use those where we can.
Differential Revision: https://reviews.llvm.org/D152870