This new tool (`dserverdbg`) runs on the host but connects to
darlingserver and makes unmanaged calls to retrieve debugging
information.
The initial set of subcommands available in this tool are `ps`,
`lsport`, `lspset`, and `lsmsg`:
* `ps` lists processes currently registered with the server and how
many Mach ports they have
* `lsport` lists the ports of a given process (via PID) and their
rights and messages counts (for receive rights)
* `lspset` lists the members of a given portset (via PID and port
name) and provides the same information about each port as `lsport`
* `lsmsg` lists the messages of a given port (via PID and port name),
providing sender PID (if available) and size
This tool may be expanded later to allow e.g. modifying logging settings
while darlingserver is running or perhaps searching through and
filtering the logs.
Make sure to pass a file mode argument to `open` (this is required with `O_CREAT`. Also, make sure to check for success (i.e. non-negative FD) before trying to using the log file.
This commit does not enable any categories with this new behavior, but
it allows for critical categories to always be logged, regardless of
log level. The main use case for this is for `kprintf` messages.
As the comment I added says, sometimes a process is killed while
user-suspended (e.g. when LLDB sends the kill signal while debugging).
In such cases, trying to save the state back to the process will fail
(since it no longer exists). We can safely ignore such errors, but let's
also log a warning just-in-case.
Do not mount /dev/shm with MS_NOEXEC flag on WSL1. A bug on WSL1
(https://github.com/microsoft/WSL/issues/8777) prevents files from
being mapped using mmap if the underlying filesystem is mounted
with MS_NOEXEC.
Darling now be used without overlayfs by enabling
the environment "DARLING_NOOVERLAYFS". Darling also
disables overlayfs when it detects itself running in a WSL1
environment.
Without overlayfs, Darling will have to recursively copy all files
and folders from LIBEXEC_PATH to DPREFIX.
- Implemented an alternative to pidfd_open for kernels older than 5.3.
mldr should send a "lifetime pipe" to darlingserver during process start.
When the process dies, darlingserver should receive a POLLHUP event.
- Set increased_limit.rlim_cur to default_limit.rlim_max on systems without
/proc/sys/fs/nr_open. On WSL1, this greatly increases the number of open file
descriptors available.
- For systems without NSpid in /proc/self/status, implemented a way to manage
thread IDs in darlingserver during checkin. darlingserver should receive a hint
address on the thread's stack, and then compare it with a stack pointer retrieved using
PTRACE_GETREGS
- Avoided sending socket messages when msg_hdr.msg_name->sun_path is an empty string.
A null msg_name is used instead, otherwise, on some systems, this would fail with EINVAL.
Debug logging produces *lots* of output *very* quickly, so that's
disabled by default now. The log level can be controlled with the new
`DSERVER_LOG_LEVEL` env var. Just set it to the minimum level
you want to see in the output. It defaults to "error" so that only
error messages are logged.
This is used to avoid the server reading incorrect/corrupted reply
contents for pushed replies. This was happening because clients were
sending the push-reply call with the pointer to the message contents,
but they were immediately returning after sending it. This led to a race
condition in which the server would sometimes read the data after the
client had already overwritten/discarded said data.
The thread might have died after sending the message, so
it might not exist by the time the server gets the message.
In that case, just ignore/drop the message.
We were previously always updating the timer deadline. This meant that,
when a later deadline than the current one came along, we would update
the deadline to the later one. In effect, we were scheduling a timer for
the latest deadline available rather than the earliest.
The fix involves keeping track of the current deadline and not updating
it if the new deadline is later than the current one. There is an option
to override this behavior, however, because sometimes the timer_call code
changes the deadline on us to a later time and we *do* want to update it
when it tells us to do so explicitly. For example, the deadline returned
by timer_queue_expire is definitive: that's definitely the next deadline
we want. The deadline passed to timer_queue_assign, on the other hand,
is merely is a suggestion.
We were writing out the path to the target process (i.e. the one we're
looking up), but we should instead write it out to the process who made
the call.
This resolves a race condition where we receive a call and then
immediately receive an interrupt while that call is still pending.
The new behavior is to go ahead and process the pending call, but we
trigger interrupt processing as soon as the call suspends.
See DarlingServer::Kqchan::MachPort::_read() for why this is necessary.
This fixes crashes in libkqueue due to out-of-order kqchannel messages,
mainly visible in aslmanager.
Together with the corresponding changes in mldr, darlingserver no longer
requires capabilities while running! The next step towards making
Darling completely unprivileged would be to remove SUID from the main
Darling binary, but that's a task for some other time.
I originally started doing this to see if some issues I was seeing with
LLDB were related to the capabilities in mldr, but it seems they're
unrelated.
What this means is that we no longer release and destroy Thread and
Process instances when the threads and processes they manage die.
Instead, we keep them alive to perform some cleanup (like finishing
active calls).
This should fix the duct-tape panic where threads and tasks are still
referenced at death.
Best of all, there don't seem to be any leaks with this approach: for
each `process dying` or `thread dying` message in the log, there's a
`process being destroyed` or `thread being destroyed` message later
on. This means we're not leaking any processes or threads.
This call needs to access lots of private thread members, so it's better
to provide a single private helper that handles the call in the Thread
class rather than have it all in a Call.
This allows kernel runner threads to be created as necessary to process
the work that comes in through `kernelAsync` and `kernelSync`.
There's currently a hardcoded max of 10 permanent kernel runners.
However, if the workload is too much, temporary runners can be spawned;
each temporary worker processes a single work item and then exits. There
is no limit on the number of temporary workers that can be spawned.
This commit allows Darling processes to convert private memory in other
Darling processes into shared memory that they can access. This is
necessary, e.g. for LLDB.
std::stoul is base 10 by default, so we were trying to process hex
values as decimal values(producing incorrect values, as expected).
Also, memoryRegionInfo now returns a structure with the info rather than
having everything passed in as a reference, just like memoryInfo was
recently changed to do as well. This should make easier to add more info
fields later.