This patch implements the majority of the public interface for the new IPC
handle passing design.
The rough design is an expansion of `geckoargs` to allow passing
`UniqueFileHandle` arguments to child processes. This replaces the existing
extra options array to make the list of files explicit.
This currently just replaces things which were already passed this way on the
command line from outside of GeckoChildProcessHost. Note that this does not
migrate callers which were not already passing file handles using geckoargs,
and does not implement the actual OS-level support for passing arguments this
way.
Differential Revision: https://phabricator.services.mozilla.com/D221371
This patch removes no longer needed WorkerPrivate* parameters from the
runnable constructors. It also converts some instances of using
WorkerThreadRunnable as a base class when MainThreadWorkerRunnable is
more suitable.
Differential Revision: https://phabricator.services.mozilla.com/D222735
There are many other uses of OtherPid which could be switched over to
OtherChildID, but these were a couple of obvious low-hanging fruit use-cases
which will work better when using OtherChildID.
Differential Revision: https://phabricator.services.mozilla.com/D217120
This requires quite a bit of piping to get the ChildID passed everywhere where
we currently pass the pid in IPC. This is done by adding a new struct type
(EndpointProcInfo), which is passed around instead of OtherPid in these places,
and contains the full pid.
In most cases, it was a fairly painless change to move over, however in some
cases, more complex changes were required, as the pid was being stored
previously in something like an Atomic<...>, and needed to be switched to using
a mutex-protected value.
In the future, it may be possible to remove OtherPid from IPDL actors once
everything is migrated to ChildID, but we're still a long way off from that, so
for now we unfortunately need to pass both around.
Differential Revision: https://phabricator.services.mozilla.com/D217118
This disables the legacy Direct2D backend in Windows Nightly in favor
of using Accelerated Canvas2D instead. The goal of this experiment
is to collect performance and other bug feedback on AC2D versus Direct2D.
In general, AC2D has been fairly stable on other platforms, so it would be
nice to gradually move towards using AC2D on Windows as well so that we
are using one consistent acceleration solution.
This experiment is the first step towards that end.
Differential Revision: https://phabricator.services.mozilla.com/D217851
This disables the legacy Direct2D backend in Windows Nightly in favor
of using Accelerated Canvas2D instead. The goal of this experiment
is to collect performance and other bug feedback on AC2D versus Direct2D.
In general, AC2D has been fairly stable on other platforms, so it would be
nice to gradually move towards using AC2D on Windows as well so that we
are using one consistent acceleration solution.
This experiment is the first step towards that end.
Differential Revision: https://phabricator.services.mozilla.com/D217851
Rather than determine whether the previous GPU process was stable or
unstable when launching the subsequent process, do so immediately
following the process loss. This allows us to check the application's
foreground/background state at the time of the loss, and ensure that
we do not treat the process as unstable if we are killed whilst in the
background.
This is desireable because background process kills are expected
behavior on Android, and not in any way indicitive of the GPU process
being unstable. Treating them as unstable was leading to fallback to
software rendering and eventually the gpu process being disabled. This
patch is attempting to prevent that from occuring.
Previously added code that treated GPU processes that were *launched*
whilst in the background as stable can be removed, as it is no longer
required with this change.
Differential Revision: https://phabricator.services.mozilla.com/D217460
Extend the sync IPC timeout mechanism in CompositorManagerChild to
additionally cover UiCompositorControllerChild. As
UiCompositorControllerChild runs on the Android UI thread, we ensure
GPUProcessManager::KillProcess dispatches to the gecko main thread.
Along with the previous patch in this series this should provide us
with crash reports when the Android UI thread is hung waiting for the
GPU process to reply.
Differential Revision: https://phabricator.services.mozilla.com/D202167
When sync IPC under the top-level PCompositorManager protocol does not
reply within a certain time threshold we purposefully kill the GPU
process. While this allows the user to recover from a stuck GPU
process, we have little visibility about the underlying cause.
This patch makes it so that we generate a paired minidump for the GPU
and parent processes prior to killing the GPU process in
GPUProcessHost::KillHard(). The implementation roughly follows the
equivalent for content processes in ContentParent::KillHard().
As the GPU process can be purposefully killed during normal operation,
and because generating minidumps can be expensive, we are careful to
only do so when the new argument aGenerateMinidump is true. We
additionally remove the aReason argument as it is unused (and
currently innacurate in some places).
As these minidumps may not automatically submitted we limit the
minidumps generation to twice per session in order to avoid
accumulating a large number of unsubmitted minidumps on disk.
Differential Revision: https://phabricator.services.mozilla.com/D202166
Extend the sync IPC timeout mechanism in CompositorManagerChild to
additionally cover UiCompositorControllerChild. As
UiCompositorControllerChild runs on the Android UI thread, we ensure
GPUProcessManager::KillProcess dispatches to the gecko main thread.
Along with the previous patch in this series this should provide us
with crash reports when the Android UI thread is hung waiting for the
GPU process to reply.
Differential Revision: https://phabricator.services.mozilla.com/D202167
When sync IPC under the top-level PCompositorManager protocol does not
reply within a certain time threshold we purposefully kill the GPU
process. While this allows the user to recover from a stuck GPU
process, we have little visibility about the underlying cause.
This patch makes it so that we generate a paired minidump for the GPU
and parent processes prior to killing the GPU process in
GPUProcessHost::KillHard(). The implementation roughly follows the
equivalent for content processes in ContentParent::KillHard().
As the GPU process can be purposefully killed during normal operation,
and because generating minidumps can be expensive, we are careful to
only do so when the new argument aGenerateMinidump is true. We
additionally remove the aReason argument as it is unused (and
currently innacurate in some places).
As these minidumps may not automatically submitted we limit the
minidumps generation to twice per session in order to avoid
accumulating a large number of unsubmitted minidumps on disk.
Differential Revision: https://phabricator.services.mozilla.com/D202166
Extend the sync IPC timeout mechanism in CompositorManagerChild to
additionally cover UiCompositorControllerChild. As
UiCompositorControllerChild runs on the Android UI thread, we ensure
GPUProcessManager::KillProcess dispatches to the gecko main thread.
Along with the previous patch in this series this should provide us
with crash reports when the Android UI thread is hung waiting for the
GPU process to reply.
Differential Revision: https://phabricator.services.mozilla.com/D202167
When sync IPC under the top-level PCompositorManager protocol does not
reply within a certain time threshold we purposefully kill the GPU
process. While this allows the user to recover from a stuck GPU
process, we have little visibility about the underlying cause.
This patch makes it so that we generate a paired minidump for the GPU
and parent processes prior to killing the GPU process in
GPUProcessHost::KillHard(). The implementation roughly follows the
equivalent for content processes in ContentParent::KillHard().
As the GPU process can be purposefully killed during normal operation,
and because generating minidumps can be expensive, we are careful to
only do so when the new argument aGenerateMinidump is true. We
additionally remove the aReason argument as it is unused (and
currently innacurate in some places).
As these minidumps may not automatically submitted we limit the
minidumps generation to twice per session in order to avoid
accumulating a large number of unsubmitted minidumps on disk.
Differential Revision: https://phabricator.services.mozilla.com/D202166
`device_hardware_decoding_support` was previously only recorded on
Windows and we want to extend this probe to other platforms as well.
Differential Revision: https://phabricator.services.mozilla.com/D208245
`device_hardware_decoding_support` was previously only recorded on
Windows and we want to extend this probe to other platforms as well.
Differential Revision: https://phabricator.services.mozilla.com/D208245
There are modifications needed for PDMfactory and the decoder modules in
order to run their methods on non-mainthread and keep them threadsafe.
Differential Revision: https://phabricator.services.mozilla.com/D206420
The strong reference from CanvasManagerChild to WebGPUChild was never cleared
when the WebGPUChild dies, meaning that it would form a strong cycle through
the `Manager()` reference.
Under the new system, the WebGPUChild actor is kept alive by the IPC
connection, and will be torn down when the IPC connection dies.
Differential Revision: https://phabricator.services.mozilla.com/D198625
WorkerRunnable no longer keeps a raw pointer(mWorkerPrivate) for the associated WorkerPrivate in this patch.
Removing the WorkerRunnable::mWorkerPrivate needs to fix the following problems.
1. Thread assertions in WorkerRunnable::Dispatch()
To fix this problem, the associated WorkerPrivate is as a parameter and passed to WorkerRunnable::Dispatch() for the dispatching thread assertions. This associated WorkerPrivate is also propagated to PreDispatch() and PostDispatch() for the children classes of WorkerRunnable()
2. Get the associated WorkerPrivate in WorkerRunnable::Run() for environment setup(GlobabObject, JSContext setting for the runnable)
- For WorkerThreadRunnable
Since WorkerThreadRunnable is supposed to run on the worker thread, it does not need to keep a raw pointer to WorkerPrivate as its class member. GetCurrentThreadWorkerPrivate() should always get the correct WorkerPrivate for WorkerThreadRunnable.
- For WorkerParentThreadRunnable
WorkerParentRef is introduced to keep a RefPtr<WorkerPrivate> for WorkerParentThreadRunnable instead of using a raw pointer.
Checking the associated WorkerPrivate existence by WorkerParentRef at the beginning of WorkerParentThreadRunnable::Run(). If the Worker has already shut down, WorkerParentThreadRunnable cannot do anything with the associated WorkerPrivate, so WorkerParentThreadRunnable::Run() will return NS_OK directly but with a warning.
The associated WorkerPrivate is also passed into WorkerRun(), PreRun(), and PostRun(), so the majority of implementations of child classes of WorkerRunnable do not need to be changed.
If there are any cases in which the child classes of WorkerThreadRunnable/WorkerParentThreadRunnable want to keep the associated WorkerPrivate, they should use WorkerRefs instead of raw pointers.
Depends on D205679
Differential Revision: https://phabricator.services.mozilla.com/D207039
This is the first step in splitting the parent thread runnable out of WorkerRunnable.
To reuse the runnable dispatching codes in Worker, we still need a base class for runnable on the worker thread and the parent thread.
In this patch, we rename the original WorkerRunnable to WorkerThreadRunnable and make WorkerRunnable to be WorkerThreadRunnable's parent class.
In the second patch, we will create WorkerParentThreadRunnable and its sub-classes, split from WorkerThreadRunnable for runnable on the Worker's parent thread.
And in the third patch, we will re-structure the content of WorkerParentThreadRunnable to remove unnecessary members.
Differential Revision: https://phabricator.services.mozilla.com/D205178
There are modifications needed for PDMfactory and the decoder modules in
order to run their methods on non-mainthread and keep them threadsafe.
Differential Revision: https://phabricator.services.mozilla.com/D206420
There are modifications needed for PDMfactory and the decoder modules in
order to run their methods on non-mainthread and keep them threadsafe.
Differential Revision: https://phabricator.services.mozilla.com/D206420
This shares a global SharedContextWebgl among all instances of CanvasTranslator.
The goal is that regardless of how many windows are open, we only have to pay the
startup costs and shader compilation times for SharedContextWebgl once. In the
event that all CanvasTranslators are gone, the SharedContextWebgl is kept around
while its internal caches and textures are discarded to avoid significant memory
usage when no canvases are in use, while at the same time saving on startup
costs the next time a first live CanvasTranslator is created.
Differential Revision: https://phabricator.services.mozilla.com/D205977
Remote canvas can run in the GPU process, and if the GPU process
crashes, we need to notify the application using canvas. Historically we
just failed, and the application may have been able to continue drawing
but with the contents prior to the crash lost. Later we regressed to
prevent the canvas from being used at all.
This patch makes it so that we can restore functionality to any
application that supports the contextlost/contextrestored events. This
will allow for a theoretical complete graceful recovery for the user
with minimal disruption.
Differential Revision: https://phabricator.services.mozilla.com/D205608
Remote canvas can run in the GPU process, and if the GPU process
crashes, we need to notify the application using canvas. Historically we
just failed, and the application may have been able to continue drawing
but with the contents prior to the crash lost. Later we regressed to
prevent the canvas from being used at all.
This patch makes it so that we can restore functionality to any
application that supports the contextlost/contextrestored events. This
will allow for a theoretical complete graceful recovery for the user
with minimal disruption.
Differential Revision: https://phabricator.services.mozilla.com/D205608
We previously refactor canvas shutdown to account for the fact that they
needed to be shutdown in conjunction with the DOM worker reference
kept alive by the CanvasManagerChild. Unfortunately if the compositor
process crashes, or otherwise the CanvasManagerChild actor is torn down,
we also prematurely shutdown the canvas when it would previously
fallback to Skia in the content process.
This patch abstracts out canvas shutdown into the CanvasShutdownManager
which has the owning reference to the ThreadSafeWorkerRef. It corrects a
similar bug on the main thread as well for HTMLCanvasElement.
Differential Revision: https://phabricator.services.mozilla.com/D204988
We previously refactor canvas shutdown to account for the fact that they
needed to be shutdown in conjunction with the DOM worker reference
kept alive by the CanvasManagerChild. Unfortunately if the compositor
process crashes, or otherwise the CanvasManagerChild actor is torn down,
we also prematurely shutdown the canvas when it would previously
fallback to Skia in the content process.
This patch abstracts out canvas shutdown into the CanvasShutdownManager
which has the owning reference to the ThreadSafeWorkerRef. It corrects a
similar bug on the main thread as well for HTMLCanvasElement.
Differential Revision: https://phabricator.services.mozilla.com/D204988
This changes comes with several different refactorings all rolled into one,
unfotunately I couldn't find a way to pull them apart:
- First of all annotations now can either recorded (that is, we copy the value
and have the crash reporting code own the copy) or registered. Several
annotations are changed to use this functionality so that we don't need to
update them as their value change.
- The code in the exception handler is modified to read the annotations from
the mozannotation_client crate. This has the unfortunate side-effect that
we need three different bits of code to serialize them: one for annotations
read from a child process, one for reading annotations from the main process
outside of the exception handler and one for reading annotations from the
main process within the exception handler. As we move to fully
out-of-process crash reporting the last two methods will go away.
- The mozannotation_client crate now doesn't record annotation types anymore.
I realized as I was working on this that storing types at runtime has two
issues: the first one is that buggy code might change the type of an
annotation (that is record it under two different types at two different
moments), the second issue is that types might become corrupt during a
crash, so better enforce them at annotation-writing time. The end result is
that the mozannotation_* crates now only store byte buffers, track the
format the data is stored in (null-terminated string, fixed size buffer,
etc...) but not the type of data each annotation is supposed to contain.
- Which brings us to the next change: concrete types for annotations are now
enforced when they're written out. If an annotation doesn't match the
expected type it's skipped. Storing an annotation with the wrong type will
also trigger an assertion in debug builds.
Differential Revision: https://phabricator.services.mozilla.com/D195248