Extend the sync IPC timeout mechanism in CompositorManagerChild to
additionally cover UiCompositorControllerChild. As
UiCompositorControllerChild runs on the Android UI thread, we ensure
GPUProcessManager::KillProcess dispatches to the gecko main thread.
Along with the previous patch in this series this should provide us
with crash reports when the Android UI thread is hung waiting for the
GPU process to reply.
Differential Revision: https://phabricator.services.mozilla.com/D202167
When sync IPC under the top-level PCompositorManager protocol does not
reply within a certain time threshold we purposefully kill the GPU
process. While this allows the user to recover from a stuck GPU
process, we have little visibility about the underlying cause.
This patch makes it so that we generate a paired minidump for the GPU
and parent processes prior to killing the GPU process in
GPUProcessHost::KillHard(). The implementation roughly follows the
equivalent for content processes in ContentParent::KillHard().
As the GPU process can be purposefully killed during normal operation,
and because generating minidumps can be expensive, we are careful to
only do so when the new argument aGenerateMinidump is true. We
additionally remove the aReason argument as it is unused (and
currently innacurate in some places).
As these minidumps may not automatically submitted we limit the
minidumps generation to twice per session in order to avoid
accumulating a large number of unsubmitted minidumps on disk.
Differential Revision: https://phabricator.services.mozilla.com/D202166
Extend the sync IPC timeout mechanism in CompositorManagerChild to
additionally cover UiCompositorControllerChild. As
UiCompositorControllerChild runs on the Android UI thread, we ensure
GPUProcessManager::KillProcess dispatches to the gecko main thread.
Along with the previous patch in this series this should provide us
with crash reports when the Android UI thread is hung waiting for the
GPU process to reply.
Differential Revision: https://phabricator.services.mozilla.com/D202167
When sync IPC under the top-level PCompositorManager protocol does not
reply within a certain time threshold we purposefully kill the GPU
process. While this allows the user to recover from a stuck GPU
process, we have little visibility about the underlying cause.
This patch makes it so that we generate a paired minidump for the GPU
and parent processes prior to killing the GPU process in
GPUProcessHost::KillHard(). The implementation roughly follows the
equivalent for content processes in ContentParent::KillHard().
As the GPU process can be purposefully killed during normal operation,
and because generating minidumps can be expensive, we are careful to
only do so when the new argument aGenerateMinidump is true. We
additionally remove the aReason argument as it is unused (and
currently innacurate in some places).
As these minidumps may not automatically submitted we limit the
minidumps generation to twice per session in order to avoid
accumulating a large number of unsubmitted minidumps on disk.
Differential Revision: https://phabricator.services.mozilla.com/D202166
Extend the sync IPC timeout mechanism in CompositorManagerChild to
additionally cover UiCompositorControllerChild. As
UiCompositorControllerChild runs on the Android UI thread, we ensure
GPUProcessManager::KillProcess dispatches to the gecko main thread.
Along with the previous patch in this series this should provide us
with crash reports when the Android UI thread is hung waiting for the
GPU process to reply.
Differential Revision: https://phabricator.services.mozilla.com/D202167
When sync IPC under the top-level PCompositorManager protocol does not
reply within a certain time threshold we purposefully kill the GPU
process. While this allows the user to recover from a stuck GPU
process, we have little visibility about the underlying cause.
This patch makes it so that we generate a paired minidump for the GPU
and parent processes prior to killing the GPU process in
GPUProcessHost::KillHard(). The implementation roughly follows the
equivalent for content processes in ContentParent::KillHard().
As the GPU process can be purposefully killed during normal operation,
and because generating minidumps can be expensive, we are careful to
only do so when the new argument aGenerateMinidump is true. We
additionally remove the aReason argument as it is unused (and
currently innacurate in some places).
As these minidumps may not automatically submitted we limit the
minidumps generation to twice per session in order to avoid
accumulating a large number of unsubmitted minidumps on disk.
Differential Revision: https://phabricator.services.mozilla.com/D202166
`device_hardware_decoding_support` was previously only recorded on
Windows and we want to extend this probe to other platforms as well.
Differential Revision: https://phabricator.services.mozilla.com/D208245
`device_hardware_decoding_support` was previously only recorded on
Windows and we want to extend this probe to other platforms as well.
Differential Revision: https://phabricator.services.mozilla.com/D208245
There are modifications needed for PDMfactory and the decoder modules in
order to run their methods on non-mainthread and keep them threadsafe.
Differential Revision: https://phabricator.services.mozilla.com/D206420
The strong reference from CanvasManagerChild to WebGPUChild was never cleared
when the WebGPUChild dies, meaning that it would form a strong cycle through
the `Manager()` reference.
Under the new system, the WebGPUChild actor is kept alive by the IPC
connection, and will be torn down when the IPC connection dies.
Differential Revision: https://phabricator.services.mozilla.com/D198625
WorkerRunnable no longer keeps a raw pointer(mWorkerPrivate) for the associated WorkerPrivate in this patch.
Removing the WorkerRunnable::mWorkerPrivate needs to fix the following problems.
1. Thread assertions in WorkerRunnable::Dispatch()
To fix this problem, the associated WorkerPrivate is as a parameter and passed to WorkerRunnable::Dispatch() for the dispatching thread assertions. This associated WorkerPrivate is also propagated to PreDispatch() and PostDispatch() for the children classes of WorkerRunnable()
2. Get the associated WorkerPrivate in WorkerRunnable::Run() for environment setup(GlobabObject, JSContext setting for the runnable)
- For WorkerThreadRunnable
Since WorkerThreadRunnable is supposed to run on the worker thread, it does not need to keep a raw pointer to WorkerPrivate as its class member. GetCurrentThreadWorkerPrivate() should always get the correct WorkerPrivate for WorkerThreadRunnable.
- For WorkerParentThreadRunnable
WorkerParentRef is introduced to keep a RefPtr<WorkerPrivate> for WorkerParentThreadRunnable instead of using a raw pointer.
Checking the associated WorkerPrivate existence by WorkerParentRef at the beginning of WorkerParentThreadRunnable::Run(). If the Worker has already shut down, WorkerParentThreadRunnable cannot do anything with the associated WorkerPrivate, so WorkerParentThreadRunnable::Run() will return NS_OK directly but with a warning.
The associated WorkerPrivate is also passed into WorkerRun(), PreRun(), and PostRun(), so the majority of implementations of child classes of WorkerRunnable do not need to be changed.
If there are any cases in which the child classes of WorkerThreadRunnable/WorkerParentThreadRunnable want to keep the associated WorkerPrivate, they should use WorkerRefs instead of raw pointers.
Depends on D205679
Differential Revision: https://phabricator.services.mozilla.com/D207039
This is the first step in splitting the parent thread runnable out of WorkerRunnable.
To reuse the runnable dispatching codes in Worker, we still need a base class for runnable on the worker thread and the parent thread.
In this patch, we rename the original WorkerRunnable to WorkerThreadRunnable and make WorkerRunnable to be WorkerThreadRunnable's parent class.
In the second patch, we will create WorkerParentThreadRunnable and its sub-classes, split from WorkerThreadRunnable for runnable on the Worker's parent thread.
And in the third patch, we will re-structure the content of WorkerParentThreadRunnable to remove unnecessary members.
Differential Revision: https://phabricator.services.mozilla.com/D205178
There are modifications needed for PDMfactory and the decoder modules in
order to run their methods on non-mainthread and keep them threadsafe.
Differential Revision: https://phabricator.services.mozilla.com/D206420
There are modifications needed for PDMfactory and the decoder modules in
order to run their methods on non-mainthread and keep them threadsafe.
Differential Revision: https://phabricator.services.mozilla.com/D206420
This shares a global SharedContextWebgl among all instances of CanvasTranslator.
The goal is that regardless of how many windows are open, we only have to pay the
startup costs and shader compilation times for SharedContextWebgl once. In the
event that all CanvasTranslators are gone, the SharedContextWebgl is kept around
while its internal caches and textures are discarded to avoid significant memory
usage when no canvases are in use, while at the same time saving on startup
costs the next time a first live CanvasTranslator is created.
Differential Revision: https://phabricator.services.mozilla.com/D205977
Remote canvas can run in the GPU process, and if the GPU process
crashes, we need to notify the application using canvas. Historically we
just failed, and the application may have been able to continue drawing
but with the contents prior to the crash lost. Later we regressed to
prevent the canvas from being used at all.
This patch makes it so that we can restore functionality to any
application that supports the contextlost/contextrestored events. This
will allow for a theoretical complete graceful recovery for the user
with minimal disruption.
Differential Revision: https://phabricator.services.mozilla.com/D205608
Remote canvas can run in the GPU process, and if the GPU process
crashes, we need to notify the application using canvas. Historically we
just failed, and the application may have been able to continue drawing
but with the contents prior to the crash lost. Later we regressed to
prevent the canvas from being used at all.
This patch makes it so that we can restore functionality to any
application that supports the contextlost/contextrestored events. This
will allow for a theoretical complete graceful recovery for the user
with minimal disruption.
Differential Revision: https://phabricator.services.mozilla.com/D205608
We previously refactor canvas shutdown to account for the fact that they
needed to be shutdown in conjunction with the DOM worker reference
kept alive by the CanvasManagerChild. Unfortunately if the compositor
process crashes, or otherwise the CanvasManagerChild actor is torn down,
we also prematurely shutdown the canvas when it would previously
fallback to Skia in the content process.
This patch abstracts out canvas shutdown into the CanvasShutdownManager
which has the owning reference to the ThreadSafeWorkerRef. It corrects a
similar bug on the main thread as well for HTMLCanvasElement.
Differential Revision: https://phabricator.services.mozilla.com/D204988
We previously refactor canvas shutdown to account for the fact that they
needed to be shutdown in conjunction with the DOM worker reference
kept alive by the CanvasManagerChild. Unfortunately if the compositor
process crashes, or otherwise the CanvasManagerChild actor is torn down,
we also prematurely shutdown the canvas when it would previously
fallback to Skia in the content process.
This patch abstracts out canvas shutdown into the CanvasShutdownManager
which has the owning reference to the ThreadSafeWorkerRef. It corrects a
similar bug on the main thread as well for HTMLCanvasElement.
Differential Revision: https://phabricator.services.mozilla.com/D204988
This changes comes with several different refactorings all rolled into one,
unfotunately I couldn't find a way to pull them apart:
- First of all annotations now can either recorded (that is, we copy the value
and have the crash reporting code own the copy) or registered. Several
annotations are changed to use this functionality so that we don't need to
update them as their value change.
- The code in the exception handler is modified to read the annotations from
the mozannotation_client crate. This has the unfortunate side-effect that
we need three different bits of code to serialize them: one for annotations
read from a child process, one for reading annotations from the main process
outside of the exception handler and one for reading annotations from the
main process within the exception handler. As we move to fully
out-of-process crash reporting the last two methods will go away.
- The mozannotation_client crate now doesn't record annotation types anymore.
I realized as I was working on this that storing types at runtime has two
issues: the first one is that buggy code might change the type of an
annotation (that is record it under two different types at two different
moments), the second issue is that types might become corrupt during a
crash, so better enforce them at annotation-writing time. The end result is
that the mozannotation_* crates now only store byte buffers, track the
format the data is stored in (null-terminated string, fixed size buffer,
etc...) but not the type of data each annotation is supposed to contain.
- Which brings us to the next change: concrete types for annotations are now
enforced when they're written out. If an annotation doesn't match the
expected type it's skipped. Storing an annotation with the wrong type will
also trigger an assertion in debug builds.
Differential Revision: https://phabricator.services.mozilla.com/D195248
By using IDXGIOutput6, we could check if system enables HDR on Windows.
DeviceManagerDx::SystemHDREnabled() is expected to be called in GPU process.
Differential Revision: https://phabricator.services.mozilla.com/D202186
By using IDXGIOutput6, we could check if system enables HDR on Windows.
DeviceManagerDx::SystemHDREnabled() is expected to be called in GPU process.
Differential Revision: https://phabricator.services.mozilla.com/D202186
We are seeing Android users running software webrender at a much
higher frequency than desired due to encountering too many unstable
GPU processes. We currently fall back to software webrender after
encountering 6 unstable GPU processes in total over the course of the
parent process' lifetime. This patch makes it so that the counter is
reset once a GPU process is declared stable, meaning we will only fall
back if they encounter enough crashes/errors in quick succession.
Hopefully this helps keep users running with a GPU process and
hardware acceleration over the course of a long session, whilst still
falling back if they hit an unrecoverable error/crash.
Differential Revision: https://phabricator.services.mozilla.com/D201204
Normally when D3D11Texture2D is copied by ID3D11DeviceContext::CopySubresourceRegion() with compositor device, WebRender does not need to wait copy complete, since WebRender also uses compositor device.
But with Non-intel GPUs(like NDIVIA), there is a case that the copy complete need to be wait explicitly even with compositor device
mSyncObject->Synchronize() could not be used with compositor device.
Wait of the query is not called in D3D11DXVA2Manager::CopyToImage(), since the wait could take long time. Then the Wait of the query is deferred to just before blitting for video overlay.
Differential Revision: https://phabricator.services.mozilla.com/D200041
This patch makes it so that we always shutdown gracefully by making
us flush the event queue for CanvasRenderThread, after blocking on the
task queues draining. This allows our dependencies like RemoteTextureMap
to shutdown successfully.
It also fixes a bug where CanvasTranslator::CanSend would still return
true on Windows, while blocked on in CanvasTranslator::ActorDestroy
waiting for the task queue to shutdown. The CanSend status is only
updated after ActorDestroy is called from IProtocol::DestroySubtree.
Differential Revision: https://phabricator.services.mozilla.com/D199186
The strong reference from CanvasManagerChild to WebGPUChild was never cleared
when the WebGPUChild dies, meaning that it would form a strong cycle through
the `Manager()` reference.
Under the new system, the WebGPUChild actor is kept alive by the IPC
connection, and will be torn down when the IPC connection dies.
Differential Revision: https://phabricator.services.mozilla.com/D198625