This patch adds a new layout profiling sub-category "LAYOUT_Printing" for the
markers added here.
I'm adding an "interval"-type marker ("AUTO_PROFILER_MARKER_TEXT") for the
function-calls that seem likely to occupy measurable amounts of time (due to
touching the filesystem or printer driver), vs. single-point-in-time markers
("PROFILER_MARKER_TEXT") for functions whose duration isn't particularly long
or interesting, or whose durations we're already measuring with other
closely-associated interval-markers.
Differential Revision: https://phabricator.services.mozilla.com/D191001
SymInitialize can fail with ERROR_INVALID_PARAMETER if some other piece
of code has already called it with the handle value that we are
providing. Therefore it is not recommended to pass GetCurrentProcess()
to this function. Instead, this patch duplicates the handle for the
current process, so that we can pass a unique handle to the current
process and thus avoid collision with handle values that other
components might pass to SymInitialize.
Differential Revision: https://phabricator.services.mozilla.com/D189360
Stack walking can currently produce crashes when we fail to delay-load
DbgHelp.dll. This patch ensures that the library is already loaded in
the process before we try to call any delay-imported function from it.
The patch also improves thread-safety for our DbgHelp initialization
code.
Differential Revision: https://phabricator.services.mozilla.com/D188956
Tail calls are doing some stack data manipulations, and profiler (iterator) needs to know how to find where caller RA and FP stored. A platform now preserves temporary registers used to store FP/RA using collapse frame operations.
Differential Revision: https://phabricator.services.mozilla.com/D183269
We currently fail to guarantee that OnEndDllLoad is called on the same
gLoaderObserver as OnBeginDllLoad. We must implement additional
synchronization to prevent a race condition where a call to
LoaderPrivateAPIImp::SetObserver would come in between the two and
change gLoaderObserver.
This has led to issues when using MOZ_PROFILER_STARTUP=1 where we would
have sStackWalkSuppressions reach (size_t)-1 instead of 0, later
resulting in deadlock or missing stacks. See bug 1687510 comment 10 for
extra details.
Depends on D181436
Differential Revision: https://phabricator.services.mozilla.com/D181437
InitializeWin64ProfilerHooks is called by the profiler to avoid deadlock
situations that can occur during stack walking. Actually, this is needed
not only for the profiler, but for any code that relies on stack
walking; and in particular the background hang monitor. So, let's move
this part outside of profiler code, and call it from the background
hang monitor.
Depends on D181435
Differential Revision: https://phabricator.services.mozilla.com/D181436
On Windows aarch64 and x64 builds, stack walking relies on
RtlLookupFunctionEntry. This can lead to deadlock, which we avoid in x64
builds by adding stack walking suppressions. We must do the same in
aarch64 builds to avoid the same deadlock situation, but we are
missing some stack walk suppression paths. Let's fix that.
Differential Revision: https://phabricator.services.mozilla.com/D181435
When a first hang is detected, the BHMgr Monitor thread needs to commit
5 more pages of stack to run profiler_suspend_and_sample_thread, which
contains big stack variables. If that occurs while we are low on memory,
failure to commit stack pages can crash the process.
In bug 1716727, we have added delays on failed allocations to try
to avoid crashing the main process under low memory condition. These
delays could trigger the background hang monitor, which could in
turn crash the process, as they occur in a low memory condition where we
will likely fail to commit.
We can pre-commit the 5 pages of stack at thread initialization to
ensure that they will already be commited when we later need them. Or at
least, we can try to see if that works.
We do that with a wrapper for the __chkstk function. We add a new test
in the NativeNt cppunit test, to ensure that our wrapper function
behaves as expected.
Differential Revision: https://phabricator.services.mozilla.com/D182582
Bug 1596930 added support for detouring a pattern of code used by eScan
Internet Security Suite. The patch also added tests to make sure
that we correctly detour this pattern.
The pattern involves a PUSH instruction followed by a RET instruction.
This pattern is forbidden by Intel CET, which enforces at RET time that
we always return to an address that was pushed on the stack by a
prior CALL instruction. Executing the pattern thus crashes if Intel CET
is active.
If CET is active, we must thus skip the execution part of the test, or
the test crashes. We will still check that our detouring code
recognized the pattern and detoured it, but we will not run the detoured
pattern anymore under active Intel CET.
Differential Revision: https://phabricator.services.mozilla.com/D163468
10 years ago (!) Bug 914190 already choked on the fact that bionic's
getline implementation could realloc a buffer using a function call
we cannot intercept, resulting in different memory allocator being used
to allocate and free the getline buffer.
This got hit again by 1850948, causing a backout. The approach taken at
that time (use std::getline) is neither future-proof (as demonstrated by
the backout) nor always satisfying (std::string as a few limitations in
term of low-level buffer manipulation).
Provide our implementation for Android, as hinted by the original bug.
Differential Revision: https://phabricator.services.mozilla.com/D187270
By coupling the state of `signwidth` and `sign`, we provide enough
information to the compiler for it to get rid of an extra mov as a
result of `-ftrivial-auto-var-init`.
Differential Revision: https://phabricator.services.mozilla.com/D187047
This requires to make existing Decimal constructor constexpr, which is
incompatible with the weak linkage implied by MFBT_API.
As an alternative, provide a constexpr user-defined-literal that creates
a temporary DecimalLiteral that can be used by a new Decimal constexpr
constructor.
Differential Revision: https://phabricator.services.mozilla.com/D184552
Mingw trunk recently gained the missing pieces we were #define'ing.
Unfortunately, the way we were doing that is not compatible with them
being there now, so we change it so that it works with both the current
version of mingw we use, and trunk by:
- using a typedef for HREPORT instead of a #define, which is the same
declaration as in trunk
- because PWER_SUBMIT_RESULT is a typedef with the underlying
WER_SUBMIT_RESULT type defined inline, we can't typedef it.
Fortunately, there's only one thing using PWER_SUBMIT_RESULT in the
old version of werapi.h in mingw (WerReportSubmit), so we #define
it to change its definition instead.
- WER_MAX_PREFERRED_MODULES_BUFFER is a #define without parens in
the new header, which would conflict, but #define'ing to the same
value as in the new header, without the parens makes it work.
Differential Revision: https://phabricator.services.mozilla.com/D186412
We have discovered that clock_gettime(CLOCK_MONOTONIC) can be slow on
certain arm64 devices due to Linux kernel workarounds for CPU errata
avoiding the VDSO fast-path. Using CLOCK_MONOTONIC_COARSE is
unnaffected by these issues. This patch adds an implementation of
TimeStamp::NowLoRes() using the coarse clock, meaning that when lower
precision timestamps are adequate we do not pay the penalty of hitting
the slow path.
CLOCK_MONOTINIC_COARSE is Linux-specific, therefore its usage is
guarded by ifdefs as well as at runtime.
Differential Revision: https://phabricator.services.mozilla.com/D185004
This also does minimal refactoring of cases where the directives were
protecting a simple expression that could be refactored back to the
callers.
Differential Revision: https://phabricator.services.mozilla.com/D184399
This also does minimal refactoring of cases where the directives were
protecting a simple expression that could be refactored back to the
callers.
Differential Revision: https://phabricator.services.mozilla.com/D184399