Improve the chance that we catch memory errors with PHC by:
* Increasing the number of PHC slots (increased from 64 to 4096 in most
cases).
* Lower the delay for the first PHC allocation, with more slots we can
afford to trap allocations from earlier in Firefox's startup.
These improve the chance PHC has of catching an error by 64x, at the cost of
additional 1MB of allocation metadata (on 4KB page size systems). It
shouldn't impact performance more than having PHC on at all.
This patch was originally developed by Randell Jesup.
Differential Revision: https://phabricator.services.mozilla.com/D161934
This changes replace_malloc_usable_size to allow measuring the size of
ptrs that appear anywhere within an allocation page, not just ptrs to the
start of the allocation pages.
This also modifies a gtest to explicitly check the usable size of some
modified pointers, not just the usable size of the pages where those
pointers are located.
Differential Revision: https://phabricator.services.mozilla.com/D170963
See my comment on here for more context of my investigation:
https://bugzilla.mozilla.org/show_bug.cgi?id=1779257#c9
The saved context is invalid once the function that called `getcontext`
returns. We need to call the `getcontext` while the frame where we called it is
still on the stack. That's why this patch is moving the call to `getcontext` to
parent function by inlining the SyncPopulate content by using a macro instead.
This has to be a macro instead of a function because stack pointer address will
be invalid once the `Registers::SyncPopulate` returns. I tried to change this
method to inline but that didn't help either.
Differential Revision: https://phabricator.services.mozilla.com/D170133
The system calls of releasing a chunk of memory can be costly and should be
done outside the arena lock's critical section so that other threads aren't
blocked waiting for the lock.
Differential Revision: https://phabricator.services.mozilla.com/D166775
Some of the runtime options are compiled-in constants when MOZ_DEBUG is not
defined. But it can be useful to enable these configuration options without
enabling the rest of MOZ_DEBUG (eg assertions). This patch adds a new
preprocessor macro to enable runtime configuration.
Differential Revision: https://phabricator.services.mozilla.com/D165920
Poisoning was hardcoded, but to compare different allocators I wanted to
disable it. This patch lets us control poisoning with the MALLOC_OPTIONS
environment variable.
Differential Revision: https://phabricator.services.mozilla.com/D165919
In bug 1794059 it was noted that the IPC shared-memory allocation code
would like to be able to stall-and-retry as well using the same logic.
While it doesn't use VirtualAlloc, the principle is otherwise the same.
Shuffle the relevant code around so that the stall-and-retry logic is
separate from the allocation, in preparation for exporting it.
Differential Revision: https://phabricator.services.mozilla.com/D164106
In bug 1794059 it was noted that the IPC shared-memory allocation code
would like to be able to stall-and-retry as well using the same logic.
While it doesn't use VirtualAlloc, the principle is otherwise the same.
Shuffle the relevant code around so that the stall-and-retry logic is
separate from the allocation, in preparation for exporting it.
Differential Revision: https://phabricator.services.mozilla.com/D164106
We want to be precise about types used here. Although in practice unsigned
is the same as uint32_t, it's not guaranteed. We want to definitely use
32-bit multiplication as it can be faster than 64-bit.
Differential Revision: https://phabricator.services.mozilla.com/D164889
Rename the `m` and `p` variables to match those used in the Hacker's Delight
book where the algorithm is presented. There were also some inconsistent
names in comments that this fixes.
Differential Revision: https://phabricator.services.mozilla.com/D164887
This structure is more optimal if it is somewhat aligned with the system's
cache line length (which we assume is 64 bytes but that's not always true).
This reduces the number of cache lines required to access one record on
average. On 32-bit systems we can manage 32-bytes, on 64-bit systems we can
manage 48 bytes. We do this by:
* Make mRunSize the number of pages in a run rather than bytes, so that it can
be stored in a single byte and save some space in bin headers.
* Make mNumRuns a uint32_t on all platforms.
Differential Revision: https://phabricator.services.mozilla.com/D140036
This code path would choose from several ways about how to divide numbers.
By calculating the inverse of the divisor early we can elude all the
branches along this code path we can make it faster than the previous code
or naive division.
Differential Revision: https://phabricator.services.mozilla.com/D132322
This structure is more optimal if it is somewhat aligned with the system's
cache line length (which we assume is 64 bytes but that's not always true).
This reduces the number of cache lines required to access one record on
average. On 32-bit systems we can manage 32-bytes, on 64-bit systems we can
manage 48 bytes. We do this by:
* Make mRunSize the number of pages in a run rather than bytes, so that it can
be stored in a single byte and save some space in bin headers.
* Make mNumRuns a uint32_t on all platforms.
Differential Revision: https://phabricator.services.mozilla.com/D140036
This code path would choose from several ways about how to divide numbers.
By calculating the inverse of the divisor early we can elude all the
branches along this code path we can make it faster than the previous code
or naive division.
Differential Revision: https://phabricator.services.mozilla.com/D132322