While MOZ_MEMORY_BSD is set for kFreeBSD, FreeBSD and NetBSD, the only
use of MOZ_MEMORY_BSD is along a check for __GLIBC__ which can only be
true on GNU/kFreeBSD, which doesn't have a XP_ macro, but which we
usually match with __FreeBSD_kernel__.
--HG--
extra : rebase_source : 2f143f8ed641f4e338024a58e20c62b2b30e653a
For some reason, we don't have a XP_ macro for Android, and we generally use
ANDROID or MOZ_WIDGET_ANDROID. The former is more appropriate.
--HG--
extra : rebase_source : 50576584f194aaf6b090fd530598ce9954e170b3
Back when it was added (for Windows CE, in bug 488608), mozjemalloc was
C and all the supported compilers didn't support C99 bools. Now
mozjemalloc is C++, and all the supported compilers support C99 bools
for the cases where the type is used from C.
--HG--
extra : rebase_source : b9c710a0c48dc36cb473af59e3119131d13523ce
Back when mozjemalloc was considered third-party, and before bug
1365194, mozjemalloc was calling abort() and that was redirectory to
MOZ_CRASH through a moz_abort() function. Bug 1365194 changed that so
that moz_abort is called directly instead of abort, but the indirection
is actually not necessary anymore.
So we just kill moz_abort, which is unused anywhere else, and use
MOZ_CRASH directly.
Note this will (obviously) change crash signatures involving moz_abort.
--HG--
extra : rebase_source : 67698ffd8c5e52e62b9a0b7f28efb0352c8fe8ce
This patch moves measurement of ComputedValues objects from Rust to C++.
Measurement now happens (a) via DOM elements and (b) remaining elements via
the frame tree. Likewise for the style structs hanging off ComputedValues
objects.
Here is an example of the output.
> ├──27,600,448 B (26.49%) -- active/window(https://en.wikipedia.org/wiki/Barack_Obama)
> │ ├──12,772,544 B (12.26%) -- layout
> │ │ ├───4,483,744 B (04.30%) -- frames
> │ │ │ ├──1,653,552 B (01.59%) ── nsInlineFrame
> │ │ │ ├──1,415,760 B (01.36%) ── nsTextFrame
> │ │ │ ├────431,376 B (00.41%) ── nsBlockFrame
> │ │ │ ├────340,560 B (00.33%) ── nsHTMLScrollFrame
> │ │ │ ├────302,544 B (00.29%) ── nsContinuingTextFrame
> │ │ │ ├────156,408 B (00.15%) ── nsBulletFrame
> │ │ │ ├─────73,024 B (00.07%) ── nsPlaceholderFrame
> │ │ │ ├─────27,656 B (00.03%) ── sundries
> │ │ │ ├─────23,520 B (00.02%) ── nsTableCellFrame
> │ │ │ ├─────16,704 B (00.02%) ── nsImageFrame
> │ │ │ ├─────15,488 B (00.01%) ── nsTableRowFrame
> │ │ │ ├─────13,776 B (00.01%) ── nsTableColFrame
> │ │ │ └─────13,376 B (00.01%) ── nsTableFrame
> │ │ ├───3,412,192 B (03.28%) -- servo-style-structs
> │ │ │ ├──1,288,224 B (01.24%) ── Display
> │ │ │ ├────742,400 B (00.71%) ── Position
> │ │ │ ├────308,736 B (00.30%) ── Font
> │ │ │ ├────226,512 B (00.22%) ── Background
> │ │ │ ├────218,304 B (00.21%) ── TextReset
> │ │ │ ├────214,896 B (00.21%) ── Text
> │ │ │ ├────130,560 B (00.13%) ── Border
> │ │ │ ├─────81,408 B (00.08%) ── UIReset
> │ │ │ ├─────61,440 B (00.06%) ── Padding
> │ │ │ ├─────38,176 B (00.04%) ── UserInterface
> │ │ │ ├─────29,232 B (00.03%) ── Margin
> │ │ │ ├─────21,824 B (00.02%) ── sundries
> │ │ │ ├─────20,080 B (00.02%) ── Color
> │ │ │ ├─────20,080 B (00.02%) ── Column
> │ │ │ └─────10,320 B (00.01%) ── Effects
> │ │ ├───2,227,680 B (02.14%) -- computed-values
> │ │ │ ├──1,182,928 B (01.14%) ── non-dom
> │ │ │ └──1,044,752 B (01.00%) ── dom
> │ │ ├───1,500,016 B (01.44%) ── text-runs
> │ │ ├─────492,640 B (00.47%) ── line-boxes
> │ │ ├─────326,688 B (00.31%) ── frame-properties
> │ │ ├─────301,760 B (00.29%) ── pres-shell
> │ │ ├──────27,648 B (00.03%) ── pres-contexts
> │ │ └─────────176 B (00.00%) ── style-sets
The 'servo-style-structs' and 'computed-values' sub-trees are new. (Prior to
this patch, ComputedValues under DOM elements were tallied under the the
'dom/element-nodes' sub-tree, and ComputedValues not under DOM element were
ignored.) 'servo-style-structs/sundries' aggregates all the style structs that
are smaller than 8 KiB.
Other notable things done by the patch are as follows.
- It significantly changes the signatures of the methods measuring nsINode and
its subclasses, in order to handle the tallying of style structs separately
from element-nodes. Likewise for nsIFrame.
- It renames the 'layout/style-structs' sub-tree as
'layout/gecko-style-structs', to clearly distinguish it from the new
'layout/servo-style-structs' sub-tree.
- It adds some FFI functions to access various Rust-side data structures from
C++ code.
- There is a nasty hack used twice to measure Arcs, by stepping backwards from
an interior pointer to a base pointer. It works, but I want to replace it
with something better eventually. The "XXX WARNING" comments have details.
- It makes DMD print a line to the console if it sees a pointer it doesn't
recognise. This is useful for detecting when we are measuring an interior
pointer instead of a base pointer, which is bad but easy to do when Arcs are
involved.
- It removes the Rust code for measuring CVs, because it's now all done on the
C++ side.
MozReview-Commit-ID: BKebACLKtCi
--HG--
extra : rebase_source : 4d9a8c6b198a0ff025b811759a6bfa9f33a260ba
When recycling chunks, mozjemalloc tries to avoid keeping around more
than 128MB worth of chunks around, but it doesn't actually look at the
size of the chunks that are recycled, so if chunk larger than 128MB is
recycled, it is kept as a whole, going well over the limit.
The chunks are still properly reused, and further recycling doesn't
occur, but that can limit other mmap users from getting enough address
space.
With this change, mozjemalloc now doesn't keep more than 128MB, by
splitting the chunks it recycles if they are too large.
Note this was not a problem on Windows, where chunks larger than 1MB are
never recycled (per CAN_RECYCLE).
--HG--
extra : rebase_source : 6765fd30b78ca5ddc7d55aac861355d960e47828
When recycling chunks, mozjemalloc tries to avoid keeping around more
than 128MB worth of chunks around, but it doesn't actually look at the
size of the chunks that are recycled, so if chunk larger than 128MB is
recycled, it is kept as a whole, going well over the limit.
The chunks are still properly reused, and further recycling doesn't
occur, but that can limit other mmap users from getting enough address
space.
With this change, mozjemalloc now doesn't keep more than 128MB, by
splitting the chunks it recycles if they are too large.
Note this was not a problem on Windows, where chunks larger than 1MB are
never recycled (per CAN_RECYCLE).
--HG--
extra : rebase_source : f81e94c6960ba253037f77de1a39c9ff74651e80
The current default is 24, which is equal to the maximum number of stack frames
that DMD will record. And that's a terrible value because it splits up too many
related stack traces into separate records. There is no single best value, but
8 is a much better default.
--HG--
extra : rebase_source : c423fc4fe0e490ff6d58fa8f7116bc01c86a366e
Just one caller (in DMD) actually looks at it, and that's in an unimportant way
-- if the return value was false, mLength would be zero anyway.
--HG--
extra : rebase_source : 0463ab3765744742a9e854964342d631095fa55f
This patch does he following.
- Avoids some unnecessary casting.
- Renames the |bp| parameter as |aBp|.
- Makes the no-op FramePointerStackWalk() signature match the real one.
(Clearly it's dead code in all built configurations!)
--HG--
extra : rebase_source : 3fe606d1ff9b063294f4028ff884c20661ed9e0a
MozStackWalk() is different on Windows to the other platforms. It has two extra
arguments, which can be used to walk the stack of a different thread.
This patch makes those differences clearer. Instead of having a single function
and forbidding those two arguments on non-Windows, it removes those arguments
from MozStackWalk, and splits off MozStackWalkThread() which retains them. This
also allows those arguments to have more appropriate types (HANDLE instead of
uintptr_t; CONTEXT* instead of than void*) and names (aContext instead of
aPlatformData).
The patch also removes unnecessary reinterpret_casts for the aClosure argument
at a couple of MozStackWalk() callsites.
--HG--
extra : rebase_source : 111ab7d6426d7be921facc2264f6db86c501d127
Bug 1186064 removed most of it when we started requiring VS 2015u2, but
the "frex" function exported through mozglue.def.in was only used
through the MSVCRT being patched by fixcrt.py, which is not done anymore.
So the "frex" export is not used anymore, and so the "dumb_free_thunk"
function is not used anymore as well.
--HG--
extra : rebase_source : 879c469c317c8b6749410a4a476d6c951c9a1d0f
It causes abort() on errors from munmap/VirtualFree (which in practice
don't happen except maybe in case of memory corruption), and was only
enabled on debug builds.
--HG--
extra : rebase_source : fb0c8c82dc1e2d1f14a588a82144cf82e4f4f773
Since bug 1378258 remove malloc_print_stats, there are a bunch of
allocator stats that are now unused, reducing the memory footprint of
allocator metadata.
--HG--
extra : rebase_source : 337ef3b647c20119334b6576d591006f6bb3dd16
When initializing a new chunk for use as an arena, we started by zeroing
out the chunk (if that wasn't the case) and then initializing a new
arena chunk in there. It turns out this can have a noticeable overhead,
especially when e.g. the new arena chunk is used for a large allocation
filled out by something that is realloc()ated.
OTOH, the chunk recycle code only ever keeps zeroed or arena chunks
around (there is a "recycled" type too, but in practice, at the moment,
this means they were arena chunks before). Arena chunks that were
recycled were totally emptied, so all the runs they may contain will
contain zeroed-out or poisoned data. They also contain a header, that is
overwritten by the new arena chunk initialization.
This means we can get away with reusing non-zeroed recycled chunks
without zeroing them, as long as the arena chunk header marks the runs
as madvised instead of zeroed.
Code-wise, this would benefit from getting a ChunkType out of
chunk_alloc, but this would require more refactoring than I'm willing to
do at the moment.
Before returning a chunk, chunk_recycle calls pages_commit (when
MALLOC_DECOMMIT is enabled), which is guaranteed to zero the chunk.
The code further zeroing the chunk afterwards, which is now moved out to
chunk_alloc callers, never took advantage of that fact, duplicating the
effort of zeroing the chunk on Windows.
By indicating to the callers that the chunk has already been zeroed, we
allow callers to skip zeroing on their own.
The current code only allows chunk_calloc() callers to tell whether they
want zeroed memory or not, but some might be okay either way, assuming
they act accordingly afterwards. So move the zeroing out of chunk_alloc.
Many functions in the mozjemalloc codebase like to return the opposite
boolean one would tend to expect. Pages_purge is one of them, and this
reverses the logic to match expectations.
Also make it static.
It turns out that not recycling some kinds of chunk can lead to the
recycle queue being starved in some scenarios. When that happens, we end
up mmap()ing new memory, but that turns out to be significantly slower.
So instead of not recycling huge chunks, we force-clean them, before
madvising so that the pages can still be reclaimed in case of memory
pressure.
--HG--
extra : rebase_source : 2dbd028daca92c9cd7c8079eb3dc5a0cfa06495b
This avoids MozStackWalk(), which has become unusably slow on Mac due to
changes in libunwind, and gets us back to decent speed.
The code for getting the frame pointer and stack end was copied from the Gecko
Profiler, which also uses FramePointerStackWalk() on Mac.
--HG--
extra : rebase_source : 58c32c2df8716c7c8123a4a8fb692182d066caca
Going through the system zone allocator for every call to realloc/free
on OSX is costly, because the zone allocator needs to first verify that
the allocations do belong to the allocator it invokes (which ends up
calling jemalloc's malloc_usable_size), which is unnecessary when we
expect the allocations to belong to jemalloc.
So, we export the malloc/realloc/free/etc. symbols from
libmozglue.dylib, such that libraries and programs linked against it
call directly into jemalloc instead of going through the system zone
allocator, effectively shortcutting the allocator verification.
The risk is that some things in Gecko try to realloc/free pointers it
got from system libraries, if those were allocated with a system zone
that is not jemalloc.
--HG--
extra : rebase_source : ee0b29e1275176f52e64f4648dfa7ce25d61292e
MOZ_REPLACE_MALLOC_LINKAGE was added back when there were problems with
getting weak references working properly for replace-malloc.
Versions of OSX < 10.6 needed flat namespace, but aren't supported
anymore.
Versions of Xcode < 4.5 required flat namespace + a dummy library in
order to produce proper weak references. There is virtually nobody still
building with such an ancient toolchain.
Keeping those around doesn't /really/ hurt, except recent versions of
Xcode don't expose dyldinfo in /usr/bin, used for the configure test.
Consequently, MOZ_REPLACE_MALLOC_LINKAGE ended up being set to use the
dummy library setup, which, by using flat namespace, now causes harm in
bug 1356701, causing bug 1378332.
--HG--
extra : rebase_source : e3edc1f2cf905943c33fafeb631f2f88fc87167e
At the same time:
- replace "(nullptr)" with "nullptr",
- replace "x == nullptr" with "!x",
- replace "x != nullptr" with "x".
--HG--
extra : rebase_source : 5bc5577d0de53244afae9ebadb84962f38306368
malloc_print_stats is designed as a end-of-process dump of malloc stats,
but it has limited value in Firefox itself. It has some usefulness as
something to invoke from a debugger while Firefox is running (after
manually setting opt_print_stats to true), but it also lacks information
about live allocations, which would make it more useful.
All in all, if we want some stats about jemalloc allocations in a more
useful way, we'd be better off augmenting the jemalloc_stats() API and
make the data available to e.g. about:memory, than keeping
malloc_print_stats.
--HG--
extra : rebase_source : f6fbc97dfddd52c8620e37e2e189ae0087a845b0
Going through the system zone allocator for every call to realloc/free
on OSX is costly, because the zone allocator needs to first verify that
the allocations do belong to the allocator it invokes (which ends up
calling jemalloc's malloc_usable_size), which is unnecessary when we
expect the allocations to belong to jemalloc.
So, we export the malloc/realloc/free/etc. symbols from
libmozglue.dylib, such that libraries and programs linked against it
calls directly into jemalloc instead of going through the system zone
allocator, effectively shortcutting the allocator verification.
The risk is that some things in Gecko try to realloc/free pointers it
got from system libraries, if those were allocated with a system zone
that is not jemalloc.
--HG--
extra : rebase_source : 45b9b98499760a7f946878d41d2fdaadb6dff4d6