gecko-dev/docs/performance/memory/heap_scan_mode.md

# DMD heap scan mode

Firefox's DMD heap scan mode tracks the set of all live blocks of
malloc-allocated memory and their allocation stacks, and allows you to
log these blocks, and the values stored in them, to a file. When
combined with cycle collector logging, this can be used to investigate
leaks of refcounted cycle collected objects, by figuring out what holds
a strong reference to a leaked object.

**When should you use this?** DMD heap scan mode is intended to be used
to investigate leaks of cycle collected (CCed) objects. DMD heap scan
mode is a "tool of last resort" that should only be used when all
other avenues have been tried and failed, except possibly [ref count
logging](refcount_tracing_and_balancing.md).
It is particularly useful if you have no idea what is causing the leak.
If you have a patch that introduces a leak, you are probably better off
auditing all of the strong references that your patch creates before
trying this.

The particular steps given below are intended for the case where the
leaked object is alive all the way through shutdown. You could modify
these steps for leaks that go away in shutdown by collecting a CC and
DMD log prior to shutdown. However, in that case it may be easier to use
refcount logging, or rr with a conditional breakpoint set on calls to
`Release()` for the leaking object, to see what object actually does the
release that causes the leaked object to go away.

## Prerequisites

-   A debug DMD build of Firefox. [This
    page](dmd.md)
    describes how to do that. This should probably be an optimized
    build. Non-optimized DMD builds will generate better stack traces,
    but they can be so slow as to be useless.
-   The build is going to be very slow, so you may need to disable some
    shutdown checks. First, in
    `toolkit/components/terminator/nsTerminator.cpp`, delete everything
    in `RunWatchDog` but the call to `NS_SetCurrentThreadName`. This
    will keep the watch dog from killing the browser when shut down
    takes multiple minutes. Secondly, you may need to comment out the
    call to `MOZ_CRASH("NSS_Shutdown failed");` in
    `xpcom/build/XPCOMInit.cpp`, as this also seems to trigger when
    shutdown is extremely slow.
-   You need the cycle collector analysis script `find_roots.py`, which
    can be downloaded as part of [this repo on
    Github](https://github.com/amccreight/heapgraph).

## Generating Logs

The next step is to generate a number of log files. You need to get a
shutdown CC log and a DMD log, for a single run.

**Definitions** I'll write `$objdir` for the object directory for your
Firefox DMD build, `$srcdir` for the top level of the Firefox source
directory, and `$heapgraph` for the location of the heapgraph repo, and
`$logdir` for the location you want logs to go to. `$logdir` should end
in a path separator. For instance, `~/logs/leak/`.

The command you need to run Firefox will look something like this:

    XPCOM_MEM_BLOAT_LOG=1 MOZ_CC_LOG_SHUTDOWN=1 MOZ_DISABLE_CONTENT_SANDBOX=t MOZ_CC_LOG_DIRECTORY=$logdir
    MOZ_CC_LOG_PROCESS=content MOZ_CC_LOG_THREAD=main MOZ_DMD_SHUTDOWN_LOG=$logdir MOZ_DMD_LOG_PROCESS=tab ./mach run --dmd --mode=scan

Breaking this down:

-   `XPCOM_MEM_BLOAT_LOG=1`: This reports a list of the counts of every
    object created and destroyed and tracked by the XPCOM leak tracking
    system. From this chart, you can see how many objects of a
    particular type were leaked through shutdown. This can come in handy
    during the manual analysis phase later, to get evidence to support
    your hunches. For instance, if you think that an `nsFoo` object
    might be holding your leaking object alive, you can use this to
    easily see if we leaked an `nsFoo` object.
-   `MOZ_CC_LOG_SHUTDOWN=1`: This generates a cycle collector log during
    shutdown. Creating this log during shutdown is nice because there
    are less things unrelated to the leak in the log, and various cycle
    collector optimizations are disabled. A garbage collector log will
    also be created, which you may not need.
-   `MOZ_DISABLE_CONTENT_SANDBOX=t`: This disables the content process
    sandbox, which is needed because the DMD and CC log files are
    created directly by the child processes.
-   `MOZ_CC_LOG_DIRECTORY=$logdir`: This selects the location for cycle
    collector logs to be saved.
-   `MOZ_CC_LOG_PROCESS=content MOZ_CC_LOG_THREAD=main`: These options
    specify that we only want CC logs for the main thread of content
    processes, to make shutdown less slow. If your leak is happening in
    a different process or thread, change the options, which are listed
    in `xpcom/base/nsCycleCollector.cpp`.
-   `MOZ_DMD_SHUTDOWN_LOG=$logdir`: This option specifies that we want a
    DMD log to be taken very late in XPCOM shutdown, and the location
    for that log to be saved. Like with the CC log, we want this log
    very late to avoid as many non-leaking things as possible.
-   `MOZ_DMD_LOG_PROCESS=tab`: As with the CC, this means that we only
    want these logs in content processes, in order to make shutdown
    faster. The allowed values here are the same as those returned by
    `XRE_GetProcessType()`, so adjust as needed.
-   Finally, the `--dmd` option need to be passed in so that DMD will be
    run. `--mode=scan` is needed so that when we get a DMD log the
    entire contents of each block of memory is saved for later analysis.

With that command line in hand, you can start Firefox. Be aware that
this may take multiple minutes if you have optimization disabled.

Once it has started, go through the steps you need to reproduce your
leak. If your leak is a ghost window, it can be handy to get an
`about:memory` report and write down the PID of the leaking process. You
may want to wait 10 or so seconds after this to make sure as much as
possible is cleaned up.

Next, exit the browser. This will cause a lot of logs to be written out,
so it can take a while.

## Analyzing the Logs

### Getting the PID and address of the leaking object

The first step is to figure out the **PID** of the leaking process. The
second step is to figure out **the address of the leaking object**,
usually a window. Conveniently, you can usually do both at once using
the cycle collector log. If you are investigating a leak of
`www.example.com`, then from `$logdir` you can do
`"grep nsGlobalWindow cc-edges* | grep example.com"`. This looks through
all of the windows in all of the CC logs (which may leaked, this late in
shutdown), and then filters out windows where the URL contains
`example.com`.

The result of that grep will contain output that looks something like
this:

    cc-edges.15873.log:0x7f0897082c00 [rc=1285] nsGlobalWindowInner # 2147483662 inner https://www.example.com/

cc-edges.15873.log: The first part is the file name where it was
found. `15873` is the PID of the process that leaked. You'll want to
write down the name of the file and the PID. Let's call the file
`$cclog` and the pid `$pid`.

0x7f0897082c00: This is the address of the leaking window. You'll
also want to write that down. Let's call this `$winaddr`.

If there are multiple files, you'll end up with one that looks like
`cc-edges.$pid.log` and one or more that look like
`cc-edges.$pid-$n.log` for various values of `$n`. You want the one with
the largest `$n`, as this was recorded the latest, and so it will
contain the least non-garbage.

### Identifying the root in the cycle collector log

The next step is to figure out why the cycle collector could not collect
the window, using the `find_roots.py` script from the heapgraph
repository. The command to invoke this looks like this:

    python $heapgraph/find_roots.py $cclog $winaddr

This may take a few seconds. It will eventually produce some output.
You'll want to save a copy of this output for later.

The output will look something like this, after a message about loading
progress:

    0x7f0882fe3230 [FragmentOrElement (xhtml) script https://www.example.com]
    --[[via hash] mListenerManager]--> 0x7f0899b4e550 [EventListenerManager]
    --[mListeners event=onload listenerType=3 [i]]--> 0x7f0882ff8f80 [CallbackObject]
    --[mIncumbentGlobal]--> 0x7f0897082c00 [nsGlobalWindowInner # 2147483662 inner https://www.example.com]

Root 0x7f0882fe3230 is a ref counted object with 1 unknown edge(s).
    known edges:
    0x7f08975a24c0 [FragmentOrElement (xhtml) head https://www.example.com] --[mAttrsAndChildren[i]]--> 0x7f0882fe3230
    0x7f08967e7b20 [JS Object (HTMLScriptElement)] --[UnwrapDOMObject(obj)]--> 0x7f0882fe3230

The first two lines mean that the script element `0x7f0882fe3230`
contains a strong reference to the EventListenerManager
`0x7f0899b4e550`. "[via hash] mListenerManager" is a description of
that strong reference. Together, these lines show a chain of strong
references from an object the cycle collector thinks needs to be kept
alive, `0x7f0899b4e550`, to the object` 0x7f0897082c00` that you asked
about. Most of the time, the actual chain is not important, because the
cycle collector can only tell us about what went right. Let us call the
address of the leaking object (`0x7f0882fe3230` in this case)
`$leakaddr`.

Besides `$leakaddr`, the other interesting part is the chunk at the
bottom. It tells us that there is 1 unknown edge, and 2 known edges.
What this means is that the leaking object has a refcount of 3, but the
cycle collector was only told about these two references. In this case,
a head element and a JS object (the JS reflector of the script element).
We need to figure out what the unknown reference is from, as that is
where our leak really is.

### Figure out what is holding the leaking object alive.

Now we need to use the DMD heap scan logs. These contain the contents of
every live block of memory.

The first step to using the DMD heap scan logs is to do some
pre-processing for the DMD log. Stacks need to be symbolicated, and we
need to clamp the values contained in the heap. Clamping is the same
kind of analysis that a conservative GC does: if a word-aligned value in
a heap block points to somewhere within another heap block, replace that
value with the address of the block.

Both kinds of preprocessing are done by the `dmd.py` script, which can
be invoked like this:

    $objdir/dist/bin/dmd.py --clamp-contents dmd-$pid.log.gz

This can take a few minutes due to symbolification, but you only need to
run it once on a log file.

After that is done, we can finally find out which objects (possibly)
point to other objects, using the block_analyzer script:

    python $srcdir/memory/replace/dmd/block_analyzer.py dmd-$pid.log.gz $leakaddr

This will look through every block of memory in the log, and give some
basic information about any block of memory that (possibly) contains a
pointer to that object. You can pass some additional options to affect
how the results are displayed. "-sfl 10000 -a" is useful. The -sfl 10000
tells it to not truncate stack frames, and -a tells it to not display
generic frames related to the allocator.

Caveat: I think block_analyzer.py does not attempt to clamp the address
you pass into it, so if it is an offset from the start of the block, it
won't find it.

    block_analyzer.py` will return a series of entries that look like this
    with the [...] indicating where I have removed things):
    0x7f089306b000 size = 4096 bytes at byte offset 2168
    nsAttrAndChildArray::GrowBy[...]
    nsAttrAndChildArray::InsertChildAt[...]
    [...]

`0x7f089306b000` is the address of the block that contains `$leakaddr`.
144 bytes is the size of that block. That can be useful for confirming
your guess about what class the block actually is. The byte offset tells
you were in the block the pointer is. This is mostly useful for larger
objects, and you can potentially combine this with debugging information
to figure out exactly what field this is. The rest of the entry is the
stack trace for the allocation of the block, which is the most useful
piece of information.

What you need to do now is to go through every one of these entries and
place it into three categories: strong reference known to the cycle
collector, weak reference, or something else! The goal is to eventually
shrink down the "something else" category until there are only as many
things in it as there are unknown references to the leaking object, and
then you have your leaker.

To place an entry into one of the categories, you must look at the code
locations given in the stack trace, and see if you can tell what the
object is based on that, then compare that to what `find_roots.py` told
you.

For instance, one of the strong references in the CC log is from a head
element to its child via `mAttrsAndChildren`, and that sounds a lot like
this, so we can mark it as being a strong known reference.

This is an iterative process, where you first go through and mark off
the things that are easily categorizable, and repeat until you have a
small list of things to analyze.

### Example analysis of block_analyzer.py results

In one debugging session where I was investigating the leak from bug
1451985, I eventually reduced the list of entries until this was the
most suspicious looking entry:

    0x7f0892f29630 size = 392 bytes at byte offset 56
    mozilla::dom::ScriptLoader::ProcessExternalScript[...]
    [...]

I went to that line of `ScriptLoader::ProcessExternalScript()`, and it
contained a call to ScriptLoader::CreateLoadRequest(). Fortunately, this
method mostly just contains two calls to `new`, one for
`ScriptLoadRequest` and one for `ModuleLoadRequest`. (This is where an
unoptimized build comes in handy, as it would have pointed out the exact
line. Unfortunately, in this particular case, the unoptimized build was
so slow I wasn't getting any logs.) I then looked through the list of
leaked objects generated by `XPCOM_MEM_BLOAT_LOG` and saw that we were
leaking a `ScriptLoadRequest`, so I went and looked at its class
definition, where I noticed that `ScriptLoadRequest` had a strong
reference to an element that it wasn't telling the cycle collector
about, which seemed suspicious.

The first thing I did to try to confirm that this was the source of the
leak was pass the address of this object into the cycle collector
analysis log, `find_roots.py`, that we used at an earlier step. That
gave a result that contained this:

    0x7f0882fe3230 [FragmentOrElement (xhtml) script [...]
    --[mNodeInfo]--> 0x7f0897431f00 [NodeInfo (xhtml) script]
    [...]
    --[mLoadingAsyncRequests]--> 0x7f0892f29630 [ScriptLoadRequest]

This confirms that this block is actually a ScriptLoadRequest. Secondly,
notice that the load request is being held alive by the very same script
element that is causing the window leak! This strongly suggests that
there is a cycle of strong references between the script element and the
load request. I then added the missing field to the traverse and unlink
methods of ScriptLoadRequest, and confirmed that I couldn't reproduce
the leak.

Keep in mind that you may need to run `block_analyzer.py` multiple
times. For instance, if the script element was being held alive by some
container being held alive by a runnable, we'd first need to figure out
that the container was holding the element. If it isn't possible to
figure out what is holding that alive, you'd have to run block_analyzer
again. This isn't too bad, because unlike ref count logging, we have the
full state of memory in our existing log, so we don't need to run the
browser again.
Bug 1706751 - convert performance memory docs from MDN->in-tree r=firefox-source-docs-reviewers,ahal * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers *** Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers Differential Revision: https://phabricator.services.mozilla.com/D113696 2021-04-30 15:33:48 +00:00			`# DMD heap scan mode`

			`Firefox's DMD heap scan mode tracks the set of all live blocks of`
			`malloc-allocated memory and their allocation stacks, and allows you to`
			`log these blocks, and the values stored in them, to a file. When`
			`combined with cycle collector logging, this can be used to investigate`
			`leaks of refcounted cycle collected objects, by figuring out what holds`
			`a strong reference to a leaked object.`

			`When should you use this? DMD heap scan mode is intended to be used`
			`to investigate leaks of cycle collected (CCed) objects. DMD heap scan`
			`mode is a "tool of last resort" that should only be used when all`
			`other avenues have been tried and failed, except possibly [ref count`
			`logging](refcount_tracing_and_balancing.md).`
			`It is particularly useful if you have no idea what is causing the leak.`
			`If you have a patch that introduces a leak, you are probably better off`
			`auditing all of the strong references that your patch creates before`
			`trying this.`

			`The particular steps given below are intended for the case where the`
			`leaked object is alive all the way through shutdown. You could modify`
			`these steps for leaks that go away in shutdown by collecting a CC and`
			`DMD log prior to shutdown. However, in that case it may be easier to use`
			`refcount logging, or rr with a conditional breakpoint set on calls to`
			`Release()` for the leaking object, to see what object actually does the
			`release that causes the leaked object to go away.`

			`## Prerequisites`

			`- A debug DMD build of Firefox. [This`
			`page](dmd.md)`
			`describes how to do that. This should probably be an optimized`
			`build. Non-optimized DMD builds will generate better stack traces,`
			`but they can be so slow as to be useless.`
			`- The build is going to be very slow, so you may need to disable some`
			`shutdown checks. First, in`
			`toolkit/components/terminator/nsTerminator.cpp`, delete everything
			in `RunWatchDog` but the call to `NS_SetCurrentThreadName`. This
			`will keep the watch dog from killing the browser when shut down`
			`takes multiple minutes. Secondly, you may need to comment out the`
			call to `MOZ_CRASH("NSS_Shutdown failed");` in
			`xpcom/build/XPCOMInit.cpp`, as this also seems to trigger when
			`shutdown is extremely slow.`
			- You need the cycle collector analysis script `find_roots.py`, which
			`can be downloaded as part of [this repo on`
			`Github](https://github.com/amccreight/heapgraph).`

			`## Generating Logs`

			`The next step is to generate a number of log files. You need to get a`
			`shutdown CC log and a DMD log, for a single run.`

			Definitions I'll write `$objdir` for the object directory for your
			Firefox DMD build, `$srcdir` for the top level of the Firefox source
			directory, and `$heapgraph` for the location of the heapgraph repo, and
			`$logdir` for the location you want logs to go to. `$logdir` should end
			in a path separator. For instance, `~/logs/leak/`.

			`The command you need to run Firefox will look something like this:`

Bug 1807899 - Remove the trailing whitespace and Windows CR on md files r=linter-reviewers,necko-reviewers,marco,jesup Differential Revision: https://phabricator.services.mozilla.com/D165662 2022-12-30 15:23:21 +00:00			`XPCOM_MEM_BLOAT_LOG=1 MOZ_CC_LOG_SHUTDOWN=1 MOZ_DISABLE_CONTENT_SANDBOX=t MOZ_CC_LOG_DIRECTORY=$logdir`
Bug 1706751 - convert performance memory docs from MDN->in-tree r=firefox-source-docs-reviewers,ahal * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers *** Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers Differential Revision: https://phabricator.services.mozilla.com/D113696 2021-04-30 15:33:48 +00:00			`MOZ_CC_LOG_PROCESS=content MOZ_CC_LOG_THREAD=main MOZ_DMD_SHUTDOWN_LOG=$logdir MOZ_DMD_LOG_PROCESS=tab ./mach run --dmd --mode=scan`

			`Breaking this down:`

			- `XPCOM_MEM_BLOAT_LOG=1`: This reports a list of the counts of every
			`object created and destroyed and tracked by the XPCOM leak tracking`
			`system. From this chart, you can see how many objects of a`
			`particular type were leaked through shutdown. This can come in handy`
			`during the manual analysis phase later, to get evidence to support`
			your hunches. For instance, if you think that an `nsFoo` object
			`might be holding your leaking object alive, you can use this to`
			easily see if we leaked an `nsFoo` object.
			- `MOZ_CC_LOG_SHUTDOWN=1`: This generates a cycle collector log during
			`shutdown. Creating this log during shutdown is nice because there`
			`are less things unrelated to the leak in the log, and various cycle`
			`collector optimizations are disabled. A garbage collector log will`
			`also be created, which you may not need.`
			- `MOZ_DISABLE_CONTENT_SANDBOX=t`: This disables the content process
			`sandbox, which is needed because the DMD and CC log files are`
			`created directly by the child processes.`
			- `MOZ_CC_LOG_DIRECTORY=$logdir`: This selects the location for cycle
			`collector logs to be saved.`
			- `MOZ_CC_LOG_PROCESS=content MOZ_CC_LOG_THREAD=main`: These options
			`specify that we only want CC logs for the main thread of content`
			`processes, to make shutdown less slow. If your leak is happening in`
			`a different process or thread, change the options, which are listed`
			in `xpcom/base/nsCycleCollector.cpp`.
			- `MOZ_DMD_SHUTDOWN_LOG=$logdir`: This option specifies that we want a
			`DMD log to be taken very late in XPCOM shutdown, and the location`
			`for that log to be saved. Like with the CC log, we want this log`
			`very late to avoid as many non-leaking things as possible.`
			- `MOZ_DMD_LOG_PROCESS=tab`: As with the CC, this means that we only
			`want these logs in content processes, in order to make shutdown`
			`faster. The allowed values here are the same as those returned by`
			`XRE_GetProcessType()`, so adjust as needed.
			- Finally, the `--dmd` option need to be passed in so that DMD will be
			run. `--mode=scan` is needed so that when we get a DMD log the
			`entire contents of each block of memory is saved for later analysis.`

			`With that command line in hand, you can start Firefox. Be aware that`
			`this may take multiple minutes if you have optimization disabled.`

			`Once it has started, go through the steps you need to reproduce your`
			`leak. If your leak is a ghost window, it can be handy to get an`
			`about:memory` report and write down the PID of the leaking process. You
			`may want to wait 10 or so seconds after this to make sure as much as`
			`possible is cleaned up.`

			`Next, exit the browser. This will cause a lot of logs to be written out,`
			`so it can take a while.`

			`## Analyzing the Logs`

Bug 1750450 - Fix 'Non-consecutive header level increase' DONTBUILD Differential Revision: https://phabricator.services.mozilla.com/D165655 2022-12-31 16:33:30 +00:00			`### Getting the PID and address of the leaking object`
Bug 1706751 - convert performance memory docs from MDN->in-tree r=firefox-source-docs-reviewers,ahal * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers *** Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers Differential Revision: https://phabricator.services.mozilla.com/D113696 2021-04-30 15:33:48 +00:00
			`The first step is to figure out the PID of the leaking process. The`
			`second step is to figure out the address of the leaking object,`
			`usually a window. Conveniently, you can usually do both at once using`
			`the cycle collector log. If you are investigating a leak of`
			`www.example.com`, then from `$logdir` you can do
			`"grep nsGlobalWindow cc-edges* \| grep example.com"`. This looks through
			`all of the windows in all of the CC logs (which may leaked, this late in`
			`shutdown), and then filters out windows where the URL contains`
			`example.com`.

			`The result of that grep will contain output that looks something like`
			`this:`

			`cc-edges.15873.log:0x7f0897082c00 [rc=1285] nsGlobalWindowInner # 2147483662 inner https://www.example.com/`

			`cc-edges.15873.log: The first part is the file name where it was`
			found. `15873` is the PID of the process that leaked. You'll want to
			`write down the name of the file and the PID. Let's call the file`
			`$cclog` and the pid `$pid`.

			`0x7f0897082c00: This is the address of the leaking window. You'll`
			also want to write that down. Let's call this `$winaddr`.

			`If there are multiple files, you'll end up with one that looks like`
			`cc-edges.$pid.log` and one or more that look like
			`cc-edges.$pid-$n.log` for various values of `$n`. You want the one with
			the largest `$n`, as this was recorded the latest, and so it will
			`contain the least non-garbage.`

Bug 1750450 - Fix 'Non-consecutive header level increase' DONTBUILD Differential Revision: https://phabricator.services.mozilla.com/D165655 2022-12-31 16:33:30 +00:00			`### Identifying the root in the cycle collector log`
Bug 1706751 - convert performance memory docs from MDN->in-tree r=firefox-source-docs-reviewers,ahal * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers *** Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers Differential Revision: https://phabricator.services.mozilla.com/D113696 2021-04-30 15:33:48 +00:00
			`The next step is to figure out why the cycle collector could not collect`
			the window, using the `find_roots.py` script from the heapgraph
			`repository. The command to invoke this looks like this:`

			`python $heapgraph/find_roots.py $cclog $winaddr`

			`This may take a few seconds. It will eventually produce some output.`
			`You'll want to save a copy of this output for later.`

			`The output will look something like this, after a message about loading`
			`progress:`

			`0x7f0882fe3230 [FragmentOrElement (xhtml) script https://www.example.com]`
			`--[[via hash] mListenerManager]--> 0x7f0899b4e550 [EventListenerManager]`
			`--[mListeners event=onload listenerType=3 [i]]--> 0x7f0882ff8f80 [CallbackObject]`
			`--[mIncumbentGlobal]--> 0x7f0897082c00 [nsGlobalWindowInner # 2147483662 inner https://www.example.com]`

			`Root 0x7f0882fe3230 is a ref counted object with 1 unknown edge(s).`
			`known edges:`
			`0x7f08975a24c0 [FragmentOrElement (xhtml) head https://www.example.com] --[mAttrsAndChildren[i]]--> 0x7f0882fe3230`
			`0x7f08967e7b20 [JS Object (HTMLScriptElement)] --[UnwrapDOMObject(obj)]--> 0x7f0882fe3230`

			The first two lines mean that the script element `0x7f0882fe3230`
			`contains a strong reference to the EventListenerManager`
			`0x7f0899b4e550`. "[via hash] mListenerManager" is a description of
			`that strong reference. Together, these lines show a chain of strong`
			`references from an object the cycle collector thinks needs to be kept`
			alive, `0x7f0899b4e550`, to the object` 0x7f0897082c00` that you asked
			`about. Most of the time, the actual chain is not important, because the`
			`cycle collector can only tell us about what went right. Let us call the`
			address of the leaking object (`0x7f0882fe3230` in this case)
			`$leakaddr`.

			Besides `$leakaddr`, the other interesting part is the chunk at the
			`bottom. It tells us that there is 1 unknown edge, and 2 known edges.`
			`What this means is that the leaking object has a refcount of 3, but the`
			`cycle collector was only told about these two references. In this case,`
			`a head element and a JS object (the JS reflector of the script element).`
			`We need to figure out what the unknown reference is from, as that is`
			`where our leak really is.`

Bug 1750450 - Fix 'Non-consecutive header level increase' DONTBUILD Differential Revision: https://phabricator.services.mozilla.com/D165655 2022-12-31 16:33:30 +00:00			`### Figure out what is holding the leaking object alive.`
Bug 1706751 - convert performance memory docs from MDN->in-tree r=firefox-source-docs-reviewers,ahal * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers *** Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers Differential Revision: https://phabricator.services.mozilla.com/D113696 2021-04-30 15:33:48 +00:00
			`Now we need to use the DMD heap scan logs. These contain the contents of`
			`every live block of memory.`

			`The first step to using the DMD heap scan logs is to do some`
			`pre-processing for the DMD log. Stacks need to be symbolicated, and we`
			`need to clamp the values contained in the heap. Clamping is the same`
			`kind of analysis that a conservative GC does: if a word-aligned value in`
			`a heap block points to somewhere within another heap block, replace that`
			`value with the address of the block.`

			Both kinds of preprocessing are done by the `dmd.py` script, which can
			`be invoked like this:`

			`$objdir/dist/bin/dmd.py --clamp-contents dmd-$pid.log.gz`

			`This can take a few minutes due to symbolification, but you only need to`
			`run it once on a log file.`

			`After that is done, we can finally find out which objects (possibly)`
			`point to other objects, using the block_analyzer script:`

			`python $srcdir/memory/replace/dmd/block_analyzer.py dmd-$pid.log.gz $leakaddr`

			`This will look through every block of memory in the log, and give some`
			`basic information about any block of memory that (possibly) contains a`
			`pointer to that object. You can pass some additional options to affect`
			`how the results are displayed. "-sfl 10000 -a" is useful. The -sfl 10000`
			`tells it to not truncate stack frames, and -a tells it to not display`
			`generic frames related to the allocator.`

			`Caveat: I think block_analyzer.py does not attempt to clamp the address`
			`you pass into it, so if it is an offset from the start of the block, it`
			`won't find it.`

			block_analyzer.py` will return a series of entries that look like this
			`with the [...] indicating where I have removed things):`
			`0x7f089306b000 size = 4096 bytes at byte offset 2168`
			`nsAttrAndChildArray::GrowBy[...]`
			`nsAttrAndChildArray::InsertChildAt[...]`
			`[...]`

			`0x7f089306b000` is the address of the block that contains `$leakaddr`.
			`144 bytes is the size of that block. That can be useful for confirming`
			`your guess about what class the block actually is. The byte offset tells`
			`you were in the block the pointer is. This is mostly useful for larger`
			`objects, and you can potentially combine this with debugging information`
			`to figure out exactly what field this is. The rest of the entry is the`
			`stack trace for the allocation of the block, which is the most useful`
			`piece of information.`

			`What you need to do now is to go through every one of these entries and`
			`place it into three categories: strong reference known to the cycle`
			`collector, weak reference, or something else! The goal is to eventually`
			`shrink down the "something else" category until there are only as many`
			`things in it as there are unknown references to the leaking object, and`
			`then you have your leaker.`

			`To place an entry into one of the categories, you must look at the code`
			`locations given in the stack trace, and see if you can tell what the`
			object is based on that, then compare that to what `find_roots.py` told
			`you.`

			`For instance, one of the strong references in the CC log is from a head`
			element to its child via `mAttrsAndChildren`, and that sounds a lot like
			`this, so we can mark it as being a strong known reference.`

			`This is an iterative process, where you first go through and mark off`
			`the things that are easily categorizable, and repeat until you have a`
			`small list of things to analyze.`

Bug 1750450 - Fix 'Non-consecutive header level increase' DONTBUILD Differential Revision: https://phabricator.services.mozilla.com/D165655 2022-12-31 16:33:30 +00:00			`### Example analysis of block_analyzer.py results`
Bug 1706751 - convert performance memory docs from MDN->in-tree r=firefox-source-docs-reviewers,ahal * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers * Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers *** Bug 1706751 - convert performance memory docs from MDN->in-tree r=#firefox-source-docs-reviewers Differential Revision: https://phabricator.services.mozilla.com/D113696 2021-04-30 15:33:48 +00:00
			`In one debugging session where I was investigating the leak from bug`
			`1451985, I eventually reduced the list of entries until this was the`
			`most suspicious looking entry:`

			`0x7f0892f29630 size = 392 bytes at byte offset 56`
			`mozilla::dom::ScriptLoader::ProcessExternalScript[...]`
			`[...]`

			I went to that line of `ScriptLoader::ProcessExternalScript()`, and it
			`contained a call to ScriptLoader::CreateLoadRequest(). Fortunately, this`
			method mostly just contains two calls to `new`, one for
			`ScriptLoadRequest` and one for `ModuleLoadRequest`. (This is where an
			`unoptimized build comes in handy, as it would have pointed out the exact`
			`line. Unfortunately, in this particular case, the unoptimized build was`
			`so slow I wasn't getting any logs.) I then looked through the list of`
			leaked objects generated by `XPCOM_MEM_BLOAT_LOG` and saw that we were
			leaking a `ScriptLoadRequest`, so I went and looked at its class
			definition, where I noticed that `ScriptLoadRequest` had a strong
			`reference to an element that it wasn't telling the cycle collector`
			`about, which seemed suspicious.`

			`The first thing I did to try to confirm that this was the source of the`
			`leak was pass the address of this object into the cycle collector`
			analysis log, `find_roots.py`, that we used at an earlier step. That
			`gave a result that contained this:`

			`0x7f0882fe3230 [FragmentOrElement (xhtml) script [...]`
			`--[mNodeInfo]--> 0x7f0897431f00 [NodeInfo (xhtml) script]`
			`[...]`
			`--[mLoadingAsyncRequests]--> 0x7f0892f29630 [ScriptLoadRequest]`

			`This confirms that this block is actually a ScriptLoadRequest. Secondly,`
			`notice that the load request is being held alive by the very same script`
			`element that is causing the window leak! This strongly suggests that`
			`there is a cycle of strong references between the script element and the`
			`load request. I then added the missing field to the traverse and unlink`
			`methods of ScriptLoadRequest, and confirmed that I couldn't reproduce`
			`the leak.`

			Keep in mind that you may need to run `block_analyzer.py` multiple
			`times. For instance, if the script element was being held alive by some`
			`container being held alive by a runnable, we'd first need to figure out`
			`that the container was holding the element. If it isn't possible to`
			`figure out what is holding that alive, you'd have to run block_analyzer`
			`again. This isn't too bad, because unlike ref count logging, we have the`
			`full state of memory in our existing log, so we don't need to run the`
			`browser again.`