How to debug memory leaks/refcnt leaks

Last update: October 21st, 1999

What tools do we have?

The mozilla team has developed a number of memory analysis tools to augment commercial tools like Purify. These can help us more quickly spot and fix memory leaks and memory bloat (our term for taking up too much memory, aka footprint). Here's a list of what we have at our disposal: More description on each of these will be provided below.

How to turn on refcnt/memory logging

Assuming you have a build with refcnt logging enabled (we'll tell you how to do that next), here's what you have to do to use it. All of the following environment variables can be set to any of these values: The log environment variables are:
XPCOM_MEM_BLOAT_LOG
If this environment variable is set then xpcom will use the "bloat" trackers. The bloat trackers gather data for the BloatView output that occurs when the program exits, when about:bloat is loaded, or a call to nsTraceRefcnt::DumpStatistics is made.

When an addref/release/ctor/dtor call is made, the data is logged and attributed to the particular data type.

By default enabling this environment variable will cause the BloatView software to dump out the entire database of collected data. If all you want to see is that data for objects that leaked, set the environment variable XPCOM_MEM_LEAK_LOG.

XPCOM_MEM_LEAK_LOG
This is basically a subset of XPCOM_MEM_BLOAT_LOG, and only shows classes that had object that were leaked, instead of statistics for all classes.
XPCOM_MEM_REFCNT_LOG
Setting this environment variable enables refcount tracing.
Only enable this for severe pain (unless you are using refcount tracing or leaky, see below). What this does is to enable logging (to stdout) of each and every call to addref/release without discrimination to the types involved. The output includes mapping the call-stacks at the time of the call to symbolic forms (on platforms that support this) and thus will be *very* *VERY* *VERY* slow. Did I say slow?
XPCOM_MEM_ALLOC_LOG
For losing architectures (those that don't have stack-crawl software written for them), xpcom supports logging at the *call site* to AddRef/Release using the usual cpp __FILE__ and __LINE__ number macro expansion hackery. This results in slower code, but at least you get *some* data about where the leaks might be occurring from.
XPCOM_MEM_LEAKY_LOG
For platforms that support leaky, xpcom will endeavor to find at run time the symbols "__log_addref" and "__log_release" and if found, instead of doing the slow painful stack crawls at program execution time instead it will pass the buck to the leaky software. This will allow your program to actually run in user friendly real time, but does require that your platform support leaky. Currently only linux supports leaky.
In addition, the following variable may be set to a list of class names:
XPCOM_MEM_LOG_CLASSES
Instead of slowing to a useless, instead you can slow to a meer crawl by using this option. When enabled, the xpcom logging software will look for the XPCOM_MEM_LOG_CLASSES environment variable (for platforms that support getenv). The variable contains a comma seperated list of names which will be used to compare against the type's of the objects being logged. For example:
env XPCOM_MEM_LOG_CLASSES=nsWebShell XPCOM_MEM_REFCNT_LOG=1 ./apprunner
will show you just the AddRef/Release calls to instances of nsWebShell while running apprunner.Note that setting XPCOM_MEM_LOG_CLASSES will also list the serial number of each object that leaked in the "bloat log" (that is, the file specified by the XPCOM_MEM_BLOAT_LOG variable). An object's serial number is simply a unique number, starting at one, that is assigned to the object when it is allocated.
You may use an object's serial number with the following variable to further restrict the reference count tracing:
XPCOM_MEM_LOG_OBJECTS
Set this variable to a comma-separated list of object serial number. When this is set, along with XPCOM_MEM_LOG_CLASSES and XPCOM_MEM_REFCNT_LOG, a stack track will be generated for only the specific objects that you list. For example,
env XPCOM_MEM_LOG_CLASSES=nsWebShell XPCOM_MEM_LOG_OBJECTS=2 XPCOM_MEM_REFCNT_LOG=1 ./apprunner
will dump stack traces to the console for the 2nd nsWebShell object that gets allocated, and nothing else.

1. BloatView

BloatView dumps out per-class statistics on allocations and refcounts, and provides gross numbers on the amount of memory being leaked broken down by class. Here's a sample of the BloatView output:
== BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS

     |<------Class----->|<-----Bytes------>|<----------------Objects---------------->|<--------------References-------------->|
                          Per-Inst   Leaked    Total      Rem      Mean       StdDev     Total      Rem      Mean       StdDev
   0 TOTAL                     193  2480436   316271    12852 ( 5377.07 +/-  5376.38)   410590    16079 ( 2850.93 +/-  2849.79)
   1 StyleSetImpl               32        0        8        0 (    3.88 +/-     3.15)     6304        0 (    7.18 +/-     6.63)
   2 SinkContext                32        0       19        0 (    1.87 +/-     1.04)        0        0 (    0.00 +/-     0.00)
   3 nsXPCClasses               12        0        2        0 (    1.00 +/-     0.71)       41        0 (    5.57 +/-     4.98)
   4 NameSpaceURIKey             8       72      158        9 (    8.16 +/-     7.62)        0        0 (    0.00 +/-     0.00)
   5 nsSupportsArray            36    11304     2581      314 (  477.13 +/-   476.53)     9223      314 (  579.23 +/-   578.64)
   6 nsView                     96        0       57        0 (   27.64 +/-    26.98)        0        0 (    0.00 +/-     0.00)
   7 nsEnderDocumentObser       12        0        1        0 (    0.50 +/-     0.87)        1        0 (    0.50 +/-     0.87)
Here's how you interpret the columns: Interesting things to look for: You can also dump out bloat statistics interactively by typing about:bloat in the location bar, or by using the menu items under the QA menu in debug builds. Note that you need to have the XPCOM_MEM_BLOAT_LOG or XPCOM_MEM_LEAK_LOG envirionment variable defined first. You can also type about:bloat?new to get a log since the last time you called it, or about:bloat?clear to clear the current set of statistics completely (use this option with caution as it can result in what look like negative refcounts, etc). Whenever these options are used, the log data is dumped to a file relative to the program's directory:
bloatlogs/all-1999-10-16-010302.txt        (a complete log resulting from the about:bloat command)
bloatlogs/new-1999-10-16-010423.txt     (an incremental log resulting from the about:bloat?new command)

Comparing Bloat Logs

You can also compare any two bloat logs (either those produced when the program shuts down, or written to the bloatlogs directory) by running the following program:
perl mozilla/tools/tinderbox/bloatdiff.pl <previous-log> <current-log>
This will give you output of the form:
Bloat/Leak Delta Report
Current file:  dist/win32_D.OBJ/bin/bloatlogs/all-1999-10-22-133450.txt
Previous file: dist/win32_D.OBJ/bin/bloatlogs/all-1999-10-16-010302.txt
--------------------------------------------------------------------------
CLASS                     LEAKS       delta      BLOAT       delta
--------------------------------------------------------------------------
TOTAL                   6113530       2.79%   67064808       9.18%
StyleContextImpl         265440      81.19%     283584     -26.99%
CToken                   236500      17.32%     306676      20.64%
nsStr                    217760      14.94%    5817060       7.63%
nsXULAttribute           113048     -70.92%     113568     -71.16%
LiteralImpl               53280      26.62%      75840      19.40%
nsXULElement              51648       0.00%      51648       0.00%
nsProfile                 51224       0.00%      51224       0.00%
nsFrame                   47568     -26.15%      48096     -50.49%
CSSDeclarationImpl        42984       0.67%      43488       0.67%
This "delta report" shows the leak offenders, sorted from most leaks to fewest. The delta numbers show the percentage change between runs for the amount of leaks and amount of bloat (negative numbers are better!). The bloat number is a metric determined by multiplying the total number of objects allocated of a given class by the class size. Note that although this isn't necessarily the amount of memory consumed at any given time, it does give an indication of how much memory we're consuming. The more memory in general, the worse the performance and footprint. The percentage 99999.99% will show up indicating an "infinite" amount of leakage. This happens when something that didn't leak before is now leaking.

Bloat Statistics on Tinderbox

Each build rectangle on Tinderbox will soon be capable of displaying the total leaks delta and bloat delta percentages from one build to the next. Horray!
 
 
 warren    L C
   L:-3 
 B:+21 

Hmmm. Warren checked in and the number of leaks went down by 3%. (Yes!) But the amount of bloat went up by 21%. (Ouch!) This probably should be investigated further. Sometimes bloat can go up because new features were added that just take up more memory (or if the set of test URLs were changed, and the activity is different from last time), but in general we'd like to see both of these numbers continue to go down. You can look at the end of the log (by clicking on the L) to see the bloat statistics and delta report for a breakdown of what actually happened.


2. Boehm GC Leak Detector

more...

3. Refcount Tracing

Refcount tracing is used to capture stack traces of AddRef and Release calls to use with the Refcount Balancer. It is best to set the XPCOM_MEM_REFCNT_LOG environment variable to point to a file when using it.

See Refcount Balancer for more information.


4. Leaky

Using this stuff with leaky

First, setup these environment variables:
setenv LD_PRELOAD ../lib/libleaky.so (assumes you execute apprunner/viewer in the dist/bin directory)
setenv LIBMALLOC_LOG 8 (tells leaky to log addref/release calls)
setenv XPCOM_MEM_LEAKY_LOG 1 (use leaky)
setenv XPCOM_MEM_LOG_CLASSES "a,b,c" (the list of types you care about)
Then run the viewer or the apprunner and run your test. Then exit it. The result will be some large file in your current directory called "malloc-log" and a small file called "malloc-map". If these aren't there then somethings wrong.

If it works properly, then you now have the tracing data for the problem you are chasing in malloc-log. Use leaky to convert it to human readable form and debug away:

leaky -dRq <viewer|apprunner> malloc-log > /tmp/log
Leaky used to require c++filt, but now it does it itself. With the -R option, leaky will only log the refcnts that actually leaked (those that didn't go to zero).

Leaky environment variables

LD_PRELOAD
Set this to the pathname to libleaky.so if you are using leaky to track memory operations.
LIBMALLOC_LOG
Set this to "8" to enable leaky to track addref/release calls that are logged by xpcom. Note that you must set bit 8 in xpcomrefcnt to connect xpcom's tracing to leakys tracing.

Sample output

Here is what you see when you enable some logging with XPCOM_MEM_LOG_CLASSES set to something:

nsWebShell      0x81189f8       Release 5       nsWebShell::Release(void)+0x59  nsCOMPtr<nsIContentViewerContainer>::~nsCOMPtr(void)+0x34       nsChannelListener::OnStartRequest(nsIChannel *, nsISupports *)+0x550        nsFileChannel::OnStartRequest(nsIChannel *, nsISupports *)+0x7b nsOnStartRequestEvent::HandleEvent(void)+0x46       nsStreamListenerEvent::HandlePLEvent(PLEvent *)+0x62    PL_HandleEvent+0x57     PL_ProcessPendingEvents+0x90    nsEventQueueImpl::ProcessPendingEvents(void)+0x1d   nsAppShell::SetDispatchListener(nsDispatchListener *)+0x3e      gdk_get_show_events+0xbb        g_io_add_watch+0xaa     g_get_current_time+0x136    g_get_current_time+0x6f1        g_main_run+0x81 gtk_main+0xb9   nsAppShell::Run(void)+0x245     nsAppShell::Run(void)+0xc7a92ede   nsAppShell::Run(void)+0xc7a9317c __libc_start_main+0xeb

Here is what you see when you use the leaky tool to dump out addref/release leaks:

addref     082cccc8     0 00000001 --> CViewSourceHTML::AddRef(void) CViewSourceHTML::QueryInterface(nsID &, void **) NS_NewViewSourceHTML(nsIDTD **) .LM708 GetSharedObjects(void) nsParser::RegisterDTD(nsIDTD *) RDFXMLDataSourceImpl::Refresh(int) nsChromeRegistry::InitRegistry(void) nsChromeProtocolHandler::NewChannel(char *, nsIURI *, nsILoadGroup *, nsIEventSinkGetter *, nsIChannel **) nsIOService::NewChannelFromURI(char *, nsIURI *, nsILoadGroup *, nsIEventSinkGetter *, nsIChannel **) NS_OpenURI(nsIChannel **, nsIURI *, nsILoadGroup *, nsIEventSinkGetter *) NS_OpenURI(nsIInputStream **, nsIURI *) CSSLoaderImpl::LoadSheet(URLKey &, SheetLoadData *) CSSLoaderImpl::LoadChildSheet(nsICSSStyleSheet *, nsIURI *, nsString &, int, int) CSSParserImpl::ProcessImport(int &, nsString &, nsString &) CSSParserImpl::ParseImportRule(int &) CSSParserImpl::ParseAtRule(int &) CSSParserImpl::Parse(nsIUnicharInputStream *, nsIURI *, nsICSSStyleSheet *&) CSSLoaderImpl::ParseSheet(nsIUnicharInputStream *, SheetLoadData *, int &, nsICSSStyleSheet *&) CSSLoaderImpl::LoadAgentSheet(nsIURI *, nsICSSStyleSheet *&, int &, void (*)(nsICSSStyleSheet *, void *), void *) nsLayoutModule::Initialize(void) nsLayoutModule::GetClassObject(nsIComponentManager *, nsID &, nsID &, void **) nsNativeComponentLoader::GetFactoryFromModule(nsDll *, nsID &, nsIFactory **) nsNativeComponentLoader::GetFactory(nsID &, char *, char *, nsIFactory **) .LM1381 nsComponentManagerImpl::FindFactory(nsID &, nsIFactory **) nsComponentManagerImpl::CreateInstance(nsID &, nsISupports *, nsID &, void **) nsComponentManager::CreateInstance(nsID &, nsISupports *, nsID &, void **) RDFXMLDataSourceImpl::Refresh(int) nsChromeRegistry::InitRegistry(void) nsChromeProtocolHandler::NewChannel(char *, nsIURI *, nsILoadGroup *, nsIEventSinkGetter *, nsIChannel **) nsIOService::NewChannelFromURI(char *, nsIURI *, nsILoadGroup *, nsIEventSinkGetter *, nsIChannel **) NS_OpenURI(nsIChannel **, nsIURI *, nsILoadGroup *, nsIEventSinkGetter *) nsDocumentBindInfo::Bind(nsIURI *, nsILoadGroup *, nsIInputStream *, unsigned short *) nsDocLoaderImpl::LoadDocument(nsIURI *, char *, nsIContentViewerContainer *, nsIInputStream *, nsISupports *, unsigned int, unsigned int, unsigned short *) nsWebShell::DoLoadURL(nsIURI *, char *, nsIInputStream *, unsigned int, unsigned int, unsigned short *) nsWebShell::LoadURI(nsIURI *, char *, nsIInputStream *, int, unsigned int, unsigned int, nsISupports *, unsigned short *) nsWebShell::LoadURL(unsigned short *, char *, nsIInputStream *, int, unsigned int, unsigned int, nsISupports *, unsigned short *) nsWebShell::LoadURL(unsigned short *, nsIInputStream *, int, unsigned int, unsigned int, nsISupports *, unsigned short *) nsWebShellWindow::Initialize(nsIWebShellWindow *, nsIAppShell *, nsIURI *, int, int, nsIXULWindowCallbacks *, int, int, nsWidgetInitData &) nsAppShellService::JustCreateTopWindow(nsIWebShellWindow *, nsIURI *, int, int, unsigned int, nsIXULWindowCallbacks *, int, int, nsIWebShellWindow **) nsAppShellService::CreateTopLevelWindow(nsIWebShellWindow *, nsIURI *, int, int, unsigned int, nsIXULWindowCallbacks *, int, int, nsIWebShellWindow **) OpenChromURL(char *, int, int) HandleBrowserStartup(nsICmdLineService *, nsIPref *, int) DoCommandLines(nsICmdLineService *, int) main1(int, char **) main __libc_start_main


5. Purify

more...

How to build xpcom with refcnt/memory logging

Built into xpcom is the ability to support the debugging of memory leaks. By default, an optimized build of xpcom has this disabled. Also by default, the debug builds have the logging facilities enabled. You can control either of these options by changing environment variables before you build mozilla:
FORCE_BUILD_REFCNT_LOGGING
If this is defined then regardless of the type of build, refcnt logging (and related memory debugging) will be enabled in the build.
NO_BUILD_REFCNT_LOGGING
If this is defined then regardless of the type of build or of the setting of the FORCE_BUILD_REFCNT_LOGGING, no refcnt logging will be enabled and no memory debugging will be enabled. This variable overrides FORCE_BUILD_REFCNT_LOGGING.
The remaining discussion assumes that one way or another that xpcom has been built with refcnt/memory logging enabled.

How to instrument your objects for refcnt/memory logging

First, if your object is an xpcom object and you use the NS_IMPL_ADDREF and NS_IMPL_RELEASE (or a variation thereof) macro to implement your AddRef and Release methods, then there is nothing you need do. By default, those macros support refcnt logging directly.

If your object is not an xpcom object then some manual editing is in order. The following sample code shows what must be done:

MOZ_DECL_CTOR_COUNTER(MyType);

MyType::MyType()
{
  MOZ_COUNT_CTOR(MyType);
}

MyType::~MyType()
{
  MOZ_COUNT_DTOR(MyType);
}

Now currently the MOZ_DECL_CTOR_COUNTER expands to nothing so your code will compile if you forget to add it; however, we reserve the right to change that so please put it in.

What are those macros doing for me anyway?


NS_IMPL_ADDREF has this additional line in it:

NS_LOG_ADDREF(this, mRefCnt, #_class, sizeof(*this));
What this is doing is logging the addref call using xpcom's nsTraceRefcnt class. The implementation of that macro is:
 
#define NS_LOG_ADDREF(_p, _rc, _type, _size) \
  nsTraceRefcnt::LogAddRef((_p), (_rc), (_type), (PRUint32) (_size))
Which as you can see just passes the buck to nsTraceRefcnt. nsTraceRefcnt implements the logging support and will track addref/release/ctor/dtor calls in a database that it builds up as the program is executing. In a similar manner, NS_IMPL_RELEASE uses NS_LOG_RELEASE which uses nsTraceRefcnt::LogRelease.

For the MOZ_DECL_CTOR_COUNTER, MOZ_COUNT_CTOR and MOZ_COUNT_DTOR macros the expansion boils down to calls to nsTraceRefcnt::LogCtor and nsTraceRefcnt::LogDtor calls. Again, the type of the object is passed in as well as the sizeof of all the data type.

#define MOZ_COUNT_CTOR(_type)                                 \
PR_BEGIN_MACRO                                                \
  nsTraceRefcnt::LogCtor((void*)this, #_type, sizeof(*this)); \
PR_END_MACRO

#define MOZ_COUNT_DTOR(_type)                                 \
PR_BEGIN_MACRO                                                \
  nsTraceRefcnt::LogDtor((void*)this, #_type, sizeof(*this)); \
PR_END_MACRO