Updating readme to talk about the JP_RTZ_HZ flag, to have examples using the

current output format, and to have a bit more in the way of explanation of the
output and the various JPROF_FLAGS values.  a=jim nance
This commit is contained in:
bzbarsky%mit.edu 2004-03-07 01:15:26 +00:00
parent f46d6b7a30
commit 5578f60f53

View File

@ -8,25 +8,32 @@
<font size="-1">
<a href="mailto:jim_nance%yahoo.com">jim_nance@yahoo.com</a>
</font>
<hr>
<a href="#introduction">Introduction</a> | <a href="#operation">Operation</a> |
<a href="#setup">Setup</a> | <a href="#usage">Usage</a> |
<a href="#interpretation">Interpretation</a>
</center>
<hr>
<h3>Introduction</h3>
<h3><a name="introduction">Introduction</a></h3>
Jprof is a profiling tool. I am writing it because I need to find out
where mozilla is spending its time, and there do not seem to be any
profilers for Linux that can handle threads and/or shared libraries.
This code is based heavily on Kipp Hickman's leaky.
<P> <h3>Operation</h3>
<h3><a name="operation">Operation</a></h3>
Jprof operates by installing a timer which periodically interrupts mozilla.
When this timer goes off, the jprof code inside mozilla walks the function
call stack to determine which code was executing. By collecting a large
number of these call stacks, it is possible to deduce where mozilla is
spending its time.
When this timer goes off, the jprof code inside mozilla walks the function call
stack to determine which code was executing and saves the results into the
<code>jprof-log</code> and <code>jprof-map</code> files. By collecting a large
number of these call stacks, it is possible to deduce where mozilla is spending
its time.
<P> <h3>Usage</h3>
<h3><a name="setup">Setup</a></h3>
First, check out the jprof source code since it is not a part of the
default SeaMonkeyAll CVS tag. To do this do:
@ -39,69 +46,102 @@ Next, enable jprof and rebuild mozilla:
./configure --enable-jprof
make
</pre>
Now you can run jprof. To do this the JPROF_FLAGS environment variable
must be set. As a simple example:
<pre>
cd dist/bin
setenv JPROF_FLAGS JP_START
</pre>
<P>
The JPROF_FLAGS environment variable can be composed of several substrings
Now you can run jprof.
<h3><a name="usage">Usage</a></h3>
The behavior of jprof is determined by the value of the JPROF_FLAGS environment
variable. This environment variable can be composed of several substrings
which have the following meanings:
<ul>
<li> <b>JP_START</b> : Install the signal handler, and start sending the
timer signals.
<li> <b>JP_DEFER</b> : Install the signal handler, but don't start sending
the timer signals. The user must start the signals by sending
the first one (with <code>kill -PROF</code>, or with <code>kill
-ALRM</code> if JP_REALTIME is used).
the timer signals. The user must start the signals by sending the first
one (with <code>kill -PROF</code>, or with <code>kill -ALRM</code> if
JP_REALTIME is used, or with <code>kill -POLL</code> if JP_RTZ_HZ is used).
<li> <b>JP_FIRST=x</b> : Wait x seconds before starting the timer
<li> <b>JP_PERIOD=y</b> : Set timer to interrupt every y seconds
<li> <b>JP_PERIOD=y</b> : Set timer to interrupt every y seconds. Only
values of y strictly greater than 0.001 are supported.
<li> <b>JP_REALTIME</b> : Do the profiling in intervals of real time rather
than intervals of time used by the mozilla process (and the kernel
when doing work for mozilla). This could probably lead to weird
results (you'll see whatever runs when mozilla is waiting for events),
but is needed to see time spent in the X server.
<li> <b>JP_RTC_HZ=freq</b> : This option, only available on Linux if the
kernel is built with RTC support, makes jprof use the RTC timer instead of
using its own timer. This option, like JP_REALTIME, uses intervals of real
time. This option overrides JP_PERIOD. freq is the frequency at which the
timer should fire, measured in Hz. It must be a power of 2. The maximal
frequency allowed by the kernel can be changed by writing to
<code>/proc/sys/dev/rtc/max-user-freq</code>; the maximum value it can be
set to is 8192.
</ul>
For example you could say:
<h4>Examples of JPROF_FLAGS usage</h4>
<ul>
<li>To make the timer start firing 3 seconds after the program is started and
fire every 25 milliseconds of program time use:
<pre>
setenv JPROF_FLAGS "JP_START JP_FIRST=3 JP_PERIOD=0.025" </pre>
<li>To make the timer start on your signal and fire every 1.5 milliseconds of
program time use:
<pre>
setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.0015" </pre>
<li>To make the timer start on your signal and fire every 10 milliseconds of
wall-clock time use:
<pre>
setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.010 JP_REALTIME" </pre>
<li>To make the timer start on your signal and fire at 8192 Hz in wall-clock
time use:
<pre>
setenv JPROF_FLAGS "JP_DEFER JP_RTZ_HZ=8192" </pre>
</ul>
<h4>Pausing profiles</h4>
<P>jprof can be paused at any time by sending a SIGUSR1 to mozilla (<code>kill
-USR1</code>). This will cause the timer signals to stop and jprof-map to be
written, but it will not close jprof-log. Combining SIGUSR1 with the JP_DEFER
option allows profiling of one sequence of actions by starting the timer right
before starting the actions and stopping the timer right afterward.
<P>After a SIGUSR1, sending another timer signal (SIGPROF, SIGALRM, or SIGPOLL,
depending on the mode) can be used to continue writing data to the same output.
<h4>Looking at the results</h4>
Now that we have <code>jprof-log</code> and <code>jprof-map</code> files, we
can use the jprof executable is used to turn them into readable output. To do
this jprof needs the name of the mozilla binary and the log file. It deduces
the name of the map file:
<pre>
setenv JPROF_FLAGS "JP_START JP_FIRST=3 JP_PERIOD=0.025"
./jprof /home/user/mozilla/debug/dist/bin/mozilla-bin ./jprof-log > tmp.html
</pre>
<P>
When mozilla runs with jprof enabled, it writes to a file called jprof-log
for each timer signal and it writes to jprof-map when it exits.
Sending a SIGUSR1 to mozilla (<code>kill -USR1</code>) will cause the
timer signals to stop and jprof-map to be written, but it will not
close jprof-log.
Combining SIGUSR1 with the JP_DEFER option allows profiling of one
sequence of actions.
(After a SIGUSR1, sending another timer signal (SIGPROF or SIGALRM)
could be used to continue writing data to the same output, although this
has not been tested.)
This will generate the file <code>tmp.html</code> which you should view in a
web browser.
<p>
The jprof executable is used to turn these
files into readable output. To do this jprof needs the name of the mozilla
binary and the log file. It deduces the name of the map file:
<h3><a name="interpretation">Interpretation</a></h3>
<pre>
./jprof mozilla-bin ./jprof-log > tmp.html
</pre>
This will generate the file tmp.html which you should view in a web browser.
<P> <h3>Interpretation</h3>
The Jprof output is split into a flat portion and a hierarchical portion.
There are links to each section at the top of the page. It is typically
easier to analyze the profile by starting with the flat output and following
the links contained in the flat output up to the hierarchical output.
<a name="flat">
<P> <h3>Flat output</h3>
<h4><a name="flat">Flat output</a></h3>
The flat portion of the profile indicates which functions were executing
when the timer was going off. It is displayed as a list of functions names
@ -109,58 +149,91 @@ on the right and the number of times that function was interrupted on the
left. The list is sorted by decreasing interrupt count. For example:
<blockquote> <pre>
Count Function Name
36 <A href="#175458">_init</a>
30 <A href="#186088">SelectorMatches(nsIPresContext *, nsCSSSelector *, nsIContent *, int)</a>
22 <A href="#101380">_init</a>
22 <A href="#1991">_dl_lookup_symbol</a>
Total hit count: 151603
Count %Total Function Name
<a href="#23081">8806 5.8 __libc_poll</a>
<a href="#40008">2254 1.5 __i686.get_pc_thunk.bx</a>
<a href="#21390">2053 1.4 _int_malloc</a>
<a href="#49013">1777 1.2 nsStyleContext::GetStyleData(nsStyleStructID)</a>
<a href="#21380">1600 1.1 __libc_malloc</a>
<a href="#603">1552 1.0 nsCOMPtr_base::~nsCOMPtr_base()</a>
</pre> </blockquote>
This shows that of the 151603 times the timer fired, 1777 (1.2% of the total) were inside nsStyleContext::GetStyleData() and 1552 (1.0% of the total) were in the nsCOMPtr_base destructor.
<p>
In general, the functions with the highest count are the functions which
are taking the most time.
<P>
The function names are linked to the entry for that function in the
hierarchical profile, which is described in the next section.
<a name="hier">
<P> <h3>Hierarchical output</h3>
<h4><a name="hier">Hierarchical output</a></h4>
The hierarchical output is divided up into sections, with each section
corresponding to one function. A typical section looks something like
this:
<blockquote><pre>
639 <A href="#76801">nsWidget::OnMotionNotifySignal(_GdkEventMotion *)</A>
11 <A href="#76810">nsWidget::OnButtonReleaseSignal(_GdkEventButton *)</A>
2 <A href="#76809">nsWidget::OnButtonPressSignal(_GdkEventButton *)</A>
76787 0 652 <A name=76787>nsWidget::DispatchMouseEvent(nsMouseEvent &)</a>
652 <A href="#76779">nsWidget::DispatchWindowEvent(nsGUIEvent *)</A>
<A href="#29355">141300 PL_ProcessPendingEvents</A>
<A href="#29372"> 927 PL_ProcessEventsBeforeID</A>
29358 0 <a name=29358> 142227</a> <b>PL_HandleEvent</b>
<A href="#28546"> 92394 nsInputStreamReadyEvent::EventHandler(PLEvent*)</A>
<A href="#41572"> 49181 HandlePLEvent(ReflowEvent*)</A>
<A href="#29537"> 481 handleTimerEvent(TimerEventType*)</A>
<A href="#34494"> 158 nsTransportStatusEvent::HandleEvent(PLEvent*)</A>
<A href="#29359"> 9 PL_DestroyEvent</A>
<A href="#20319"> 4 __restore_rt</A>
</pre></blockquote>
This block corresponds to the function nsWidget::DispatchMouseEvent.
You can determine this because it is preceded by three numbers while
the other function names are preceded by only one number. These three
numbers have the following meaning. The number on the left (76787)
is the index number, and is not important. The center number (0)
is the number of times this function was interrupted by the timer.
The last number (652) is the number of times this function was in the
call stack when the timer went off. For our example we can see that
our function was in the call stack for 652 interrupt ticks, but we were
never the function that was running when the interrupt arrived.
The information this block tells us is:
<ul>
<li>There were 0 profiler hits <em>in</em> <code>PL_HandleEvent</code>
<li>There were 142227 profiler hits <em>under</em> <code>PL_HandleEvent</code>. Of these:
<ul>
<li>92394 were in or under <code>nsInputStreamReadyEvent::EventHandler</code>
<li>49181 were in or under <code>HandlePLEvent(ReflowEvent*)</code>
<li>481 were in or under <code>handleTimerEvent</code>
<li>158 were in or under <code>nsTransportStatusEvent::HandleEvent</code>
<li>9 were in or under <code>PL_DestroyEvent</code>
<li>4 were in or under <code>__restore_rt</code>
</ul>
<li>Of these 142227 calls into <code>PL_HandleEvent</code>:
<ul>
<li>141300 came from <code>PL_ProcessPendingEvents</code>
<li>927 came from <code>PL_ProcessEventsBeforeID</code>
</ul>
</ul>
The rest of this section explains how to read this information off from the jprof output.
<p>This block corresponds to the function <code>PL_HandleEvent</code>, which is
therefore bolded and not a link. The name of this function is preceded by
three numbers which have the following meaning. The number on the left (29358)
is the index number, and is not important. The center number (0) is the number
of times this function was interrupted by the timer. The last number (142227)
is the number of times this function was in the call stack when the timer went
off. That is, the timer went off while we were in code that was ultimately
called from <code>PL_HandleEvent</code>.
<p>For our example we can see that our function was in the call stack for
142227 interrupt ticks, but we were never the function that was running when
the interrupt arrived.
<P>
The functions listed above the line for nsWidget::DispatchMouseEvent, are
the functions which are calling this function. The numbers to the left of
these function names are the number of times these functions were in the
call stack as callers of nsWidget::DispatchMouseEvent. In our example,
we were called 11 times by the function nsWidget::OnButtonReleaseSignal.
The functions listed above the line for <code>PL_HandleEvent</code> are its
callers. The numbers to the left of these function names are the numbers of
times these functions were in the call stack as callers of
<code>PL_HandleEvent</code>. In our example, we were called 927 times by
<code>PL_ProcessEventsBeforeID</code> and 141300 times by
<code>PL_ProcessPendingEvents</code>.
<P>
The functions listed below the list for nsWidget::DispatchMouseEvent,
are the functions this function was calling when the timer went off.
The numbers to the left of the function names are the number of times
it was in the call stack as a callee of our function when the timer
went off. In our example, each of the 652 times the timer went off
and nsWidget::DispatchMouseEvent was in the call stack it was calling
nsWidget::DispatchWindowEvent.
The functions listed below the line for <code>PL_HandleEvent</code> are its
callees. The numbers to the left of the function names are the numbers of
times these functions were in the callstack as callees of <code>PL_HandleEvent</code>. In our example, of the 142227 profiler hits under <code>PL_HandleEvent</code> 92394 were under <code>nsInputStreamReadyEvent::EventHandler</code>, 49181 were under <code>HandlePLEvent(ReflowEvent*)</code>, and so forth.
<h3>Bugs</h3>
Jprof has only been tested under Red Hat Linux 6.0, 6.1, and 6.2. It does