mirror of
https://github.com/mozilla/gecko-dev.git
synced 2025-05-13 08:57:27 +00:00
Updating readme to talk about the JP_RTZ_HZ flag, to have examples using the
current output format, and to have a bit more in the way of explanation of the output and the various JPROF_FLAGS values. a=jim nance
This commit is contained in:
parent
f46d6b7a30
commit
5578f60f53
@ -8,25 +8,32 @@
|
||||
<font size="-1">
|
||||
<a href="mailto:jim_nance%yahoo.com">jim_nance@yahoo.com</a>
|
||||
</font>
|
||||
<hr>
|
||||
|
||||
<a href="#introduction">Introduction</a> | <a href="#operation">Operation</a> |
|
||||
<a href="#setup">Setup</a> | <a href="#usage">Usage</a> |
|
||||
<a href="#interpretation">Interpretation</a>
|
||||
|
||||
</center>
|
||||
<hr>
|
||||
|
||||
<h3>Introduction</h3>
|
||||
<h3><a name="introduction">Introduction</a></h3>
|
||||
|
||||
Jprof is a profiling tool. I am writing it because I need to find out
|
||||
where mozilla is spending its time, and there do not seem to be any
|
||||
profilers for Linux that can handle threads and/or shared libraries.
|
||||
This code is based heavily on Kipp Hickman's leaky.
|
||||
|
||||
<P> <h3>Operation</h3>
|
||||
<h3><a name="operation">Operation</a></h3>
|
||||
|
||||
Jprof operates by installing a timer which periodically interrupts mozilla.
|
||||
When this timer goes off, the jprof code inside mozilla walks the function
|
||||
call stack to determine which code was executing. By collecting a large
|
||||
number of these call stacks, it is possible to deduce where mozilla is
|
||||
spending its time.
|
||||
When this timer goes off, the jprof code inside mozilla walks the function call
|
||||
stack to determine which code was executing and saves the results into the
|
||||
<code>jprof-log</code> and <code>jprof-map</code> files. By collecting a large
|
||||
number of these call stacks, it is possible to deduce where mozilla is spending
|
||||
its time.
|
||||
|
||||
<P> <h3>Usage</h3>
|
||||
<h3><a name="setup">Setup</a></h3>
|
||||
|
||||
First, check out the jprof source code since it is not a part of the
|
||||
default SeaMonkeyAll CVS tag. To do this do:
|
||||
@ -39,69 +46,102 @@ Next, enable jprof and rebuild mozilla:
|
||||
./configure --enable-jprof
|
||||
make
|
||||
</pre>
|
||||
Now you can run jprof. To do this the JPROF_FLAGS environment variable
|
||||
must be set. As a simple example:
|
||||
<pre>
|
||||
cd dist/bin
|
||||
setenv JPROF_FLAGS JP_START
|
||||
</pre>
|
||||
<P>
|
||||
The JPROF_FLAGS environment variable can be composed of several substrings
|
||||
Now you can run jprof.
|
||||
|
||||
<h3><a name="usage">Usage</a></h3>
|
||||
|
||||
The behavior of jprof is determined by the value of the JPROF_FLAGS environment
|
||||
variable. This environment variable can be composed of several substrings
|
||||
which have the following meanings:
|
||||
<ul>
|
||||
<li> <b>JP_START</b> : Install the signal handler, and start sending the
|
||||
timer signals.
|
||||
|
||||
<li> <b>JP_DEFER</b> : Install the signal handler, but don't start sending
|
||||
the timer signals. The user must start the signals by sending
|
||||
the first one (with <code>kill -PROF</code>, or with <code>kill
|
||||
-ALRM</code> if JP_REALTIME is used).
|
||||
the timer signals. The user must start the signals by sending the first
|
||||
one (with <code>kill -PROF</code>, or with <code>kill -ALRM</code> if
|
||||
JP_REALTIME is used, or with <code>kill -POLL</code> if JP_RTZ_HZ is used).
|
||||
|
||||
<li> <b>JP_FIRST=x</b> : Wait x seconds before starting the timer
|
||||
<li> <b>JP_PERIOD=y</b> : Set timer to interrupt every y seconds
|
||||
|
||||
<li> <b>JP_PERIOD=y</b> : Set timer to interrupt every y seconds. Only
|
||||
values of y strictly greater than 0.001 are supported.
|
||||
|
||||
<li> <b>JP_REALTIME</b> : Do the profiling in intervals of real time rather
|
||||
than intervals of time used by the mozilla process (and the kernel
|
||||
when doing work for mozilla). This could probably lead to weird
|
||||
results (you'll see whatever runs when mozilla is waiting for events),
|
||||
but is needed to see time spent in the X server.
|
||||
|
||||
<li> <b>JP_RTC_HZ=freq</b> : This option, only available on Linux if the
|
||||
kernel is built with RTC support, makes jprof use the RTC timer instead of
|
||||
using its own timer. This option, like JP_REALTIME, uses intervals of real
|
||||
time. This option overrides JP_PERIOD. freq is the frequency at which the
|
||||
timer should fire, measured in Hz. It must be a power of 2. The maximal
|
||||
frequency allowed by the kernel can be changed by writing to
|
||||
<code>/proc/sys/dev/rtc/max-user-freq</code>; the maximum value it can be
|
||||
set to is 8192.
|
||||
</ul>
|
||||
|
||||
For example you could say:
|
||||
<h4>Examples of JPROF_FLAGS usage</h4>
|
||||
<ul>
|
||||
|
||||
<li>To make the timer start firing 3 seconds after the program is started and
|
||||
fire every 25 milliseconds of program time use:
|
||||
<pre>
|
||||
setenv JPROF_FLAGS "JP_START JP_FIRST=3 JP_PERIOD=0.025" </pre>
|
||||
|
||||
<li>To make the timer start on your signal and fire every 1.5 milliseconds of
|
||||
program time use:
|
||||
<pre>
|
||||
setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.0015" </pre>
|
||||
|
||||
<li>To make the timer start on your signal and fire every 10 milliseconds of
|
||||
wall-clock time use:
|
||||
<pre>
|
||||
setenv JPROF_FLAGS "JP_DEFER JP_PERIOD=0.010 JP_REALTIME" </pre>
|
||||
|
||||
<li>To make the timer start on your signal and fire at 8192 Hz in wall-clock
|
||||
time use:
|
||||
<pre>
|
||||
setenv JPROF_FLAGS "JP_DEFER JP_RTZ_HZ=8192" </pre>
|
||||
|
||||
</ul>
|
||||
|
||||
<h4>Pausing profiles</h4>
|
||||
|
||||
<P>jprof can be paused at any time by sending a SIGUSR1 to mozilla (<code>kill
|
||||
-USR1</code>). This will cause the timer signals to stop and jprof-map to be
|
||||
written, but it will not close jprof-log. Combining SIGUSR1 with the JP_DEFER
|
||||
option allows profiling of one sequence of actions by starting the timer right
|
||||
before starting the actions and stopping the timer right afterward.
|
||||
|
||||
<P>After a SIGUSR1, sending another timer signal (SIGPROF, SIGALRM, or SIGPOLL,
|
||||
depending on the mode) can be used to continue writing data to the same output.
|
||||
|
||||
<h4>Looking at the results</h4>
|
||||
|
||||
Now that we have <code>jprof-log</code> and <code>jprof-map</code> files, we
|
||||
can use the jprof executable is used to turn them into readable output. To do
|
||||
this jprof needs the name of the mozilla binary and the log file. It deduces
|
||||
the name of the map file:
|
||||
|
||||
<pre>
|
||||
setenv JPROF_FLAGS "JP_START JP_FIRST=3 JP_PERIOD=0.025"
|
||||
./jprof /home/user/mozilla/debug/dist/bin/mozilla-bin ./jprof-log > tmp.html
|
||||
</pre>
|
||||
|
||||
<P>
|
||||
When mozilla runs with jprof enabled, it writes to a file called jprof-log
|
||||
for each timer signal and it writes to jprof-map when it exits.
|
||||
Sending a SIGUSR1 to mozilla (<code>kill -USR1</code>) will cause the
|
||||
timer signals to stop and jprof-map to be written, but it will not
|
||||
close jprof-log.
|
||||
Combining SIGUSR1 with the JP_DEFER option allows profiling of one
|
||||
sequence of actions.
|
||||
(After a SIGUSR1, sending another timer signal (SIGPROF or SIGALRM)
|
||||
could be used to continue writing data to the same output, although this
|
||||
has not been tested.)
|
||||
This will generate the file <code>tmp.html</code> which you should view in a
|
||||
web browser.
|
||||
|
||||
<p>
|
||||
The jprof executable is used to turn these
|
||||
files into readable output. To do this jprof needs the name of the mozilla
|
||||
binary and the log file. It deduces the name of the map file:
|
||||
<h3><a name="interpretation">Interpretation</a></h3>
|
||||
|
||||
<pre>
|
||||
./jprof mozilla-bin ./jprof-log > tmp.html
|
||||
</pre>
|
||||
|
||||
This will generate the file tmp.html which you should view in a web browser.
|
||||
|
||||
<P> <h3>Interpretation</h3>
|
||||
|
||||
The Jprof output is split into a flat portion and a hierarchical portion.
|
||||
There are links to each section at the top of the page. It is typically
|
||||
easier to analyze the profile by starting with the flat output and following
|
||||
the links contained in the flat output up to the hierarchical output.
|
||||
|
||||
<a name="flat">
|
||||
<P> <h3>Flat output</h3>
|
||||
<h4><a name="flat">Flat output</a></h3>
|
||||
|
||||
The flat portion of the profile indicates which functions were executing
|
||||
when the timer was going off. It is displayed as a list of functions names
|
||||
@ -109,58 +149,91 @@ on the right and the number of times that function was interrupted on the
|
||||
left. The list is sorted by decreasing interrupt count. For example:
|
||||
|
||||
<blockquote> <pre>
|
||||
Count Function Name
|
||||
36 <A href="#175458">_init</a>
|
||||
30 <A href="#186088">SelectorMatches(nsIPresContext *, nsCSSSelector *, nsIContent *, int)</a>
|
||||
22 <A href="#101380">_init</a>
|
||||
22 <A href="#1991">_dl_lookup_symbol</a>
|
||||
Total hit count: 151603
|
||||
Count %Total Function Name
|
||||
|
||||
<a href="#23081">8806 5.8 __libc_poll</a>
|
||||
<a href="#40008">2254 1.5 __i686.get_pc_thunk.bx</a>
|
||||
<a href="#21390">2053 1.4 _int_malloc</a>
|
||||
<a href="#49013">1777 1.2 nsStyleContext::GetStyleData(nsStyleStructID)</a>
|
||||
<a href="#21380">1600 1.1 __libc_malloc</a>
|
||||
<a href="#603">1552 1.0 nsCOMPtr_base::~nsCOMPtr_base()</a>
|
||||
</pre> </blockquote>
|
||||
|
||||
This shows that of the 151603 times the timer fired, 1777 (1.2% of the total) were inside nsStyleContext::GetStyleData() and 1552 (1.0% of the total) were in the nsCOMPtr_base destructor.
|
||||
|
||||
<p>
|
||||
In general, the functions with the highest count are the functions which
|
||||
are taking the most time.
|
||||
|
||||
<P>
|
||||
The function names are linked to the entry for that function in the
|
||||
hierarchical profile, which is described in the next section.
|
||||
|
||||
<a name="hier">
|
||||
<P> <h3>Hierarchical output</h3>
|
||||
<h4><a name="hier">Hierarchical output</a></h4>
|
||||
|
||||
The hierarchical output is divided up into sections, with each section
|
||||
corresponding to one function. A typical section looks something like
|
||||
this:
|
||||
|
||||
<blockquote><pre>
|
||||
639 <A href="#76801">nsWidget::OnMotionNotifySignal(_GdkEventMotion *)</A>
|
||||
11 <A href="#76810">nsWidget::OnButtonReleaseSignal(_GdkEventButton *)</A>
|
||||
2 <A href="#76809">nsWidget::OnButtonPressSignal(_GdkEventButton *)</A>
|
||||
76787 0 652 <A name=76787>nsWidget::DispatchMouseEvent(nsMouseEvent &)</a>
|
||||
652 <A href="#76779">nsWidget::DispatchWindowEvent(nsGUIEvent *)</A>
|
||||
<A href="#29355">141300 PL_ProcessPendingEvents</A>
|
||||
<A href="#29372"> 927 PL_ProcessEventsBeforeID</A>
|
||||
29358 0 <a name=29358> 142227</a> <b>PL_HandleEvent</b>
|
||||
<A href="#28546"> 92394 nsInputStreamReadyEvent::EventHandler(PLEvent*)</A>
|
||||
<A href="#41572"> 49181 HandlePLEvent(ReflowEvent*)</A>
|
||||
<A href="#29537"> 481 handleTimerEvent(TimerEventType*)</A>
|
||||
<A href="#34494"> 158 nsTransportStatusEvent::HandleEvent(PLEvent*)</A>
|
||||
<A href="#29359"> 9 PL_DestroyEvent</A>
|
||||
|
||||
<A href="#20319"> 4 __restore_rt</A>
|
||||
</pre></blockquote>
|
||||
|
||||
This block corresponds to the function nsWidget::DispatchMouseEvent.
|
||||
You can determine this because it is preceded by three numbers while
|
||||
the other function names are preceded by only one number. These three
|
||||
numbers have the following meaning. The number on the left (76787)
|
||||
is the index number, and is not important. The center number (0)
|
||||
is the number of times this function was interrupted by the timer.
|
||||
The last number (652) is the number of times this function was in the
|
||||
call stack when the timer went off. For our example we can see that
|
||||
our function was in the call stack for 652 interrupt ticks, but we were
|
||||
never the function that was running when the interrupt arrived.
|
||||
The information this block tells us is:
|
||||
|
||||
<ul>
|
||||
<li>There were 0 profiler hits <em>in</em> <code>PL_HandleEvent</code>
|
||||
<li>There were 142227 profiler hits <em>under</em> <code>PL_HandleEvent</code>. Of these:
|
||||
<ul>
|
||||
<li>92394 were in or under <code>nsInputStreamReadyEvent::EventHandler</code>
|
||||
<li>49181 were in or under <code>HandlePLEvent(ReflowEvent*)</code>
|
||||
<li>481 were in or under <code>handleTimerEvent</code>
|
||||
<li>158 were in or under <code>nsTransportStatusEvent::HandleEvent</code>
|
||||
<li>9 were in or under <code>PL_DestroyEvent</code>
|
||||
<li>4 were in or under <code>__restore_rt</code>
|
||||
</ul>
|
||||
<li>Of these 142227 calls into <code>PL_HandleEvent</code>:
|
||||
<ul>
|
||||
<li>141300 came from <code>PL_ProcessPendingEvents</code>
|
||||
<li>927 came from <code>PL_ProcessEventsBeforeID</code>
|
||||
</ul>
|
||||
</ul>
|
||||
|
||||
|
||||
The rest of this section explains how to read this information off from the jprof output.
|
||||
|
||||
<p>This block corresponds to the function <code>PL_HandleEvent</code>, which is
|
||||
therefore bolded and not a link. The name of this function is preceded by
|
||||
three numbers which have the following meaning. The number on the left (29358)
|
||||
is the index number, and is not important. The center number (0) is the number
|
||||
of times this function was interrupted by the timer. The last number (142227)
|
||||
is the number of times this function was in the call stack when the timer went
|
||||
off. That is, the timer went off while we were in code that was ultimately
|
||||
called from <code>PL_HandleEvent</code>.
|
||||
<p>For our example we can see that our function was in the call stack for
|
||||
142227 interrupt ticks, but we were never the function that was running when
|
||||
the interrupt arrived.
|
||||
<P>
|
||||
The functions listed above the line for nsWidget::DispatchMouseEvent, are
|
||||
the functions which are calling this function. The numbers to the left of
|
||||
these function names are the number of times these functions were in the
|
||||
call stack as callers of nsWidget::DispatchMouseEvent. In our example,
|
||||
we were called 11 times by the function nsWidget::OnButtonReleaseSignal.
|
||||
The functions listed above the line for <code>PL_HandleEvent</code> are its
|
||||
callers. The numbers to the left of these function names are the numbers of
|
||||
times these functions were in the call stack as callers of
|
||||
<code>PL_HandleEvent</code>. In our example, we were called 927 times by
|
||||
<code>PL_ProcessEventsBeforeID</code> and 141300 times by
|
||||
<code>PL_ProcessPendingEvents</code>.
|
||||
<P>
|
||||
The functions listed below the list for nsWidget::DispatchMouseEvent,
|
||||
are the functions this function was calling when the timer went off.
|
||||
The numbers to the left of the function names are the number of times
|
||||
it was in the call stack as a callee of our function when the timer
|
||||
went off. In our example, each of the 652 times the timer went off
|
||||
and nsWidget::DispatchMouseEvent was in the call stack it was calling
|
||||
nsWidget::DispatchWindowEvent.
|
||||
The functions listed below the line for <code>PL_HandleEvent</code> are its
|
||||
callees. The numbers to the left of the function names are the numbers of
|
||||
times these functions were in the callstack as callees of <code>PL_HandleEvent</code>. In our example, of the 142227 profiler hits under <code>PL_HandleEvent</code> 92394 were under <code>nsInputStreamReadyEvent::EventHandler</code>, 49181 were under <code>HandlePLEvent(ReflowEvent*)</code>, and so forth.
|
||||
|
||||
<h3>Bugs</h3>
|
||||
Jprof has only been tested under Red Hat Linux 6.0, 6.1, and 6.2. It does
|
||||
|
Loading…
x
Reference in New Issue
Block a user