Commit Graph

373 Commits

Author SHA1 Message Date
Ingo Molnar
be750231ce Merge branch 'perfcounters/urgent' into perfcounters/core
Conflicts:
	kernel/perf_counter.c

Merge reason: update to latest upstream (-rc6) and resolve
              the conflict with urgent fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15 12:06:12 +02:00
Arnaldo Carvalho de Melo
247648e374 perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available
In old binutils we can't access bfd_demangle(), use
cplus_demangle() just like oprofile.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Luis Claudio R. Gonçalves <lclaudio@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090811192211.GG18061@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-12 14:10:49 +02:00
Frederic Weisbecker
66e274f3b8 perf tools: Factorize the map helpers
Factorize the dso mapping helpers into a single purpose common file
"util/map.c"

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:37:37 +02:00
Frederic Weisbecker
cd84c2ac6d perf tools: Factorize high level dso helpers
Factorize multiple definitions of high level dso helpers into the
symbol source file.

The side effect is a general export of the verbose and eprintf
debugging helpers into a new file dedicated to debugging purposes.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
2009-08-12 12:02:38 +02:00
Brice Goglin
8d51327090 perf report: Fix and improve the displaying of per-thread event counters
Improve and fix the handling of per-thread counter stats
recorded via perf record -s. Previously we only displayed
it in debug printouts (-D) and even that output was hard
to disambiguate.

I moved everything to utils/values.[ch] so that we may reuse
it in perf stat.

We get something like this now:

 #  PID   TID  cache-misses  cache-references
   4658  4659        495581           3238779
   4658  4662        498246           3236823
   4658  4663        499531           3243162

Then it'll be easy to add --pretty=raw to display a single line per thread/event.

By the way, -S was also used for --symbol... So I used -T/--thread here.

perf report: Add -T/--threads to display per-thread counter values

 We get something like this now:
 #  PID   TID  cache-misses  cache-references
   4658  4659        495581           3238779
   4658  4662        498246           3236823
   4658  4663        499531           3243162

Per-thread arrays of counter values are managed in utils/values.[ch]

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 13:04:20 +02:00
Mike Galbraith
183f3b0887 perf_counter tools: Fix libbfd detection for systems with libz dependency
Due to a libz dependency in some distro's binutils package,
C++ demangle support isn't compiled in despite the necessary
libraries being available.

Fix this by adding a -lz link test to the dependency detection
rules.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1249733655.6929.5.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:47 +02:00
Peter Zijlstra
9424edc2da perf: Auto-detect libelf
Adds autodetection for libelf as well, and simplifies the
libbfd code. Furthermore, fail make with an error when libelf
is not found and warn about the lack of libbfd.

Also provide an option to build a 32bit version even though you
might be running a 64bit kernel.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06 20:25:13 +02:00
Peter Zijlstra
2cdbc46d7b perf: Auto-detect libbfd
Since the C++ demangling isn't needed for everybody and
bfd/iberty aren't widely/easily available on all machines, make
it optional.

It also allows you to forcefully disable demangling by using
NO_DEMANGLE=1 and otherwise tries to detect libbfd/libiberty
combinations that result in a compiling demangler.

Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
LKML-Reference: <20090801082048.GX12579@kernel.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-05 14:12:08 +02:00
Ingo Molnar
2d1b6949d2 perf_counter tools: Fix link errors with older toolchains
On older distros (F8 for example) the perf build could fail
with such missing symbols:

    LINK perf
/usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../lib64/libbfd.a(bfd.o): In function `bfd_demangle':
(.text+0x2b3): undefined reference to `cplus_demangle'
/usr/lib/gcc/x86_64-redhat-linux/4.3.2/../../../../lib64/libbfd.a(bfd.o): In function `bfd_demangle':

Link in -liberty too.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-01 13:15:36 +02:00
Arnaldo Carvalho de Melo
28ac909b49 perf symbol: C++ demangling
[acme@doppio ~]$ perf report -s comm,dso,symbol -C firefox -d /usr/lib64/xulrunner-1.9.1/libxul.so | grep :: | head
     2.21%  [.] nsDeque::Push(void*)
     1.78%  [.] GraphWalker::DoWalk(nsDeque&)
     1.30%  [.] GCGraphBuilder::AddNode(void*, nsCycleCollectionParticipant*)
     1.27%  [.] XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
     1.18%  [.] imgContainer::DrawFrameTo(gfxIImageFrame*, gfxIImageFrame*, nsRect&)
     1.13%  [.] nsDeque::PopFront()
     1.11%  [.] nsGlobalWindow::RunTimeout(nsTimeout*)
     0.97%  [.] nsXPConnect::Traverse(void*, nsCycleCollectionTraversalCallback&)
     0.95%  [.] nsJSEventListener::cycleCollection::Traverse(void*, nsCycleCollectionTraversalCallback&)
     0.95%  [.] nsCOMPtr_base::~nsCOMPtr_base()
[acme@doppio ~]$

Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Suggested-by: Clark Williams <williams@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090720171412.GB10410@ghostprotocols.net>
2009-07-22 18:05:57 +02:00
Mike Galbraith
208b4b4a59 perf_counter tools: Add infrastructure to support loading of kernel module symbols
Add infrastructure for module path discovery and section load addresses.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1246514830.13293.44.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-02 08:42:20 +02:00
Arnaldo Carvalho de Melo
5da5025858 perf_counter tools: Share list.h with the kernel
The copy we were using came from another copy I did for the dwarves
(pahole) package, that came from the kernel years ago.

The only function that is used by the perf tools and that isn't in the
kernel is list_del_range, that I'm leaving in the perf tools only for
now.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20090701174608.GA5823@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 22:37:23 +02:00
Arnaldo Carvalho de Melo
43cbcd8acb perf_counter tools: Share rbtree.with the kernel
The tools/perf/util/rbtree.c copy already drifted by three
csets:

 4b324126e0
 4c60117811
 16c047add3

So remove the copy and use the lib/rbtree.c directly, sharing
the source code while still generating a separate object file,
since tools/perf uses a far more agressive -O6 switch.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20090701152837.GG15682@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 22:37:22 +02:00
Ingo Molnar
f37a291c52 perf_counter tools: Add more warnings and fix/annotate them
Enable -Wextra. This found a few real bugs plus a number
of signed/unsigned type mismatches/uncleanlinesses. It
also required a few annotations

All things considered it was still worth it so lets try with
this enabled for now.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 12:49:48 +02:00
Anton Blanchard
6717534ddc perf_counter tools: Remove zlib dependency
The zlib devel libraries may not be installed and since we aren't
using zlib we may as well remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
LKML-Reference: <20090630230140.802078956@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 01:25:18 +02:00
Arnaldo Carvalho de Melo
25903407da perf report: Add --dsos parameter
So that we can filter by dso. Symbols in other dsos won't be
accounted for.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246399282-20934-2-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 00:07:09 +02:00
Frederic Weisbecker
8cb76d99d7 perf_counter tools: Prepare a small callchain framework
We plan to display the callchains depending on some user-configurable
parameters.

To gather the callchains stats from the recorded stream in a fast way,
this patch introduces an ad hoc radix tree adapted for callchains and also
a rbtree to sort these callchains once we have gathered every events
from the stream.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1246026481-8314-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-26 16:47:00 +02:00
Peter Zijlstra
7c6a1c65bb perf_counter tools: Rework the file format
Create a structured file format that includes the full
perf_counter_attr and all its relevant counter IDs so that
the reporting program has full information.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-25 21:39:04 +02:00
Paul Mackerras
9cffa8d533 perf_counter tools: Define and use our own u64, s64 etc. definitions
On 64-bit powerpc, __u64 is defined to be unsigned long rather than
unsigned long long.  This causes compiler warnings every time we
print a __u64 value with %Lx.

Rather than changing __u64, we define our own u64 to be unsigned long
long on all architectures, and similarly s64 as signed long long.
For consistency we also define u32, s32, u16, s16, u8 and s8.  These
definitions are put in a new header, types.h, because these definitions
are needed in util/string.h and util/symbol.h.

The main change here is the mechanical change of __[us]{64,32,16,8}
to remove the "__".  The other changes are:

* Create types.h
* Include types.h in perf.h, util/string.h and util/symbol.h
* Add types.h to the LIB_H definition in Makefile
* Added (u64) casts in process_overflow_event() and print_sym_table()
  to kill two remaining warnings.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: benh@kernel.crashing.org
LKML-Reference: <19003.33494.495844.956580@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-19 18:25:47 +02:00
Ingo Molnar
b8e6d82972 perf report: Filter to parent set by default
Make it easier to use parent filtering - default to a filtered
output. Also add the parent column so that we get collapsing but
dont display it by default.

add --no-exclude-other to override this.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 14:32:19 +02:00
Paul Mackerras
e24a72c4d8 perf_counter: tools: Makefile tweaks for 64-bit powerpc
On 64-bit powerpc, perf needs to be built as a 64-bit executable.
This arranges to add the -m64 flag to CFLAGS if we are running on
a 64-bit machine, indicated by the result of uname -m ending in "64".
This means that we'll use -m64 on x86_64 machines as well.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: benh@kernel.crashing.org
LKML-Reference: <19000.55666.866148.559620@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 11:11:47 +02:00
Ingo Molnar
ef281a196d perf stat: Enable raw data to be printed
If -vv (very verbose) is specified, print out raw data
in the following format:

$ perf stat -vv -r 3 ./loop_1b_instructions

[ perf stat: executing run #1 ... ]
[ perf stat: executing run #2 ... ]
[ perf stat: executing run #3 ... ]

debug:              runtime[0]: 235871872
debug:             walltime[0]: 236646752
debug:       runtime_cycles[0]: 755150182
debug:            counter/0[0]: 235871872
debug:            counter/1[0]: 235871872
debug:            counter/2[0]: 235871872
debug:               scaled[0]: 0
debug:            counter/0[1]: 2
debug:            counter/1[1]: 235870662
debug:            counter/2[1]: 235870662
debug:               scaled[1]: 0
debug:            counter/0[2]: 1
debug:            counter/1[2]: 235870437
debug:            counter/2[2]: 235870437
debug:               scaled[2]: 0
debug:            counter/0[3]: 140
debug:            counter/1[3]: 235870298
debug:            counter/2[3]: 235870298
debug:               scaled[3]: 0
debug:            counter/0[4]: 755150182
debug:            counter/1[4]: 235870145
debug:            counter/2[4]: 235870145
debug:               scaled[4]: 0
debug:            counter/0[5]: 1001411258
debug:            counter/1[5]: 235868838
debug:            counter/2[5]: 235868838
debug:               scaled[5]: 0
debug:            counter/0[6]: 27897
debug:            counter/1[6]: 235868560
debug:            counter/2[6]: 235868560
debug:               scaled[6]: 0
debug:            counter/0[7]: 2910
debug:            counter/1[7]: 235868151
debug:            counter/2[7]: 235868151
debug:               scaled[7]: 0
debug:              runtime[0]: 235980257
debug:             walltime[0]: 236770942
debug:       runtime_cycles[0]: 755114546
debug:            counter/0[0]: 235980257
debug:            counter/1[0]: 235980257
debug:            counter/2[0]: 235980257
debug:               scaled[0]: 0
debug:            counter/0[1]: 3
debug:            counter/1[1]: 235980049
debug:            counter/2[1]: 235980049
debug:               scaled[1]: 0
debug:            counter/0[2]: 1
debug:            counter/1[2]: 235979907
debug:            counter/2[2]: 235979907
debug:               scaled[2]: 0
debug:            counter/0[3]: 135
debug:            counter/1[3]: 235979780
debug:            counter/2[3]: 235979780
debug:               scaled[3]: 0
debug:            counter/0[4]: 755114546
debug:            counter/1[4]: 235979652
debug:            counter/2[4]: 235979652
debug:               scaled[4]: 0
debug:            counter/0[5]: 1001439771
debug:            counter/1[5]: 235979304
debug:            counter/2[5]: 235979304
debug:               scaled[5]: 0
debug:            counter/0[6]: 23723
debug:            counter/1[6]: 235979050
debug:            counter/2[6]: 235979050
debug:               scaled[6]: 0
debug:            counter/0[7]: 2213
debug:            counter/1[7]: 235978820
debug:            counter/2[7]: 235978820
debug:               scaled[7]: 0
debug:              runtime[0]: 235888002
debug:             walltime[0]: 236700533
debug:       runtime_cycles[0]: 754881504
debug:            counter/0[0]: 235888002
debug:            counter/1[0]: 235888002
debug:            counter/2[0]: 235888002
debug:               scaled[0]: 0
debug:            counter/0[1]: 2
debug:            counter/1[1]: 235887793
debug:            counter/2[1]: 235887793
debug:               scaled[1]: 0
debug:            counter/0[2]: 1
debug:            counter/1[2]: 235887645
debug:            counter/2[2]: 235887645
debug:               scaled[2]: 0
debug:            counter/0[3]: 135
debug:            counter/1[3]: 235887499
debug:            counter/2[3]: 235887499
debug:               scaled[3]: 0
debug:            counter/0[4]: 754881504
debug:            counter/1[4]: 235887368
debug:            counter/2[4]: 235887368
debug:               scaled[4]: 0
debug:            counter/0[5]: 1001401731
debug:            counter/1[5]: 235887024
debug:            counter/2[5]: 235887024
debug:               scaled[5]: 0
debug:            counter/0[6]: 24212
debug:            counter/1[6]: 235886786
debug:            counter/2[6]: 235886786
debug:               scaled[6]: 0
debug:            counter/0[7]: 1824
debug:            counter/1[7]: 235886560
debug:            counter/2[7]: 235886560
debug:               scaled[7]: 0

 Performance counter stats for '/home/mingo/loop_1b_instructions' (3 runs):

     235.913377  task-clock-msecs     #      0.997 CPUs    ( +-   0.011% )
              2  context-switches     #      0.000 M/sec   ( +-   0.000% )
              1  CPU-migrations       #      0.000 M/sec   ( +-   0.000% )
            136  page-faults          #      0.001 M/sec   ( +-   0.730% )
      755048744  cycles               #   3200.534 M/sec   ( +-   0.009% )
     1001417586  instructions         #      1.326 IPC     ( +-   0.001% )
          25277  cache-references     #      0.107 M/sec   ( +-   3.988% )
           2315  cache-misses         #      0.010 M/sec   ( +-   9.845% )

    0.236706075  seconds time elapsed.

This allows the summary stats to be validated.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-13 15:40:35 +02:00
Ingo Molnar
864709302a perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/
Several people have suggested that 'perf' has become a full-fledged
tool that should be moved out of Documentation/. Move it to the
(new) tools/ directory.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-06 20:33:43 +02:00