linux/tools/perf
Paul Mackerras 051ae7f734 perf_counter tools: Reduce perf stat measurement overhead/skew
Vince Weaver reported a 'perf stat' measurement overhead in the
count of retired instructions, which can amount to a +6000
instructions inflated count in the reported count.

At present, perf stat creates its counters on the perf process.  Thus
the counters count the fork and various other activity in both the
parent and child, such as the resolver overhead for resolving PLT
entries for any libc functions that haven't been called before, such
as execvp.

This reduces the overhead by creating the counters on the child process
after the fork, using a couple of pipes to synchronize so that the
child process waits until the parent has created the counters before
doing the exec.  To eliminate the PLT resolution overhead on calling
execvp, this does a dummy execvp first which will always fail.

With this, the overhead of executing a program goes down from over
4800 instructions to about 90 instructions on powerpc (32-bit).
This was measured with a statically-linked program written in
assembler which only does the 3 instructions needed to call _exit(0).

Before:

$ perf stat -e 0:1:u ./three

 Performance counter stats for './three':

           4858  instructions

    0.001274523  seconds time elapsed

After:

$ perf stat -e 0:1:u ./three

 Performance counter stats for './three':

             92  instructions

    0.000468153  seconds time elapsed

Reported-by: Vince Weaver <vince@deater.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <19016.41425.814043.870352@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-29 22:38:09 +02:00
..
Documentation perf report: Fix help text typo 2009-06-23 16:39:53 +02:00
util perf_counter tools: Remove dead code 2009-06-27 06:06:39 +02:00
.gitignore
builtin-annotate.c perf_counter: Rework the sample ABI 2009-06-25 21:39:08 +02:00
builtin-help.c
builtin-list.c
builtin-record.c perf record: Fix unhandled io return value 2009-06-25 22:25:55 +02:00
builtin-report.c perf report: Print sorted callchains per histogram entries 2009-06-26 16:47:01 +02:00
builtin-stat.c perf_counter tools: Reduce perf stat measurement overhead/skew 2009-06-29 22:38:09 +02:00
builtin-top.c perf_counter: Rework the sample ABI 2009-06-25 21:39:08 +02:00
builtin.h
command-list.txt
CREDITS perf_counter tools: Add CREDITS file for Git contributors 2009-06-24 19:54:29 +02:00
design.txt perf_counter: Start documenting HAVE_PERF_COUNTERS requirements 2009-06-12 19:37:30 +02:00
Makefile perf_counter tools: Prepare a small callchain framework 2009-06-26 16:47:00 +02:00
perf.c
perf.h perf_counter tools: Prepare a small callchain framework 2009-06-26 16:47:00 +02:00