linux

mirror of https://github.com/FEX-Emu/linux.git synced 2024-12-21 08:53:41 +00:00

Author	SHA1	Message	Date
Ingo Molnar	aac898548d	Merge branch 'perf/urgent' into perf/core Conflicts: tools/perf/builtin-record.c tools/perf/builtin-top.c tools/perf/util/hist.h	2013-10-29 11:23:32 +01:00
Jiri Olsa	ae779a6309	perf top: Split -G and --call-graph Splitting -G and --call-graph for record command, so we could use '-G' with no option. The '-G' option now takes NO argument and enables the configured unwind method, which is currently the frame pointers method. It will be possible to configure unwind method via config file in upcoming patches. All current '-G' arguments is overtaken by --call-graph option. NOTE: The documentation for top --call-graph option was wrongly copied from report command. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382797536-32303-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-28 16:06:00 -03:00
Jiri Olsa	09b0fd45ff	perf record: Split -g and --call-graph Splitting -g and --call-graph for record command, so we could use '-g' with no option. The '-g' option now takes NO argument and enables the configured unwind method, which is currently the frame pointers method. It will be possible to configure unwind method via config file in upcoming patches. All current '-g' arguments is overtaken by --call-graph option. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382797536-32303-2-git-send-email-jolsa@redhat.com [ reordered -g/--call-graph on --help and expanded the man page according to comments by David Ahern and Namhyung Kim ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-28 16:05:59 -03:00
Waiman Long	5dbb6e81d8	perf top: Add --max-stack option to limit callchain stack scan When the callgraph function is enabled (-G), it may take a long time to scan all the stack data and merge them accordingly. This patch adds a new --max-stack option to perf-top to limit the depth of callchain stack data to look at to reduce the time it takes for perf-top to finish its processing. It reduces the amount of information provided to the user in exchange for faster speed. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: David Ahern <dsahern@gmail.com> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Scott J Norton <scott.norton@hp.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382107129-2010-5-git-send-email-Waiman.Long@hp.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-21 17:36:25 -03:00
Waiman Long	91e9561742	perf report: Add --max-stack option to limit callchain stack scan When callgraph data was included in the perf data file, it may take a long time to scan all those data and merge them together especially if the stored callchains are long and the perf data file itself is large, like a Gbyte or so. The callchain stack is currently limited to PERF_MAX_STACK_DEPTH (127). This is a large value. Usually the callgraph data that developers are most interested in are the first few levels, the rests are usually not looked at. This patch adds a new --max-stack option to perf-report to limit the depth of callchain stack data to look at to reduce the time it takes for perf-report to finish its processing. It trades the presence of trailing stack information with faster speed. The following table shows the elapsed time of doing perf-report on a perf.data file of size 985,531,828 bytes. --max_stack Elapsed Time Output data size ----------- ------------ ---------------- not set 88.0s 124,422,651 64 87.5s 116,303,213 32 87.2s 112,023,804 16 86.6s 94,326,380 8 59.9s 33,697,248 4 40.7s 10,116,637 -g none 27.1s 2,555,810 Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Scott J Norton <scott.norton@hp.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382107129-2010-4-git-send-email-Waiman.Long@hp.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-21 17:36:25 -03:00
Arnaldo Carvalho de Melo	c522739d72	perf trace: Use vfs_getname hook if available Initially it tries to find a probe:vfs_getname that should be setup with: perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string' or with slight changes to cope with code flux in the getname_flags code. In the future, if a "vfs:getname" tracepoint becomes available, then it will be preferred. This is not strictly required and more expensive method of reading the /proc/pid/fd/ symlink will be used when the fd->path array entry is not populated by a previous vfs_getname + open syscall ret sequence. As with any other 'perf probe' probe the setup must be done just once and the probe will be left inactive, waiting for users, be it 'perf trace' of any other tool. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ujg8se8glq5izmu8cdkq15po@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-16 11:18:24 -03:00
Adrian Hunter	fc1b691d76	perf buildid-cache: Add ability to add kcore to the cache kcore can be used to view the running kernel object code. However, kcore changes as modules are loaded and unloaded, and when the kernel decides to modify its own code. Consequently it is useful to create a copy of kcore at a particular time. Unlike vmlinux, kcore is not unique for a given build-id. And in addition, the kallsyms and modules files are also needed. The tool therefore creates a directory: ~/.debug/[kernel.kcore]/<build-id>/<YYYYmmddHHMMSShh> which contains: kcore, kallsyms and modules. Note that the copied kcore contains only code sections. See the kcore_copy() function for how that is determined. The tool will not make additional copies of kcore if there is already one with the same modules at the same addresses. Currently, perf tools will not look for kcore in the cache. That is addressed in another patch. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/525BF849.5030405@intel.com [ renamed 'index' to 'idx' to avoid shadowing string.h symbol in f12, use at least one member initializer when initializing a struct to zeros, also to fix the build on f12 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-14 12:20:38 -03:00
David Ahern	bf2575c121	perf trace: Add summary option to dump syscall statistics When enabled dumps a summary of all syscalls by task with the usual statistics -- min, max, average and relative stddev. For example, make - 26341 : 3344 [ 17.4% ] 0.000 ms read : 52 0.000 4.802 0.644 30.08 write : 20 0.004 0.036 0.010 21.72 open : 24 0.003 0.046 0.014 23.68 close : 64 0.002 0.055 0.008 22.53 stat : 2714 0.002 0.222 0.004 4.47 fstat : 18 0.001 0.041 0.006 46.26 mmap : 30 0.003 0.009 0.006 5.71 mprotect : 8 0.006 0.039 0.016 32.16 munmap : 12 0.007 0.077 0.020 38.25 brk : 48 0.002 0.014 0.004 10.18 rt_sigaction : 18 0.002 0.002 0.002 2.11 rt_sigprocmask : 60 0.002 0.128 0.010 32.88 access : 2 0.006 0.006 0.006 0.00 pipe : 12 0.004 0.048 0.013 35.98 vfork : 34 0.448 0.980 0.692 3.04 execve : 20 0.000 0.387 0.046 56.66 wait4 : 34 0.017 9923.287 593.221 68.45 fcntl : 8 0.001 0.041 0.013 48.79 getdents : 48 0.002 0.079 0.013 19.62 getcwd : 2 0.005 0.005 0.005 0.00 chdir : 2 0.070 0.070 0.070 0.00 getrlimit : 2 0.045 0.045 0.045 0.00 arch_prctl : 2 0.002 0.002 0.002 0.00 setrlimit : 2 0.002 0.002 0.002 0.00 openat : 94 0.003 0.005 0.003 2.11 Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1381289214-24885-3-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-14 10:28:50 -03:00
Ramkumar Ramachandra	d366c53e1d	perf timechart: Add example in the documentation While at it, update the synopsis to show both forms. Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@gmail.com> Link: http://lkml.kernel.org/r/1380791716-10325-1-git-send-email-artagnon@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-11 12:18:13 -03:00
Ingo Molnar	8a5411e9a3	perf tools: Implement summary output for 'make install' 'make install' used to show all the install lines, which is way too verbose to be really informative to the user. Implement summary output instead: comet:~/tip/tools/perf> make install BUILD: Doing 'make -j12' parallel build SUBDIR Documentation INSTALL Documentation-man INSTALL binaries INSTALL libexec INSTALL perf-archive INSTALL perl-scripts INSTALL python-scripts INSTALL bash_completion-script INSTALL tests 'make install V=1' will still show the old, detailed output. Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1381312169-17354-5-git-send-email-mingo@kernel.org [ Fixed conflict with libperf-gtk patches in acme/perf/core, cope with 'trace' alias ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-11 12:18:11 -03:00
Ingo Molnar	65fb09922d	tools: Harmonize the various build messages in perf, lib-traceevent, lib-lk The various build lines from libtraceevent and perf mix up during a parallel build and produce unaligned output like: CC builtin-buildid-list.o CC builtin-buildid-cache.o CC builtin-list.o CC FPIC trace-seq.o CC builtin-record.o CC FPIC parse-filter.o CC builtin-report.o CC builtin-stat.o CC FPIC parse-utils.o CC FPIC kbuffer-parse.o CC builtin-timechart.o CC builtin-top.o CC builtin-script.o BUILD STATIC LIB libtraceevent.a CC builtin-probe.o CC builtin-kmem.o CC builtin-lock.o To solve this, harmonize all the build message alignments to be similar to the kernel's kbuild output: prefixed by two spaces and 11-char wide. After the patch the output looks pretty tidy, even if output lines get mixed up: CC builtin-annotate.o FLAGS: * new build flags or cross compiler CC builtin-bench.o AR liblk.a CC bench/sched-messaging.o CC FPIC event-parse.o CC bench/sched-pipe.o CC FPIC trace-seq.o CC bench/mem-memcpy.o CC bench/mem-memset.o CC FPIC parse-filter.o CC builtin-diff.o CC builtin-evlist.o CC builtin-help.o Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381312169-17354-3-git-send-email-mingo@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-11 12:18:08 -03:00
Ingo Molnar	8ec19c0eba	perf tools: Implement summary output for 'make clean' 'make clean' used to show all the rm lines, which isn't really informative in any way and spams the console. Implement summary output: comet:~/tip/tools/perf> make clean CLEAN libtraceevent CLEAN liblk CLEAN config CLEAN core-objs CLEAN core-progs CLEAN core-gen CLEAN Documentation CLEAN python 'make clean V=1' will still show the old, detailed output. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381312169-17354-2-git-send-email-mingo@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-11 12:18:06 -03:00
David Ahern	5e2485b1a2	perf trace: Add record option The record option is a convience alias to include the -e raw_syscalls:* argument to perf-record. All other options are passed to perf-record's handler. Resulting data file can be analyzed by perf-trace -i. Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1380395584-9025-5-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-11 12:17:49 -03:00
Jiri Olsa	27050f530d	perf tools: Add possibility to specify mmap size Adding possibility to specify mmap size via -m/--mmap-pages by appending unit size character (B/K/M/G) to the number, like: $ perf record -m 8K ls $ perf record -m 2M ls The size is rounded up appropriately to follow perf mmap restrictions. If no unit is specified the number provides pages as of now, like: $ perf record -m 8 ls Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1378031796-17892-3-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-09 11:24:20 -03:00
Davidlohr Bueso	f37376cd72	perf lock: Account for lock average wait time While perf-lock currently reports both the total wait time and the number of contentions, it doesn't explicitly show the average wait time. Having this value immediately in the report can be quite useful when looking into performance issues. Furthermore, allowing report to sort by averages is another handy feature to have - and thus do not only print the value, but add it to the lock_stat structure. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1378693159-8747-8-git-send-email-davidlohr@hp.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-09 11:24:01 -03:00
Arnaldo Carvalho de Melo	50c95cbd70	perf trace: Add option to show process COMM Enabled by default, disable with --no-comm, e.g.: 181.821 (0.001 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: PEEK\|TRUNC\|CMSG_CLOEXEC ) = 20 181.824 (0.001 ms): deja-dup-monit/10784 geteuid( ) = 1000 181.825 (0.001 ms): deja-dup-monit/10784 getegid( ) = 1000 181.834 (0.002 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: CMSG_CLOEXEC ) = 20 181.836 (0.001 ms): deja-dup-monit/10784 geteuid( ) = 1000 181.838 (0.001 ms): deja-dup-monit/10784 getegid( ) = 1000 181.705 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK\|TRUNC\|CMSG_CLOEXEC) = 1256 181.710 (0.002 ms): evolution-addr/10924 geteuid( ) = 1000 181.712 (0.001 ms): evolution-addr/10924 getegid( ) = 1000 181.727 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC ) = 1256 181.731 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000 181.734 (0.001 ms): evolution-addr/10924 getegid( ) = 1000 181.908 (0.002 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK\|TRUNC\|CMSG_CLOEXEC) = 20 181.913 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000 181.915 (0.001 ms): evolution-addr/10924 getegid( ) = 1000 181.930 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC ) = 20 181.934 (0.001 ms): evolution-addr/10924 geteuid( ) = 1000 181.937 (0.001 ms): evolution-addr/10924 getegid( ) = 1000 220.718 (0.010 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL ) = 200 220.741 (0.000 ms): dbus-daemon/10711 ... [continued]: epoll_wait()) = 1 220.759 (0.004 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = 200 220.780 (0.002 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = 200 220.788 (0.001 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC ) = -1 EAGAIN Resource temporarily unavailable 220.760 (0.004 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL ) = 200 220.771 (0.023 ms): perf/26347 open(filename: 0xf2e780, mode: 15918976 ) = 19 220.850 (0.002 ms): perf/26347 close(fd: 19 ) = 0 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-6be5jvnkdzjptdrebfn5263n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-09 11:11:33 -03:00
David Ahern	4bb09192d3	perf trace: Add option to show full timestamp Current timestamp shown for output is time relative to firt sample. This patch adds an option to show the absolute perf_clock timestamp which is useful when comparing output across commands (e.g., perf-trace to perf-script). Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1378319865-55695-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-10-09 11:10:49 -03:00
Ingo Molnar	31f6be65e0	tools/perf: Fix double/triple-build of the feature detection logic during 'make install' et al Linus reported the following perf build system bug: 'Another annoyance during that make was that "make install" seems to want to re-make the thing I just built. That's absolutely horrible, [...]' The thing that got re-built were 'only' the (numerous) feature checks, not the whole project - but still it was mighty annoying as the feature checks took 9+ seconds even on reasonably fast boxes. Even with the autodep patches where feature detection is much faster it wastes resources, wastes screen real estate and confuses users if we execute feature detection twice. There were two sources for these unnecessary re-builds of the feature checks: - Unnecessary nested invocations of $(MAKE), apparently to be able to do conditional compilation dependent on documentation tools presence. Use straight dependencies instead, with no nesting. - A direct invocation of $(MAKE) to rebuild the PERF-VERSION-FILE. This is apparently done to be able to include it into the Makefile: -include $(OUTPUT)PERF-VERSION-FILE but that's entirely pointless for two reasons: 1) the version file gets regenerated by the initial build pass anyway, 2) including it is futile, given its contents: #define PERF_VERSION "3.12.rc3.g8510c7" 'make' will interpret that as a comment line... So just remove this part of the doc-generation logic. With these things fixed a 'make install' now rebuilds only what is needed. A repeated 'make install' on an already built tree is super fast now, it finishes in under 0.3 seconds: # # After the patch: # $ time make install ... real 0m0.280s user 0m0.162s sys 0m0.054s Prior all the autodep changes and prior this fix, a repeat 'make install' took 24.1 seconds (!) on the same system: # # Before the patches: # $ time make install ... real 0m24.109s user 0m21.171s sys 0m2.449s Which almost entirely was caused by fixable build system fat. We are now literally ~86 times faster. A fresh rebuild and install now takes just 11.4 seconds: # # After the patch: # $ make clean $ time make -j16 install ... real 0m11.457s user 1m43.411s sys 0m7.610s Without the patches it took 27.8 seconds: # # Before the patches: # $ make clean $ time make -j16 install ... real 0m27.801s user 1m59.242s sys 0m9.749s So even in the complete rebuild case we are now ~2.5 times faster. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/n/tip-x4qjnxjGrgxpribq8sdakfTp@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-09 08:48:51 +02:00
Andi Kleen	475eeab9f3	tools/perf: Add support for record transaction flags Add support for recording and displaying the transaction flags. They are essentially a new sort key. Also display them in a nice way to the user. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:06:12 +02:00
Andi Kleen	0126d493b6	tools/perf/record: Add abort_tx,no_tx,in_tx branch filter options to perf record -j Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-5-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:06:10 +02:00
Andi Kleen	f5d05bcec4	tools/perf: Support sorting by in_tx or abort branch flags Extend the perf branch sorting code to support sorting by in_tx or abort_tx qualifiers. Also print out those qualifiers. This also fixes up some of the existing sort key documentation. We do not support no_tx here, because it's simply not showing the in_tx flag. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:06:09 +02:00
Andi Kleen	4cabc3d1cb	tools/perf/stat: Add perf stat --transaction Add support to perf stat to print the basic transactional execution statistics: Total cycles, Cycles in Transaction, Cycles in aborted transsactions using the in_tx and in_tx_checkpoint qualifiers. Transaction Starts and Elision Starts, to compute the average transaction length. This is a reasonable overview over the success of the transactions. Also support architectures that have a transaction aborted cycles counter like POWER8. Since that is awkward to handle in the kernel abstract handle both cases here. Enable with a new --transaction / -T option. This requires measuring these events in a group, since they depend on each other. This is implemented by using TM sysfs events exported by the kernel Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnaldo Carvalho de Melo <acme@infradead.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1377128846-977-5-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:06:07 +02:00
David Ahern	6810fc915f	perf trace: Add option to analyze events in a file versus live Allows capture of raw_syscall:* events and analyzed at a later time. v2: change -i option from inherit to input name for consistency with other perf commands Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1377750593-48046-3-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-29 17:42:34 -03:00
Arnaldo Carvalho de Melo	7c304ee0fc	perf trace: Add --verbose option Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ain6q4u8g3bpnh18yhw24v2x@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-26 17:25:43 -03:00
Arnaldo Carvalho de Melo	b059efdf52	perf trace: Support ! in -e expressions So that we can ask for all but a set of syscalls to be traced. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-9j6hvap23qanyl96wx4mrj9k@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-26 17:25:41 -03:00
David Ahern	ac9be8ee4e	perf trace: Make command line arguments consistent with perf-record Common arguments like thread id, CPU list, mmap pages, etc should be consistent across perf commands. v3: Updated man page v2: rebased to latest core branch Signed-off-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1377018945-21940-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-26 17:25:35 -03:00
Arnaldo Carvalho de Melo	c24ff998fc	perf trace: Implement -o/--output filename To output all 'trace' output to a filename, just like 'strace -ofile' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-6q1homkwoayhmoq64y5vhel6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-26 16:51:31 -03:00
Arnaldo Carvalho de Melo	2ae3a312c0	perf trace: Allow specifying which syscalls to trace Similar to -e in strace, i.e. a comma separated list of syscall names to trace. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-5zku7q5wug3103k1dzn3yy63@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-14 11:44:21 -03:00
David Ahern	9a6d316692	perf kvm: Update documentation with live command Update perf-kvm documentation with new live subcommand. Add -p/--pid option for perf-kvm-stat-report as well. Signed-off-by: David Ahern <dsahern@gmail.com> Requested-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Runzhen Wang <runzhen@linux.vnet.ibm.com> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1375926999-75129-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-12 10:31:05 -03:00
Michael Ellerman	e9a7c41447	perf tools: Add support for pinned modifier This commit adds support for a new modifier "D", which requests that the event, or group of events, be pinned to the PMU. The "p" modifier is already taken for precise, and "P" may be used in future to mean "fully precise". So we use "D", which stands for pinneD - and looks like a padlock, or if you're using the ":D" syntax perf smiles at you. This is an oft-requested feature from our HW folks, who want to be able to run a large number of events, but also want 100% accurate results for instructions per cycle. Comparison of results with and without pinning: $ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,... 79,590,480,683 cycles # 0.000 GHz 166,123,716,524 instructions # 2.09 insns per cycle # 0.11 stalled cycles per insn 79,352,134,463 cycles # 0.000 GHz [11.11%] 165,178,301,818 instructions # 2.08 insns per cycle # 0.11 stalled cycles per insn [11.13%] As you can see although perf does a very good job of scaling the values in the non-pinned case, there is some small discrepancy. The patch is fairly straight forward, the one detail is that we need to make sure we only request pinning for the group leader when we have a group. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au [ Use perf_evsel__is_group_leader instead of open coded equivalent, as suggested by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-07 17:35:40 -03:00
Andi Kleen	411916880f	perf stat: Add support for --initial-delay option When measuring workloads the startup phase -- doing page faults, dynamic linking, opening files -- is often very different from the rest of the workload. Especially with smaller kernels and using counter multiplexing this can give significant measurement errors. Multiplexing assumes that the workload is mostly the same over longer periods. But at startup there is typically some spike of activity which is relatively short. If many groups are multiplexing the one group seeing the spike, and which is then scaled up over the time to run all groups, may see a significant error. Also in general it's often not useful to measure the startup, because it is so different from the rest. One way around this is to use interval mode and discard the first sample, but this can be awkward because interval mode doesn't support intervals of less than 100ms, and also a useful interval is not necessarily the same as a useful startup delay. This patch adds a new --initial-delay / -D option to skip measuring for the startup phase. The time can be specified in ms Here's a simple example: perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done' ... 3,721 page-faults ... If we just wait 20 ms the number of page faults is 1/3 less: perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done' ... 2,823 page-faults ... So we filtered out most of the startup noise from bash. Signed-off-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1375490473-1503-4-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-07 17:35:29 -03:00
Jiri Olsa	3c1763115b	perf tools: Add 'S' event/group modifier to read sample value Adding 'S' event/group modifier to specify that the event value/s are read by PERF_SAMPLE_READ sample type processing, instead of the period value offered by lower layers. There's additional behaviour change for 'S' modifier being specified on event group: Currently all the events within a group makes samples. If user now specifies 'S' within group modifier, only the leader will trigger samples. The rest of events in the group will have sampling disabled. And same as for single events, values of all events within the group (including leader) are read by PERF_SAMPLE_READ sample type processing. Following example will create event group with cycles and cache-misses events, setting the cycles as group leader and the only event to actually sample. Both cycles and cache-misses event period values are read by PERF_SAMPLE_READ sample type processing with PERF_FORMAT_GROUP read format. Example: $ perf record -e '{cycles,cache-misses}:S' ls ... $ perf report --group --show-total-period --stdio ... # Samples: 36 of event 'anon group { cycles, cache-misses }' # Event count (approx.): 12585593 # # Overhead Period Command Shared Object Symbol # .............. .............. ....... ................. .......................... # 19.92% 1.20% 2505936 31 ls [kernel.kallsyms] [k] mark_held_locks 13.74% 0.47% 1729327 12 ls [kernel.kallsyms] [k] sched_clock_local 13.64% 23.72% 1716147 612 ls ld-2.14.90.so [.] check_match.10805 13.12% 23.22% 1650778 599 ls libc-2.14.90.so [.] _nl_intern_locale_data 11.24% 29.19% 1414554 753 ls [kernel.kallsyms] [k] sched_clock_cpu 8.50% 0.35% 1070150 9 ls [kernel.kallsyms] [k] check_chain_key ... Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-iyoinu3axi11mymwnh2b7fxj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-08-07 17:35:22 -03:00
Andi Kleen	99571ab3d9	perf tools: Support callchain sorting based on addresses With programs with very large functions it can be useful to distinguish the callgraph nodes on more than just function names. So for example if you have multiple calls to the same function, it ends up being separate nodes in the chain. This patch adds a new key field to the callgraph options, that allows comparing nodes on functions (as today, default) and addresses. Longer term it would be nice to also handle src lines, but that would need more changes and address is a reasonable proxy for it today. I right now reference the global params, as there was no simple way to register a params pointer. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-0uskktybf0e7wrnoi5e9b9it@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-22 12:42:18 -03:00
Jiri Olsa	5f3f8d3b12	perf diff: Add generic order option for compute sorting Adding option 'o' to allow sorting based on the input file number. By default (without -o option) the output is sorted on baseline. Also removing '+' sorting support from -c option, because it's not needed anymore. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-l7dvhgt0azm7yiqg3fbn4dxw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-12 13:54:16 -03:00
Jiri Olsa	3a3beae81d	perf diff: Update perf diff documentation for multiple data comparison Updating perf diff documentation to include multiple perf data files comparison. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-tr6su3wfm20k2m5npjggyvtw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-12 13:54:13 -03:00
Greg Price	b21484f1a1	perf report/top: Add option to collapse undesired parts of call graph For example, in an application with an expensive function implemented with deeply nested recursive calls, the default call-graph presentation is dominated by the different callchains within that function. By ignoring these callees, we can collect the callchains leading into the function and compactly identify what to blame for expensive calls. For example, in this report the callers of garbage_collect() are scattered across the tree: $ perf report -d ruby 2>- \| grep -m10 ^[^#][a-z] 22.03% ruby [.] gc_mark --- gc_mark \|--59.40%-- mark_keyvalue \| st_foreach \| gc_mark_children \| \|--99.75%-- rb_gc_mark \| \| rb_vm_mark \| \| gc_mark_children \| \| gc_marks \| \| \|--99.00%-- garbage_collect If we ignore the callees of garbage_collect(), its callers are coalesced: $ perf report --ignore-callees garbage_collect -d ruby 2>- \| grep -m10 ^[^#][a-z] 72.92% ruby [.] garbage_collect --- garbage_collect vm_xmalloc \|--47.08%-- ruby_xmalloc \| st_insert2 \| rb_hash_aset \| \|--98.45%-- features_index_add \| \| rb_provide_feature \| \| rb_require_safe \| \| vm_call_method Signed-off-by: Greg Price <price@mit.edu> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130623031720.GW22203@biohazard-cafe.mit.edu Link: http://lkml.kernel.org/r/20130708115746.GO22203@biohazard-cafe.mit.edu Cc: Fengguang Wu <fengguang.wu@intel.com> [ remove spaces at beginning of line, reported by Fengguang Wu ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-12 13:53:55 -03:00
Andi Kleen	dc098b35b5	perf list: List kernel supplied event aliases List the kernel supplied pmu event aliases in perf list It's better when the users can actually see them. Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1366480949-32292-2-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-12 13:53:53 -03:00
Jiri Olsa	4a4d371a4d	perf record: Remove -f/--force option It no longer have any affect on the processing and is marked as obsolete anyway. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-tvwyspiqr4getzfib2lw06ty@git.kernel.org Link: http://lkml.kernel.org/r/1372307120-737-1-git-send-email-namhyung@kernel.org [ combined patch removing the -f usage in various sub-commands, such as 'perf sched', etc, by Namhyung Kim ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-08 17:37:25 -03:00
Jiri Olsa	563aecb2e6	perf record: Remove -A/--append option It's no longer working and needed. Quite straightforward discussion/vote was in here: http://marc.info/?t=137028288300004&r=1&w=2 Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-8fgdva12hl8w3xzzpsvvg7nx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-08 17:37:10 -03:00
Robert Richter	761a0f395d	perf tools: Fix output directory of Documentation/ The OUTPUT directory is wrongly determind leading to: make[3]: *** No rule to make target `.../.build/perf/PERF-VERSION-FILE'. Stop. Fixing this by using the generic approach in script/Makefile.include. Signed-off-by: Robert Richter <robert.richter@calxeda.com> Link: http://lkml.kernel.org/r/1367865614-30876-1-git-send-email-rric@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-08 17:31:49 -03:00
Robert Richter	5125bc22e7	tools: Get only verbose output with V=1 Fix having verbose build with V=0, e.g: make V=0 -C tools/ perf Signed-off-by: Robert Richter <robert.richter@calxeda.com> Link: http://lkml.kernel.org/r/20130503134953.GU8356@rric.localhost Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-07-08 17:31:34 -03:00
Namhyung Kim	fa5df94350	perf top: Add --percent-limit option The --percent-limit option is for not showing small overhead entries in the output. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1368497347-9628-8-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-05-28 16:24:01 +03:00
Namhyung Kim	064f19815c	perf report: Add --percent-limit option The --percent-limit option is for not showing small overhead entries in the output. Maybe we want to set a certain default value like 0.1. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1368497347-9628-7-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-05-28 16:24:01 +03:00
Arnaldo Carvalho de Melo	bc8b8c0d6a	perf archive: Fix typo on Documentation It is analysis, not analisys. Reported-by: William Cohen <wcohen@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-s7476m0irq0naxkzd9iekbr3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-05-28 16:23:55 +03:00
Stephane Eranian	028f12ee6b	perf tools: Add new mem command for memory access profiling This new command is a wrapper on top of perf record and perf report to make it easier to configure for memory access profiling. To record loads: $ perf mem -t load rec ..... To record stores: $ perf mem -t store rec ..... To get the report: $ perf mem -t load rep Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-15-git-send-email-eranian@google.com [ Fixed minor conflict with `66857b5` "Sort command-list.txt alphabetically" ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-04-01 12:21:44 -03:00
Andi Kleen	05484298cb	perf tools: Add support for weight v7 (modified) perf record has a new option -W that enables weightened sampling. Add sorting support in top/report for the average weight per sample and the total weight sum. This allows to both compare relative cost per event and the total cost over the measurement period. Add the necessary glue to perf report, record and the library. v2: Merge with new hist refactoring. v3: Fix manpage. Remove value check. Rename global_weight to weight and weight to local_weight. v4: Readd sort keys to manpage v5: Move weight to end v6: Move weight to template v7: Rename weight key. Original patch from Andi modified by Stephane Eranian <eranian@google.com> to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com [ committer note: changed to cope with `fc5871ed` and the hists_link perf test entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-04-01 12:19:43 -03:00
Namhyung Kim	328ccdace8	perf report: Add --no-demangle option It's sometimes useful to see undemangled raw symbol name for example other tools using the perf output to do manipulation of binaries. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: William Cohen <wcohen@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: William Cohen <wcohen@redhat.com> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=55571 Link: http://lkml.kernel.org/r/1364203098-17741-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-03-26 16:38:21 -03:00
Stephane Eranian	12c08a9f59	perf stat: Add per-core aggregation This patch adds the --per-core option to perf stat. This option is used to aggregate system-wide counts on a per physical core basis. On processors with hyperthreading, this means counts of all HT threads running on a physical core are aggregated. This mode is useful to find imblance between physical cores running an uniform workload. Cores are identified by socket: S0-C1, means physical core 1 on socket 0. Note that cores are identified using their physical core id, thus their numbering may not be continuous. Per core aggregation can be combined with interval printing: # perf stat -a --per-core -I 1000 -e cycles sleep 1000 # time core cpus counts events 1.000090030 S0-C0 1 4,765,747 cycles 1.000090030 S0-C1 1 5,580,647 cycles 1.000090030 S0-C2 1 221,181 cycles 1.000090030 S0-C3 1 266,092 cycles Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1360846649-6411-4-git-send-email-eranian@google.com [ committer note: Remove parts already applied on `86ee6e1` to keep bisectability ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-03-25 16:13:26 -03:00
Stephane Eranian	d4304958a2	perf stat: Rename --aggr-socket to --per-socket To make it more obvious what this option does as suggested by Andi on LKML. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1360846649-6411-3-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-03-25 16:09:24 -03:00
Frederik Deweerdt	a7e191c376	perf stat: Introduce --repeat forever The following patch causes 'perf stat --repeat 0' to be interpreted as 'forever', displaying the stats for every run. We act as if a single run was asked, and reset the stats in each iteration. In this mode SIGINT is passed to perf to be able to stop the loop with Ctrl+C. Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130301180227.GA24385@ks398093.ip-192-95-24.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2013-03-15 14:01:26 -03:00

1 2 3 4 5 ...

276 Commits