linux/tools/perf
Ravi Bangoria e47392bf9c perf uprobe: Skip prologue if program compiled without optimization
The function prologue prepares stack and registers before executing
function logic.

When target program is compiled without optimization, function parameter
information is only valid after the prologue.

When we probe entrypc of the function, and try to record a function
parameter, it contains a garbage value.

For example:

  $ vim test.c
    #include <stdio.h>

    void foo(int i)
    {
       printf("i: %d\n", i);
    }

    int main()
    {
      foo(42);
      return 0;
    }

  $ gcc -g test.c -o test
  $ objdump -dl test | less
    foo():
    /home/ravi/test.c:4
      400536:       55                      push   %rbp
      400537:       48 89 e5                mov    %rsp,%rbp
      40053a:       48 83 ec 10             sub    -bashx10,%rsp
      40053e:       89 7d fc                mov    %edi,-0x4(%rbp)
    /home/ravi/test.c:5
      400541:       8b 45 fc                mov    -0x4(%rbp),%eax
    ...
    ...
    main():
    /home/ravi/test.c:9
      400558:       55                      push   %rbp
      400559:       48 89 e5                mov    %rsp,%rbp
    /home/ravi/test.c:10
      40055c:       bf 2a 00 00 00          mov    -bashx2a,%edi
      400561:       e8 d0 ff ff ff          callq  400536 <foo>

  $ perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
     p:probe_test/foo /home/ravi/test:0x0000000000000536 i=-12(%sp):s32

  $ perf record -e probe_test:foo ./test
  $ perf script
     test  5778 [001]  4918.562027: probe_test:foo: (400536) i=0

Here variable 'i' is passed via stack which is pushed on stack at
0x40053e. But we are probing at 0x400536.

To resolve this issues, we need to probe on next instruction after
prologue.  gdb and systemtap also does same thing. I've implemented this
patch based on approach systemtap has used.

After applying patch:

  $ perf probe -x ./test 'foo i'
  $ cat /sys/kernel/debug/tracing/uprobe_events
    p:probe_test/foo /home/ravi/test:0x0000000000000541 i=-4(%bp):s32

  $ perf record -e probe_test:foo ./test
  $ perf script
    test  6300 [001]  5877.879327: probe_test:foo: (400541) i=42

No need to skip prologue for optimized case since debug info is correct
for each instructions for -O2 -g. For more details please visit:

        https://bugzilla.redhat.com/show_bug.cgi?id=612253#c6

Changes in v2:

- Skipping prologue only when any ARG is either C variable, $params or
  $vars.

- Probe on line(:1) may not be always possible. Recommend only address
  to force probe on function entry.

Committer notes:

Testing it with 'perf trace':

  # perf probe -x ./test foo i
  Added new event:
    probe_test:foo       (on foo in /home/acme/c/test with i)

  You can now use it in all perf tools, such as:

	  perf record -e probe_test:foo -aR sleep 1

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_test/foo /home/acme/c/test:0x0000000000000526 i=-12(%sp):s32
  # trace --no-sys --event probe_*:* ./test
  i: 42
     0.000 probe_test:foo:(400526) i=0)
  #

After the patch:

  # perf probe -d *:*
  Removed event: probe_test:foo
  # perf probe -x ./test foo i
  Target program is compiled without optimization. Skipping prologue.
  Probe on address 0x400526 to force probing at the function entry.

  Added new event:
    probe_test:foo       (on foo in /home/acme/c/test with i)

  You can now use it in all perf tools, such as:

	perf record -e probe_test:foo -aR sleep 1

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_test/foo /home/acme/c/test:0x0000000000000531 i=-4(%bp):s32
  # trace --no-sys --event probe_*:* ./test
  i: 42
     0.000 probe_test:foo:(400531) i=42)
  #

Reported-by: Michael Petlan <mpetlan@redhat.com>
Report-Link: https://www.mail-archive.com/linux-perf-users@vger.kernel.org/msg02348.html
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1299021
Link: http://lkml.kernel.org/r/1470214725-5023-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com
[ Rename 'die' to 'cu_die' to avoid shadowing a die() definition on at least centos 5, Debian 7 and ubuntu:12.04.5]
[ Use PRIx64 instead of lx to format a Dwarf_Addr, aka long long unsigned int, fixing the build on 32-bit systems ]
[ dwarf_getsrclines() expects a size_t * argument ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-09-01 12:42:25 -03:00
..
arch perf probe: Support probing on offline cross-arch binary 2016-09-01 12:41:09 -03:00
bench perf bench futex: Use NSEC_PER_USEC 2016-08-23 15:37:33 -03:00
Documentation perf probe: Ignore vmlinux buildid if offline kernel is given 2016-09-01 09:44:14 -03:00
jvmti perf jit: Remove some no-op error handling 2016-07-18 12:20:00 -03:00
python perf python: Add tracepoint example 2016-07-12 16:23:35 -03:00
scripts perf/core improvements and fixes: 2016-08-04 11:02:38 +02:00
tests perf test vmlinux: Tolerate symbol aliases 2016-09-01 12:42:23 -03:00
trace perf trace beauty seccomp: Remove seccomp.h include 2016-07-12 15:20:38 -03:00
ui perf hists browser: Remove superfluous null check on map 2016-08-23 15:37:33 -03:00
util perf uprobe: Skip prologue if program compiled without optimization 2016-09-01 12:42:25 -03:00
.gitignore perf tools: Add arch/*/include/generated/ to .gitignore 2016-05-30 12:41:46 -03:00
Build perf tools: Set and pass DOCDIR to builtin-report.c 2016-01-12 12:42:07 -03:00
builtin-annotate.c perf annotate: Initialize the priv are in symbol__new() 2016-08-30 10:56:34 -03:00
builtin-bench.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-buildid-cache.c tools: Introduce str_error_r() 2016-07-12 15:19:47 -03:00
builtin-buildid-list.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-config.c perf config: Reimplement show_config() using config_set__for_each 2016-06-23 17:23:00 -03:00
builtin-data.c perf data ctf: Add '--all' option for 'perf data convert' 2016-06-28 10:54:57 -03:00
builtin-diff.c perf hists: Add support for header span 2016-08-23 15:37:33 -03:00
builtin-evlist.c perf evlist: Rename for_each() macros to for_each_entry() 2016-06-23 11:26:15 -03:00
builtin-help.c tools: Introduce str_error_r() 2016-07-12 15:19:47 -03:00
builtin-inject.c perf evlist: Rename for_each() macros to for_each_entry() 2016-06-23 11:26:15 -03:00
builtin-kmem.c mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations 2016-07-28 16:07:41 -07:00
builtin-kvm.c perf kvm: Use NSEC_PER_USEC 2016-08-23 15:37:33 -03:00
builtin-list.c perf list: Show SDT and pre-cached events 2016-07-13 23:09:07 -03:00
builtin-lock.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-mem.c perf tools mem: Fix -t store option for record command 2016-08-12 14:39:48 -03:00
builtin-probe.c perf probe: Ignore vmlinux Build-id when offline vmlinux given 2016-09-01 12:42:22 -03:00
builtin-record.c perf record: Fix spelling mistake "Finshed" -> "Finished" 2016-08-23 17:06:40 -03:00
builtin-report.c perf annotate: Initialize the priv are in symbol__new() 2016-08-30 10:56:34 -03:00
builtin-sched.c perf sched: Use linux/time64.h 2016-08-23 15:37:33 -03:00
builtin-script.c tools: Introduce tools/include/linux/time64.h for *SEC_PER_*SEC macros 2016-08-23 15:37:33 -03:00
builtin-stat.c perf stat: Use *SEC_PER_*SEC macros 2016-08-23 15:37:33 -03:00
builtin-timechart.c perf timechart: Use NSEC_PER_U?SEC 2016-08-23 15:37:33 -03:00
builtin-top.c perf symbols: Rename ->ignore to ->idle 2016-08-30 11:15:59 -03:00
builtin-trace.c tools: Introduce tools/include/linux/time64.h for *SEC_PER_*SEC macros 2016-08-23 15:37:33 -03:00
builtin-version.c perf tools: Move cmd_version() to builtin-version.c 2015-12-09 13:42:03 -03:00
builtin.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
command-list.txt perf tools: Do not show trace command if it's not compiled in 2016-01-08 12:46:17 -03:00
CREDITS
design.txt perf tools: Update some code references in design.txt 2014-03-18 18:17:06 -03:00
Makefile perf build tests: Do parallell builds with 'build-test' 2016-02-04 15:57:00 -03:00
Makefile.config perf tools: Move config/Makefile into Makefile.config 2016-08-02 16:33:28 -03:00
Makefile.perf perf tools: Skip running the feature tests for 'make install-doc' 2016-08-23 15:37:33 -03:00
MANIFEST tools: Copy coresight-pmu.h header file needed by perf tools 2016-08-23 15:37:33 -03:00
perf-archive.sh
perf-completion.sh perf tools: Avoid confusion with preloaded bash function for perf bash completion 2015-03-19 13:53:27 -03:00
perf-read-vdso.c perf tools: Build programs to copy 32-bit compatibility 2014-10-29 10:32:48 -02:00
perf-sys.h perf tools: Add missing linux/compiler.h include to perf-sys.h 2016-07-18 17:40:49 -03:00
perf-with-kcore.sh perf tools: Fix perf-with-kcore handling of arguments containing spaces 2015-08-06 16:48:27 -03:00
perf.c perf tools: Just pr_debug() about not being able to read cacheline_size 2016-07-15 10:08:29 -03:00
perf.h tools: Introduce tools/include/linux/time64.h for *SEC_PER_*SEC macros 2016-08-23 15:37:33 -03:00