linux/kernel
Srinivasa D S ef53d9c5e4 kprobes: improve kretprobe scalability with hashed locking
Currently list of kretprobe instances are stored in kretprobe object (as
used_instances,free_instances) and in kretprobe hash table.  We have one
global kretprobe lock to serialise the access to these lists.  This causes
only one kretprobe handler to execute at a time.  Hence affects system
performance, particularly on SMP systems and when return probe is set on
lot of functions (like on all systemcalls).

Solution proposed here gives fine-grain locks that performs better on SMP
system compared to present kretprobe implementation.

Solution:

 1) Instead of having one global lock to protect kretprobe instances
    present in kretprobe object and kretprobe hash table.  We will have
    two locks, one lock for protecting kretprobe hash table and another
    lock for kretporbe object.

 2) We hold lock present in kretprobe object while we modify kretprobe
    instance in kretprobe object and we hold per-hash-list lock while
    modifying kretprobe instances present in that hash list.  To prevent
    deadlock, we never grab a per-hash-list lock while holding a kretprobe
    lock.

 3) We can remove used_instances from struct kretprobe, as we can
    track used instances of kretprobe instances using kretprobe hash
    table.

Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system
with return probes set on all systemcalls looks like this.

cacheline              non-cacheline             Un-patched kernel
aligned patch 	       aligned patch
===============================================================================
real    9m46.784s       9m54.412s                  10m2.450s
user    40m5.715s       40m7.142s                  40m4.273s
sys     2m57.754s       2m58.583s                  3m17.430s
===========================================================

Time duration for kernel compilation ("make -j 8) on the same system, when
kernel is not probed.
=========================
real    9m26.389s
user    40m8.775s
sys     2m7.283s
=========================

Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
..
irq kernel/irq/manage.c: replace a printk + WARN_ON() to a WARN() 2008-07-25 10:53:29 -07:00
power pm: fix try_to_freeze_tasks()'s use of do_div() 2008-07-24 10:47:24 -07:00
time Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-24 12:55:01 -07:00
trace cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr 2008-07-18 22:02:57 +02:00
.gitignore
acct.c
audit_tree.c [PATCH] list_for_each_rcu must die: audit 2008-05-17 03:30:23 -04:00
audit.c [PATCH] remove useless argument type in audit_filter_user() 2008-06-24 23:36:35 -04:00
audit.h
auditfilter.c [PATCH] remove useless argument type in audit_filter_user() 2008-06-24 23:36:35 -04:00
auditsc.c x86_64 syscall audit fast-path 2008-07-23 17:47:32 -07:00
backtracetest.c backtrace: replace timer with tasklet + completions 2008-06-27 18:09:16 +02:00
bounds.c Add kbuild.h that contains common definitions for kbuild users 2008-04-29 08:06:29 -07:00
capability.c security: filesystem capabilities refactor kernel code 2008-07-24 10:47:22 -07:00
cgroup_debug.c
cgroup.c cgroups: remove node_ prefix_from ns subsystem 2008-05-24 09:56:14 -07:00
compat.c ntp: support for TAI 2008-05-01 08:03:59 -07:00
configs.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
cpu.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
cpuset.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
delayacct.c
dma.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
exec_domain.c remove CONFIG_KMOD from core kernel code 2008-07-22 19:24:31 +10:00
exit.c fix dangling zombie when new parent ignores children 2008-07-16 18:02:34 -07:00
extable.c
fork.c clean up duplicated alloc/free_thread_info 2008-07-25 10:53:28 -07:00
futex_compat.c
futex.c futexes: fix fault handling in futex_lock_pi 2008-06-23 13:31:15 +02:00
hrtimer.c Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
itimer.c
kallsyms.c kallsyms: fix potential overflow in binary search 2008-07-25 10:53:27 -07:00
Kconfig.hz sched: fix hrtick & generic-ipi dependency 2008-07-23 11:18:28 +02:00
Kconfig.preempt
kexec.c kexec: make extended crashkernel= syntax less confusing 2008-05-01 08:04:00 -07:00
kfifo.c
kgdb.c kgdb: sparse fix 2008-06-24 10:52:55 -05:00
kmod.c call_usermodehelper(): increase reliability 2008-07-25 10:53:28 -07:00
kprobes.c kprobes: improve kretprobe scalability with hashed locking 2008-07-25 10:53:30 -07:00
ksysfs.c
kthread.c kthread: reduce stack pressure in create_kthread and kthreadd 2008-07-18 18:46:58 +02:00
latencytop.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
lockdep_internals.h lockdep: add lock_class information to lock_chain and output it 2008-06-24 01:28:20 +02:00
lockdep_proc.c lockdep: add lock_class information to lock_chain and output it 2008-06-24 01:28:20 +02:00
lockdep.c Merge branch 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-14 14:55:13 -07:00
Makefile build kernel/profile.o only when requested 2008-07-25 10:53:27 -07:00
marker.c Markers - remove extra format argument 2008-05-23 22:25:27 +02:00
module.c modules: Take a shortcut for checking if an address is in a module 2008-07-22 19:24:28 +10:00
mutex-debug.c mutex-debug: check mutex magic before owner 2008-05-16 16:53:35 +02:00
mutex-debug.h
mutex.c __mutex_lock_common: use signal_pending_state() 2008-06-10 11:45:09 +02:00
mutex.h
notifier.c ipc: re-enable msgmni automatic recomputing msgmni if set to negative 2008-04-29 08:06:13 -07:00
ns_cgroup.c
nsproxy.c ipc: sysvsem: refuse clone(CLONE_SYSVSEM|CLONE_NEWIPC) 2008-04-29 08:06:14 -07:00
panic.c Add a WARN() macro; this is WARN_ON() + printk arguments 2008-07-25 10:53:29 -07:00
params.c
pid_namespace.c pidns: make pid->level and pid_ns->level unsigned 2008-04-30 08:29:49 -07:00
pid.c rcu: split list.h and move rcu-protected lists into rculist.h 2008-05-19 10:01:37 +02:00
pm_qos_params.c pm_qos_params: BKL pushdown 2008-07-02 15:06:24 -06:00
posix-cpu-timers.c posix-timers: print RT watchdog message 2008-05-24 18:49:22 +02:00
posix-timers.c signals: join send_sigqueue() with send_group_sigqueue() 2008-04-30 08:29:36 -07:00
printk.c printk ratelimiting rewrite 2008-07-25 10:53:29 -07:00
profile.c build kernel/profile.o only when requested 2008-07-25 10:53:27 -07:00
ptrace.c ptrace children revamp 2008-07-16 18:02:33 -07:00
rcuclassic.c Merge branch 'linus' into cpus4096 2008-07-16 00:29:07 +02:00
rcupdate.c Merge branch 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-15 14:12:03 -07:00
rcupreempt_trace.c rcu: remove duplicated include in kernel/rcupreempt_trace.c 2008-05-19 10:03:39 +02:00
rcupreempt.c Merge branch 'linus' into cpus4096 2008-07-16 00:29:07 +02:00
rcutorture.c rcu: make rcutorture even more vicious: invoke RCU readers from irq handlers (timers) 2008-06-26 09:24:33 +02:00
relay.c splice: fix sendfile() issue with relay 2008-05-28 14:49:27 +02:00
res_counter.c
resource.c kernel: use non-racy method for proc entries creation 2008-04-29 08:06:22 -07:00
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c sysdev: Pass the attribute to the low level sysdev show/store function 2008-07-21 21:55:02 -07:00
rtmutex.c
rtmutex.h
rwsem.c
sched_clock.c Merge branch 'sched/clock' into sched/devel 2008-07-14 12:19:13 +02:00
sched_cpupri.c sched: use a 2-d bitmap for searching lowest-pri CPU 2008-06-06 15:19:28 +02:00
sched_cpupri.h sched: fix the cpuprio count really 2008-06-06 15:19:44 +02:00
sched_debug.c sched: add full schedstats to /proc/sched_debug 2008-06-27 14:31:31 +02:00
sched_fair.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
sched_features.h sched: bias effective_load() error towards failing wake_affine(). 2008-06-27 14:31:47 +02:00
sched_idletask.c sched: make rt_sched_class, idle_sched_class static 2008-05-05 23:56:17 +02:00
sched_rt.c Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-24 12:53:51 -07:00
sched_stats.h sched: fix accounting in task delay accounting & migration 2008-07-04 12:50:23 +02:00
sched.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
seccomp.c
semaphore.c mmiotrace broken in linux-next (8-bit writes only) 2008-07-01 10:14:06 +02:00
signal.c posix timers: discard SI_TIMER signals on exec 2008-05-26 10:37:07 -07:00
smp.c generic ipi function calls: wait on alloc failure fallback 2008-07-15 14:12:20 -07:00
softirq.c Merge branch 'linus' into timers/nohz 2008-07-18 19:53:16 +02:00
softlockup.c softlockup: print a module list on being stuck 2008-07-05 08:51:24 +02:00
spinlock.c ftrace: lockdep notrace annotations 2008-05-23 20:39:40 +02:00
srcu.c
stacktrace.c stacktrace: fix modular build, export print_stack_trace and save_stack_trace 2008-06-30 09:20:55 +02:00
stop_machine.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr 2008-07-18 22:02:57 +02:00
sys_ni.c flag parameters: fix compile error of sys_epoll_create1 2008-07-25 10:53:26 -07:00
sys.c call_usermodehelper(): increase reliability 2008-07-25 10:53:28 -07:00
sysctl_check.c
sysctl.c printk ratelimiting rewrite 2008-07-25 10:53:29 -07:00
taskstats.c core: use performance variant for_each_cpu_mask_nr 2008-05-23 18:35:12 +02:00
test_kprobes.c
time.c Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm 2008-07-14 16:06:58 -07:00
tsacct.c
uid16.c
user_namespace.c
user.c alloc_uid: cleanup 2008-04-30 08:29:53 -07:00
utsname_sysctl.c
utsname.c
wait.c
workqueue.c pm: introduce new interfaces schedule_work_on() and queue_work_on() 2008-07-24 10:47:23 -07:00