linux/kernel
Christian Borntraeger 3401a61e16 stop_machine: make stop_machine_run more virtualization friendly
On kvm I have seen some rare hangs in stop_machine when I used more guest
cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the
hang quite often. I could also reproduce the problem on a 4 way z/VM host with
a 64 way guest.

It turned out that the guest was consuming all available cpus mostly for
spinning on scheduler locks like rq->lock. This is expected as the threads are
calling yield all the time.
The problem is now, that the host scheduling decisings together with the guest
scheduling decisions and spinlocks not being fair managed to create an
interesting scenario similar to a live lock. (Sometimes the hang resolved
itself after some minutes)

Changing stop_machine to yield the cpu to the hypervisor when yielding inside
the guest fixed the problem for me. While I am not completely happy with this
patch, I think it causes no harm and it really improves the situation for me.

I used cpu_relax for yielding to the hypervisor, does that work on all
architectures?

p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use
stop_machine_run and both triggered the problem after some retries.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-23 13:09:34 +10:00
..
irq genirq: reenable a nobody cared disabled irq when a new driver arrives 2008-05-02 13:40:34 +02:00
power Merge branches 'release', 'acpica', 'bugzilla-10224', 'bugzilla-9772', 'bugzilla-9916', 'ec', 'eeepc', 'idle', 'misc', 'pm-legacy', 'sysfs-links-2.6.26', 'thermal', 'thinkpad' and 'video' into release 2008-04-30 13:58:00 -04:00
time clocksource: allow read access to available/current_clocksource 2008-05-03 18:11:48 +02:00
.gitignore
acct.c
audit_tree.c [PATCH] list_for_each_rcu must die: audit 2008-05-17 03:30:23 -04:00
audit.c [patch 1/1] audit_send_reply(): fix error-path memory leak 2008-05-17 03:30:22 -04:00
audit.h
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup_debug.c
cgroup.c mm: bdi: add separate writeback accounting capability 2008-04-30 08:29:50 -07:00
compat.c ntp: support for TAI 2008-05-01 08:03:59 -07:00
configs.c
cpu.c kernel: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
cpuset.c Fix cpuset sched_relax_domain_level control file 2008-05-08 10:46:56 -07:00
delayacct.c
dma.c
exec_domain.c
exit.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
extable.c
fork.c [PATCH] dup_fd() fixes, part 1 2008-05-16 17:22:26 -04:00
futex_compat.c
futex.c Removal of FUTEX_FD 2008-05-05 08:18:45 -07:00
hrtimer.c hrtimer: remove duplicate helper function 2008-05-03 18:11:48 +02:00
itimer.c
kallsyms.c
Kconfig.hz
Kconfig.preempt
kexec.c kexec: make extended crashkernel= syntax less confusing 2008-05-01 08:04:00 -07:00
kfifo.c
kgdb.c lib: create common ascii hex array 2008-05-14 19:11:14 -07:00
kmod.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
kprobes.c
ksysfs.c
kthread.c Deprecate find_task_by_pid() 2008-04-30 08:29:48 -07:00
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep.c
Makefile sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK 2008-05-05 23:56:18 +02:00
marker.c make marker_debug static 2008-04-30 08:29:49 -07:00
module.c modules: proper cleanup of kobject without CONFIG_SYSFS 2008-05-23 13:09:33 +10:00
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
panic.c
params.c
pid_namespace.c pidns: make pid->level and pid_ns->level unsigned 2008-04-30 08:29:49 -07:00
pid.c pids: introduce change_pid() helper 2008-04-30 08:29:48 -07:00
pm_qos_params.c
posix-cpu-timers.c remove div_long_long_rem 2008-05-01 08:03:58 -07:00
posix-timers.c signals: join send_sigqueue() with send_group_sigqueue() 2008-04-30 08:29:36 -07:00
printk.c printk: don't read beyond string arguments' terminating zero 2008-04-30 08:29:52 -07:00
profile.c
ptrace.c make generic sys_ptrace unconditional 2008-05-01 10:21:54 -07:00
rcuclassic.c
rcupdate.c
rcupreempt_trace.c
rcupreempt.c
rcutorture.c
relay.c Revert "relay: fix splice problem" 2008-05-08 14:06:19 +02:00
res_counter.c
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_clock.c sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK 2008-05-05 23:56:18 +02:00
sched_debug.c sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCK 2008-05-05 23:56:18 +02:00
sched_fair.c sched: fix weight calculations 2008-05-08 17:00:42 +02:00
sched_features.h
sched_idletask.c sched: make rt_sched_class, idle_sched_class static 2008-05-05 23:56:17 +02:00
sched_rt.c sched: fix RT task-wakeup logic 2008-05-05 23:56:18 +02:00
sched_stats.h
sched.c cgroups: fix compile warning 2008-05-14 19:11:14 -07:00
seccomp.c
semaphore.c Revert "semaphore: fix" 2008-05-10 20:43:22 -07:00
signal.c signals: add set_restore_sigmask 2008-04-30 08:29:37 -07:00
softirq.c Fix cpu hotplug problem in softirq code 2008-05-01 08:03:58 -07:00
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c stop_machine: make stop_machine_run more virtualization friendly 2008-05-23 13:09:34 +10:00
sys_ni.c
sys.c pids: sys_getpgid: fix unsafe *pid usage, s/tasklist/rcu/ 2008-04-30 08:29:49 -07:00
sysctl_check.c
sysctl.c [PATCH] avoid multiplication overflows and signedness issues for max_fds 2008-05-16 17:22:52 -04:00
taskstats.c Use find_task_by_vpid in taskstats 2008-04-30 08:29:48 -07:00
test_kprobes.c
time.c Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c debugobjects: add timer specific object debugging code 2008-04-30 08:29:53 -07:00
tsacct.c
uid16.c
user_namespace.c
user.c alloc_uid: cleanup 2008-04-30 08:29:53 -07:00
utsname_sysctl.c
utsname.c
wait.c
workqueue.c workqueue: remove redundant function invocation 2008-05-01 08:04:02 -07:00