linux/kernel
David S. Miller 0aec63e67c [PATCH] Fix posix-cpu-timers sched_time accumulation
I've spent the past 3 days digging into a glibc testsuite failure in
current CVS, specifically libc/rt/tst-cputimer1.c The thr1 and thr2
timers fire too early in the second pass of this test.  The second
pass is noteworthy because it makes use of intervals, whereas the
first pass does not.

All throughout the posix-cpu-timers.c code, the calculation of the
process sched_time sum is implemented roughly as:

	unsigned long long sum;

	sum = tsk->signal->sched_time;
	t = tsk;
	do {
		sum += t->sched_time;
		t = next_thread(t);
	} while (t != tsk);

In fact this is the exact scheme used by check_process_timers().

In the case of check_process_timers(), current->sched_time has just
been updated (via scheduler_tick(), which is invoked by
update_process_times(), which subsequently invokes
run_posix_cpu_timers()) So there is no special processing necessary
wrt. that.

In other contexts, we have to allot for the fact that tsk->sched_time
might be a bit out of date if we are current.  And the
posix-cpu-timers.c code uses current_sched_time() to deal with that.

Unfortunately it does so in an erroneous and inconsistent manner in
one spot which is what results in the early timer firing.

In cpu_clock_sample_group_locked(), it does this:

		cpu->sched = p->signal->sched_time;
		/* Add in each other live thread.  */
		while ((t = next_thread(t)) != p) {
			cpu->sched += t->sched_time;
		}
		if (p->tgid == current->tgid) {
			/*
			 * We're sampling ourselves, so include the
			 * cycles not yet banked.  We still omit
			 * other threads running on other CPUs,
			 * so the total can always be behind as
			 * much as max(nthreads-1,ncpus) * (NSEC_PER_SEC/HZ).
			 */
			cpu->sched += current_sched_time(current);
		} else {
			cpu->sched += p->sched_time;
		}

The problem is the "p->tgid == current->tgid" test.  If "p" is
not current, and the tgids are the same, we will add the process
t->sched_time twice into cpu->sched and omit "p"'s sched_time
which is very very very wrong.

posix-cpu-timers.c has a helper function, sched_ns(p) which takes care
of this, so my fix is to use that here instead of this special tgid
test.

The fact that current can be one of the sub-threads of "p" points out
that we could make things a little bit more accurate, perhaps by using
sched_ns() on every thread we process in these loops.  It also points
out that we don't use the most accurate value for threads in the group
actively running other cpus (and this is mentioned in the comment).

But that is a future enhancement, and this fix here definitely makes
sense.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 20:23:04 -08:00
..
irq [PATCH] Alpha: convert to generic irq framework (generic part) 2006-01-06 08:33:40 -08:00
power [PATCH] swsusp: save image header first 2006-01-06 08:33:43 -08:00
.gitignore gitignore: ignore more generated files 2006-01-03 11:35:26 +01:00
acct.c [PATCH] s390: cputime_t fixes 2006-01-06 08:33:49 -08:00
audit.c [PATCH] Add try_to_freeze to kauditd 2005-12-12 08:57:43 -08:00
auditsc.c
capability.c
compat.c
configs.c update the email address of Randy Dunlap 2006-01-03 13:37:51 +01:00
cpu.c [PATCH] clean up lock_cpu_hotplug() in cpufreq 2005-11-28 14:42:23 -08:00
cpuset.c [PATCH] cpuset: fix return without releasing semaphore 2005-11-13 18:14:11 -08:00
crash_dump.c
dma.c
exec_domain.c
exit.c [PATCH] m68k: introduce task_thread_info 2005-11-13 18:14:13 -08:00
extable.c
fork.c [PATCH] cpuset fork locking fix 2005-11-28 14:42:24 -08:00
futex.c [PATCH] FRV: Make futex code compilable on nommu [try #2] 2006-01-06 08:33:33 -08:00
intermodule.c
itimer.c
kallsyms.c [PATCH] fix missing includes 2005-10-30 17:37:32 -08:00
Kconfig.hz
Kconfig.preempt
kexec.c
kfifo.c
kmod.c
kprobes.c [PATCH] kprobes: increment kprobe missed count for multiprobes 2005-12-12 08:57:45 -08:00
ksysfs.c [PATCH] kobject_uevent CONFIG_NET=n fix 2006-01-04 16:18:08 -08:00
kthread.c
Makefile
module.c [PATCH] kernel/module.c: removed dead code 2006-01-06 08:33:59 -08:00
panic.c [PATCH] s390: cleanup Kconfig 2006-01-06 08:33:53 -08:00
params.c [PATCH] kernel/params.c: fix sysfs access with CONFIG_MODULES=n 2005-12-20 10:31:33 -08:00
pid.c
posix-cpu-timers.c [PATCH] Fix posix-cpu-timers sched_time accumulation 2006-01-06 20:23:04 -08:00
posix-timers.c [PATCH] timespec: normalize off by one errors 2005-11-13 18:14:17 -08:00
printk.c [PATCH] Fix crash in unregister_console() 2005-11-23 16:08:39 -08:00
profile.c
ptrace.c [PATCH] Fix crash when ptrace poking hugepage areas 2005-11-29 19:47:03 -08:00
rcupdate.c [PATCH] Fix RCU race in access of nohz_cpu_mask 2005-12-12 08:57:42 -08:00
rcutorture.c [PATCH] Fix bug in RCU torture test 2005-12-12 08:57:42 -08:00
resource.c
sched.c [PATCH] m68k: introduce setup_thread_stack() and end_of_stack() 2005-11-13 18:14:13 -08:00
seccomp.c
signal.c [PATCH] signal handling: revert sigkill priority fix 2005-11-13 18:14:15 -08:00
softirq.c [PATCH] cpu hoptlug: avoid usage of smp_processor_id() in preemptible code 2005-11-07 07:53:29 -08:00
softlockup.c [PATCH] quieten softlockup at boot 2005-11-09 07:55:50 -08:00
spinlock.c
stop_machine.c [PATCH] stop_machine() vs. synchronous IPI send deadlock 2005-11-13 18:14:16 -08:00
sys_ni.c
sys.c [PATCH] kprobes: no probes on critical path 2005-12-12 08:57:45 -08:00
sysctl.c [PATCH] s390: cleanup Kconfig 2006-01-06 08:33:53 -08:00
time.c [PATCH] Add getnstimestamp function 2005-12-12 08:57:42 -08:00
timer.c
uid16.c
user.c
wait.c
workqueue.c [PATCH] Fix hardcoded cpu=0 in workqueue for per_cpu_ptr() calls 2005-11-28 14:42:23 -08:00