linux/kernel
Keika Kobayashi 873b477177 per-task-delay-accounting: add memory reclaim delay
Sometimes, application responses become bad under heavy memory load.
Applications take a bit time to reclaim memory.  The statistics, how long
memory reclaim takes, will be useful to measure memory usage.

This patch adds accounting memory reclaim to per-task-delay-accounting for
accounting the time of do_try_to_free_pages().

<i.e>

- When System is under low memory load,
  memory reclaim may not occur.

$ free
             total       used       free     shared    buffers     cached
Mem:       8197800    1577300    6620500          0       4808    1516724
-/+ buffers/cache:      55768    8142032
Swap:     16386292          0   16386292

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0 5069748  10612 3014060    0    0     0     0    3   26  0  0 100  0
 0  0      0 5069748  10612 3014060    0    0     0     0    4   22  0  0 100  0
 0  0      0 5069748  10612 3014060    0    0     0     0    3   18  0  0 100  0

Measure the time of tar command.

$ ls -s test.dat
1501472 test.dat

$ time tar cvf test.tar test.dat
real    0m13.388s
user    0m0.116s
sys     0m5.304s

$ ./delayget -d -p <pid>
CPU             count     real total  virtual total    delay total
                  428     5528345500     5477116080       62749891
IO              count    delay total
                  338     8078977189
SWAP            count    delay total
                    0              0
RECLAIM         count    delay total
                    0              0

- When system is under heavy memory load
  memory reclaim may occur.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0 7159032  49724   1812   3012    0    0     0     0    3   24  0  0 100  0
 0  0 7159032  49724   1812   3012    0    0     0     0    4   24  0  0 100  0
 0  0 7159032  49848   1812   3012    0    0     0     0    3   22  0  0 100  0

In this case, one process uses more 8G memory
by execution of malloc() and memset().

$ time tar cvf test.tar test.dat
real    1m38.563s        <-  increased by 85 sec
user    0m0.140s
sys     0m7.060s

$ ./delayget -d -p <pid>
CPU             count     real total  virtual total    delay total
                 9021     7140446250     7315277975      923201824
IO              count    delay total
                 8965    90466349669
SWAP            count    delay total
                    3       21036367
RECLAIM         count    delay total
                  740    61011951153

In the later case, the value of RECLAIM is increasing.
So, taskstats can show how much memory reclaim influences TAT.

Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujistu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:47 -07:00
..
irq kernel/irq/manage.c: replace a printk + WARN_ON() to a WARN() 2008-07-25 10:53:29 -07:00
power pm: fix try_to_freeze_tasks()'s use of do_div() 2008-07-24 10:47:24 -07:00
time Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-24 12:55:01 -07:00
trace markers: fix sparse integer as NULL pointer warning 2008-07-25 10:53:45 -07:00
.gitignore
acct.c bsdacct: fix and add comments around acct_process() 2008-07-25 10:53:47 -07:00
audit_tree.c
audit.c
audit.h
auditfilter.c
auditsc.c x86_64 syscall audit fast-path 2008-07-23 17:47:32 -07:00
backtracetest.c
bounds.c
capability.c security: filesystem capabilities refactor kernel code 2008-07-24 10:47:22 -07:00
cgroup_debug.c
cgroup.c cgroup_clone: use pid of newly created task for new cgroup 2008-07-25 10:53:37 -07:00
compat.c
configs.c
cpu.c workqueues: make get_online_cpus() useable for work->func() 2008-07-25 10:53:40 -07:00
cpuset.c cpuset: two minor code-cleanups 2008-07-25 10:53:38 -07:00
delayacct.c per-task-delay-accounting: add memory reclaim delay 2008-07-25 10:53:47 -07:00
dma.c
exec_domain.c
exit.c task IO accounting: provide distinct tgid/tid I/O statistics 2008-07-25 10:53:47 -07:00
extable.c
fork.c task IO accounting: provide distinct tgid/tid I/O statistics 2008-07-25 10:53:47 -07:00
futex_compat.c
futex.c
hrtimer.c
itimer.c
kallsyms.c kallsyms: fix potential overflow in binary search 2008-07-25 10:53:27 -07:00
Kconfig.hz sched: fix hrtick & generic-ipi dependency 2008-07-23 11:18:28 +02:00
Kconfig.preempt
kexec.c
kfifo.c
kgdb.c
kmod.c call_usermodehelper(): increase reliability 2008-07-25 10:53:28 -07:00
kprobes.c kprobes: remove redundant config check 2008-07-25 10:53:30 -07:00
ksysfs.c
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep.c
Makefile build kernel/profile.o only when requested 2008-07-25 10:53:27 -07:00
marker.c markers: use rcu_barrier_sched() and call_rcu_sched() 2008-07-25 10:53:45 -07:00
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c cgroup_clone: use pid of newly created task for new cgroup 2008-07-25 10:53:37 -07:00
nsproxy.c cgroup_clone: use pid of newly created task for new cgroup 2008-07-25 10:53:37 -07:00
panic.c Add a WARN() macro; this is WARN_ON() + printk arguments 2008-07-25 10:53:29 -07:00
params.c
pid_namespace.c bsdacct: switch from global bsd_acct_struct instance to per-pidns one 2008-07-25 10:53:47 -07:00
pid.c pidns: remove now unused find_pid function. 2008-07-25 10:53:45 -07:00
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c posix timers: release_posix_timer: kill the bogus put_task_struct(->it_process); 2008-07-25 10:53:38 -07:00
printk.c printk ratelimiting rewrite 2008-07-25 10:53:29 -07:00
profile.c build kernel/profile.o only when requested 2008-07-25 10:53:27 -07:00
ptrace.c
rcuclassic.c
rcupdate.c
rcupreempt_trace.c
rcupreempt.c
rcutorture.c
relay.c
res_counter.c cgroup files: convert res_counter_write() to be a cgroups write_string() handler 2008-07-25 10:53:36 -07:00
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c
sched_fair.c Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-23 19:36:53 -07:00
sched_features.h
sched_idletask.c
sched_rt.c Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-07-24 12:53:51 -07:00
sched_stats.h
sched.c accounting: account for user time when updating memory integrals 2008-07-25 10:53:46 -07:00
seccomp.c
semaphore.c
signal.c pidns: remove now unused kill_proc function 2008-07-25 10:53:45 -07:00
smp.c
softirq.c
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c
sys_ni.c flag parameters: fix compile error of sys_epoll_create1 2008-07-25 10:53:26 -07:00
sys.c unexport uts_sem 2008-07-25 10:53:45 -07:00
sysctl_check.c sysctl: check for bogus modes 2008-07-25 10:53:45 -07:00
sysctl.c printk ratelimiting rewrite 2008-07-25 10:53:29 -07:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c
tsacct.c tsacct: fix bacct_add_tsk()'s use of do_div() 2008-07-25 10:53:47 -07:00
uid16.c
user_namespace.c
user.c
utsname_sysctl.c
utsname.c
wait.c
workqueue.c workqueues: do CPU_UP_CANCELED if CPU_UP_PREPARE fails 2008-07-25 10:53:41 -07:00