linux/tools
Yunlong Song cb06ac256a perf sched replay: Alloc the memory of pid_to_task dynamically to adapt to the unexpected change of pid_max
The current memory allocation of struct task_desc *pid_to_task[MAX_PID]
is in a permanent and preset way, and it has two problems:

Problem 1: If the pid_max, which is the max number of pids in the
system, is much smaller than MAX_PID (1024*1000), then it causes a waste
of stack memory. This may happen in the case where the number of cpu
cores is much smaller than 1000.

Problem 2: If the pid_max is changed from the default value to a value
larger than MAX_PID, then it will cause assertion failure problem. The
maximum value of pid_max can be set to pid_max_max (see pidmap_init
defined in kernel/pid.c), which equals to PID_MAX_LIMIT. In x86_64,
PID_MAX_LIMIT is 4*1024*1024 (defined in include/linux/threads.h). This
value is much larger than MAX_PID, and will take up 32768 Kbytes
(4*1024*1024*8/1024) for memory allocation of pid_to_task, which is much
larger than the default 8192 Kbytes of the stack size of calling
process.

Due to these two problems, we use calloc to allocate the memory of
pid_to_task dynamically.

Example:

Test environment: x86_64 with 160 cores

 $ cat /proc/sys/kernel/pid_max
 163840
 $ echo 1025000 > /proc/sys/kernel/pid_max
 $ cat /proc/sys/kernel/pid_max
 1025000

Run some applications until the pid of some process is greater than
the value of MAX_PID (1024*1000).

Before this patch:

 $ perf sched replay
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 55480 nsecs
 the run test took 1000008 nsecs
 the sleep test took 1063151 nsecs
 perf: builtin-sched.c:330: register_pid: Assertion `!(pid >= 1024000)'
 failed.
 Aborted

After this patch:

 $ perf sched replay
 run measurement overhead: 221 nsecs
 sleep measurement overhead: 55435 nsecs
 the run test took 1000004 nsecs
 the sleep test took 1059312 nsecs
 nr_run_events:        10
 nr_sleep_events:      1562
 nr_wakeup_events:     5
 task      0 (                  :1:         1), nr_events: 1
 task      1 (                  :2:         2), nr_events: 1
 task      2 (                  :3:         3), nr_events: 1
 task      3 (                  :5:         5), nr_events: 1
 ...

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427809596-29559-4-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-04-08 09:07:22 -03:00
..
build tools build: Add feature check for lzma library 2015-03-21 14:53:39 -03:00
cgroup
firewire
hv Tools: hv: do not add redundant '/' in hv_start_fcopy() 2015-01-25 09:17:57 -08:00
include tools: Remove bitops/hweight usage of bits in tools/perf 2015-01-16 17:49:29 -03:00
lguest tools/lguest: don't use legacy definitions for net device in example launcher. 2015-02-13 17:15:55 +10:30
lib tools lib traceevent: Honor operator priority 2015-04-08 09:07:09 -03:00
net
nfsd
perf perf sched replay: Alloc the memory of pid_to_task dynamically to adapt to the unexpected change of pid_max 2015-04-08 09:07:22 -03:00
power Revert "cpupower Makefile change to help run the tool without 'make install'" 2015-03-11 21:56:49 +01:00
scripts
testing selftests: Fix build failures when invoked from kselftest target 2015-03-19 09:54:55 -06:00
thermal/tmon tools/thermal: tmon: silence 'set but not used' warnings 2015-02-28 13:52:48 +08:00
time
usb tools: ffs-aio-example: use endpoint addresses from descriptors 2015-01-15 09:41:49 -06:00
virtio tools/virtio: add virtio 1.0 in vringh_test 2014-12-15 23:49:22 +02:00
vm mm:add KPF_ZERO_PAGE flag for /proc/kpageflags 2015-02-11 17:06:00 -08:00
Makefile