linux/kernel/trace
Steven Rostedt d1a291437f tracing: Use kmem_cache_alloc instead of kmalloc in trace_events.c
The event structures used by the trace events are mostly persistent,
but they are also allocated by kmalloc, which is not the best at
allocating space for what is used. By converting these kmallocs
into kmem_cache_allocs, we can save over 50K of space that is
permanently allocated.

After boot we have:

 slab name          active allocated size
 ---------          ------ --------- ----
ftrace_event_file    979   1005     56   67    1
ftrace_event_field   2301   2310     48   77    1

The ftrace_event_file has at boot up 979 active objects out of
1005 allocated in the slabs. Each object is 56 bytes. In a normal
kmalloc, that would allocate 64 bytes for each object.

 1005 - 979  = 26 objects not used
 26 * 56 = 1456 bytes wasted

But if we used kmalloc:

 64 - 56 = 8 bytes unused per allocation
 8 * 979 = 7832 bytes wasted

 7832 - 1456 = 6376 bytes in savings

Doing the same for ftrace_event_field where there's 2301 objects
allocated in a slab that can hold 2310 with 48 bytes each we have:

 2310 - 2301 = 9 objects not used
 9 * 48 = 432 bytes wasted

A kmalloc would also use 64 bytes per object:

 64 - 48 = 16 bytes unused per allocation
 16 * 2301 = 36816 bytes wasted!

 36816 - 432 = 36384 bytes in savings

This change gives us a total of 42760 bytes in savings. At least
on my machine, but as there's a lot of these persistent objects
for all configurations that use trace points, this is a net win.

Thanks to Ezequiel Garcia for his trace_analyze presentation which
pointed out the wasted space in my code.

Cc: Ezequiel Garcia <elezegarcia@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15 00:34:46 -04:00
..
blktrace.c tracing: Use this_cpu_ptr per-cpu helper 2013-01-21 13:22:30 -05:00
ftrace.c tracing: Fix free of probe entry by calling call_rcu_sched() 2013-03-13 17:57:44 -04:00
Kconfig ftrace: Update the kconfig for DYNAMIC_FTRACE 2013-02-27 21:58:01 -05:00
Makefile trace: Stop compiling in trace_clock unconditionally 2012-09-13 22:52:08 -04:00
power-traces.c perf: Clean up power events by introducing new, more generic ones 2011-01-04 08:16:54 +01:00
ring_buffer_benchmark.c tracing: Use NUMA allocation for per-cpu ring buffer pages 2011-06-14 22:04:39 -04:00
ring_buffer.c ring-buffer: Add stats field for amount read from trace ring buffer 2013-01-30 11:01:53 -05:00
rpm-traces.c PM / Runtime: Introduce trace points for tracing rpm_* functions 2011-09-27 22:53:27 +02:00
trace_branch.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_clock.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 17:49:41 -08:00
trace_entries.h Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent 2012-03-24 08:19:09 +01:00
trace_event_perf.c perf/core improvements and fixes: 2012-08-21 11:27:00 +02:00
trace_events_filter_test.h tracing/filter: Add startup tests for events filter 2011-08-19 14:35:59 -04:00
trace_events_filter.c tracing: Separate out trace events from global variables 2013-03-15 00:34:40 -04:00
trace_events.c tracing: Use kmem_cache_alloc instead of kmalloc in trace_events.c 2013-03-15 00:34:46 -04:00
trace_export.c tracing: Do not enable function event with enable 2012-05-10 15:55:43 -04:00
trace_functions_graph.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_functions.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_irqsoff.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_kdb.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_kprobe.c tracing: Use irq_work for wake ups and remove *_nowake_*() functions 2012-11-02 10:21:52 -04:00
trace_mmiotrace.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_nop.c
trace_output.c tracing: Format non-nanosec times from tsc clock without a decimal point. 2012-11-13 15:48:40 -05:00
trace_output.h
trace_printk.c tracing: Add percpu buffers for trace_printk() 2012-04-23 21:15:55 -04:00
trace_probe.c tracing: Replace strict_strto* with kstrto* 2012-10-31 16:45:23 -04:00
trace_probe.h uprobes/tracing: Introduce is_trace_uprobe_enabled() 2013-02-08 18:24:30 +01:00
trace_sched_switch.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_sched_wakeup.c tracing: Replace the static global per_cpu arrays with allocated per_cpu 2013-03-15 00:34:43 -04:00
trace_selftest_dynamic.c ftrace: Add self-tests for multiple function trace users 2011-05-18 19:24:51 -04:00
trace_selftest.c ftrace: Fix function tracing recursion self test 2013-01-22 23:37:58 -05:00
trace_stack.c tracing: Remove unneeded checks from the stack tracer 2012-11-19 15:07:13 -05:00
trace_stat.c
trace_stat.h
trace_syscalls.c tracing: Make syscall events suitable for multiple buffers 2013-03-15 00:34:44 -04:00
trace_uprobe.c uprobes/perf: Avoid uprobe_apply() whenever possible 2013-02-08 18:28:08 +01:00
trace.c tracing: Add rmdir to remove multibuffer instances 2013-03-15 00:34:45 -04:00
trace.h tracing: Add rmdir to remove multibuffer instances 2013-03-15 00:34:45 -04:00