mirror of
https://github.com/FEX-Emu/linux.git
synced 2024-12-17 06:17:35 +00:00
64db4cfff9
This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
944 lines
29 KiB
Plaintext
944 lines
29 KiB
Plaintext
config ARCH
|
|
string
|
|
option env="ARCH"
|
|
|
|
config KERNELVERSION
|
|
string
|
|
option env="KERNELVERSION"
|
|
|
|
config DEFCONFIG_LIST
|
|
string
|
|
depends on !UML
|
|
option defconfig_list
|
|
default "/lib/modules/$UNAME_RELEASE/.config"
|
|
default "/etc/kernel-config"
|
|
default "/boot/config-$UNAME_RELEASE"
|
|
default "$ARCH_DEFCONFIG"
|
|
default "arch/$ARCH/defconfig"
|
|
|
|
menu "General setup"
|
|
|
|
config EXPERIMENTAL
|
|
bool "Prompt for development and/or incomplete code/drivers"
|
|
---help---
|
|
Some of the various things that Linux supports (such as network
|
|
drivers, file systems, network protocols, etc.) can be in a state
|
|
of development where the functionality, stability, or the level of
|
|
testing is not yet high enough for general use. This is usually
|
|
known as the "alpha-test" phase among developers. If a feature is
|
|
currently in alpha-test, then the developers usually discourage
|
|
uninformed widespread use of this feature by the general public to
|
|
avoid "Why doesn't this work?" type mail messages. However, active
|
|
testing and use of these systems is welcomed. Just be aware that it
|
|
may not meet the normal level of reliability or it may fail to work
|
|
in some special cases. Detailed bug reports from people familiar
|
|
with the kernel internals are usually welcomed by the developers
|
|
(before submitting bug reports, please read the documents
|
|
<file:README>, <file:MAINTAINERS>, <file:REPORTING-BUGS>,
|
|
<file:Documentation/BUG-HUNTING>, and
|
|
<file:Documentation/oops-tracing.txt> in the kernel source).
|
|
|
|
This option will also make obsoleted drivers available. These are
|
|
drivers that have been replaced by something else, and/or are
|
|
scheduled to be removed in a future kernel release.
|
|
|
|
Unless you intend to help test and develop a feature or driver that
|
|
falls into this category, or you have a situation that requires
|
|
using these features, you should probably say N here, which will
|
|
cause the configurator to present you with fewer choices. If
|
|
you say Y here, you will be offered the choice of using features or
|
|
drivers that are currently considered to be in the alpha-test phase.
|
|
|
|
config BROKEN
|
|
bool
|
|
|
|
config BROKEN_ON_SMP
|
|
bool
|
|
depends on BROKEN || !SMP
|
|
default y
|
|
|
|
config LOCK_KERNEL
|
|
bool
|
|
depends on SMP || PREEMPT
|
|
default y
|
|
|
|
config INIT_ENV_ARG_LIMIT
|
|
int
|
|
default 32 if !UML
|
|
default 128 if UML
|
|
help
|
|
Maximum of each of the number of arguments and environment
|
|
variables passed to init from the kernel command line.
|
|
|
|
|
|
config LOCALVERSION
|
|
string "Local version - append to kernel release"
|
|
help
|
|
Append an extra string to the end of your kernel version.
|
|
This will show up when you type uname, for example.
|
|
The string you set here will be appended after the contents of
|
|
any files with a filename matching localversion* in your
|
|
object and source tree, in that order. Your total string can
|
|
be a maximum of 64 characters.
|
|
|
|
config LOCALVERSION_AUTO
|
|
bool "Automatically append version information to the version string"
|
|
default y
|
|
help
|
|
This will try to automatically determine if the current tree is a
|
|
release tree by looking for git tags that belong to the current
|
|
top of tree revision.
|
|
|
|
A string of the format -gxxxxxxxx will be added to the localversion
|
|
if a git-based tree is found. The string generated by this will be
|
|
appended after any matching localversion* files, and after the value
|
|
set in CONFIG_LOCALVERSION.
|
|
|
|
(The actual string used here is the first eight characters produced
|
|
by running the command:
|
|
|
|
$ git rev-parse --verify HEAD
|
|
|
|
which is done within the script "scripts/setlocalversion".)
|
|
|
|
config SWAP
|
|
bool "Support for paging of anonymous memory (swap)"
|
|
depends on MMU && BLOCK
|
|
default y
|
|
help
|
|
This option allows you to choose whether you want to have support
|
|
for so called swap devices or swap files in your kernel that are
|
|
used to provide more virtual memory than the actual RAM present
|
|
in your computer. If unsure say Y.
|
|
|
|
config SYSVIPC
|
|
bool "System V IPC"
|
|
---help---
|
|
Inter Process Communication is a suite of library functions and
|
|
system calls which let processes (running programs) synchronize and
|
|
exchange information. It is generally considered to be a good thing,
|
|
and some programs won't run unless you say Y here. In particular, if
|
|
you want to run the DOS emulator dosemu under Linux (read the
|
|
DOSEMU-HOWTO, available from <http://www.tldp.org/docs.html#howto>),
|
|
you'll need to say Y here.
|
|
|
|
You can find documentation about IPC with "info ipc" and also in
|
|
section 6.4 of the Linux Programmer's Guide, available from
|
|
<http://www.tldp.org/guides.html>.
|
|
|
|
config SYSVIPC_SYSCTL
|
|
bool
|
|
depends on SYSVIPC
|
|
depends on SYSCTL
|
|
default y
|
|
|
|
config POSIX_MQUEUE
|
|
bool "POSIX Message Queues"
|
|
depends on NET && EXPERIMENTAL
|
|
---help---
|
|
POSIX variant of message queues is a part of IPC. In POSIX message
|
|
queues every message has a priority which decides about succession
|
|
of receiving it by a process. If you want to compile and run
|
|
programs written e.g. for Solaris with use of its POSIX message
|
|
queues (functions mq_*) say Y here.
|
|
|
|
POSIX message queues are visible as a filesystem called 'mqueue'
|
|
and can be mounted somewhere if you want to do filesystem
|
|
operations on message queues.
|
|
|
|
If unsure, say Y.
|
|
|
|
config BSD_PROCESS_ACCT
|
|
bool "BSD Process Accounting"
|
|
help
|
|
If you say Y here, a user level program will be able to instruct the
|
|
kernel (via a special system call) to write process accounting
|
|
information to a file: whenever a process exits, information about
|
|
that process will be appended to the file by the kernel. The
|
|
information includes things such as creation time, owning user,
|
|
command name, memory usage, controlling terminal etc. (the complete
|
|
list is in the struct acct in <file:include/linux/acct.h>). It is
|
|
up to the user level program to do useful things with this
|
|
information. This is generally a good idea, so say Y.
|
|
|
|
config BSD_PROCESS_ACCT_V3
|
|
bool "BSD Process Accounting version 3 file format"
|
|
depends on BSD_PROCESS_ACCT
|
|
default n
|
|
help
|
|
If you say Y here, the process accounting information is written
|
|
in a new file format that also logs the process IDs of each
|
|
process and it's parent. Note that this file format is incompatible
|
|
with previous v0/v1/v2 file formats, so you will need updated tools
|
|
for processing it. A preliminary version of these tools is available
|
|
at <http://www.gnu.org/software/acct/>.
|
|
|
|
config TASKSTATS
|
|
bool "Export task/process statistics through netlink (EXPERIMENTAL)"
|
|
depends on NET
|
|
default n
|
|
help
|
|
Export selected statistics for tasks/processes through the
|
|
generic netlink interface. Unlike BSD process accounting, the
|
|
statistics are available during the lifetime of tasks/processes as
|
|
responses to commands. Like BSD accounting, they are sent to user
|
|
space on task exit.
|
|
|
|
Say N if unsure.
|
|
|
|
config TASK_DELAY_ACCT
|
|
bool "Enable per-task delay accounting (EXPERIMENTAL)"
|
|
depends on TASKSTATS
|
|
help
|
|
Collect information on time spent by a task waiting for system
|
|
resources like cpu, synchronous block I/O completion and swapping
|
|
in pages. Such statistics can help in setting a task's priorities
|
|
relative to other tasks for cpu, io, rss limits etc.
|
|
|
|
Say N if unsure.
|
|
|
|
config TASK_XACCT
|
|
bool "Enable extended accounting over taskstats (EXPERIMENTAL)"
|
|
depends on TASKSTATS
|
|
help
|
|
Collect extended task accounting data and send the data
|
|
to userland for processing over the taskstats interface.
|
|
|
|
Say N if unsure.
|
|
|
|
config TASK_IO_ACCOUNTING
|
|
bool "Enable per-task storage I/O accounting (EXPERIMENTAL)"
|
|
depends on TASK_XACCT
|
|
help
|
|
Collect information on the number of bytes of storage I/O which this
|
|
task has caused.
|
|
|
|
Say N if unsure.
|
|
|
|
config AUDIT
|
|
bool "Auditing support"
|
|
depends on NET
|
|
help
|
|
Enable auditing infrastructure that can be used with another
|
|
kernel subsystem, such as SELinux (which requires this for
|
|
logging of avc messages output). Does not do system-call
|
|
auditing without CONFIG_AUDITSYSCALL.
|
|
|
|
config AUDITSYSCALL
|
|
bool "Enable system-call auditing support"
|
|
depends on AUDIT && (X86 || PPC || PPC64 || S390 || IA64 || UML || SPARC64|| SUPERH)
|
|
default y if SECURITY_SELINUX
|
|
help
|
|
Enable low-overhead system-call auditing infrastructure that
|
|
can be used independently or with another kernel subsystem,
|
|
such as SELinux. To use audit's filesystem watch feature, please
|
|
ensure that INOTIFY is configured.
|
|
|
|
config AUDIT_TREE
|
|
def_bool y
|
|
depends on AUDITSYSCALL && INOTIFY
|
|
|
|
config IKCONFIG
|
|
tristate "Kernel .config support"
|
|
---help---
|
|
This option enables the complete Linux kernel ".config" file
|
|
contents to be saved in the kernel. It provides documentation
|
|
of which kernel options are used in a running kernel or in an
|
|
on-disk kernel. This information can be extracted from the kernel
|
|
image file with the script scripts/extract-ikconfig and used as
|
|
input to rebuild the current kernel or to build another kernel.
|
|
It can also be extracted from a running kernel by reading
|
|
/proc/config.gz if enabled (below).
|
|
|
|
config IKCONFIG_PROC
|
|
bool "Enable access to .config through /proc/config.gz"
|
|
depends on IKCONFIG && PROC_FS
|
|
---help---
|
|
This option enables access to the kernel configuration file
|
|
through /proc/config.gz.
|
|
|
|
config LOG_BUF_SHIFT
|
|
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
|
|
range 12 21
|
|
default 17
|
|
help
|
|
Select kernel log buffer size as a power of 2.
|
|
Examples:
|
|
17 => 128 KB
|
|
16 => 64 KB
|
|
15 => 32 KB
|
|
14 => 16 KB
|
|
13 => 8 KB
|
|
12 => 4 KB
|
|
|
|
config CGROUPS
|
|
bool "Control Group support"
|
|
help
|
|
This option will let you use process cgroup subsystems
|
|
such as Cpusets
|
|
|
|
Say N if unsure.
|
|
|
|
config CGROUP_DEBUG
|
|
bool "Example debug cgroup subsystem"
|
|
depends on CGROUPS
|
|
default n
|
|
help
|
|
This option enables a simple cgroup subsystem that
|
|
exports useful debugging information about the cgroups
|
|
framework
|
|
|
|
Say N if unsure
|
|
|
|
config CGROUP_NS
|
|
bool "Namespace cgroup subsystem"
|
|
depends on CGROUPS
|
|
help
|
|
Provides a simple namespace cgroup subsystem to
|
|
provide hierarchical naming of sets of namespaces,
|
|
for instance virtual servers and checkpoint/restart
|
|
jobs.
|
|
|
|
config CGROUP_FREEZER
|
|
bool "control group freezer subsystem"
|
|
depends on CGROUPS
|
|
help
|
|
Provides a way to freeze and unfreeze all tasks in a
|
|
cgroup.
|
|
|
|
config CGROUP_DEVICE
|
|
bool "Device controller for cgroups"
|
|
depends on CGROUPS && EXPERIMENTAL
|
|
help
|
|
Provides a cgroup implementing whitelists for devices which
|
|
a process in the cgroup can mknod or open.
|
|
|
|
config CPUSETS
|
|
bool "Cpuset support"
|
|
depends on SMP && CGROUPS
|
|
help
|
|
This option will let you create and manage CPUSETs which
|
|
allow dynamically partitioning a system into sets of CPUs and
|
|
Memory Nodes and assigning tasks to run only within those sets.
|
|
This is primarily useful on large SMP or NUMA systems.
|
|
|
|
Say N if unsure.
|
|
|
|
#
|
|
# Architectures with an unreliable sched_clock() should select this:
|
|
#
|
|
config HAVE_UNSTABLE_SCHED_CLOCK
|
|
bool
|
|
|
|
config GROUP_SCHED
|
|
bool "Group CPU scheduler"
|
|
depends on EXPERIMENTAL
|
|
default n
|
|
help
|
|
This feature lets CPU scheduler recognize task groups and control CPU
|
|
bandwidth allocation to such task groups.
|
|
|
|
config FAIR_GROUP_SCHED
|
|
bool "Group scheduling for SCHED_OTHER"
|
|
depends on GROUP_SCHED
|
|
default GROUP_SCHED
|
|
|
|
config RT_GROUP_SCHED
|
|
bool "Group scheduling for SCHED_RR/FIFO"
|
|
depends on EXPERIMENTAL
|
|
depends on GROUP_SCHED
|
|
default n
|
|
help
|
|
This feature lets you explicitly allocate real CPU bandwidth
|
|
to users or control groups (depending on the "Basis for grouping tasks"
|
|
setting below. If enabled, it will also make it impossible to
|
|
schedule realtime tasks for non-root users until you allocate
|
|
realtime bandwidth for them.
|
|
See Documentation/scheduler/sched-rt-group.txt for more information.
|
|
|
|
choice
|
|
depends on GROUP_SCHED
|
|
prompt "Basis for grouping tasks"
|
|
default USER_SCHED
|
|
|
|
config USER_SCHED
|
|
bool "user id"
|
|
help
|
|
This option will choose userid as the basis for grouping
|
|
tasks, thus providing equal CPU bandwidth to each user.
|
|
|
|
config CGROUP_SCHED
|
|
bool "Control groups"
|
|
depends on CGROUPS
|
|
help
|
|
This option allows you to create arbitrary task groups
|
|
using the "cgroup" pseudo filesystem and control
|
|
the cpu bandwidth allocated to each such task group.
|
|
Refer to Documentation/cgroups.txt for more information
|
|
on "cgroup" pseudo filesystem.
|
|
|
|
endchoice
|
|
|
|
config CGROUP_CPUACCT
|
|
bool "Simple CPU accounting cgroup subsystem"
|
|
depends on CGROUPS
|
|
help
|
|
Provides a simple Resource Controller for monitoring the
|
|
total CPU consumed by the tasks in a cgroup
|
|
|
|
config RESOURCE_COUNTERS
|
|
bool "Resource counters"
|
|
help
|
|
This option enables controller independent resource accounting
|
|
infrastructure that works with cgroups
|
|
depends on CGROUPS
|
|
|
|
config MM_OWNER
|
|
bool
|
|
|
|
config CGROUP_MEM_RES_CTLR
|
|
bool "Memory Resource Controller for Control Groups"
|
|
depends on CGROUPS && RESOURCE_COUNTERS
|
|
select MM_OWNER
|
|
help
|
|
Provides a memory resource controller that manages both anonymous
|
|
memory and page cache. (See Documentation/controllers/memory.txt)
|
|
|
|
Note that setting this option increases fixed memory overhead
|
|
associated with each page of memory in the system. By this,
|
|
20(40)bytes/PAGE_SIZE on 32(64)bit system will be occupied by memory
|
|
usage tracking struct at boot. Total amount of this is printed out
|
|
at boot.
|
|
|
|
Only enable when you're ok with these trade offs and really
|
|
sure you need the memory resource controller. Even when you enable
|
|
this, you can set "cgroup_disable=memory" at your boot option to
|
|
disable memory resource controller and you can avoid overheads.
|
|
(and lose benefits of memory resource contoller)
|
|
|
|
This config option also selects MM_OWNER config option, which
|
|
could in turn add some fork/exit overhead.
|
|
|
|
config SYSFS_DEPRECATED
|
|
bool
|
|
|
|
config SYSFS_DEPRECATED_V2
|
|
bool "Create deprecated sysfs files"
|
|
depends on SYSFS
|
|
default y
|
|
select SYSFS_DEPRECATED
|
|
help
|
|
This option creates deprecated symlinks such as the
|
|
"device"-link, the <subsystem>:<name>-link, and the
|
|
"bus"-link. It may also add deprecated key in the
|
|
uevent environment.
|
|
None of these features or values should be used today, as
|
|
they export driver core implementation details to userspace
|
|
or export properties which can't be kept stable across kernel
|
|
releases.
|
|
|
|
If enabled, this option will also move any device structures
|
|
that belong to a class, back into the /sys/class hierarchy, in
|
|
order to support older versions of udev and some userspace
|
|
programs.
|
|
|
|
If you are using a distro with the most recent userspace
|
|
packages, it should be safe to say N here.
|
|
|
|
config PROC_PID_CPUSET
|
|
bool "Include legacy /proc/<pid>/cpuset file"
|
|
depends on CPUSETS
|
|
default y
|
|
|
|
config RELAY
|
|
bool "Kernel->user space relay support (formerly relayfs)"
|
|
help
|
|
This option enables support for relay interface support in
|
|
certain file systems (such as debugfs).
|
|
It is designed to provide an efficient mechanism for tools and
|
|
facilities to relay large amounts of data from kernel space to
|
|
user space.
|
|
|
|
If unsure, say N.
|
|
|
|
config NAMESPACES
|
|
bool "Namespaces support" if EMBEDDED
|
|
default !EMBEDDED
|
|
help
|
|
Provides the way to make tasks work with different objects using
|
|
the same id. For example same IPC id may refer to different objects
|
|
or same user id or pid may refer to different tasks when used in
|
|
different namespaces.
|
|
|
|
config UTS_NS
|
|
bool "UTS namespace"
|
|
depends on NAMESPACES
|
|
help
|
|
In this namespace tasks see different info provided with the
|
|
uname() system call
|
|
|
|
config IPC_NS
|
|
bool "IPC namespace"
|
|
depends on NAMESPACES && SYSVIPC
|
|
help
|
|
In this namespace tasks work with IPC ids which correspond to
|
|
different IPC objects in different namespaces
|
|
|
|
config USER_NS
|
|
bool "User namespace (EXPERIMENTAL)"
|
|
depends on NAMESPACES && EXPERIMENTAL
|
|
help
|
|
This allows containers, i.e. vservers, to use user namespaces
|
|
to provide different user info for different servers.
|
|
If unsure, say N.
|
|
|
|
config PID_NS
|
|
bool "PID Namespaces (EXPERIMENTAL)"
|
|
default n
|
|
depends on NAMESPACES && EXPERIMENTAL
|
|
help
|
|
Support process id namespaces. This allows having multiple
|
|
process with the same pid as long as they are in different
|
|
pid namespaces. This is a building block of containers.
|
|
|
|
Unless you want to work with an experimental feature
|
|
say N here.
|
|
|
|
config BLK_DEV_INITRD
|
|
bool "Initial RAM filesystem and RAM disk (initramfs/initrd) support"
|
|
depends on BROKEN || !FRV
|
|
help
|
|
The initial RAM filesystem is a ramfs which is loaded by the
|
|
boot loader (loadlin or lilo) and that is mounted as root
|
|
before the normal boot procedure. It is typically used to
|
|
load modules needed to mount the "real" root file system,
|
|
etc. See <file:Documentation/initrd.txt> for details.
|
|
|
|
If RAM disk support (BLK_DEV_RAM) is also included, this
|
|
also enables initial RAM disk (initrd) support and adds
|
|
15 Kbytes (more on some other architectures) to the kernel size.
|
|
|
|
If unsure say Y.
|
|
|
|
if BLK_DEV_INITRD
|
|
|
|
source "usr/Kconfig"
|
|
|
|
endif
|
|
|
|
config CC_OPTIMIZE_FOR_SIZE
|
|
bool "Optimize for size"
|
|
default y
|
|
help
|
|
Enabling this option will pass "-Os" instead of "-O2" to gcc
|
|
resulting in a smaller kernel.
|
|
|
|
If unsure, say Y.
|
|
|
|
config SYSCTL
|
|
bool
|
|
|
|
menuconfig EMBEDDED
|
|
bool "Configure standard kernel features (for small systems)"
|
|
help
|
|
This option allows certain base kernel options and settings
|
|
to be disabled or tweaked. This is for specialized
|
|
environments which can tolerate a "non-standard" kernel.
|
|
Only use this if you really know what you are doing.
|
|
|
|
config UID16
|
|
bool "Enable 16-bit UID system calls" if EMBEDDED
|
|
depends on ARM || BLACKFIN || CRIS || FRV || H8300 || X86_32 || M68K || (S390 && !64BIT) || SUPERH || SPARC32 || (SPARC64 && COMPAT) || UML || (X86_64 && IA32_EMULATION)
|
|
default y
|
|
help
|
|
This enables the legacy 16-bit UID syscall wrappers.
|
|
|
|
config SYSCTL_SYSCALL
|
|
bool "Sysctl syscall support" if EMBEDDED
|
|
default y
|
|
select SYSCTL
|
|
---help---
|
|
sys_sysctl uses binary paths that have been found challenging
|
|
to properly maintain and use. The interface in /proc/sys
|
|
using paths with ascii names is now the primary path to this
|
|
information.
|
|
|
|
Almost nothing using the binary sysctl interface so if you are
|
|
trying to save some space it is probably safe to disable this,
|
|
making your kernel marginally smaller.
|
|
|
|
If unsure say Y here.
|
|
|
|
config KALLSYMS
|
|
bool "Load all symbols for debugging/ksymoops" if EMBEDDED
|
|
default y
|
|
help
|
|
Say Y here to let the kernel print out symbolic crash information and
|
|
symbolic stack backtraces. This increases the size of the kernel
|
|
somewhat, as all symbols have to be loaded into the kernel image.
|
|
|
|
config KALLSYMS_ALL
|
|
bool "Include all symbols in kallsyms"
|
|
depends on DEBUG_KERNEL && KALLSYMS
|
|
help
|
|
Normally kallsyms only contains the symbols of functions, for nicer
|
|
OOPS messages. Some debuggers can use kallsyms for other
|
|
symbols too: say Y here to include all symbols, if you need them
|
|
and you don't care about adding 300k to the size of your kernel.
|
|
|
|
Say N.
|
|
|
|
config KALLSYMS_EXTRA_PASS
|
|
bool "Do an extra kallsyms pass"
|
|
depends on KALLSYMS
|
|
help
|
|
If kallsyms is not working correctly, the build will fail with
|
|
inconsistent kallsyms data. If that occurs, log a bug report and
|
|
turn on KALLSYMS_EXTRA_PASS which should result in a stable build.
|
|
Always say N here unless you find a bug in kallsyms, which must be
|
|
reported. KALLSYMS_EXTRA_PASS is only a temporary workaround while
|
|
you wait for kallsyms to be fixed.
|
|
|
|
|
|
config HOTPLUG
|
|
bool "Support for hot-pluggable devices" if EMBEDDED
|
|
default y
|
|
help
|
|
This option is provided for the case where no hotplug or uevent
|
|
capabilities is wanted by the kernel. You should only consider
|
|
disabling this option for embedded systems that do not use modules, a
|
|
dynamic /dev tree, or dynamic device discovery. Just say Y.
|
|
|
|
config PRINTK
|
|
default y
|
|
bool "Enable support for printk" if EMBEDDED
|
|
help
|
|
This option enables normal printk support. Removing it
|
|
eliminates most of the message strings from the kernel image
|
|
and makes the kernel more or less silent. As this makes it
|
|
very difficult to diagnose system problems, saying N here is
|
|
strongly discouraged.
|
|
|
|
config BUG
|
|
bool "BUG() support" if EMBEDDED
|
|
default y
|
|
help
|
|
Disabling this option eliminates support for BUG and WARN, reducing
|
|
the size of your kernel image and potentially quietly ignoring
|
|
numerous fatal conditions. You should only consider disabling this
|
|
option for embedded systems with no facilities for reporting errors.
|
|
Just say Y.
|
|
|
|
config ELF_CORE
|
|
default y
|
|
bool "Enable ELF core dumps" if EMBEDDED
|
|
help
|
|
Enable support for generating core dumps. Disabling saves about 4k.
|
|
|
|
config PCSPKR_PLATFORM
|
|
bool "Enable PC-Speaker support" if EMBEDDED
|
|
depends on ALPHA || X86 || MIPS || PPC_PREP || PPC_CHRP || PPC_PSERIES
|
|
default y
|
|
help
|
|
This option allows to disable the internal PC-Speaker
|
|
support, saving some memory.
|
|
|
|
config COMPAT_BRK
|
|
bool "Disable heap randomization"
|
|
default y
|
|
help
|
|
Randomizing heap placement makes heap exploits harder, but it
|
|
also breaks ancient binaries (including anything libc5 based).
|
|
This option changes the bootup default to heap randomization
|
|
disabled, and can be overriden runtime by setting
|
|
/proc/sys/kernel/randomize_va_space to 2.
|
|
|
|
On non-ancient distros (post-2000 ones) N is usually a safe choice.
|
|
|
|
config BASE_FULL
|
|
default y
|
|
bool "Enable full-sized data structures for core" if EMBEDDED
|
|
help
|
|
Disabling this option reduces the size of miscellaneous core
|
|
kernel data structures. This saves memory on small machines,
|
|
but may reduce performance.
|
|
|
|
config FUTEX
|
|
bool "Enable futex support" if EMBEDDED
|
|
default y
|
|
select RT_MUTEXES
|
|
help
|
|
Disabling this option will cause the kernel to be built without
|
|
support for "fast userspace mutexes". The resulting kernel may not
|
|
run glibc-based applications correctly.
|
|
|
|
config ANON_INODES
|
|
bool
|
|
|
|
config EPOLL
|
|
bool "Enable eventpoll support" if EMBEDDED
|
|
default y
|
|
select ANON_INODES
|
|
help
|
|
Disabling this option will cause the kernel to be built without
|
|
support for epoll family of system calls.
|
|
|
|
config SIGNALFD
|
|
bool "Enable signalfd() system call" if EMBEDDED
|
|
select ANON_INODES
|
|
default y
|
|
help
|
|
Enable the signalfd() system call that allows to receive signals
|
|
on a file descriptor.
|
|
|
|
If unsure, say Y.
|
|
|
|
config TIMERFD
|
|
bool "Enable timerfd() system call" if EMBEDDED
|
|
select ANON_INODES
|
|
default y
|
|
help
|
|
Enable the timerfd() system call that allows to receive timer
|
|
events on a file descriptor.
|
|
|
|
If unsure, say Y.
|
|
|
|
config EVENTFD
|
|
bool "Enable eventfd() system call" if EMBEDDED
|
|
select ANON_INODES
|
|
default y
|
|
help
|
|
Enable the eventfd() system call that allows to receive both
|
|
kernel notification (ie. KAIO) or userspace notifications.
|
|
|
|
If unsure, say Y.
|
|
|
|
config SHMEM
|
|
bool "Use full shmem filesystem" if EMBEDDED
|
|
default y
|
|
depends on MMU
|
|
help
|
|
The shmem is an internal filesystem used to manage shared memory.
|
|
It is backed by swap and manages resource limits. It is also exported
|
|
to userspace as tmpfs if TMPFS is enabled. Disabling this
|
|
option replaces shmem and tmpfs with the much simpler ramfs code,
|
|
which may be appropriate on small systems without swap.
|
|
|
|
config AIO
|
|
bool "Enable AIO support" if EMBEDDED
|
|
default y
|
|
help
|
|
This option enables POSIX asynchronous I/O which may by used
|
|
by some high performance threaded applications. Disabling
|
|
this option saves about 7k.
|
|
|
|
config VM_EVENT_COUNTERS
|
|
default y
|
|
bool "Enable VM event counters for /proc/vmstat" if EMBEDDED
|
|
help
|
|
VM event counters are needed for event counts to be shown.
|
|
This option allows the disabling of the VM event counters
|
|
on EMBEDDED systems. /proc/vmstat will only show page counts
|
|
if VM event counters are disabled.
|
|
|
|
config PCI_QUIRKS
|
|
default y
|
|
bool "Enable PCI quirk workarounds" if EMBEDDED
|
|
depends on PCI
|
|
help
|
|
This enables workarounds for various PCI chipset
|
|
bugs/quirks. Disable this only if your target machine is
|
|
unaffected by PCI quirks.
|
|
|
|
config SLUB_DEBUG
|
|
default y
|
|
bool "Enable SLUB debugging support" if EMBEDDED
|
|
depends on SLUB && SYSFS
|
|
help
|
|
SLUB has extensive debug support features. Disabling these can
|
|
result in significant savings in code size. This also disables
|
|
SLUB sysfs support. /sys/slab will not exist and there will be
|
|
no support for cache validation etc.
|
|
|
|
choice
|
|
prompt "Choose SLAB allocator"
|
|
default SLUB
|
|
help
|
|
This option allows to select a slab allocator.
|
|
|
|
config SLAB
|
|
bool "SLAB"
|
|
help
|
|
The regular slab allocator that is established and known to work
|
|
well in all environments. It organizes cache hot objects in
|
|
per cpu and per node queues.
|
|
|
|
config SLUB
|
|
bool "SLUB (Unqueued Allocator)"
|
|
help
|
|
SLUB is a slab allocator that minimizes cache line usage
|
|
instead of managing queues of cached objects (SLAB approach).
|
|
Per cpu caching is realized using slabs of objects instead
|
|
of queues of objects. SLUB can use memory efficiently
|
|
and has enhanced diagnostics. SLUB is the default choice for
|
|
a slab allocator.
|
|
|
|
config SLOB
|
|
depends on EMBEDDED
|
|
bool "SLOB (Simple Allocator)"
|
|
help
|
|
SLOB replaces the stock allocator with a drastically simpler
|
|
allocator. SLOB is generally more space efficient but
|
|
does not perform as well on large systems.
|
|
|
|
endchoice
|
|
|
|
config PROFILING
|
|
bool "Profiling support (EXPERIMENTAL)"
|
|
help
|
|
Say Y here to enable the extended profiling support mechanisms used
|
|
by profilers such as OProfile.
|
|
|
|
#
|
|
# Place an empty function call at each tracepoint site. Can be
|
|
# dynamically changed for a probe function.
|
|
#
|
|
config TRACEPOINTS
|
|
bool
|
|
|
|
config MARKERS
|
|
bool "Activate markers"
|
|
help
|
|
Place an empty function call at each marker site. Can be
|
|
dynamically changed for a probe function.
|
|
|
|
source "arch/Kconfig"
|
|
|
|
endmenu # General setup
|
|
|
|
config HAVE_GENERIC_DMA_COHERENT
|
|
bool
|
|
default n
|
|
|
|
config SLABINFO
|
|
bool
|
|
depends on PROC_FS
|
|
depends on SLAB || SLUB_DEBUG
|
|
default y
|
|
|
|
config RT_MUTEXES
|
|
boolean
|
|
select PLIST
|
|
|
|
config TINY_SHMEM
|
|
default !SHMEM
|
|
bool
|
|
|
|
config BASE_SMALL
|
|
int
|
|
default 0 if BASE_FULL
|
|
default 1 if !BASE_FULL
|
|
|
|
menuconfig MODULES
|
|
bool "Enable loadable module support"
|
|
help
|
|
Kernel modules are small pieces of compiled code which can
|
|
be inserted in the running kernel, rather than being
|
|
permanently built into the kernel. You use the "modprobe"
|
|
tool to add (and sometimes remove) them. If you say Y here,
|
|
many parts of the kernel can be built as modules (by
|
|
answering M instead of Y where indicated): this is most
|
|
useful for infrequently used options which are not required
|
|
for booting. For more information, see the man pages for
|
|
modprobe, lsmod, modinfo, insmod and rmmod.
|
|
|
|
If you say Y here, you will need to run "make
|
|
modules_install" to put the modules under /lib/modules/
|
|
where modprobe can find them (you may need to be root to do
|
|
this).
|
|
|
|
If unsure, say Y.
|
|
|
|
if MODULES
|
|
|
|
config MODULE_FORCE_LOAD
|
|
bool "Forced module loading"
|
|
default n
|
|
help
|
|
Allow loading of modules without version information (ie. modprobe
|
|
--force). Forced module loading sets the 'F' (forced) taint flag and
|
|
is usually a really bad idea.
|
|
|
|
config MODULE_UNLOAD
|
|
bool "Module unloading"
|
|
help
|
|
Without this option you will not be able to unload any
|
|
modules (note that some modules may not be unloadable
|
|
anyway), which makes your kernel smaller, faster
|
|
and simpler. If unsure, say Y.
|
|
|
|
config MODULE_FORCE_UNLOAD
|
|
bool "Forced module unloading"
|
|
depends on MODULE_UNLOAD && EXPERIMENTAL
|
|
help
|
|
This option allows you to force a module to unload, even if the
|
|
kernel believes it is unsafe: the kernel will remove the module
|
|
without waiting for anyone to stop using it (using the -f option to
|
|
rmmod). This is mainly for kernel developers and desperate users.
|
|
If unsure, say N.
|
|
|
|
config MODVERSIONS
|
|
bool "Module versioning support"
|
|
help
|
|
Usually, you have to use modules compiled with your kernel.
|
|
Saying Y here makes it sometimes possible to use modules
|
|
compiled for different kernels, by adding enough information
|
|
to the modules to (hopefully) spot any changes which would
|
|
make them incompatible with the kernel you are running. If
|
|
unsure, say N.
|
|
|
|
config MODULE_SRCVERSION_ALL
|
|
bool "Source checksum for all modules"
|
|
help
|
|
Modules which contain a MODULE_VERSION get an extra "srcversion"
|
|
field inserted into their modinfo section, which contains a
|
|
sum of the source files which made it. This helps maintainers
|
|
see exactly which source was used to build a module (since
|
|
others sometimes change the module source without updating
|
|
the version). With this option, such a "srcversion" field
|
|
will be created for all modules. If unsure, say N.
|
|
|
|
config KMOD
|
|
def_bool y
|
|
help
|
|
This is being removed soon. These days, CONFIG_MODULES
|
|
implies CONFIG_KMOD, so use that instead.
|
|
|
|
endif # MODULES
|
|
|
|
config STOP_MACHINE
|
|
bool
|
|
default y
|
|
depends on (SMP && MODULE_UNLOAD) || HOTPLUG_CPU
|
|
help
|
|
Need stop_machine() primitive.
|
|
|
|
source "block/Kconfig"
|
|
|
|
config PREEMPT_NOTIFIERS
|
|
bool
|
|
|
|
config TREE_RCU_TRACE
|
|
def_bool RCU_TRACE && TREE_RCU
|
|
select DEBUG_FS
|
|
help
|
|
This option provides tracing for the TREE_RCU implementation,
|
|
permitting Makefile to trivially select kernel/rcutree_trace.c.
|
|
|
|
config PREEMPT_RCU_TRACE
|
|
def_bool RCU_TRACE && PREEMPT_RCU
|
|
select DEBUG_FS
|
|
help
|
|
This option provides tracing for the PREEMPT_RCU implementation,
|
|
permitting Makefile to trivially select kernel/rcupreempt_trace.c.
|