mirror of
https://github.com/FEX-Emu/linux.git
synced 2024-12-11 02:02:27 +00:00
6471384af2
Patch series "add init_on_alloc/init_on_free boot options", v10. Provide init_on_alloc and init_on_free boot options. These are aimed at preventing possible information leaks and making the control-flow bugs that depend on uninitialized values more deterministic. Enabling either of the options guarantees that the memory returned by the page allocator and SL[AU]B is initialized with zeroes. SLOB allocator isn't supported at the moment, as its emulation of kmem caches complicates handling of SLAB_TYPESAFE_BY_RCU caches correctly. Enabling init_on_free also guarantees that pages and heap objects are initialized right after they're freed, so it won't be possible to access stale data by using a dangling pointer. As suggested by Michal Hocko, right now we don't let the heap users to disable initialization for certain allocations. There's not enough evidence that doing so can speed up real-life cases, and introducing ways to opt-out may result in things going out of control. This patch (of 2): The new options are needed to prevent possible information leaks and make control-flow bugs that depend on uninitialized values more deterministic. This is expected to be on-by-default on Android and Chrome OS. And it gives the opportunity for anyone else to use it under distros too via the boot args. (The init_on_free feature is regularly requested by folks where memory forensics is included in their threat models.) init_on_alloc=1 makes the kernel initialize newly allocated pages and heap objects with zeroes. Initialization is done at allocation time at the places where checks for __GFP_ZERO are performed. init_on_free=1 makes the kernel initialize freed pages and heap objects with zeroes upon their deletion. This helps to ensure sensitive data doesn't leak via use-after-free accesses. Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator returns zeroed memory. The two exceptions are slab caches with constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never zero-initialized to preserve their semantics. Both init_on_alloc and init_on_free default to zero, but those defaults can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and CONFIG_INIT_ON_FREE_DEFAULT_ON. If either SLUB poisoning or page poisoning is enabled, those options take precedence over init_on_alloc and init_on_free: initialization is only applied to unpoisoned allocations. Slowdown for the new features compared to init_on_free=0, init_on_alloc=0: hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%) hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%) Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%) Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%) Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%) Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%) The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline is within the standard error. The new features are also going to pave the way for hardware memory tagging (e.g. arm64's MTE), which will require both on_alloc and on_free hooks to set the tags for heap objects. With MTE, tagging will have the same cost as memory initialization. Although init_on_free is rather costly, there are paranoid use-cases where in-memory data lifetime is desired to be minimized. There are various arguments for/against the realism of the associated threat models, but given that we'll need the infrastructure for MTE anyway, and there are people who want wipe-on-free behavior no matter what the performance cost, it seems reasonable to include it in this series. [glider@google.com: v8] Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com [glider@google.com: v9] Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com [glider@google.com: v10] Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts Acked-by: James Morris <jamorris@linux.microsoft.com>] Cc: Christoph Lameter <cl@linux.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Sandeep Patil <sspatil@android.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
195 lines
7.6 KiB
Plaintext
195 lines
7.6 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
menu "Kernel hardening options"
|
|
|
|
config GCC_PLUGIN_STRUCTLEAK
|
|
bool
|
|
help
|
|
While the kernel is built with warnings enabled for any missed
|
|
stack variable initializations, this warning is silenced for
|
|
anything passed by reference to another function, under the
|
|
occasionally misguided assumption that the function will do
|
|
the initialization. As this regularly leads to exploitable
|
|
flaws, this plugin is available to identify and zero-initialize
|
|
such variables, depending on the chosen level of coverage.
|
|
|
|
This plugin was originally ported from grsecurity/PaX. More
|
|
information at:
|
|
* https://grsecurity.net/
|
|
* https://pax.grsecurity.net/
|
|
|
|
menu "Memory initialization"
|
|
|
|
config CC_HAS_AUTO_VAR_INIT
|
|
def_bool $(cc-option,-ftrivial-auto-var-init=pattern)
|
|
|
|
choice
|
|
prompt "Initialize kernel stack variables at function entry"
|
|
default GCC_PLUGIN_STRUCTLEAK_BYREF_ALL if COMPILE_TEST && GCC_PLUGINS
|
|
default INIT_STACK_ALL if COMPILE_TEST && CC_HAS_AUTO_VAR_INIT
|
|
default INIT_STACK_NONE
|
|
help
|
|
This option enables initialization of stack variables at
|
|
function entry time. This has the possibility to have the
|
|
greatest coverage (since all functions can have their
|
|
variables initialized), but the performance impact depends
|
|
on the function calling complexity of a given workload's
|
|
syscalls.
|
|
|
|
This chooses the level of coverage over classes of potentially
|
|
uninitialized variables. The selected class will be
|
|
initialized before use in a function.
|
|
|
|
config INIT_STACK_NONE
|
|
bool "no automatic initialization (weakest)"
|
|
help
|
|
Disable automatic stack variable initialization.
|
|
This leaves the kernel vulnerable to the standard
|
|
classes of uninitialized stack variable exploits
|
|
and information exposures.
|
|
|
|
config GCC_PLUGIN_STRUCTLEAK_USER
|
|
bool "zero-init structs marked for userspace (weak)"
|
|
depends on GCC_PLUGINS
|
|
select GCC_PLUGIN_STRUCTLEAK
|
|
help
|
|
Zero-initialize any structures on the stack containing
|
|
a __user attribute. This can prevent some classes of
|
|
uninitialized stack variable exploits and information
|
|
exposures, like CVE-2013-2141:
|
|
https://git.kernel.org/linus/b9e146d8eb3b9eca
|
|
|
|
config GCC_PLUGIN_STRUCTLEAK_BYREF
|
|
bool "zero-init structs passed by reference (strong)"
|
|
depends on GCC_PLUGINS
|
|
select GCC_PLUGIN_STRUCTLEAK
|
|
help
|
|
Zero-initialize any structures on the stack that may
|
|
be passed by reference and had not already been
|
|
explicitly initialized. This can prevent most classes
|
|
of uninitialized stack variable exploits and information
|
|
exposures, like CVE-2017-1000410:
|
|
https://git.kernel.org/linus/06e7e776ca4d3654
|
|
|
|
config GCC_PLUGIN_STRUCTLEAK_BYREF_ALL
|
|
bool "zero-init anything passed by reference (very strong)"
|
|
depends on GCC_PLUGINS
|
|
select GCC_PLUGIN_STRUCTLEAK
|
|
help
|
|
Zero-initialize any stack variables that may be passed
|
|
by reference and had not already been explicitly
|
|
initialized. This is intended to eliminate all classes
|
|
of uninitialized stack variable exploits and information
|
|
exposures.
|
|
|
|
config INIT_STACK_ALL
|
|
bool "0xAA-init everything on the stack (strongest)"
|
|
depends on CC_HAS_AUTO_VAR_INIT
|
|
help
|
|
Initializes everything on the stack with a 0xAA
|
|
pattern. This is intended to eliminate all classes
|
|
of uninitialized stack variable exploits and information
|
|
exposures, even variables that were warned to have been
|
|
left uninitialized.
|
|
|
|
endchoice
|
|
|
|
config GCC_PLUGIN_STRUCTLEAK_VERBOSE
|
|
bool "Report forcefully initialized variables"
|
|
depends on GCC_PLUGIN_STRUCTLEAK
|
|
depends on !COMPILE_TEST # too noisy
|
|
help
|
|
This option will cause a warning to be printed each time the
|
|
structleak plugin finds a variable it thinks needs to be
|
|
initialized. Since not all existing initializers are detected
|
|
by the plugin, this can produce false positive warnings.
|
|
|
|
config GCC_PLUGIN_STACKLEAK
|
|
bool "Poison kernel stack before returning from syscalls"
|
|
depends on GCC_PLUGINS
|
|
depends on HAVE_ARCH_STACKLEAK
|
|
help
|
|
This option makes the kernel erase the kernel stack before
|
|
returning from system calls. This has the effect of leaving
|
|
the stack initialized to the poison value, which both reduces
|
|
the lifetime of any sensitive stack contents and reduces
|
|
potential for uninitialized stack variable exploits or information
|
|
exposures (it does not cover functions reaching the same stack
|
|
depth as prior functions during the same syscall). This blocks
|
|
most uninitialized stack variable attacks, with the performance
|
|
impact being driven by the depth of the stack usage, rather than
|
|
the function calling complexity.
|
|
|
|
The performance impact on a single CPU system kernel compilation
|
|
sees a 1% slowdown, other systems and workloads may vary and you
|
|
are advised to test this feature on your expected workload before
|
|
deploying it.
|
|
|
|
This plugin was ported from grsecurity/PaX. More information at:
|
|
* https://grsecurity.net/
|
|
* https://pax.grsecurity.net/
|
|
|
|
config STACKLEAK_TRACK_MIN_SIZE
|
|
int "Minimum stack frame size of functions tracked by STACKLEAK"
|
|
default 100
|
|
range 0 4096
|
|
depends on GCC_PLUGIN_STACKLEAK
|
|
help
|
|
The STACKLEAK gcc plugin instruments the kernel code for tracking
|
|
the lowest border of the kernel stack (and for some other purposes).
|
|
It inserts the stackleak_track_stack() call for the functions with
|
|
a stack frame size greater than or equal to this parameter.
|
|
If unsure, leave the default value 100.
|
|
|
|
config STACKLEAK_METRICS
|
|
bool "Show STACKLEAK metrics in the /proc file system"
|
|
depends on GCC_PLUGIN_STACKLEAK
|
|
depends on PROC_FS
|
|
help
|
|
If this is set, STACKLEAK metrics for every task are available in
|
|
the /proc file system. In particular, /proc/<pid>/stack_depth
|
|
shows the maximum kernel stack consumption for the current and
|
|
previous syscalls. Although this information is not precise, it
|
|
can be useful for estimating the STACKLEAK performance impact for
|
|
your workloads.
|
|
|
|
config STACKLEAK_RUNTIME_DISABLE
|
|
bool "Allow runtime disabling of kernel stack erasing"
|
|
depends on GCC_PLUGIN_STACKLEAK
|
|
help
|
|
This option provides 'stack_erasing' sysctl, which can be used in
|
|
runtime to control kernel stack erasing for kernels built with
|
|
CONFIG_GCC_PLUGIN_STACKLEAK.
|
|
|
|
config INIT_ON_ALLOC_DEFAULT_ON
|
|
bool "Enable heap memory zeroing on allocation by default"
|
|
help
|
|
This has the effect of setting "init_on_alloc=1" on the kernel
|
|
command line. This can be disabled with "init_on_alloc=0".
|
|
When "init_on_alloc" is enabled, all page allocator and slab
|
|
allocator memory will be zeroed when allocated, eliminating
|
|
many kinds of "uninitialized heap memory" flaws, especially
|
|
heap content exposures. The performance impact varies by
|
|
workload, but most cases see <1% impact. Some synthetic
|
|
workloads have measured as high as 7%.
|
|
|
|
config INIT_ON_FREE_DEFAULT_ON
|
|
bool "Enable heap memory zeroing on free by default"
|
|
help
|
|
This has the effect of setting "init_on_free=1" on the kernel
|
|
command line. This can be disabled with "init_on_free=0".
|
|
Similar to "init_on_alloc", when "init_on_free" is enabled,
|
|
all page allocator and slab allocator memory will be zeroed
|
|
when freed, eliminating many kinds of "uninitialized heap memory"
|
|
flaws, especially heap content exposures. The primary difference
|
|
with "init_on_free" is that data lifetime in memory is reduced,
|
|
as anything freed is wiped immediately, making live forensics or
|
|
cold boot memory attacks unable to recover freed memory contents.
|
|
The performance impact varies by workload, but is more expensive
|
|
than "init_on_alloc" due to the negative cache effects of
|
|
touching "cold" memory areas. Most cases see 3-5% impact. Some
|
|
synthetic workloads have measured as high as 8%.
|
|
|
|
endmenu
|
|
|
|
endmenu
|