linux/kernel
Paul E. McKenney 52d7e48b86 rcu: Narrow early boot window of illegal synchronous grace periods
The current preemptible RCU implementation goes through three phases
during bootup.  In the first phase, there is only one CPU that is running
with preemption disabled, so that a no-op is a synchronous grace period.
In the second mid-boot phase, the scheduler is running, but RCU has
not yet gotten its kthreads spawned (and, for expedited grace periods,
workqueues are not yet running.  During this time, any attempt to do
a synchronous grace period will hang the system (or complain bitterly,
depending).  In the third and final phase, RCU is fully operational and
everything works normally.

This has been OK for some time, but there has recently been some
synchronous grace periods showing up during the second mid-boot phase.
This code worked "by accident" for awhile, but started failing as soon
as expedited RCU grace periods switched over to workqueues in commit
8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue").
Note that the code was buggy even before this commit, as it was subject
to failure on real-time systems that forced all expedited grace periods
to run as normal grace periods (for example, using the rcu_normal ksysfs
parameter).  The callchain from the failure case is as follows:

early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited

The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
which caused the code to try using workqueues before they were
initialized, which did not go well.

This commit therefore reworks RCU to permit synchronous grace periods
to proceed during this mid-boot phase.  This commit is therefore a
fix to a regression introduced in v4.9, and is therefore being put
forward post-merge-window in v4.10.

This commit sets a flag from the existing rcu_scheduler_starting()
function which causes all synchronous grace periods to take the expedited
path.  The expedited path now checks this flag, using the requesting task
to drive the expedited grace period forward during the mid-boot phase.
Finally, this flag is updated by a core_initcall() function named
rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.

Note that this arrangement assumes that tasks are not sent POSIX signals
(or anything similar) from the time that the first task is spawned
through core_initcall() time.

Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Reported-by: "Zheng, Lv" <lv.zheng@intel.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Stan Kain <stan.kain@gmail.com>
Tested-by: Ivan <waffolz@hotmail.com>
Tested-by: Emanuel Castelo <emanuel.castelo@gmail.com>
Tested-by: Bruno Pesavento <bpesavento@infinito.it>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Frederic Bezies <fredbezies@gmail.com>
Cc: <stable@vger.kernel.org> # 4.9.0-
2017-01-14 21:23:48 -08:00
..
bpf bpf: fix mark_reg_unknown_value for spilled regs on map value marking 2016-12-17 21:27:44 -05:00
configs config: android: enable CONFIG_SECCOMP 2016-10-11 15:06:32 -07:00
debug kdb: call vkdb_printf() from vprintk_default() only when wanted 2016-12-14 16:04:08 -08:00
events Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-23 16:49:12 -08:00
gcov
irq genirq/affinity: Fix node generation from cpumask 2016-12-15 12:32:35 +01:00
livepatch
locking Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
power Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
printk Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
rcu rcu: Narrow early boot window of illegal synchronous grace periods 2017-01-14 21:23:48 -08:00
sched ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
time ktime: Cleanup ktime_set() usage 2016-12-25 17:21:22 +01:00
trace clocksource: Use a plain u64 instead of cycle_t 2016-12-25 11:04:12 +01:00
.gitignore
acct.c
async.c
audit_fsnotify.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-17 18:44:00 -08:00
audit_tree.c Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit 2017-01-05 23:06:06 -08:00
audit_watch.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-17 18:44:00 -08:00
audit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-17 18:44:00 -08:00
audit.h audit_log_{name,link_denied}: constify struct path 2016-12-05 19:00:38 -05:00
auditfilter.c audit: add support for session ID user filter 2016-11-29 15:10:12 -05:00
auditsc.c Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit 2016-12-14 14:06:40 -08:00
backtracetest.c
bounds.c
capability.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
cgroup_freezer.c
cgroup_pids.c
cgroup.c cgroup: add support for eBPF programs 2016-11-25 16:25:52 -05:00
compat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
configs.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
context_tracking.c
cpu_pm.c
cpu.c smp/hotplug: Undo tglxs brainfart 2016-12-26 17:30:24 -08:00
cpuset.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
extable.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
fork.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
freezer.c
futex_compat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
futex.c ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
groups.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
hung_task.c hung_task: decrement sysctl_hung_task_warnings only if it is positive 2016-12-12 18:55:09 -08:00
irq_work.c
jump_label.c
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/mutex: Allow MUTEX_SPIN_ON_OWNER when DEBUG_MUTEXES 2016-10-25 11:31:51 +02:00
Kconfig.preempt
kcov.c kcov: make kcov work properly with KASLR enabled 2016-12-20 09:48:47 -08:00
kexec_core.c kexec: add cond_resched into kimage_alloc_crash_control_pages 2016-12-14 16:04:07 -08:00
kexec_file.c ima: on soft reboot, save the measurement list 2016-12-20 09:48:44 -08:00
kexec_internal.h kexec_file: Allow arch-specific memory walking for kexec_add_buffer 2016-11-30 23:14:57 +11:00
kexec.c
kmod.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
kprobes.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
ksysfs.c
kthread.c kthread: add __printf attributes 2016-12-12 18:55:06 -08:00
latencytop.c
Makefile Merge branch 'akpm' (patches from Andrew) 2016-12-14 17:25:18 -08:00
membarrier.c
memremap.c
module_signing.c
module-internal.h
module.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
notifier.c
nsproxy.c
padata.c padata: Remove unused but set variables 2016-10-25 11:08:10 +08:00
panic.c taint/module: Clean up global and module taint flags handling 2016-11-26 11:18:01 -08:00
params.c
pid_namespace.c
pid.c
profile.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
ptrace.c ptrace: Don't allow accessing an undumpable mm 2016-11-22 12:57:38 -06:00
range.c
reboot.c
relay.c relay: check array offset before using it 2016-12-14 16:04:08 -08:00
resource.c
seccomp.c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-12-14 13:57:44 -08:00
signal.c ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
smp.c kernel/smp: Tell the user we're bringing up secondary CPUs 2016-10-26 12:02:35 +02:00
smpboot.c kthread/smpboot: do not park in kthread_create_on_cpu() 2016-10-11 15:06:33 -07:00
smpboot.h
softirq.c softirq: Display IRQ_POLL for irq-poll statistics 2016-10-21 15:45:47 -06:00
stacktrace.c
stop_machine.c locking/core, stop_machine: Yield the CPU during stop machine() 2016-11-16 10:15:09 +01:00
sys_ni.c move aio compat to fs/aio.c 2016-12-22 22:58:37 -05:00
sys.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
sysctl_binary.c sysctl: add KERN_CONT to deprecated_sysctl_warning() 2016-12-14 16:04:07 -08:00
sysctl.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
task_work.c
taskstats.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
test_kprobes.c
torture.c
tracepoint.c tracing: Have the reg function allow to fail 2016-12-09 09:13:30 -05:00
tsacct.c
ucount.c
uid16.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog_hld.c kernel/watchdog.c: move hardlockup detector to separate file 2016-12-14 16:04:08 -08:00
watchdog.c kernel/watchdog.c: move hardlockup detector to separate file 2016-12-14 16:04:08 -08:00
workqueue_internal.h
workqueue.c Merge branch 'for-4.9' into for-4.10 2016-10-19 12:12:40 -04:00