linux/arch/x86/kernel
Roland McGrath 83bd01024b x86: protect against sigaltstack wraparound
cf http://lkml.org/lkml/2007/10/3/41

To summarize: on Linux, SA_ONSTACK decides whether you are already on the
signal stack based on the value of the SP at the time of a signal.  If
you are not already inside the range, you are not "on the signal stack"
and so the new signal handler frame starts over at the base of the signal
stack.

sigaltstack (and sigstack before it) was invented in BSD.  There, the
SA_ONSTACK behavior has always been different.  It uses a kernel state
flag to decide, rather than the SP value.  When you first take an
SA_ONSTACK signal and switch to the alternate signal stack, it sets the
SS_ONSTACK flag in the thread's sigaltstack state in the kernel.
Thereafter you are "on the signal stack" and don't switch SP before
pushing a handler frame no matter what the SP value is.  Only when you
sigreturn from the original handler context do you clear the SS_ONSTACK
flag so that a new handler frame will start over at the base of the
alternate signal stack.

The undesireable effect of the Linux behavior is that an overflow of the
alternate signal stack can not only go undetected, but lead to a ring
buffer effect of clobbering the original handler frame at the base of the
signal stack for each successive signal that comes just after the
overflow.  This is what Shi Weihua's test case demonstrates.  Normally
this does not come up because of the signal mask, but the test case uses
SA_NODEFER for its SIGSEGV handler.

The other subtle part of the existing Linux semantics is that a simple
longjmp out of a signal handler serves to take you off the signal stack
in a safe and reliable fashion without having used sigreturn (nor having
just returned from the handler normally, which means the same).  After
the longjmp (or even informal stack switching not via any proper libc or
kernel interface), the alternate signal stack stands ready to be used
again.

A paranoid program would allocate a PROT_NONE red zone around its
alternate signal stack.  Then a small overflow would trigger a SIGSEGV in
handler setup, and be fatal (core dump) whether or not SIGSEGV is
blocked.  As with thread stack red zones, that cannot catch all overflows
(or underflows).  e.g., a local array as large as page size allocated in
a function called from a handler, but not actually touched before more
calls push more stack, could cause an overflow that silently pushes into
some unrelated allocated pages.

The BSD behavior does not do anything in particular about overflow.  But
it does at least avoid the wraparound or "ring buffer effect", so you'll
just get a straightforward all-out overflow down your address space past
the low end of the alternate signal stack.  I don't know what the BSD
behavior is for longjmp out of an SA_ONSTACK handler.

The POSIX wording relating to sigaltstack is pretty minimal.  I don't
think it speaks to this issue one way or another.  (The program that
overflows its stack is clearly in undefined behavior territory of one
sort or another anyhow.)

Given the longjmp issue and the potential for highly subtle complications
in existing programs relying on this in arcane ways deep in their code, I
am very dubious about changing the behavior to the BSD style persistent
flag.  I think Shi Weihua's patches have a similar effect by tracking the
SP used in the last handler setup.

I think it would be sensible for the signal handler setup code to detect
when it would itself be causing a stack overflow.  Maybe something like
the following patch (untested).  This issue exists in the same way on all
machines, so ideally they would all do a similar check.

When it's the handler function itself or its callees that cause the
overflow, rather than the signal handler frame setup alone crossing the
boundary, this still won't help.  But I don't see any way to distinguish
that from the valid longjmp case.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:06 +01:00
..
acpi ACPI: suspend: old debugging hacks sneaked back 2007-12-06 16:03:06 -05:00
cpu kobj: fix threshold_init_device/kobject_uevent_env oops 2008-01-30 13:29:58 +01:00
.gitignore .gitignore update for x86 arch 2007-10-17 21:19:04 +02:00
alternative.c x86: convert cpuinfo_x86 array to a per_cpu array 2007-10-19 20:35:04 +02:00
aperture_64.c x86 gart: rename symbols only used for the GART implementation 2007-10-30 00:22:22 +01:00
apic_32.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
apic_64.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
apm_32.c PM: ACPI and APM must not be enabled at the same time 2008-01-11 12:26:47 -05:00
asm-offsets_32.c Boot with virtual == physical to get closer to native Linux. 2007-10-23 15:49:54 +10:00
asm-offsets_64.c x86: Fix boot protocol KEEP_SEGMENTS check. 2007-10-27 20:57:43 +02:00
asm-offsets.c
audit_64.c
bootflag.c
bugs_64.c
cpuid.c PM: Acquire device locks on suspend 2008-01-24 20:40:04 -08:00
crash_dump_32.c kmap leak fix for x86_32 kdump 2007-10-19 11:53:33 -07:00
crash_dump_64.c
crash.c x86: disable hpet legacy replacement for kdump 2007-12-03 17:17:10 +01:00
doublefault_32.c
e820_32.c kexec: add BSS to resource tree 2007-10-22 08:13:19 -07:00
e820_64.c kexec: add BSS to resource tree 2007-10-22 08:13:19 -07:00
early_printk.c [x86] remove uses of magic macros for boot_params access 2007-10-16 17:38:31 -07:00
early-quirks.c x86 gart: rename symbols only used for the GART implementation 2007-10-30 00:22:22 +01:00
efi_32.c kexec: add BSS to resource tree 2007-10-22 08:13:19 -07:00
efi_stub_32.S
entry_32.S Merge branch 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen 2007-10-17 11:10:11 -07:00
entry_64.S sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
genapic_64.c x86: convert cpu_to_apicid to be a per cpu variable 2007-10-19 20:35:03 +02:00
genapic_flat_64.c x86: convert cpu_to_apicid to be a per cpu variable 2007-10-19 20:35:03 +02:00
geode_32.c
head64.c x86: use descriptor's functions instead of inline assembly 2007-10-19 20:35:03 +02:00
head_32.S fix lguest rmmod "bad pgd" 2008-01-01 11:30:35 -08:00
head_64.S
hpet.c x86: assign IRQs to HPET timers, fix 2008-01-30 13:30:03 +01:00
i386_ksyms_32.c x86: export the symbol empty_zero_page on the 32-bit x86 architecture 2007-11-26 20:42:19 +01:00
i387_32.c
i387_64.c x86: fix taking DNA during 64bit sigreturn 2007-11-12 11:09:33 -08:00
i8237.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
i8253.c x86: unregister PIT clocksource when PIT is disabled 2008-01-30 13:30:03 +01:00
i8259_32.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
i8259_64.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
init_task.c x86: merge init_task_32/64.c 2007-10-19 20:35:02 +02:00
io_apic_32.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
io_apic_64.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
io_delay.c x86: add DMI quirk for io-delay hangs on Compaq Presario V6000 laptops 2008-01-30 13:30:05 +01:00
ioport_32.c
ioport_64.c
irq_32.c x86: also show non-zero IRQ counts for vectors that currently don't have a handler 2007-10-17 20:16:54 +02:00
irq_64.c x86: also show non-zero IRQ counts for vectors that currently don't have a handler 2007-10-17 20:16:54 +02:00
k8.c
kprobes_32.c x86: jprobe bugfix 2007-12-18 18:05:58 +01:00
kprobes_64.c x86: kprobes bugfix 2007-12-18 18:05:58 +01:00
ldt_32.c x86: convert mm_context_t semaphore to a mutex 2007-10-17 20:17:05 +02:00
ldt_64.c x86: convert mm_context_t semaphore to a mutex 2007-10-17 20:17:00 +02:00
machine_kexec_32.c Use extended crashkernel command line on i386 2007-10-19 11:53:49 -07:00
machine_kexec_64.c x86: Dump filtering supports x86_64 sparsemem 2007-10-27 20:57:43 +02:00
Makefile x86: delete vsyscall files during make clean 2007-10-17 21:56:01 +02:00
Makefile_32 x86: provide a DMI based port 0x80 I/O delay override. 2008-01-30 13:30:05 +01:00
Makefile_64 x86: provide a DMI based port 0x80 I/O delay override. 2008-01-30 13:30:05 +01:00
mca_32.c
mfgpt_32.c x86: GEODE fix a race condition in the MFGPT timer tick 2008-01-22 23:30:16 +01:00
microcode.c cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus() 2008-01-25 21:08:02 +01:00
module_32.c
module_64.c
mpparse_32.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
mpparse_64.c x86: acpi use cpu_physical_id 2007-10-19 20:35:03 +02:00
msr.c PM: Acquire device locks on suspend 2008-01-24 20:40:04 -08:00
nmi_32.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
nmi_64.c Driver core: change sysdev classes to use dynamic kobject names 2008-01-24 20:40:40 -08:00
numaq_32.c
paravirt_32.c x86/paravirt: revert exports to restore old behaviour 2007-11-29 09:24:55 -08:00
pci-calgary_64.c x86 gart: rename iommu.h to gart.h 2007-10-30 00:22:22 +01:00
pci-dma_32.c i386: Clean up duplicate includes in arch/i386/kernel/ 2007-10-17 20:15:51 +02:00
pci-dma_64.c x86: turn off iommu merge by default 2007-11-26 20:42:19 +01:00
pci-gart_64.c x86 gart: rename symbols only used for the GART implementation 2007-10-30 00:22:22 +01:00
pci-nommu_64.c x86 gart: rename iommu.h to gart.h 2007-10-30 00:22:22 +01:00
pci-swiotlb_64.c x86 gart: rename iommu.h to gart.h 2007-10-30 00:22:22 +01:00
pcspeaker.c
pmtimer_64.c
process_32.c Kick CPUS that might be sleeping in cpus_idle_wait 2008-01-14 08:52:22 -08:00
process_64.c time: more timer related cleanups 2008-01-30 13:30:00 +01:00
ptrace_32.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
ptrace_64.c x86: convert mm_context_t semaphore to a mutex 2007-10-17 20:17:00 +02:00
quirks.c x86: Add HPET force support for MCP55 (nForce 5) chipsets 2007-10-23 22:37:25 +02:00
reboot_32.c x86: disable hpet on shutdown 2007-12-03 17:17:10 +01:00
reboot_64.c x86: disable hpet on shutdown 2007-12-03 17:17:10 +01:00
reboot_fixups_32.c x86: reboot fixup for wrap2c board 2007-11-17 16:27:02 +01:00
relocate_kernel_32.S
relocate_kernel_64.S
scx200_32.c
setup64.c x86: use descriptor's functions instead of inline assembly 2007-10-19 20:35:03 +02:00
setup_32.c x86: provide a DMI based port 0x80 I/O delay override. 2008-01-30 13:30:05 +01:00
setup_64.c x86: provide a DMI based port 0x80 I/O delay override. 2008-01-30 13:30:05 +01:00
sigframe_32.h
signal_32.c x86: protect against sigaltstack wraparound 2008-01-30 13:30:06 +01:00
signal_64.c sched: high-res preemption tick 2008-01-25 21:08:29 +01:00
smp_32.c x86: export smp_ops to allow modular build of KVM 2007-10-27 20:57:43 +02:00
smp_64.c x86: implement missing x86_64 function smp_call_function_mask() 2007-10-19 20:35:03 +02:00
smpboot_32.c x86 smpboot_32.c section fixes 2007-12-19 23:20:18 +01:00
smpboot_64.c x86: fix do_fork_idle section mismatch 2008-01-08 16:10:35 -08:00
smpcommon_32.c
srat_32.c
stacktrace.c sched: latencytop support 2008-01-25 21:08:34 +01:00
summit_32.c spelling fixes: arch/i386/ 2007-10-20 01:13:56 +02:00
suspend_64.c x86: hibernation: document __save_processor_state() on x86 2008-01-30 13:30:04 +01:00
suspend_asm_64.S x86: Save registers in saved_context during suspend and hibernation 2007-10-23 22:37:24 +02:00
sys_i386_32.c remove include/asm-*/ipc.h 2007-10-17 08:42:55 -07:00
sys_x86_64.c
syscall_64.c
syscall_table_32.S
sysenter_32.c
tce_64.c x86: Create clflush() inline, remove hardcoded wbinvd 2007-10-17 20:16:12 +02:00
time_32.c
time_64.c x86: on x86_64, correct reading of PC RTC when update in progress in time_64.c 2007-11-17 16:27:01 +01:00
topology.c x86: arch_register_cpu() section fix 2007-12-04 17:19:07 +01:00
trampoline_32.S x86: misc. constifications 2007-10-17 20:16:08 +02:00
trampoline_64.S x86: misc. constifications 2007-10-17 20:16:08 +02:00
traps_32.c lockdep: more hardirq annotations for notify_die() 2008-01-16 09:51:59 +01:00
traps_64.c lockdep: more hardirq annotations for notify_die() 2008-01-16 09:51:59 +01:00
tsc_32.c x86: fix more TSC clock source calibration errors 2007-10-23 22:37:22 +02:00
tsc_64.c x86: convert cpuinfo_x86 array to a per_cpu array 2007-10-19 20:35:04 +02:00
tsc_sync.c x86: fix: s2ram + P4 + tsc = annoyance 2008-01-30 13:30:04 +01:00
verify_cpu_64.S
vm86_32.c
vmi_32.c paravirt: clean up lazy mode handling 2007-10-16 11:51:29 -07:00
vmiclock_32.c
vmlinux_32.lds.S all archs: consolidate init and exit sections in vmlinux.lds.h 2008-01-28 23:21:17 +01:00
vmlinux_64.lds.S all archs: consolidate init and exit sections in vmlinux.lds.h 2008-01-28 23:21:17 +01:00
vmlinux.lds.S
vsmp_64.c
vsyscall_32.lds.S
vsyscall_32.S
vsyscall_64.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2007-10-19 20:36:17 -07:00
vsyscall-int80_32.S
vsyscall-note_32.S
vsyscall-sigreturn_32.S
vsyscall-sysenter_32.S
x8664_ksyms_64.c