linux/arch/x86/mm
Dave Young 22ef882e6b x86/mm/numa: Fix kernel stack corruption in numa_init()->numa_clear_kernel_node_hotplug()
I got below kernel panic during kdump test on Thinkpad T420
laptop:

[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff]
[    0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81d21910
 ...
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff817c2a26>] dump_stack+0x45/0x57
[    0.000000]  [<ffffffff817bc8d2>] panic+0xd0/0x204
[    0.000000]  [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2
[    0.000000]  [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20
[    0.000000]  [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2
[    0.000000]  [<ffffffff81d21e5d>] numa_init+0x1a5/0x520
[    0.000000]  [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d
[    0.000000]  [<ffffffff81d22460>] initmem_init+0x9/0xb
[    0.000000]  [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120
[    0.000000]  [<ffffffff817bd0bb>] ? printk+0x55/0x6b
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120
[    0.000000]  [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120
[    0.000000]  [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120
[    0.000000]  [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c
[    0.000000]  [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184
[    0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta

This is caused by writing over the end of numa mask bitmap
in numa_clear_kernel_node().

numa_clear_kernel_node() tries to set the node id in a mask bitmap,
by iterating all reserved regions and assuming that every region
has a valid nid.

This assumption is not true because there's an exception for some
graphic memory quirks. See trim_snb_memory() in arch/x86/kernel/setup.c

It is easily to reproduce the bug in the kdump kernel because kdump
kernel use pre-reserved memory instead of the whole memory, but
kexec pass other reserved memory ranges to 2nd kernel as well.
like below in my test:

kdump kernel ram 0x2d000000 - 0x37bfffff
One of the reserved regions: 0x40000000 - 0x40100000 which
includes 0x40004000, a page excluded in trim_snb_memory(). For
this memblock reserved region the nid is not set, it is still
default value MAX_NUMNODES. later node_set will set bit
MAX_NUMNODES thus stack corruption happen.

This also happens when booting with mem= kernel commandline
during my test.

Fixing it by adding a check, do not call node_set in case nid is
MAX_NUMNODES.

Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bhe@redhat.com
Cc: qiuxishi@huawei.com
Link: http://lkml.kernel.org/r/20150407134132.GA23522@dhcp-16-198.nay.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-07 16:01:19 +02:00
..
kmemcheck x86: Replace __get_cpu_var uses 2014-08-26 13:45:49 -04:00
amdtopology.c x86/mm/numa: Simplify some bit mangling 2013-04-10 19:06:26 +02:00
dump_pagetables.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-12-10 13:59:34 -08:00
extable.c
fault.c x86: Store a per-cpu shadow copy of CR4 2015-02-04 12:10:42 +01:00
gup.c mm: convert p[te|md]_numa users to p[te|md]_protnone_numa 2015-02-12 18:54:08 -08:00
highmem_32.c mm: accurately calculate zone->managed_pages for highmem zones 2013-07-03 16:07:33 -07:00
hugetlbpage.c mm/hugetlb: pmd_huge() returns true for non-present hugepage 2015-02-11 17:06:01 -08:00
init_32.c x86: remove the Xen-specific _PAGE_IOMAP PTE flag 2014-09-23 13:36:20 +00:00
init_64.c Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-12-10 14:24:20 -08:00
init.c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-02-21 10:41:29 -08:00
iomap_32.c x86: Use new cache mode type in mm/iomap_32.c 2014-11-16 11:04:25 +01:00
ioremap.c Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-12-10 13:59:34 -08:00
kasan_init_64.c kasan: enable instrumentation of global variables 2015-02-13 21:21:42 -08:00
kmmio.c x86: Delete non-required instances of include <linux/init.h> 2014-01-06 21:25:18 -08:00
Makefile x86_64: add KASan support 2015-02-13 21:21:41 -08:00
memtest.c x86/mm: memblock: switch to use NUMA_NO_NODE 2014-01-21 16:19:47 -08:00
mm_internal.h x86: Enable PAT to use cache mode translation tables 2014-11-16 11:04:26 +01:00
mmap.c x86, mm/ASLR: Fix stack randomization on 64-bit systems 2015-02-19 12:21:36 +01:00
mmio-mod.c x86: delete __cpuinit usage from all x86 files 2013-07-14 19:36:56 -04:00
mpx.c x86, mpx: Explicitly disable 32-bit MPX support on 64-bit kernels 2015-01-22 21:11:06 +01:00
numa_32.c x86: Fix the initialization of physnode_map 2014-02-01 22:15:51 -08:00
numa_64.c x86, mm: kill numa_free_all_bootmem() 2012-11-17 11:59:47 -08:00
numa_emulation.c x86: delete __cpuinit usage from all x86 files 2013-07-14 19:36:56 -04:00
numa_internal.h x86-32, mm: Rip out x86_32 NUMA remapping code 2013-01-31 14:12:30 -08:00
numa.c x86/mm/numa: Fix kernel stack corruption in numa_init()->numa_clear_kernel_node_hotplug() 2015-04-07 16:01:19 +02:00
pageattr-test.c x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels 2014-06-04 16:53:55 -07:00
pageattr.c xen: additional features for 3.19-rc0 2014-12-16 13:23:03 -08:00
pat_internal.h x86: Use new cache mode type in memtype related functions 2014-11-16 11:04:26 +01:00
pat_rbtree.c x86: Use new cache mode type in memtype related functions 2014-11-16 11:04:26 +01:00
pat.c x86: Don't rely on VMWare emulating PAT MSR correctly 2015-01-20 14:33:45 +01:00
pf_in.c
pf_in.h
pgtable_32.c x86: Remove set_pmd_pfn 2014-09-01 10:15:31 +02:00
pgtable.c mm: account pmd page tables to the process 2015-02-11 17:06:04 -08:00
physaddr.c x86, mm: Make DEBUG_VIRTUAL work earlier in boot 2013-01-25 16:33:22 -08:00
physaddr.h
setup_nx.c x86: delete __cpuinit usage from all x86 files 2013-07-14 19:36:56 -04:00
srat.c x86/mm: Avoid duplicated pxm_to_node() calls 2014-02-09 15:32:31 +01:00
testmmiotrace.c
tlb.c x86: Store a per-cpu shadow copy of CR4 2015-02-04 12:10:42 +01:00