linux/mm
Juergen Gross 895f7b8e90 mm: don't defer struct page initialization for Xen pv guests
Commit f7f99100d8 ("mm: stop zeroing memory during allocation in
vmemmap") broke Xen pv domains in some configurations, as the "Pinned"
information in struct page of early page tables could get lost.

This will lead to the kernel trying to write directly into the page
tables instead of asking the hypervisor to do so.  The result is a crash
like the following:

  BUG: unable to handle kernel paging request at ffff8801ead19008
  IP: xen_set_pud+0x4e/0xd0
  PGD 1c0a067 P4D 1c0a067 PUD 23a0067 PMD 1e9de0067 PTE 80100001ead19065
  Oops: 0003 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-default+ #271
  Hardware name: Dell Inc. Latitude E6440/0159N7, BIOS A07 06/26/2014
  task: ffffffff81c10480 task.stack: ffffffff81c00000
  RIP: e030:xen_set_pud+0x4e/0xd0
  Call Trace:
   __pmd_alloc+0x128/0x140
   ioremap_page_range+0x3f4/0x410
   __ioremap_caller+0x1c3/0x2e0
   acpi_os_map_iomem+0x175/0x1b0
   acpi_tb_acquire_table+0x39/0x66
   acpi_tb_validate_table+0x44/0x7c
   acpi_tb_verify_temp_table+0x45/0x304
   acpi_reallocate_root_table+0x12d/0x141
   acpi_early_init+0x4d/0x10a
   start_kernel+0x3eb/0x4a1
   xen_start_kernel+0x528/0x532
  Code: 48 01 e8 48 0f 42 15 a2 fd be 00 48 01 d0 48 ba 00 00 00 00 00 ea ff ff 48 c1 e8 0c 48 c1 e0 06 48 01 d0 48 8b 00 f6 c4 02 75 5d <4c> 89 65 00 5b 5d 41 5c c3 65 8b 05 52 9f fe 7e 89 c0 48 0f a3
  RIP: xen_set_pud+0x4e/0xd0 RSP: ffffffff81c03cd8
  CR2: ffff8801ead19008
  ---[ end trace 38eca2e56f1b642e ]---

Avoid this problem by not deferring struct page initialization when
running as Xen pv guest.

Pavel said:

: This is unique for Xen, so this particular issue won't effect other
: configurations.  I am going to investigate if there is a way to
: re-enable deferred page initialization on xen guests.

[akpm@linux-foundation.org: explicitly include xen.h]
Link: http://lkml.kernel.org/r/20180216154101.22865-1-jgross@suse.com
Fixes: f7f99100d8 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Bob Picco <bob.picco@oracle.com>
Cc: <stable@vger.kernel.org>	[4.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21 15:35:43 -08:00
..
kasan kasan: fix prototype author email address 2018-02-06 18:32:43 -08:00
backing-dev.c Revert "bdi: add error handle for bdi_debug_register" 2017-12-21 10:01:30 -07:00
balloon_compaction.c virtio_balloon: fix deadlock on OOM 2017-11-14 23:57:38 +02:00
bootmem.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
cleancache.c
cma_debug.c
cma.c mm/cma.c: change pr_info to pr_err for cma_alloc fail log 2017-11-15 18:21:03 -08:00
cma.h
compaction.c mm/compaction.c: fix comment for try_to_compact_pages() 2018-01-31 17:18:39 -08:00
debug_page_ref.c
debug.c mm/debug.c: provide useful debugging information for VM_BUG 2018-01-04 16:45:09 -08:00
dmapool.c
early_ioremap.c mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep 2017-12-11 14:54:44 +01:00
fadvise.c mm/fadvise: discard partial page if endbyte is also EOF 2018-01-31 17:18:39 -08:00
failslab.c
filemap.c mm/filemap.c: remove include of hardirq.h 2018-01-31 17:18:36 -08:00
frame_vector.c mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' 2017-12-14 16:00:48 -08:00
frontswap.c
gup_benchmark.c mm: add infrastructure for get_user_pages_fast() benchmarking 2017-11-17 16:10:04 -08:00
gup.c libnvdimm for 4.16 2018-02-06 10:41:33 -08:00
highmem.c
hmm.c libnvdimm for 4.16 2018-02-06 10:41:33 -08:00
huge_memory.c mm/thp: remove pmd_huge_split_prepare() 2018-01-31 17:18:38 -08:00
hugetlb_cgroup.c
hugetlb.c hugetlb, mbind: fall back to default policy if vma is NULL 2018-01-31 17:18:40 -08:00
hwpoison-inject.c mm/memory_failure: Remove unused trapno from memory_failure 2018-01-23 12:17:42 -06:00
init-mm.c
internal.h Revert "mm, thp: Do not make pmd/pud dirty without a reason" 2017-11-29 09:01:01 -08:00
interval_tree.c mm/interval_tree.c: use vma_pages() helper 2018-01-31 17:18:37 -08:00
Kconfig mm: relax deferred struct page requirements 2018-01-31 17:18:36 -08:00
Kconfig.debug kmemcheck: rip it out 2017-11-15 18:21:05 -08:00
khugepaged.c mm: thp: use down_read_trylock() in khugepaged to avoid long block 2018-01-31 17:18:38 -08:00
kmemleak-test.c
kmemleak.c mm: kmemleak: remove unused hardirq.h 2018-01-31 17:18:36 -08:00
ksm.c mm: docs: fixup punctuation 2018-02-06 18:32:48 -08:00
list_lru.c mm/list_lru.c: mark expected switch fall-through 2017-11-15 18:21:07 -08:00
maccess.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
madvise.c mm/memory_failure: Remove unused trapno from memory_failure 2018-01-23 12:17:42 -06:00
Makefile mm: add infrastructure for get_user_pages_fast() benchmarking 2017-11-17 16:10:04 -08:00
memblock.c mm/memblock: memblock_is_map/region_memory can be boolean 2018-02-06 18:32:47 -08:00
memcontrol.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
memory_hotplug.c libnvdimm for 4.16 2018-02-06 10:41:33 -08:00
memory-failure.c x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages 2018-02-13 16:25:06 +01:00
memory.c mm: hide a #warning for COMPILE_TEST 2018-02-16 09:41:36 -08:00
mempolicy.c hugetlb, mbind: fall back to default policy if vma is NULL 2018-01-31 17:18:40 -08:00
mempool.c kasan: detect invalid frees for large mempool objects 2018-02-06 18:32:43 -08:00
memtest.c
migrate.c mm, hugetlb: do not rely on overcommit limit during migration 2018-01-31 17:18:40 -08:00
mincore.c
mlock.c mm, mlock, vmscan: no more skipping pagevecs 2018-02-21 15:35:42 -08:00
mm_init.c
mmap.c mm, oom_reaper: fix memory corruption 2017-12-14 16:00:49 -08:00
mmu_context.c
mmu_notifier.c mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks 2018-01-31 17:18:38 -08:00
mmzone.c
mprotect.c mm: numa: do not trap faults on shared data section pages. 2018-01-31 17:18:40 -08:00
mremap.c
msync.c
nobootmem.c
nommu.c mm: docs: fixup punctuation 2018-02-06 18:32:48 -08:00
oom_kill.c mm, oom: avoid reaping only for mm's with blockable invalidate callbacks 2018-01-31 17:18:38 -08:00
page_alloc.c mm: don't defer struct page initialization for Xen pv guests 2018-02-21 15:35:43 -08:00
page_counter.c
page_ext.c mm/page_ext.c: make page_ext_init a noop when CONFIG_PAGE_EXTENSION but nothing uses it 2018-01-31 17:18:39 -08:00
page_idle.c
page_io.c block: convert to bio_first_bvec_all & bio_first_page_all 2018-01-06 09:18:00 -07:00
page_isolation.c mm: distinguish CMA and MOVABLE isolation in has_unmovable_pages() 2017-11-15 18:21:02 -08:00
page_owner.c mm/page_owner.c: clean up init_pages_in_zone() 2018-01-31 17:18:39 -08:00
page_poison.c
page_vma_mapped.c mm, page_vma_mapped: Introduce pfn_in_hpage() 2018-01-22 12:15:57 -08:00
page-writeback.c Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" 2017-11-29 18:40:43 -08:00
pagewalk.c mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors 2018-02-06 18:32:48 -08:00
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c mm: remove __GFP_COLD 2017-11-15 18:21:06 -08:00
percpu.c percpu: hack to let the CRIS architecture to boot until they clean up 2017-11-27 12:53:12 -08:00
pgtable-generic.c mm: do not lose dirty and accessed bits in pmdp_invalidate() 2018-01-31 17:18:38 -08:00
process_vm_access.c mm: docs: add blank lines to silence sphinx "Unexpected indentation" errors 2018-02-06 18:32:48 -08:00
quicklist.c
readahead.c
rmap.c mm: remove cold parameter from free_hot_cold_page* 2017-11-15 18:21:06 -08:00
rodata_test.c
shmem.c shmem: add sealing support to hugetlb-backed memfd 2018-01-31 17:18:39 -08:00
slab_common.c Currently, hardened usercopy performs dynamic bounds checking on slab 2018-02-03 16:25:42 -08:00
slab.c kasan: don't use __builtin_return_address(1) 2018-02-06 18:32:43 -08:00
slab.h Currently, hardened usercopy performs dynamic bounds checking on slab 2018-02-03 16:25:42 -08:00
slob.c slab, slub, slob: add slab_flags_t 2017-11-15 18:21:01 -08:00
slub.c kasan: don't use __builtin_return_address(1) 2018-02-06 18:32:43 -08:00
sparse-vmemmap.c mm: merge vmem_altmap_alloc into altmap_alloc_block_buf 2018-01-08 11:46:23 -08:00
sparse.c libnvdimm for 4.16 2018-02-06 10:41:33 -08:00
swap_cgroup.c
swap_slots.c mm/swap_slots.c: fix race conditions in swap_slots cache init 2017-11-15 18:21:03 -08:00
swap_state.c mm: remove cold parameter for release_pages 2017-11-15 18:21:06 -08:00
swap.c mm/swap.c: make functions and their kernel-doc agree (again) 2018-02-21 15:35:43 -08:00
swapfile.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
truncate.c mm: add unmap_mapping_pages() 2018-01-31 17:18:37 -08:00
usercopy.c usercopy: WARN() on slab cache usercopy region violations 2018-01-15 12:07:48 -08:00
userfaultfd.c mm/userfaultfd.c: remove duplicate include 2018-02-06 18:32:47 -08:00
util.c new primitive: vmemdup_user() 2018-01-07 13:06:15 -05:00
vmacache.c
vmalloc.c vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems 2018-02-21 15:35:43 -08:00
vmpressure.c
vmscan.c mm, mlock, vmscan: no more skipping pagevecs 2018-02-21 15:35:42 -08:00
vmstat.c mm, sysctl: make NUMA stats configurable 2017-11-15 18:21:07 -08:00
workingset.c mm, truncate: do not check mapping for every page being truncated 2017-11-15 18:21:06 -08:00
z3fold.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
zbud.c mm: docs: fix parameter names mismatch 2018-02-06 18:32:48 -08:00
zpool.c mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc 2018-02-21 15:35:43 -08:00
zsmalloc.c zsmalloc: use U suffix for negative literals being shifted 2018-01-31 17:18:39 -08:00
zswap.c mm, swap, frontswap: fix THP swap if frontswap enabled 2018-02-21 15:35:43 -08:00