linux/mm
Gerald Schaefer 770a537022 mm: thp: broken page count after commit aa88b68c3b
Christian Borntraeger reported a kernel panic after corrupt page counts,
and it turned out to be a regression introduced with commit aa88b68c3b
("thp: keep huge zero page pinned until tlb flush"), at least on s390.

put_huge_zero_page() was moved over from zap_huge_pmd() to
release_pages(), and it was replaced by tlb_remove_page().  However,
release_pages() might not always be triggered by (the arch-specific)
tlb_remove_page().

On s390 we call free_page_and_swap_cache() from tlb_remove_page(), and
not tlb_flush_mmu() -> free_pages_and_swap_cache() like the generic
version, because we don't use the MMU-gather logic.  Although both
functions have very similar names, they are doing very unsimilar things,
in particular free_page_xxx is just doing a put_page(), while
free_pages_xxx calls release_pages().

This of course results in very harmful put_page()s on the huge zero
page, on architectures where tlb_remove_page() is implemented in this
way.  It seems to affect only s390 and sh, but sh doesn't have THP
support, so the problem (currently) probably only exists on s390.

The following quick hack fixed the issue:

Link: http://lkml.kernel.org/r/20160602172141.75c006a9@thinkpad
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org>	[4.6.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-09 14:23:11 -07:00
..
kasan kasan: change memory hot-add error messages to info messages 2016-06-09 14:23:11 -07:00
backing-dev.c mm: throttle on IO only when there are too many dirty and writeback pages 2016-05-20 17:58:30 -07:00
balloon_compaction.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-03-17 21:38:27 -07:00
bootmem.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
cleancache.c cleancache: constify cleancache_ops structure 2016-01-27 09:09:57 -05:00
cma_debug.c
cma.c mm/cma: silence warnings due to max() usage 2016-05-27 14:49:37 -07:00
cma.h
compaction.c mm/compaction.c: fix zoneindex in kcompactd() 2016-05-20 17:58:30 -07:00
debug_page_ref.c mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
debug.c mm: introduce page reference manipulation functions 2016-03-17 15:09:34 -07:00
dmapool.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
early_ioremap.c
fadvise.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
failslab.c mm: fault-inject take over bootstrap kmem_cache check 2016-03-15 16:55:16 -07:00
filemap.c Filesystem DAX locking for 4.7 2016-05-26 20:00:28 -07:00
frame_vector.c mm/gup: Switch all callers of get_user_pages() to not pass tsk/mm 2016-02-16 10:11:12 +01:00
frontswap.c
gup.c Merge branch 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-04-14 19:31:34 -07:00
highmem.c mm/highmem: make nr_free_highpages() handles all highmem zones by itself 2016-05-19 19:12:14 -07:00
huge_memory.c libnvdimm for 4.7 2016-05-23 11:18:01 -07:00
hugetlb_cgroup.c mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size 2016-05-20 17:58:30 -07:00
hugetlb.c mm/hugetlb: fix huge page reserve accounting for private mappings 2016-06-09 14:23:11 -07:00
hwpoison-inject.c
init-mm.c
internal.h mm: make vm_mmap killable 2016-05-23 17:04:14 -07:00
interval_tree.c
Kconfig mm: disable DEFERRED_STRUCT_PAGE_INIT on !NO_BOOTMEM 2016-05-27 14:49:37 -07:00
Kconfig.debug mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
kmemcheck.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c mm: coalesce split strings 2016-03-17 15:09:34 -07:00
ksm.c ksm: fix conflict between mmput and scan_get_next_rmap_item 2016-05-12 15:52:50 -07:00
list_lru.c mm: memcontrol: move kmem accounting code to CONFIG_MEMCG 2016-01-20 17:09:18 -08:00
maccess.c x86: remove more uaccess_32.h complexity 2016-05-22 17:21:27 -07:00
madvise.c mm: make mmap_sem for write waits killable for mm syscalls 2016-05-23 17:04:14 -07:00
Makefile z3fold: the 3-fold allocator for compressed pages 2016-05-20 17:58:30 -07:00
memblock.c mm/memblock.c: remove unnecessary always-true comparison 2016-05-20 17:58:30 -07:00
memcontrol.c revert "mm: memcontrol: fix possible css ref leak on oom" 2016-06-09 14:23:11 -07:00
memory_hotplug.c mm: fix section mismatch warning 2016-05-27 15:23:32 -07:00
memory-failure.c mm/memory-failure.c: replace "MCE" with "Memory failure" 2016-05-20 17:58:30 -07:00
memory.c Filesystem DAX locking for 4.7 2016-05-26 20:00:28 -07:00
mempolicy.c mm, page_alloc: avoid looking up the first zone in a zonelist twice 2016-05-19 19:12:14 -07:00
mempool.c mm: kasan: initial memory quarantine implementation 2016-05-20 17:58:30 -07:00
memtest.c
migrate.c mm, migrate: increment fail count on ENOMEM 2016-05-20 17:58:30 -07:00
mincore.c mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
mlock.c mm: make mmap_sem for write waits killable for mm syscalls 2016-05-23 17:04:14 -07:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c mm: remove more IS_ERR_VALUE abuses 2016-05-27 15:57:31 -07:00
mmu_context.c mm/mmu_context, sched/core: Fix mmu_context.h assumption 2016-04-28 11:44:19 +02:00
mmu_notifier.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
mmzone.c mm, page_alloc: inline the fast path of the zonelist iterator 2016-05-19 19:12:14 -07:00
mprotect.c mm: make mmap_sem for write waits killable for mm syscalls 2016-05-23 17:04:14 -07:00
mremap.c mm: make mmap_sem for write waits killable for mm syscalls 2016-05-23 17:04:14 -07:00
msync.c
nobootmem.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
nommu.c mm: remove more IS_ERR_VALUE abuses 2016-05-27 15:57:31 -07:00
oom_kill.c mm, oom_reaper: do not use siglock in try_oom_reaper() 2016-06-03 16:02:56 -07:00
page_alloc.c mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies 2016-06-03 16:02:57 -07:00
page_counter.c
page_ext.c mm: use early_pfn_to_nid in page_ext_init 2016-05-27 14:49:37 -07:00
page_idle.c mm: add page_check_address_transhuge() helper 2016-01-15 17:56:32 -08:00
page_io.c Merge branch 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-17 15:05:23 -07:00
page_isolation.c mm/memory_hotplug: add comment to some functions related to memory hotplug 2016-05-19 19:12:14 -07:00
page_owner.c mm: check the return value of lookup_page_ext for all call sites 2016-06-03 15:06:22 -07:00
page_poison.c mm: check the return value of lookup_page_ext for all call sites 2016-06-03 15:06:22 -07:00
page-writeback.c MM: increase safety margin provided by PF_LESS_THROTTLE 2016-05-20 17:58:30 -07:00
pagewalk.c thp: rename split_huge_page_pmd() to split_huge_pmd() 2016-01-15 17:56:32 -08:00
percpu-km.c mm: percpu: use pr_fmt to prefix output 2016-03-17 15:09:34 -07:00
percpu-vm.c
percpu.c mm: percpu: use pr_fmt to prefix output 2016-03-17 15:09:34 -07:00
pgtable-generic.c mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_range 2016-03-17 15:09:34 -07:00
process_vm_access.c mm/gup: Introduce get_user_pages_remote() 2016-02-16 10:04:09 +01:00
quicklist.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
readahead.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
rmap.c mm: thp: avoid false positive VM_BUG_ON_PAGE in page_move_anon_rmap() 2016-05-27 14:49:37 -07:00
shmem.c switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
slab_common.c mm: kasan: initial memory quarantine implementation 2016-05-20 17:58:30 -07:00
slab.c mm, kasan: don't call kasan_krealloc() from ksize(). 2016-05-20 17:58:30 -07:00
slab.h mm: kasan: initial memory quarantine implementation 2016-05-20 17:58:30 -07:00
slob.c mm: slab: free kmem_cache_node after destroy sysfs file 2016-02-18 16:23:24 -08:00
slub.c mm, kasan: don't call kasan_krealloc() from ksize(). 2016-05-20 17:58:30 -07:00
sparse-vmemmap.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
sparse.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
swap_cgroup.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
swap_state.c mm: thp: broken page count after commit aa88b68c3b 2016-06-09 14:23:11 -07:00
swap.c mm/swap.c: put activate_page_pvecs and other pagevecs together 2016-05-20 17:58:30 -07:00
swapfile.c mm: thp: calculate the mapcount correctly for THP pages during WP faults 2016-05-12 15:52:50 -07:00
truncate.c dax: New fault locking 2016-05-19 15:20:54 -06:00
userfaultfd.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
util.c mm: make vm_mmap killable 2016-05-23 17:04:14 -07:00
vmacache.c
vmalloc.c mm: fix overflow in vm_map_ram() 2016-06-03 15:06:22 -07:00
vmpressure.c mm/vmpressure.c: fix subtree pressure detection 2016-02-03 08:28:43 -08:00
vmscan.c mm, oom: rework oom detection 2016-05-20 17:58:30 -07:00
vmstat.c mm: check the return value of lookup_page_ext for all call sites 2016-06-03 15:06:22 -07:00
workingset.c mm: workingset: make shadow node shrinker memcg aware 2016-03-17 15:09:34 -07:00
z3fold.c mm/z3fold.c: avoid modifying HEADLESS page and minor cleanup 2016-06-03 16:02:55 -07:00
zbud.c mm/zbud.c: use list_last_entry() instead of list_tail_entry() 2016-01-15 11:40:52 -08:00
zpool.c
zsmalloc.c update "mm/zsmalloc: don't fail if can't create debugfs info" 2016-05-26 15:35:44 -07:00
zswap.c mm/zswap: use workqueue to destroy pool 2016-05-20 17:58:30 -07:00