linux/mm
Hugh Dickins f2a07f40db tmpfs mempolicy: fix /proc/mounts corrupting memory
Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA
mempolicy testing.  Very nasty.  Reading /proc/mounts, /proc/pid/mounts
or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often
in a page table (causing "Bad swap" or "Bad page map" warning or "Bad
pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere
worse.  "mpol=prefer" and "mpol=prefer:Node" are equally toxic.

Recent NUMA enhancements are not to blame: this dates back to 2.6.35,
when commit e17f74af35 "mempolicy: don't call mpol_set_nodemask() when
no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(),
which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags.
With slab poisoning, you can then rely on mpol_to_str() to set the bit
for node 0x6b6b, probably in the next page above the caller's stack.

mpol_parse_str() is only called from shmem_parse_options(): no_context
is always true, so call it unused for now, and remove !no_context code.
Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might
expect.  Then mpol_to_str() can ignore its no_context argument also,
the mpol being appropriately initialized whether contextualized or not.
Rename its no_context unused too, and let subsequent patch remove them
(that's not needed for stable backporting, which would involve rejects).

I don't understand why MPOL_LOCAL is described as a pseudo-policy:
it's a reasonable policy which suffers from a confusing implementation
in terms of MPOL_PREFERRED with MPOL_F_LOCAL.  I believe this would be
much more robust if MPOL_LOCAL were recognized in switch statements
throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly
empty) nodes mask like everyone else, instead of its preferred_node
variant (I presume an optimization from the days before MPOL_LOCAL).
But that would take me too long to get right and fully tested.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-02 09:27:10 -08:00
..
backing-dev.c Revert "bdi: add a user-tunable cpu_list for the bdi flusher threads" 2012-12-17 11:29:09 -08:00
balloon_compaction.c mm: introduce a common interface for balloon pages mobility 2012-12-11 17:22:26 -08:00
bootmem.c mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic() 2012-12-12 17:38:35 -08:00
bounce.c
cleancache.c
compaction.c compaction: fix build error in CMA && !COMPACTION 2012-12-20 17:40:18 -08:00
debug-pagealloc.c
dmapool.c dmapool: make DMAPOOL_DEBUG detect corruption of free marker 2012-12-11 17:22:24 -08:00
fadvise.c
failslab.c
filemap_xip.c
filemap.c
fremap.c
frontswap.c
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c mm: clean up transparent hugepage sysfs error messages 2012-12-20 17:40:20 -08:00
hugetlb_cgroup.c mm/hugetlb: create hugetlb cgroup file in hugetlb_init 2012-12-18 15:02:15 -08:00
hugetlb.c mm/hugetlb: create hugetlb cgroup file in hugetlb_init 2012-12-18 15:02:15 -08:00
hwpoison-inject.c
init-mm.c
internal.h Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00
interval_tree.c
Kconfig memory-hotplug: document and enable CONFIG_MOVABLE_NODE 2012-12-18 15:02:12 -08:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm/kmemleak.c: remove obsolete simple_strtoul 2012-12-18 15:02:15 -08:00
ksm.c ksm: make rmap walks more scalable 2012-12-20 07:06:56 -08:00
maccess.c
madvise.c
Makefile mm: introduce a common interface for balloon pages mobility 2012-12-11 17:22:26 -08:00
memblock.c
memcontrol.c memcg: don't register hotcpu notifier from ->css_alloc() 2012-12-20 17:40:20 -08:00
memory_hotplug.c mm/memory_hotplug.c: improve comments 2012-12-18 15:02:15 -08:00
memory-failure.c Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00
memory.c mm: use kbasename() 2012-12-17 17:15:17 -08:00
mempolicy.c tmpfs mempolicy: fix /proc/mounts corrupting memory 2013-01-02 09:27:10 -08:00
mempool.c
migrate.c mm,numa: fix update_mmu_cache_pmd call 2012-12-17 19:37:03 -08:00
mincore.c
mlock.c
mm_init.c
mmap.c Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00
mmu_context.c
mmu_notifier.c
mmzone.c memcg: fix hotplugged memory zone oops 2012-11-16 14:33:04 -08:00
mprotect.c mm/mprotect.c: coding-style cleanups 2012-12-18 15:02:15 -08:00
mremap.c Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00
msync.c
nobootmem.c mm: introduce new field "managed_pages" to struct zone 2012-12-12 17:38:34 -08:00
nommu.c mm: export a function to get vm committed memory 2012-11-15 15:41:22 -08:00
oom_kill.c mm, oom: remove redundant sleep in pagefault oom handler 2012-12-12 17:38:34 -08:00
page_alloc.c mm: cma: WARN if freed memory is still in use 2012-12-20 17:40:19 -08:00
page_cgroup.c memcontrol: use N_MEMORY instead N_HIGH_MEMORY 2012-12-12 17:38:32 -08:00
page_io.c
page_isolation.c memory-hotplug: skip HWPoisoned page when offlining pages 2012-12-11 17:22:22 -08:00
page-writeback.c mm: fix calculation of dirtyable memory 2012-12-20 17:40:18 -08:00
pagewalk.c thp: change split_huge_page_pmd() interface 2012-12-12 17:38:31 -08:00
percpu-km.c
percpu-vm.c
percpu.c mm, percpu: Make sure percpu_alloc early parameter has an argument 2012-12-02 06:23:04 -08:00
pgtable-generic.c mm: Only flush the TLB when clearing an accessible pte 2012-12-11 14:28:34 +00:00
process_vm_access.c
quicklist.c
readahead.c
rmap.c Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00
shmem.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
slab_common.c slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slab.c memcg: add comments clarifying aspects of cache attribute propagation 2012-12-18 15:02:15 -08:00
slab.h slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slob.c sl[au]b: always get the cache from its page in kmem_cache_free() 2012-12-18 15:02:14 -08:00
slub.c slub: drop mutex before deleting sysfs entry 2012-12-18 15:02:15 -08:00
sparse-vmemmap.c
sparse.c memory-hotplug, mm/sparse.c: clear the memory to store struct page 2012-12-11 17:22:23 -08:00
swap_state.c
swap.c
swapfile.c mm, oom: fix race when specifying a thread as the oom origin 2012-12-11 17:22:27 -08:00
truncate.c mm: drop vmtruncate 2012-12-20 18:46:29 -05:00
util.c
vmalloc.c mm: use IS_ENABLED(CONFIG_NUMA) instead of NUMA_BUILD 2012-12-11 17:22:22 -08:00
vmscan.c mm: fix null pointer dereference in wait_iff_congested() 2012-12-28 08:42:39 -08:00
vmstat.c Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00