linux/mm
David Rientjes 32f8516a8c mm, mempolicy: fix printing stack contents in numa_maps
When reading /proc/pid/numa_maps, it's possible to return the contents of
the stack where the mempolicy string should be printed if the policy gets
freed from beneath us.

This happens because mpol_to_str() may return an error the
stack-allocated buffer is then printed without ever being stored.

There are two possible error conditions in mpol_to_str():

 - if the buffer allocated is insufficient for the string to be stored,
   and

 - if the mempolicy has an invalid mode.

The first error condition is not triggered in any of the callers to
mpol_to_str(): at least 50 bytes is always allocated on the stack and this
is sufficient for the string to be written.  A future patch should convert
this into BUILD_BUG_ON() since we know the maximum strlen possible, but
that's not -rc material.

The second error condition is possible if a race occurs in dropping a
reference to a task's mempolicy causing it to be freed during the read().
The slab poison value is then used for the mode and mpol_to_str() returns
-EINVAL.

This race is only possible because get_vma_policy() believes that
mm->mmap_sem protects task->mempolicy, which isn't true.  The exit path
does not hold mm->mmap_sem when dropping the reference or setting
task->mempolicy to NULL: it uses task_lock(task) instead.

Thus, it's required for the caller of a task mempolicy to hold
task_lock(task) while grabbing the mempolicy and reading it.  Callers with
a vma policy store their mempolicy earlier and can simply increment the
reference count so it's guaranteed not to be freed.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-16 18:00:50 -07:00
..
backing-dev.c backing-dev: use kstrto* in preference to simple_strtoul 2012-08-25 16:58:14 +08:00
bootmem.c mm: fix-up zone present pages 2012-10-09 16:22:54 +09:00
bounce.c bounce: allow use of bounce pool via config option 2012-07-18 16:40:35 -04:00
cleancache.c
compaction.c CMA: migrate mlocked pages 2012-10-09 16:23:00 +09:00
debug-pagealloc.c
dmapool.c
fadvise.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
failslab.c
filemap_xip.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
filemap.c readahead: fault retry breaks mmap file read random detection 2012-10-09 16:22:47 +09:00
fremap.c mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
frontswap.c frontswap: support exclusive gets if tmem backend is capable 2012-09-21 10:38:12 -04:00
highmem.c mm: add support for direct_IO to highmem pages 2012-07-31 18:42:47 -07:00
huge_memory.c mm: huge_memory: Fix build error. 2012-10-15 07:59:15 -07:00
hugetlb_cgroup.c hugetlb/cgroup: remove exclude and wakeup rmdir calls from migrate 2012-07-31 18:42:41 -07:00
hugetlb.c mm: document PageHuge somewhat 2012-10-09 16:23:03 +09:00
hwpoison-inject.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
init-mm.c
internal.h mm, thp: fix mlock statistics 2012-10-09 16:23:03 +09:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
Kconfig mm: enable CONFIG_COMPACTION by default 2012-10-09 16:22:53 +09:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c kmemleak: use rbtree instead of prio tree 2012-10-09 16:22:39 +09:00
ksm.c mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end 2012-10-09 16:22:58 +09:00
maccess.c
madvise.c mm: prepare VM_DONTDUMP for using in drivers 2012-10-09 16:22:18 +09:00
Makefile mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
memblock.c mm: avoid section mismatch warning for memblock_type_name 2012-10-09 16:23:01 +09:00
memcontrol.c memcg: move mem_cgroup_is_root upwards 2012-10-09 16:22:55 +09:00
memory_hotplug.c memory-hotplug: suppress "Trying to free nonexistent resource <XXXXXXXXXXXXXXXX-YYYYYYYYYYYYYYYY>" warning 2012-10-09 16:23:04 +09:00
memory-failure.c mm anon rmap: replace same_anon_vma linked list with an interval tree. 2012-10-09 16:22:41 +09:00
memory.c mm, thp: fix mapped pages avoiding unevictable list on mlock 2012-10-09 16:23:02 +09:00
mempolicy.c mm, mempolicy: fix printing stack contents in numa_maps 2012-10-16 18:00:50 -07:00
mempool.c
migrate.c mm: memcg: fix compaction/migration failing due to memcg limits 2012-07-31 18:42:48 -07:00
mincore.c
mlock.c mm, thp: fix mlock statistics 2012-10-09 16:23:03 +09:00
mm_init.c
mmap.c mm: avoid taking rmap locks in move_ptes() 2012-10-09 16:22:42 +09:00
mmu_context.c
mmu_notifier.c mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end 2012-10-09 16:22:58 +09:00
mmzone.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
mprotect.c
mremap.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
msync.c
nobootmem.c mm: fix-up zone present pages 2012-10-09 16:22:54 +09:00
nommu.c mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
oom_kill.c oom: remove deprecated oom_adj 2012-10-09 16:22:24 +09:00
page_alloc.c cma: decrease cc.nr_migratepages after reclaiming pagelist 2012-10-09 16:23:01 +09:00
page_cgroup.c memcg: rename config variables 2012-07-31 18:42:43 -07:00
page_io.c mm: add support for direct_IO to highmem pages 2012-07-31 18:42:47 -07:00
page_isolation.c mm/page_alloc: refactor out __alloc_contig_migrate_alloc() 2012-10-09 16:22:52 +09:00
page-writeback.c CPU hotplug, writeback: Don't call writeback_set_ratelimit() too often during hotplug 2012-09-28 20:27:49 +08:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c sections: fix section conflicts in mm/percpu.c 2012-10-06 03:04:44 +09:00
pgtable-generic.c thp: introduce pmdp_invalidate() 2012-10-09 16:22:29 +09:00
process_vm_access.c
quicklist.c
readahead.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
rmap.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
shmem.c tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking 2012-10-09 23:33:55 -04:00
slab_common.c mm, slab: release slab_mutex earlier in kmem_cache_destroy() 2012-10-10 09:25:08 +03:00
slab.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2012-10-07 07:53:13 +09:00
slab.h Revert "mm/sl[aou]b: Move sysfs_slab_add to common" 2012-09-05 12:07:44 +03:00
slob.c Merge branch 'slab/tracing' into slab/for-linus 2012-10-03 09:57:17 +03:00
slub.c Merge branch 'slab/common-for-cgroups' into slab/for-linus 2012-10-03 09:56:37 +03:00
sparse-vmemmap.c
sparse.c mm/sparse: remove index_init_lock 2012-07-31 18:42:49 -07:00
swap_state.c mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages 2012-07-31 18:42:47 -07:00
swap.c mm: remove vma arg from page_evictable 2012-10-09 16:22:55 +09:00
swapfile.c vfs: make path_openat take a struct filename pointer 2012-10-12 20:15:09 -04:00
truncate.c mm: use clear_page_mlock() in page_remove_rmap() 2012-10-09 16:22:56 +09:00
util.c mm: Use __do_krealloc to do the krealloc job 2012-09-04 10:22:58 +03:00
vmalloc.c mm: use %pK for /proc/vmallocinfo 2012-10-09 16:23:03 +09:00
vmscan.c CMA: migrate mlocked pages 2012-10-09 16:23:00 +09:00
vmstat.c mm: remove unevictable_pgs_mlockfreed 2012-10-09 16:22:59 +09:00