linux

mirror of https://github.com/FEX-Emu/linux.git synced 2024-12-26 19:36:41 +00:00

History

Johannes Weiner 96f1c58d85 mm: memcg: fix race condition between memcg teardown and swapin There is a race condition between a memcg being torn down and a swapin triggered from a different memcg of a page that was recorded to belong to the exiting memcg on swapout (with CONFIG_MEMCG_SWAP extension). The result is unreclaimable pages pointing to dead memcgs, which can lead to anything from endless loops in later memcg teardown (the page is charged to all hierarchical parents but is not on any LRU list) or crashes from following the dangling memcg pointer. Memcgs with tasks in them can not be torn down and usually charges don't show up in memcgs without tasks. Swapin with the CONFIG_MEMCG_SWAP extension is the notable exception because it charges the cgroup that was recorded as owner during swapout, which may be empty and in the process of being torn down when a task in another memcg triggers the swapin: teardown: swapin: lookup_swap_cgroup_id() rcu_read_lock() mem_cgroup_lookup() css_tryget() rcu_read_unlock() disable css_tryget() call_rcu() offline_css() reparent_charges() res_counter_charge() (hierarchical!) css_put() css_free() pc->mem_cgroup = dead memcg add page to dead lru Add a final reparenting step into css_free() to make sure any such raced charges are moved out of the memcg before it's finally freed. In the longer term it would be cleaner to have the css_tryget() and the res_counter charge under the same RCU lock section so that the charge reparenting is deferred until the last charge whose tryget succeeded is visible. But this will require more invasive changes that will be harder to evaluate and backport into stable, so better defer them to a separate change set. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2013-12-12 18:19:26 -08:00
..
backing-dev.c	mm/backing-dev.c: check user buffer length before copying data to the related user buffer	2013-09-11 15:58:03 -07:00
balloon_compaction.c	mm: introduce a common interface for balloon pages mobility	2012-12-11 17:22:26 -08:00
bootmem.c	mm/bootmem.c: remove unused local `map'	2013-11-13 12:09:09 +09:00
bounce.c	mm/bounce.c: fix a regression where MS_SNAP_STABLE (stable pages snapshotting) was ignored	2013-09-30 14:31:02 -07:00
cleancache.c	mm: cleancache: clean up cleancache_enabled	2013-04-30 17:04:01 -07:00
compaction.c	mm/compaction.c: update comment about zone lock in isolate_freepages_block	2013-11-13 12:09:03 +09:00
debug-pagealloc.c	mm, x86: Remove debug_pagealloc_enabled	2011-12-06 09:24:07 +01:00
dmapool.c	dmapool: make DMAPOOL_DEBUG detect corruption of free marker	2012-12-11 17:22:24 -08:00
fadvise.c	teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long	2013-03-03 22:46:22 -05:00
failslab.c	switch debugfs to umode_t	2012-01-03 22:54:56 -05:00
filemap_xip.c	seqcount: Add lockdep functionality to seqcount/seqlock structures	2013-11-06 12:40:26 +01:00
filemap.c	mm: drop actor argument of do_generic_file_read()	2013-11-15 09:32:13 +09:00
fremap.c	mm: save soft-dirty bits on file pages	2013-08-13 17:57:48 -07:00
frontswap.c	frontswap: fix incorrect zeroing and allocation size for frontswap_map	2013-06-12 16:29:46 -07:00
highmem.c	Some nice cleanups, and even a patch my wife did as a "live" demo for	2012-12-20 08:37:05 -08:00
huge_memory.c	thp: move preallocated PTE page table on move_huge_pmd()	2013-12-12 18:19:26 -08:00
hugetlb_cgroup.c	cgroup: pass around cgroup_subsys_state instead of cgroup in file methods	2013-08-08 20:11:24 -04:00
hugetlb.c	mm: hugetlbfs: fix hugetlbfs optimization	2013-11-21 16:42:27 -08:00
hwpoison-inject.c	mm/hwpoison: fix the lack of one reference count against poisoned page	2013-09-30 14:31:03 -07:00
init-mm.c
internal.h	mm: vmscan: fix do_try_to_free_pages() livelock	2013-09-11 15:58:01 -07:00
interval_tree.c	mm: add CONFIG_DEBUG_VM_RB build option	2012-10-09 16:22:42 +09:00
Kconfig	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial	2013-11-15 16:47:22 -08:00
Kconfig.debug	mm: more intensive memory corruption debugging	2012-01-10 16:30:42 -08:00
kmemcheck.c
kmemleak-test.c
kmemleak.c	mm: kmemleak: avoid false negatives on vmalloc'ed objects	2013-11-13 12:09:07 +09:00
ksm.c	ksm: remove redundant __GFP_ZERO from kcalloc	2013-11-13 12:09:02 +09:00
list_lru.c	mm: list_lru: fix almost infinite loop causing effective livelock	2013-10-30 12:57:46 -07:00
maccess.c
madvise.c	mm/hwpoison: fix traversal of hugetlbfs pages to avoid printk flood	2013-09-30 14:31:02 -07:00
Makefile	list: add a new LRU list type	2013-09-10 18:56:30 -04:00
memblock.c	mm/memblock.c: introduce bottom-up allocation mode	2013-11-13 12:09:08 +09:00
memcontrol.c	mm: memcg: fix race condition between memcg teardown and swapin	2013-12-12 18:19:26 -08:00
memory_hotplug.c	mem-hotplug: introduce movable_node boot option	2013-11-13 12:09:09 +09:00
memory-failure.c	kfifo API type safety	2013-11-15 09:32:23 +09:00
memory.c	Revert "mm: create a separate slab for page->ptl allocation"	2013-11-20 14:41:47 -08:00
mempolicy.c	mm, mempolicy: silence gcc warning	2013-11-21 16:42:27 -08:00
mempool.c	mm/mempool.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)	2013-09-11 15:58:14 -07:00
migrate.c	mm: thp: give transparent hugepage code a separate copy_page	2013-11-21 16:42:27 -08:00
mincore.c	swap: make each swap partition have one address_space	2013-02-23 17:50:17 -08:00
mlock.c	mm/mlock.c: prevent walking off the end of a pagetable in no-pmd configuration	2013-09-30 14:31:02 -07:00
mm_init.c	mm: numa: Change page last {nid,pid} into {cpu,pid}	2013-10-09 14:47:45 +02:00
mmap.c	mm: convert mm->nr_ptes to atomic_long_t	2013-11-15 09:32:14 +09:00
mmu_context.c	mm: remove old aio use_mm() comment	2013-05-07 18:38:27 -07:00
mmu_notifier.c	treewide: relase -> release	2013-06-28 14:34:33 +02:00
mmzone.c	mm: numa: Change page last {nid,pid} into {cpu,pid}	2013-10-09 14:47:45 +02:00
mprotect.c	mm: numa: return the number of base pages altered by protection changes	2013-11-13 12:09:11 +09:00
mremap.c	mm: revert mremap pud_free anti-fix	2013-10-16 21:35:53 -07:00
msync.c
nobootmem.c	mm/nobootmem.c: have __free_pages_memory() free in larger chunks.	2013-11-13 12:09:04 +09:00
nommu.c	Merge branch 'akpm' (patches from Andrew Morton)	2013-11-13 15:45:43 +09:00
oom_kill.c	mm: convert mm->nr_ptes to atomic_long_t	2013-11-15 09:32:14 +09:00
page_alloc.c	mm/page_alloc.c: fix comment in zlc_setup()	2013-11-13 12:09:11 +09:00
page_cgroup.c	memcontrol: use N_MEMORY instead N_HIGH_MEMORY	2012-12-12 17:38:32 -08:00
page_io.c	aio: Kill aio_rw_vect_retry()	2013-07-30 11:53:12 -04:00
page_isolation.c	mm: memory-hotplug: enable memory hotplug to handle hugepage	2013-09-11 15:57:48 -07:00
page-writeback.c	writeback: fix negative bdi max pause	2013-10-16 21:35:53 -07:00
pagewalk.c	mm/pagewalk.c: fix walk_page_range() access of wrong PTEs	2013-10-30 14:27:03 -07:00
percpu-km.c
percpu-vm.c	mm: fix kernel-doc warnings	2012-06-20 14:39:36 -07:00
percpu.c	percpu: fix bootmem error handling in pcpu_page_first_chunk()	2013-09-23 10:51:45 -04:00
pgtable-generic.c	mm: convert the rest to new page table lock api	2013-11-15 09:32:15 +09:00
process_vm_access.c	Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys	2013-03-12 11:05:45 -07:00
quicklist.c
readahead.c	readahead: fix sequential read cache miss detection	2013-11-13 12:09:09 +09:00
rmap.c	mm, hugetlb: convert hugetlbfs to use split pmd lock	2013-11-15 09:32:14 +09:00
shmem.c	security: shmem: implement kernel private shmem inodes	2013-12-02 11:24:19 +00:00
slab_common.c	memcg, kmem: rename cache_from_memcg to cache_from_memcg_idx	2013-11-13 12:09:10 +09:00
slab.c	Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux	2013-11-22 08:10:34 -08:00
slab.h	memcg, kmem: rename cache_from_memcg to cache_from_memcg_idx	2013-11-13 12:09:10 +09:00
slob.c	mm/sl[aou]b: Move kmallocXXX functions to common code	2013-09-04 20:51:33 +03:00
slub.c	Merge branch 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux	2013-11-22 08:10:34 -08:00
sparse-vmemmap.c	sparse-vmemmap: specify vmemmap population range in bytes	2013-04-29 15:54:35 -07:00
sparse.c	mm/sparsemem: fix a bug in free_map_bootmem when CONFIG_SPARSEMEM_VMEMMAP	2013-11-13 12:09:06 +09:00
swap_state.c	lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interrupt	2013-09-11 15:59:36 -07:00
swap.c	mm: hugetlbfs: fix hugetlbfs optimization	2013-11-21 16:42:27 -08:00
swapfile.c	frontswap: enable call to invalidate area on swapoff	2013-11-13 12:09:07 +09:00
truncate.c	truncate: drop 'oldsize' truncate_pagecache() parameter	2013-09-12 15:38:02 -07:00
util.c	mm: factor commit limit calculation	2013-11-13 12:09:11 +09:00
vmalloc.c	mm: kmemleak: avoid false negatives on vmalloc'ed objects	2013-11-13 12:09:07 +09:00
vmpressure.c	Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2013-09-03 18:25:03 -07:00
vmscan.c	mm/vmscan.c: don't forget to free shrinker->nr_deferred	2013-10-16 21:35:52 -07:00
vmstat.c	mm: numa: return the number of base pages altered by protection changes	2013-11-13 12:09:11 +09:00
zbud.c	mm/zbud: fix some trivial typos in comments	2013-09-11 15:57:35 -07:00
zswap.c	mm/zswap: refactor the get/put routines	2013-11-13 12:09:11 +09:00