linux/mm
Michal Hocko a394cb8ee6 memcg,vmscan: do not break out targeted reclaim without reclaimed pages
Targeted (hard resp soft) reclaim has traditionally tried to scan one
group with decreasing priority until nr_to_reclaim (SWAP_CLUSTER_MAX
pages) is reclaimed or all priorities are exhausted.  The reclaim is
then retried until the limit is met.

This approach, however, doesn't work well with deeper hierarchies where
groups higher in the hierarchy do not have any or only very few pages
(this usually happens if those groups do not have any tasks and they
have only re-parented pages after some of their children is removed).
Those groups are reclaimed with decreasing priority pointlessly as there
is nothing to reclaim from them.

An easiest fix is to break out of the memcg iteration loop in
shrink_zone only if the whole hierarchy has been visited or sufficient
pages have been reclaimed.  This is also more natural because the
reclaimer expects that the hierarchy under the given root is reclaimed.
As a result we can simplify the soft limit reclaim which does its own
iteration.

[yinghan@google.com: break out of the hierarchy loop only if nr_reclaimed exceeded nr_to_reclaim]
[akpm@linux-foundation.org: use conventional comparison order]
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Ying Han <yinghan@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:10 -08:00
..
backing-dev.c bdi: allow block devices to say that they require stable page writes 2013-02-21 17:22:19 -08:00
balloon_compaction.c
bootmem.c mm: Add alloc_bootmem_low_pages_nopanic() 2013-01-29 19:32:59 -08:00
bounce.c block: optionally snapshot page contents to provide stable pages during write 2013-02-21 17:22:20 -08:00
cleancache.c
compaction.c mm: compaction: do not accidentally skip pageblocks in the migrate scanner 2013-02-23 17:50:10 -08:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c
filemap.c mm: only enforce stable page writes if the backing device requires it 2013-02-21 17:22:19 -08:00
fremap.c
frontswap.c
highmem.c Some nice cleanups, and even a patch my wife did as a "live" demo for 2012-12-20 08:37:05 -08:00
huge_memory.c mm/huge_memory.c: use new hashtable implementation 2013-02-23 17:50:10 -08:00
hugetlb_cgroup.c mm/hugetlb: create hugetlb cgroup file in hugetlb_init 2012-12-18 15:02:15 -08:00
hugetlb.c mm/hugetlb.c: convert to pr_foo() 2013-02-23 17:50:09 -08:00
hwpoison-inject.c
init-mm.c
internal.h mm: compaction: partially revert capture of suitable high-order page 2013-01-11 14:54:56 -08:00
interval_tree.c
Kconfig Merge branch 'akpm' (incoming from Andrew) 2013-02-21 17:38:49 -08:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm/kmemleak.c: remove obsolete simple_strtoul 2012-12-18 15:02:15 -08:00
ksm.c mm/ksm.c: use new hashtable implementation 2013-02-23 17:50:10 -08:00
maccess.c
madvise.c
Makefile
memblock.c memblock: Add memblock_mem_size() 2013-01-29 19:32:57 -08:00
memcontrol.c mm/memcontrol.c: convert printk(KERN_FOO) to pr_foo() 2013-02-23 17:50:09 -08:00
memory_hotplug.c mm/memory_hotplug.c: improve comments 2012-12-18 15:02:15 -08:00
memory-failure.c
memory.c mm: reduce rmap overhead for ex-KSM page copies created on swap faults 2013-02-23 17:50:09 -08:00
mempolicy.c mm: mempolicy: Convert shared_policy mutex to spinlock 2013-01-02 17:32:13 -08:00
mempool.c
migrate.c mm/hugetlb: set PTE as huge in hugetlb_change_protection and remove_migration_pte 2013-02-05 20:38:47 +11:00
mincore.c
mlock.c mm: don't overwrite mm->def_flags in do_mlockall() 2013-02-12 14:34:00 -08:00
mm_init.c
mmap.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-19 18:19:48 -08:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c mm/mprotect.c: coding-style cleanups 2012-12-18 15:02:15 -08:00
mremap.c sched: Move sched.h sysctl bits into separate header 2013-02-07 20:50:54 +01:00
msync.c
nobootmem.c mm: Add alloc_bootmem_low_pages_nopanic() 2013-01-29 19:32:59 -08:00
nommu.c sched: Move sched.h sysctl bits into separate header 2013-02-07 20:50:54 +01:00
oom_kill.c memcg, oom: provide more precise dump info while memcg oom happening 2013-02-23 17:50:08 -08:00
page_alloc.c mm/page_alloc.c:__setup_per_zone_wmarks: make min_pages unsigned long 2013-02-23 17:50:10 -08:00
page_cgroup.c
page_io.c
page_isolation.c mm: fix zone_watermark_ok_safe() accounting of isolated pages 2013-01-04 16:11:46 -08:00
page-writeback.c block: optionally snapshot page contents to provide stable pages during write 2013-02-21 17:22:20 -08:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c
rmap.c s390/mm: implement software dirty bits 2013-02-14 15:55:23 +01:00
shmem.c mempolicy: remove arg from mpol_parse_str, mpol_to_str 2013-01-02 09:27:10 -08:00
slab_common.c slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slab.c memcg: add comments clarifying aspects of cache attribute propagation 2012-12-18 15:02:15 -08:00
slab.h slab: propagate tunable values 2012-12-18 15:02:14 -08:00
slob.c
slub.c slub: drop mutex before deleting sysfs entry 2012-12-18 15:02:15 -08:00
sparse-vmemmap.c
sparse.c
swap_state.c
swap.c
swapfile.c
truncate.c mm: drop vmtruncate 2012-12-20 18:46:29 -05:00
util.c
vmalloc.c
vmscan.c memcg,vmscan: do not break out targeted reclaim without reclaimed pages 2013-02-23 17:50:10 -08:00
vmstat.c