linux/mm
Ying Han 889976dbcb memcg: reclaim memory from nodes in round-robin order
Presently, memory cgroup's direct reclaim frees memory from the current
node.  But this has some troubles.  Usually when a set of threads works in
a cooperative way, they tend to operate on the same node.  So if they hit
limits under memcg they will reclaim memory from themselves, damaging the
active working set.

For example, assume 2 node system which has Node 0 and Node 1 and a memcg
which has 1G limit.  After some work, file cache remains and the usages
are

   Node 0:  1M
   Node 1:  998M.

and run an application on Node 0, it will eat its foot before freeing
unnecessary file caches.

This patch adds round-robin for NUMA and adds equal pressure to each node.
When using cpuset's spread memory feature, this will work very well.

But yes, a better algorithm is needed.

[akpm@linux-foundation.org: comment editing]
[kamezawa.hiroyu@jp.fujitsu.com: fix time comparisons]
Signed-off-by: Ying Han <yinghan@google.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:35 -07:00
..
backing-dev.c backing-dev: Kill set but not used var in bdi_debug_stats_show() 2011-05-20 21:23:37 +02:00
bootmem.c
bounce.c
cleancache.c mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
compaction.c
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c mm: Convert i_mmap_lock to a mutex 2011-05-25 08:39:18 -07:00
filemap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem 2011-05-26 10:50:56 -07:00
fremap.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
highmem.c
huge_memory.c mm: thp: optimize memcg charge in khugepaged 2011-05-25 08:39:21 -07:00
hugetlb.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
hwpoison-inject.c
init-mm.c mm: convert mm->cpu_vm_cpumask into cpumask_var_t 2011-05-25 08:39:21 -07:00
internal.h mm: nommu: sort mm->mmap list properly 2011-05-25 08:39:05 -07:00
Kconfig mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c kmemleak: Do not return a pointer to an object that kmemleak did not get 2011-05-19 17:35:28 +01:00
ksm.c oom: replace PF_OOM_ORIGIN with toggling oom_score_adj 2011-05-25 08:39:10 -07:00
maccess.c
madvise.c
Makefile mm: cleancache core ops functions and config 2011-05-26 10:01:36 -06:00
memblock.c
memcontrol.c memcg: reclaim memory from nodes in round-robin order 2011-05-26 17:12:35 -07:00
memory_hotplug.c mm: remove dependency on CONFIG_FLATMEM from online_page() 2011-05-25 08:39:28 -07:00
memory-failure.c vmscan: change shrinker API by passing shrink_control struct 2011-05-25 08:39:26 -07:00
memory.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
mempolicy.c mm: proc: move show_numa_map() to fs/proc/task_mmu.c 2011-05-25 08:39:34 -07:00
mempool.c
migrate.c mm: use refcounts for page_lock_anon_vma() 2011-05-25 08:39:19 -07:00
mincore.c
mlock.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
mm_init.c
mmap.c mm: don't access vm_flags as 'int' 2011-05-26 09:20:31 -07:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c mm: Convert i_mmap_lock to a mutex 2011-05-25 08:39:18 -07:00
msync.c
nobootmem.c memblock/nobootmem: remove unneeded code from alloc_bootmem_node_high() 2011-05-25 08:39:31 -07:00
nommu.c nommu: add page alignment to mmap 2011-05-25 08:39:38 -07:00
oom_kill.c oom: replace PF_OOM_ORIGIN with toggling oom_score_adj 2011-05-25 08:39:10 -07:00
page_alloc.c mm/page_alloc.c: prevent unending loop in __alloc_pages_slowpath() 2011-05-25 08:39:36 -07:00
page_cgroup.c memcg: move page-freeing code out of lock 2011-05-26 17:12:35 -07:00
page_io.c
page_isolation.c
page-writeback.c
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2011-05-24 11:53:42 -07:00
pgtable-generic.c
prio_tree.c sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
quicklist.c
readahead.c readahead: readahead page allocations are OK to fail 2011-05-25 08:39:25 -07:00
rmap.c mm: optimize page_lock_anon_vma() fast-path 2011-05-25 08:39:20 -07:00
shmem.c tmpfs: implement generic xattr support 2011-05-25 08:39:31 -07:00
slab.c sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
slob.c
slub.c slub: remove no-longer used 'unlock_out' label 2011-05-25 18:06:54 -07:00
sparse-vmemmap.c
sparse.c
swap_state.c
swap.c mm: batch activate_page() to reduce lock contention 2011-05-25 08:39:37 -07:00
swapfile.c oom: replace PF_OOM_ORIGIN with toggling oom_score_adj 2011-05-25 08:39:10 -07:00
thrash.c
truncate.c mm/fs: add hooks to support cleancache 2011-05-26 10:01:43 -06:00
util.c mm: nommu: sort mm->mmap list properly 2011-05-25 08:39:05 -07:00
vmalloc.c mm: print vmalloc() state after allocation failures 2011-05-25 08:39:22 -07:00
vmscan.c memcg: reclaim memory from nodes in round-robin order 2011-05-26 17:12:35 -07:00
vmstat.c mm, mem-hotplug: update pcp->stat_threshold when memory hotplug occur 2011-05-25 08:39:09 -07:00