linux/mm
Greg Thelen 2bff24a370 memcg: fix multiple large threshold notifications
A memory cgroup with (1) multiple threshold notifications and (2) at least
one threshold >=2G was not reliable.  Specifically the notifications would
either not fire or would not fire in the proper order.

The __mem_cgroup_threshold() signaling logic depends on keeping 64 bit
thresholds in sorted order.  mem_cgroup_usage_register_event() sorts them
with compare_thresholds(), which returns the difference of two 64 bit
thresholds as an int.  If the difference is positive but has bit[31] set,
then sort() treats the difference as negative and breaks sort order.

This fix compares the two arbitrary 64 bit thresholds returning the
classic -1, 0, 1 result.

The test below sets two notifications (at 0x1000 and 0x81001000):
  cd /sys/fs/cgroup/memory
  mkdir x
  for x in 4096 2164264960; do
    cgroup_event_listener x/memory.usage_in_bytes $x | sed "s/^/$x listener:/" &
  done
  echo $$ > x/cgroup.procs
  anon_leaker 500M

v3.11-rc7 fails to signal the 4096 event listener:
  Leaking...
  Done leaking pages.

Patched v3.11-rc7 properly notifies:
  Leaking...
  4096 listener:2013:8:31:14:13:36
  Done leaking pages.

The fixed bug is old.  It appears to date back to the introduction of
memcg threshold notifications in v2.6.34-rc1-116-g2e72b6347c94 "memcg:
implement memory thresholds"

Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:58:15 -07:00
..
backing-dev.c mm/backing-dev.c: check user buffer length before copying data to the related user buffer 2013-09-11 15:58:03 -07:00
balloon_compaction.c
bootmem.c
bounce.c
cleancache.c
compaction.c mm: compaction: do not compact pgdat for order-0 2013-09-11 15:57:55 -07:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c
filemap.c
fremap.c
frontswap.c
highmem.c
huge_memory.c mm/huge_memory.c: fix potential NULL pointer dereference 2013-09-11 15:57:19 -07:00
hugetlb_cgroup.c
hugetlb.c mm: prepare to remove /proc/sys/vm/hugepages_treat_as_movable 2013-09-11 15:57:49 -07:00
hwpoison-inject.c mm/hwpoison-inject.c: change permission of corrupt-pfn/unpoison-pfn to 0200 2013-09-11 15:58:11 -07:00
init-mm.c
internal.h mm: vmscan: fix do_try_to_free_pages() livelock 2013-09-11 15:58:01 -07:00
interval_tree.c
Kconfig
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c mm: replace strict_strtoul() with kstrtoul() 2013-09-11 15:57:11 -07:00
ksm.c mm: replace strict_strtoul() with kstrtoul() 2013-09-11 15:57:11 -07:00
maccess.c
madvise.c mm/madvise.c:madvise_hwpoison(): remove local `ret' 2013-09-11 15:58:13 -07:00
Makefile
memblock.c memblock, numa: binary search node id 2013-09-11 15:57:51 -07:00
memcontrol.c memcg: fix multiple large threshold notifications 2013-09-11 15:58:15 -07:00
memory_hotplug.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
memory-failure.c mm/memory-failure.c: fix bug triggered by unpoisoning empty zero page 2013-09-11 15:58:12 -07:00
memory.c mm: migrate: add hugepage migration code to move_pages() 2013-09-11 15:57:48 -07:00
mempolicy.c mbind: add BUG_ON(!vma) in new_vma_page() 2013-09-11 15:57:50 -07:00
mempool.c mm/mempool.c: convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) 2013-09-11 15:58:14 -07:00
migrate.c mm: vmscan: fix do_try_to_free_pages() livelock 2013-09-11 15:58:01 -07:00
mincore.c
mlock.c mm: munlock: manual pte walk in fast path instead of follow_page_mask() 2013-09-11 15:58:01 -07:00
mm_init.c
mmap.c mm/mmap: remove unnecessary assignment 2013-09-11 15:58:13 -07:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c mm/mremap.c: call pud_free() after fail calling pmd_alloc() 2013-09-11 15:58:03 -07:00
msync.c
nobootmem.c
nommu.c
oom_kill.c
page_alloc.c mm: correct the comment about the value for buddy _mapcount 2013-09-11 15:58:06 -07:00
page_cgroup.c
page_io.c
page_isolation.c mm: memory-hotplug: enable memory hotplug to handle hugepage 2013-09-11 15:57:48 -07:00
page-writeback.c mm/page-writeback.c: add strictlimit feature 2013-09-11 15:58:04 -07:00
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c mm: move pgtable related functions to right place 2013-09-11 15:57:30 -07:00
process_vm_access.c
quicklist.c
readahead.c readahead: make context readahead more conservative 2013-09-11 15:57:39 -07:00
rmap.c
shmem.c
slab_common.c
slab.c
slab.h
slob.c
slub.c mm: replace strict_strtoul() with kstrtoul() 2013-09-11 15:57:11 -07:00
sparse-vmemmap.c
sparse.c mm/sparse: introduce alloc_usemap_and_memmap 2013-09-11 15:58:01 -07:00
swap_state.c
swap.c mm: fix aio performance regression for database caused by THP 2013-09-11 15:57:55 -07:00
swapfile.c swap: make cluster allocation per-cpu 2013-09-11 15:57:17 -07:00
truncate.c
util.c swap: clean-up #ifdef in page_mapping() 2013-09-11 15:57:31 -07:00
vmalloc.c mm/vmalloc: use wrapper function get_vm_area_size to caculate size of vm area 2013-09-11 15:58:02 -07:00
vmpressure.c
vmscan.c mm: vmscan: fix do_try_to_free_pages() livelock 2013-09-11 15:58:01 -07:00
vmstat.c mm: vmscan: fix do_try_to_free_pages() livelock 2013-09-11 15:58:01 -07:00
zbud.c mm/zbud: fix some trivial typos in comments 2013-09-11 15:57:35 -07:00
zswap.c mm/zswap.c: get swapper address_space by using macro 2013-09-11 15:57:08 -07:00