From be09102b4190561b67e3809b07a7fd29c9774152 Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Thu, 7 Jun 2018 17:09:21 -0700 Subject: [PATCH] mm: memcg: allow lowering memory.swap.max below the current usage Currently an attempt to set swap.max into a value lower than the actual swap usage fails, which causes configuration problems as there's no way of lowering the configuration below the current usage short of turning off swap entirely. This makes swap.max difficult to use and allows delegatees to lock the delegator out of reducing swap allocation. This patch updates swap_max_write() so that the limit can be lowered below the current usage. It doesn't implement active reclaiming of swap entries for the following reasons. * mem_cgroup_swap_full() already tells the swap machinary to aggressively reclaim swap entries if the usage is above 50% of limit, so simply lowering the limit automatically triggers gradual reclaim. * Forcing back swapped out pages is likely to heavily impact the workload and mess up the working set. Given that swap usually is a lot less valuable and less scarce, letting the existing usage dissipate over time through the above gradual reclaim and as they're falted back in is likely the better behavior. Link: http://lkml.kernel.org/r/20180523185041.GR1718769@devbig577.frc2.facebook.com Signed-off-by: Tejun Heo Acked-by: Roman Gushchin Acked-by: Rik van Riel Acked-by: Johannes Weiner Cc: Michal Hocko Cc: Shaohua Li Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/admin-guide/cgroup-v2.rst | 5 +++++ mm/memcontrol.c | 6 +----- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index e34d3c938729..8a2c52d5c53b 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1245,6 +1245,11 @@ PAGE_SIZE multiple when read back. because of running out of swap system-wide or max limit. + When reduced under the current usage, the existing swap + entries are reclaimed gradually and the swap usage may stay + higher than the limit for an extended period of time. This + reduces the impact on the workload and memory management. + Usage Guidelines ~~~~~~~~~~~~~~~~ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e3d56927a724..c1e64d60ed02 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6280,11 +6280,7 @@ static ssize_t swap_max_write(struct kernfs_open_file *of, if (err) return err; - mutex_lock(&memcg_max_mutex); - err = page_counter_set_max(&memcg->swap, max); - mutex_unlock(&memcg_max_mutex); - if (err) - return err; + xchg(&memcg->swap.max, max); return nbytes; }