linux/fs
Vladimir Davydov d86133bd39 pipe: account to kmemcg
Pipes can consume a significant amount of system memory, hence they
should be accounted to kmemcg.

This patch marks pipe_inode_info and anonymous pipe buffer page
allocations as __GFP_ACCOUNT so that they would be charged to kmemcg.
Note, since a pipe buffer page can be "stolen" and get reused for other
purposes, including mapping to userspace, we clear PageKmemcg thus
resetting page->_mapcount and uncharge it in anon_pipe_buf_steal, which
is introduced by this patch.

A note regarding anon_pipe_buf_steal implementation.  We allow to steal
the page if its ref count equals 1.  It looks racy, but it is correct
for anonymous pipe buffer pages, because:

 - We lock out all other pipe users, because ->steal is called with
   pipe_lock held, so the page can't be spliced to another pipe from
   under us.

 - The page is not on LRU and it never was.

 - Thus a parallel thread can access it only by PFN. Although this is
   quite possible (e.g. see page_idle_get_page and balloon_page_isolate)
   this is not dangerous, because all such functions do is increase page
   ref count, check if the page is the one they are looking for, and
   decrease ref count if it isn't. Since our page is clean except for
   PageKmemcg mark, which doesn't conflict with other _mapcount users,
   the worst that can happen is we see page_count > 2 due to a transient
   ref, in which case we false-positively abort ->steal, which is still
   fine, because ->steal is not guaranteed to succeed.

Link: http://lkml.kernel.org/r/20160527150313.GD26059@esperanza
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
..
9p Use the right predicate in ->atomic_open() instances 2016-07-05 16:02:23 -04:00
adfs fs/adfs/adfs.h: tidy up comments 2016-01-20 17:09:18 -08:00
affs affs: fix remount failure when there are no options changed 2016-05-28 16:50:24 -07:00
afs remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
autofs4 autofs: don't get stuck in a loop if vfs_write() returns an error 2016-06-24 17:23:52 -07:00
befs fs/befs/io.c:befs_bread(): remove unneeded initialization to NULL 2016-05-23 17:04:14 -07:00
bfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
btrfs Merge branch 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-06-25 08:53:38 -07:00
cachefiles FS-Cache: make check_consistency callback return int 2016-06-01 10:29:39 +02:00
ceph Use the right predicate in ->atomic_open() instances 2016-07-05 16:02:23 -04:00
cifs Use the right predicate in ->atomic_open() instances 2016-07-05 16:02:23 -04:00
coda introduce a parallel variant of ->iterate() 2016-05-02 19:49:29 -04:00
configfs configfs: Remove ppos increment in configfs_write_bin_file 2016-06-30 11:28:55 +02:00
cramfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
crypto fscrypto/f2fs: allow fs-specific key prefix for fs encryption 2016-05-07 10:32:33 -07:00
debugfs debugfs: open_proxy_open(): avoid double fops release 2016-06-15 04:56:35 -07:00
devpts devpts: Make each mount of devpts an independent filesystem. 2016-06-05 10:36:01 -07:00
dlm mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
ecryptfs ecryptfs: don't allow mmap when the lower fs doesn't support it 2016-07-08 10:35:28 -05:00
efivarfs fs/efivarfs/inode.c: use generic UUID library 2016-05-20 17:58:30 -07:00
efs fs/efs/super.c: fix return value 2016-05-20 17:58:30 -07:00
exofs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-05-17 17:05:30 -07:00
exportfs introduce a parallel variant of ->iterate() 2016-05-02 19:49:29 -04:00
ext2 dax: remote unused fault wrappers 2016-07-26 16:19:19 -07:00
ext4 dax: remote unused fault wrappers 2016-07-26 16:19:19 -07:00
f2fs switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
fat Merge branch 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-17 15:05:23 -07:00
freevxfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
fscache FS-Cache: wake write waiter after invalidating writes 2016-06-01 10:29:09 +02:00
fuse Use the right predicate in ->atomic_open() instances 2016-07-05 16:02:23 -04:00
gfs2 We've got ten patches this time, half of which are related to a plethora 2016-07-24 16:07:52 -07:00
hfs switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00
hfsplus switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
hostfs hostfs: switch to ->iterate_shared() 2016-05-12 19:49:30 -04:00
hpfs hpfs: implement the show_options method 2016-05-28 16:50:24 -07:00
hugetlbfs mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
isofs Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
jbd2 jbd2: get rid of superfluous __GFP_REPEAT 2016-06-24 17:23:52 -07:00
jffs2 switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
jfs switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
kernfs switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00
lockd lockd: unregister notifier blocks if the service fails to come up completely 2016-06-30 16:35:07 -04:00
logfs logfs: no need to lock directory in lseek 2016-05-09 11:42:19 -04:00
minix simple local filesystems: switch to ->iterate_shared() 2016-05-02 19:49:32 -04:00
ncpfs mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
nfs nfs_atomic_open(): prevent parallel nfs_lookup() on a negative hashed 2016-07-05 16:02:31 -04:00
nfs_common
nfsd nfsd: check permissions when setting ACLs 2016-06-24 12:11:52 -04:00
nilfs2 fs/nilfs2: fix potential underflow in call to crc32_le 2016-06-24 17:23:52 -07:00
nls
notify fsnotify: avoid spurious EMFILE errors from inotify_init() 2016-05-19 19:12:14 -07:00
ntfs fs: simplify the generic_write_sync prototype 2016-05-01 19:58:39 -04:00
ocfs2 ocfs2/cluster: clean up unnecessary assignment for 'ret' 2016-07-26 16:19:19 -07:00
omfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
openpromfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
orangefs switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
overlayfs ovl: verify upper dentry in ovl_remove_and_whiteout() 2016-07-22 10:54:20 +02:00
proc Merge branch 'stacking-fixes' (vfs stacking fixes from Jann) 2016-06-10 12:10:02 -07:00
pstore mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
qnx4 more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
qnx6 more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
quota fs/quota: use nla_put_u64_64bit() 2016-04-26 12:00:48 -04:00
ramfs tmpfs/ramfs: fix VM_MAYSHARE mappings for NOMMU 2016-05-20 17:58:30 -07:00
reiserfs Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2016-06-19 07:05:14 -10:00
romfs romfs, squashfs: switch to ->iterate_shared() 2016-05-09 11:41:15 -04:00
squashfs romfs, squashfs: switch to ->iterate_shared() 2016-05-09 11:41:15 -04:00
sysfs platform/chrome: Branch for v4.4 2015-11-13 21:53:18 -08:00
sysv simple local filesystems: switch to ->iterate_shared() 2016-05-02 19:49:32 -04:00
tracefs wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ubifs UBIFS: Implement ->migratepage() 2016-06-23 00:29:53 +02:00
udf udf: Use correct partition reference number for metadata 2016-05-19 13:00:35 +02:00
ufs simple local filesystems: switch to ->iterate_shared() 2016-05-02 19:49:32 -04:00
xfs dax: remote unused fault wrappers 2016-07-26 16:19:19 -07:00
aio.c aio: make aio_setup_ring killable 2016-05-23 17:04:14 -07:00
anon_inodes.c
attr.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
bad_inode.c switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00
binfmt_aout.c fs: fix binfmt_aout.c build error 2016-05-28 16:34:59 -07:00
binfmt_elf_fdpic.c coredump: fix dumping through pipes 2016-06-07 22:07:09 -04:00
binfmt_elf.c coredump: fix dumping through pipes 2016-06-07 22:07:09 -04:00
binfmt_em86.c
binfmt_flat.c remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
binfmt_misc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
binfmt_script.c
block_dev.c DAX error handling for 4.7 2016-05-26 19:34:26 -07:00
buffer.c fs: export __block_write_full_page 2016-06-27 09:58:40 -05:00
char_dev.c chardev: add missing line break in pr_warn 2016-07-14 16:21:53 +09:00
compat_binfmt_elf.c
compat_ioctl.c Merge 4.5-rc4 into char-misc-next 2016-02-14 14:25:59 -08:00
compat.c Fix a number of bugs, most notably a potential stale data exposure 2016-05-24 12:55:26 -07:00
coredump.c coredump: fix dumping through pipes 2016-06-07 22:07:09 -04:00
dax.c dax: remote unused fault wrappers 2016-07-26 16:19:19 -07:00
dcache.c fix idiotic braino in d_alloc_parallel() 2016-06-20 10:07:42 -04:00
dcookies.c
direct-io.c direct-io: fix direct write stale data exposure from concurrent buffered read 2016-05-27 14:49:37 -07:00
drop_caches.c
eventfd.c eventfd: document lockless access in eventfd_poll 2016-03-22 15:36:02 -07:00
eventpoll.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
exec.c exec: make exec path waiting for mmap_sem killable 2016-05-23 17:04:14 -07:00
fcntl.c fcntl: allow to set O_DIRECT flag on pipe 2016-01-09 02:55:37 -05:00
fhandle.c fs/coredump: prevent fsuid=0 dumps into user-controlled directories 2016-03-22 15:36:02 -07:00
file_table.c
file.c give readdir(2)/getdents(2)/etc. uniform exclusion with lseek() 2016-05-02 19:49:28 -04:00
filesystems.c find_filesystem(): simplify comparison 2016-01-19 12:02:23 -05:00
fs_pin.c
fs_struct.c
fs-writeback.c fs/fs-writeback.c: inode writeback list tracking tracepoints 2016-07-26 16:19:19 -07:00
inode.c fs/fs-writeback.c: add a new writeback list for sync 2016-07-26 16:19:19 -07:00
internal.h much milder d_walk() race 2016-06-10 11:32:47 -04:00
ioctl.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
Kconfig dax: Make huge page handling depend of CONFIG_BROKEN 2016-05-19 15:13:17 -06:00
Kconfig.binfmt ELF/MIPS build fix 2016-05-23 17:04:14 -07:00
libfs.c lockless next_positive() 2016-06-20 17:11:29 -04:00
locks.c locks: use file_inode() 2016-07-01 10:24:18 -04:00
Makefile Merge tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux 2016-03-26 12:59:04 -07:00
mbcache.c mbcache: add reusable flag to cache entries 2016-02-22 22:44:04 -05:00
mount.h
mpage.c mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
namei.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-06-07 20:41:36 -07:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-07-01 15:20:11 -07:00
no-block.c
nsfs.c
open.c Merge branch 'work.const-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-17 14:41:03 -07:00
pipe.c pipe: account to kmemcg 2016-07-26 16:19:19 -07:00
pnode.c propogate_mnt: Handle the first propogated copy being a slave 2016-05-05 09:54:45 -05:00
pnode.h
posix_acl.c posix_acl: Add set_posix_acl 2016-06-24 12:11:34 -04:00
proc_namespace.c vfs: show_vfsstat: do not ignore errors from show_devname method 2016-03-16 13:09:08 -04:00
read_write.c x86/syscalls: Add compat_sys_preadv64v2/compat_sys_pwritev64v2 2016-07-15 10:30:26 +02:00
readdir.c restore killability of old mutex_lock_killable(&inode->i_mutex) users 2016-05-26 00:13:25 -04:00
select.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
seq_file.c Make file credentials available to the seqfile interfaces 2016-04-14 12:56:09 -07:00
signalfd.c
splice.c Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
stack.c
stat.c fs/stat.c: drop the last new_valid_dev check 2016-01-16 11:17:23 -08:00
statfs.c
super.c fs/fs-writeback.c: add a new writeback list for sync 2016-07-26 16:19:19 -07:00
sync.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
timerfd.c timerfd: Reject ALARM timerfds without CAP_WAKE_ALARM 2016-06-09 23:42:38 +02:00
userfaultfd.c userfaultfd: don't pin the user memory in userfaultfd_file_create() 2016-05-20 17:58:30 -07:00
utimes.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
xattr.c switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00