linux/fs
Ross Zwisler 4636e70bb0 dax: prevent invalidation of mapped DAX entries
Patch series "mm,dax: Fix data corruption due to mmap inconsistency",
v4.

This series fixes data corruption that can happen for DAX mounts when
page faults race with write(2) and as a result page tables get out of
sync with block mappings in the filesystem and thus data seen through
mmap is different from data seen through read(2).

The series passes testing with t_mmap_stale test program from Ross and
also other mmap related tests on DAX filesystem.

This patch (of 4):

dax_invalidate_mapping_entry() currently removes DAX exceptional entries
only if they are clean and unlocked.  This is done via:

  invalidate_mapping_pages()
    invalidate_exceptional_entry()
      dax_invalidate_mapping_entry()

However, for page cache pages removed in invalidate_mapping_pages()
there is an additional criteria which is that the page must not be
mapped.  This is noted in the comments above invalidate_mapping_pages()
and is checked in invalidate_inode_page().

For DAX entries this means that we can can end up in a situation where a
DAX exceptional entry, either a huge zero page or a regular DAX entry,
could end up mapped but without an associated radix tree entry.  This is
inconsistent with the rest of the DAX code and with what happens in the
page cache case.

We aren't able to unmap the DAX exceptional entry because according to
its comments invalidate_mapping_pages() isn't allowed to block, and
unmap_mapping_range() takes a write lock on the mapping->i_mmap_rwsem.

Since we essentially never have unmapped DAX entries to evict from the
radix tree, just remove dax_invalidate_mapping_entry().

Fixes: c6dcf52c23 ("mm: Invalidate DAX radix tree entries only if appropriate")
Link: http://lkml.kernel.org/r/20170510085419.27601-2-jack@suse.cz
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>    [4.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-12 15:57:15 -07:00
..
9p 9p: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
adfs
affs fs/affs: add rename exchange 2017-05-05 15:24:52 -04:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
autofs4 Fix dead URLs to ftp.kernel.org 2017-03-28 16:16:52 +02:00
befs befs: make export work with cold dcache 2017-05-05 11:35:35 +01:00
bfs Tigran has moved 2017-05-12 15:57:15 -07:00
btrfs Merge branch 'for-linus-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-05-10 08:33:17 -07:00
cachefiles sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
ceph The two main items are support for disabling automatic rbd exclusive 2017-05-10 08:42:33 -07:00
cifs fs: cifs: replace CURRENT_TIME by other appropriate apis 2017-05-08 17:15:15 -07:00
coda coda: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
configfs
cramfs
crypto fscrypt: introduce helper function for filename matching 2017-05-04 11:44:37 -04:00
debugfs fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
devpts
dlm net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
ecryptfs ecryptfs: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
efivarfs
efs
exofs fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
exportfs Merge branch 'rebased-statx' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-03-03 11:38:56 -08:00
ext2 libnvdimm for 4.12 2017-05-05 18:49:20 -07:00
ext4 Merge branch 'akpm' (patches from Andrew) 2017-05-08 18:17:56 -07:00
f2fs Merge branch 'akpm' (patches from Andrew) 2017-05-08 18:17:56 -07:00
fat fat: fix using uninitialized fields of fat_inode/fsinfo_inode 2017-03-09 17:01:10 -08:00
freevxfs
fscache KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload() 2017-03-02 10:09:00 +11:00
fuse NFS client updates for Linux 4.12 2017-05-10 13:03:38 -07:00
gfs2 gfs2: replace CURRENT_TIME with current_time 2017-05-08 17:15:15 -07:00
hfs fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
hfsplus fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
hostfs
hpfs sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
hugetlbfs hugetlbfs: fix offset overflow in hugetlbfs mmap 2017-04-13 18:24:21 -07:00
isofs sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
jbd2 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-10 10:30:46 -07:00
jffs2 jffs2: fix spelling mistake: "requestied" -> "requested" 2017-04-19 11:35:55 -07:00
jfs jfs: Remove jfs_get_inode_flags() 2017-04-19 14:21:23 +02:00
kernfs kernfs: Check KERNFS_HAS_RELEASE before calling kernfs_release_file() 2017-03-17 10:25:59 +09:00
lockd Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
minix statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
ncpfs ncpfs: Convert to separately allocated bdi 2017-04-20 12:09:55 -06:00
nfs Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
nfs_common
nfsd Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
nilfs2 fs: Remove SB_I_DYNBDI flag 2017-04-20 12:09:55 -06:00
nls
notify fanotify: don't expose EOPENSTALE to userspace 2017-04-25 15:48:06 +02:00
ntfs sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
ocfs2 fs/ocfs2/cluster: use offset_in_page() macro 2017-05-03 15:52:07 -07:00
omfs sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
openpromfs
orangefs Some cleanups: 2017-05-05 13:36:10 -07:00
overlayfs Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs 2017-05-10 09:03:48 -07:00
proc proc: try to remove use of FOLL_FORCE entirely 2017-05-09 08:45:16 -07:00
pstore Annotation of module parameters that specify device settings 2017-05-10 19:13:03 -07:00
qnx4
qnx6
quota quota: Remove dquot_quotactl_ops 2017-04-19 14:21:23 +02:00
ramfs
reiserfs reiserfs: use designated initializers 2017-05-08 17:15:11 -07:00
romfs
squashfs fs/pstore: fs/squashfs: change usage of LZ4 to work with new LZ4 version 2017-02-24 17:46:57 -08:00
sysfs sysfs: be careful of error returns from ops->show() 2017-04-08 17:33:32 +02:00
sysv statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
tracefs fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
ubifs Merge branch 'akpm' (patches from Andrew) 2017-05-08 18:17:56 -07:00
udf udf: use kmap_atomic for memcpy copying 2017-04-24 16:28:02 +02:00
ufs fs: ufs: use ktime_get_real_ts64() for birthtime 2017-05-08 17:15:15 -07:00
xfs scripts/spelling.txt: add "intialise(d)" pattern and fix typo instances 2017-05-08 17:15:13 -07:00
aio.c Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
anon_inodes.c
attr.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
bad_inode.c statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
binfmt_aout.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
binfmt_elf_fdpic.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
binfmt_elf.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
binfmt_em86.c
binfmt_flat.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
binfmt_misc.c fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
binfmt_script.c
block_dev.c libnvdimm for 4.12 2017-05-05 18:49:20 -07:00
buffer.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
char_dev.c chardev: add helper function to register char devs with a struct device 2017-03-21 06:44:32 +01:00
compat_binfmt_elf.c
compat_ioctl.c fs: compat: Remove warning from COMPATIBLE_IOCTL 2017-04-29 17:47:19 -04:00
compat.c fs/compat.c: trim unused includes 2017-04-17 12:52:27 -04:00
coredump.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
dax.c dax: prevent invalidation of mapped DAX entries 2017-05-12 15:57:15 -07:00
dcache.c fs: don't set *REFERENCED on single use objects 2017-05-03 11:47:05 -04:00
dcookies.c
direct-io.c fs: add i_blocksize() 2017-02-27 18:43:46 -08:00
drop_caches.c
eventfd.c sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> 2017-03-02 08:42:32 +01:00
eventpoll.c epoll: Add busy poll support to epoll with socket fds. 2017-03-24 20:49:31 -07:00
exec.c x86/arch_prctl: Add ARCH_[GET|SET]_CPUID 2017-03-20 16:10:34 +01:00
fcntl.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
fhandle.c fhandle: move compat syscalls from compat.c 2017-04-17 12:52:26 -04:00
file_table.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
file.c mm, vmalloc: use __GFP_HIGHMEM implicitly 2017-05-08 17:15:13 -07:00
filesystems.c
fs_pin.c
fs_struct.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
fs-writeback.c writeback: fix memory leak in wb_queue_work() 2017-03-13 08:27:34 -06:00
inode.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
internal.h Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-09 09:12:53 -07:00
ioctl.c sched/headers: Prepare for the reduction of <linux/sched.h>'s signal API dependency 2017-03-02 08:42:37 +01:00
iomap.c fs: semove set but not checked AOP_FLAG_UNINTERRUPTIBLE flag 2017-05-08 17:15:14 -07:00
Kconfig
Kconfig.binfmt
libfs.c fs: constify tree_descr arrays passed to simple_fill_super() 2017-04-26 23:54:06 -04:00
locks.c locks: Set FL_CLOSE when removing flock locks on close() 2017-04-21 10:45:01 -04:00
Makefile
mbcache.c
mount.h fsnotify: Free fsnotify_mark_connector when there is no mark attached 2017-04-10 17:37:35 +02:00
mpage.c fs: add i_blocksize() 2017-02-27 18:43:46 -08:00
namei.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
namespace.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
no-block.c
nsfs.c ns: allow ns_entries to have custom symlink content 2017-05-08 17:15:12 -07:00
open.c Merge branch 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-12 11:39:59 -07:00
pipe.c
pnode.c mnt: Tuck mounts under others instead of creating shadow/side mounts. 2017-02-04 00:01:06 +13:00
pnode.h mnt: Tuck mounts under others instead of creating shadow/side mounts. 2017-02-04 00:01:06 +13:00
posix_acl.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
proc_namespace.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
read_write.c move compat_rw_copy_check_uvector() over to fs/read_write.c 2017-04-17 12:52:26 -04:00
readdir.c readdir: move compat syscalls from compat.c 2017-04-17 12:52:24 -04:00
select.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
seq_file.c mm: introduce kv[mz]alloc helpers 2017-05-08 17:15:12 -07:00
signalfd.c mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU 2017-04-18 11:42:36 -07:00
splice.c Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:38:06 -07:00
stack.c
stat.c Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:54:26 -07:00
statfs.c statfs: move compat syscalls from compat.c 2017-04-17 12:52:23 -04:00
super.c bdi: Drop 'parent' argument from bdi_register[_va]() 2017-04-20 12:09:55 -06:00
sync.c vfs: use helper for calling f_op->fsync() 2017-02-20 16:51:23 +01:00
timerfd.c timerfd: Only check CAP_WAKE_ALARM when it is needed 2017-03-01 12:53:44 +01:00
userfaultfd.c userfaultfd: report actual registered features in fdinfo 2017-04-08 00:47:48 -07:00
utimes.c utimes: move compat syscalls from compat.c 2017-04-17 12:52:23 -04:00
xattr.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00