linux/fs
Miklos Szeredi eaf5f90735 fix shrink_dcache_parent() livelock
Two (or more) concurrent calls of shrink_dcache_parent() on the same dentry may
cause shrink_dcache_parent() to loop forever.

Here's what appears to happen:

1 - CPU0: select_parent(P) finds C and puts it on dispose list, returns 1

2 - CPU1: select_parent(P) locks P->d_lock

3 - CPU0: shrink_dentry_list() locks C->d_lock
   dentry_kill(C) tries to lock P->d_lock but fails, unlocks C->d_lock

4 - CPU1: select_parent(P) locks C->d_lock,
         moves C from dispose list being processed on CPU0 to the new
dispose list, returns 1

5 - CPU0: shrink_dentry_list() finds dispose list empty, returns

6 - Goto 2 with CPU0 and CPU1 switched

Basically select_parent() steals the dentry from shrink_dentry_list() and thinks
it found a new one, causing shrink_dentry_list() to think it's making progress
and loop over and over.

One way to trigger this is to make udev calls stat() on the sysfs file while it
is going away.

Having a file in /lib/udev/rules.d/ with only this one rule seems to the trick:

ATTR{vendor}=="0x8086", ATTR{device}=="0x10ca", ENV{PCI_SLOT_NAME}="%k", ENV{MATCHADDR}="$attr{address}", RUN+="/bin/true"

Then execute the following loop:

while true; do
        echo -bond0 > /sys/class/net/bonding_masters
        echo +bond0 > /sys/class/net/bonding_masters
        echo -bond1 > /sys/class/net/bonding_masters
        echo +bond1 > /sys/class/net/bonding_masters
done

One fix would be to check all callers and prevent concurrent calls to
shrink_dcache_parent().  But I think a better solution is to stop the
stealing behavior.

This patch adds a new dentry flag that is set when the dentry is added to the
dispose list.  The flag is cleared in dentry_lru_del() in case the dentry gets a
new reference just before being pruned.

If the dentry has this flag, select_parent() will skip it and let
shrink_dentry_list() retry pruning it.  With select_parent() skipping those
dentries there will not be the appearance of progress (new dentries found) when
there is none, hence shrink_dcache_parent() will not loop forever.

Set the flag is also set in prune_dcache_sb() for consistency as suggested by
Linus.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-10 13:06:32 -05:00
..
9p 9p: propagate umode_t 2012-01-03 22:55:01 -05:00
adfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
affs affs: propagate umode_t 2012-01-03 22:55:04 -05:00
afs switch ->create() to umode_t 2012-01-03 22:54:53 -05:00
autofs4 vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
befs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
bfs switch ->create() to umode_t 2012-01-03 22:54:53 -05:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
cachefiles fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
ceph ceph: d_alloc_root() may fail 2012-01-09 16:36:12 -05:00
cifs Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
coda coda: switch coda_cnode_make() to sane API as well, clean coda_lookup() 2012-01-10 11:13:16 -05:00
configfs configfs: convert to umode_t 2012-01-03 22:54:57 -05:00
cramfs Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
debugfs Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
devpts devpts: fix double-free on mount failure 2012-01-08 20:19:03 -05:00
dlm net: remove ipv6_addr_copy() 2011-11-22 16:43:32 -05:00
ecryptfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
efs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
exofs Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2012-01-09 12:51:01 -08:00
exportfs
ext2 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
ext3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
ext4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-09 17:37:37 -08:00
fat Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb 2012-01-09 12:09:47 -08:00
freevxfs fs: propagate umode_t, misc bits 2012-01-03 22:55:10 -05:00
fscache
fuse vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
gfs2 Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
hfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hfsplus vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hostfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hpfs switch ->mknod() to umode_t 2012-01-03 22:54:54 -05:00
hppfs vfs: for usbfs, etc. internal vfsmounts ->mnt_sb->s_root == ->mnt_root 2012-01-03 22:52:41 -05:00
hugetlbfs hugetlbfs: propagate umode_t 2012-01-03 22:55:05 -05:00
isofs isofs: inode leak on mount failure 2012-01-09 10:48:11 -05:00
jbd Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
jbd2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
jffs2 vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
jfs Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
lockd vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
logfs logfs: propagate umode_t 2012-01-03 22:55:06 -05:00
minix Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
ncpfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
nfs Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
nfs_common
nfsd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
nilfs2 Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
nls NLS: raname "maxlen" to "maxout" in UTF conversion routines 2011-11-26 19:58:47 -08:00
notify vfs: move fsnotify junk to struct mount 2012-01-03 22:57:12 -05:00
ntfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
ocfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
omfs omfs: propagate umode_t 2012-01-03 22:55:01 -05:00
openpromfs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
proc Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
pstore pstore: gracefully handle NULL pstore_info functions 2011-11-18 13:49:00 -08:00
qnx4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
quota vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
ramfs pohmelfs: propagate umode_t 2012-01-03 22:55:07 -05:00
reiserfs Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
romfs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
squashfs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
sysfs sysfs: propagate umode_t 2012-01-03 22:55:03 -05:00
sysv vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
ubifs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
udf Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
ufs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
xfs Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs 2012-01-09 12:50:15 -08:00
aio.c aio: allocate kiocbs in batches 2011-11-02 16:07:03 -07:00
anon_inodes.c
attr.c switch is_sxid() to umode_t 2012-01-03 22:55:11 -05:00
bad_inode.c switch ->mknod() to umode_t 2012-01-03 22:54:54 -05:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c binfmt_elf: fix PIE execution with randomization disabled 2011-11-02 16:06:58 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
binfmt_script.c
binfmt_som.c
bio-integrity.c
bio.c bio: change some signed vars to unsigned 2011-11-16 09:21:50 +01:00
block_dev.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
buffer.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
char_dev.c char_dev.c: fix up some whitespace errors 2011-12-13 11:18:17 -08:00
compat_binfmt_elf.c
compat_ioctl.c vfs: fix up ENOIOCTLCMD error handling 2012-01-05 15:40:12 -08:00
compat.c switch open and mkdir syscalls to umode_t 2012-01-03 22:55:19 -05:00
dcache.c fix shrink_dcache_parent() livelock 2012-01-10 13:06:32 -05:00
dcookies.c
direct-io.c
drop_caches.c
eventfd.c
eventpoll.c epoll: fix spurious lockdep warnings 2011-10-31 17:30:57 -07:00
exec.c trim fs/internal.h 2012-01-03 22:52:35 -05:00
fcntl.c
fhandle.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
fifo.c
file_table.c vfs: prevent remount read-only if pending removes 2012-01-06 23:20:13 -05:00
file.c
filesystems.c vfs: convert fs_supers to hlist 2012-01-03 22:52:39 -05:00
fs_struct.c
fs-writeback.c Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
generic_acl.c
inode.c vfs: count unlinked inodes 2012-01-06 23:20:12 -05:00
internal.h vfs: protect remounting superblock read-only 2012-01-06 23:20:12 -05:00
ioctl.c vfs: fix up ENOIOCTLCMD error handling 2012-01-05 15:40:12 -08:00
ioprio.c
Kconfig Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2012-01-09 12:51:01 -08:00
Kconfig.binfmt
libfs.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
locks.c vfs: fix handling of lock allocation failure in lease-break case 2011-12-26 10:25:26 -08:00
Makefile Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
mbcache.c
mount.h vfs: keep list of mounts for each superblock 2012-01-06 23:20:12 -05:00
mpage.c
namei.c Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
no-block.c
open.c switch security_path_chmod() to struct path * 2012-01-06 23:16:53 -05:00
pipe.c vfs: pipe.c is really non-modular 2012-01-03 22:52:41 -05:00
pnode.c vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
pnode.h vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
posix_acl.c
proc_namespace.c vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
read_write.c
read_write.h
readdir.c
select.c
seq_file.c constify seq_file stuff 2012-01-03 22:52:40 -05:00
signalfd.c
splice.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
stack.c filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
stat.c readlinkat: ensure we return ENOENT for the empty pathname for normal lookups 2011-11-02 12:53:42 +01:00
statfs.c vfs: new helper - vfs_ustat() 2012-01-03 22:53:07 -05:00
super.c vfs: prevent remount read-only if pending removes 2012-01-06 23:20:13 -05:00
sync.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
timerfd.c
utimes.c
xattr_acl.c
xattr.c vfs: mnt_drop_write_file() 2012-01-03 22:52:40 -05:00