linux

mirror of https://github.com/FEX-Emu/linux.git synced 2024-12-15 13:22:55 +00:00

Author	SHA1	Message	Date
Al Viro	1fa97e8b1f	constify file_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:39:00 -04:00
Al Viro	19d860a140	handle suicide on late failure exits in execve() in search_binary_handler() ... rather than doing that in the guts of ->load_binary(). [updated to fix the bug spotted by Shentino - for SIGSEGV we really need something stronger than send_sig_info(); again, better do that in one place] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:39:00 -04:00
Al Viro	2926620145	dcache.c: call ->d_prune() regardless of d_unhashed() the only in-tree instance checks d_unhashed() anyway, out-of-tree code can preserve the current behaviour by adding such check if they want it and we get an ability to use it in cases where we want to be notified of killing being inevitable before ->d_lock is dropped, whether it's unhashed or not. In particular, autofs would benefit from that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:59 -04:00
Al Viro	29355c3904	d_prune_alias(): just lock the parent and call __dentry_kill() The only reason for games with ->d_prune() was __d_drop(), which was needed only to force dput() into killing the sucker off. Note that lock_parent() can be called under ->i_lock and won't drop it, so dentry is safe from somebody managing to kill it under us - it won't happen while we are holding ->i_lock. __dentry_kill() is called only with ->d_lockref.count being 0 (here and when picked from shrink list) or 1 (dput() and dropping the ancestors in shrink_dentry_list()), so it will never be called twice - the first thing it's doing is making ->d_lockref.count negative and once that happens, nothing will increment it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:59 -04:00
Eric W. Biederman	bbd5192412	proc: Update proc_flush_task_mnt to use d_invalidate Now that d_invalidate always succeeds and flushes mount points use it in stead of a combination of shrink_dcache_parent and d_drop in proc_flush_task_mnt. This removes the danger of a mount point under /proc/<pid>/... becoming unreachable after the d_drop. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:58 -04:00
Eric W. Biederman	c143c2333c	vfs: Remove d_drop calls from d_revalidate implementations Now that d_invalidate always succeeds it is not longer necessary or desirable to hard code d_drop calls into filesystem specific d_revalidate implementations. Remove the unnecessary d_drop calls and rely on d_invalidate to drop the dentries. Using d_invalidate ensures that paths to mount points will not be dropped. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:58 -04:00
Eric W. Biederman	5542aa2fa7	vfs: Make d_invalidate return void Now that d_invalidate can no longer fail, stop returning a useless return code. For the few callers that checked the return code update remove the handling of d_invalidate failure. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:57 -04:00
Eric W. Biederman	1ffe46d11c	vfs: Merge check_submounts_and_drop and d_invalidate Now that d_invalidate is the only caller of check_submounts_and_drop, expand check_submounts_and_drop inline in d_invalidate. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:57 -04:00
Eric W. Biederman	9b053f3207	vfs: Remove unnecessary calls of check_submounts_and_drop Now that check_submounts_and_drop can not fail and is called from d_invalidate there is no longer a need to call check_submounts_and_drom from filesystem d_revalidate methods so remove it. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:56 -04:00
Eric W. Biederman	8ed936b567	vfs: Lazily remove mounts on unlinked files and directories. With the introduction of mount namespaces and bind mounts it became possible to access files and directories that on some paths are mount points but are not mount points on other paths. It is very confusing when rm -rf somedir returns -EBUSY simply because somedir is mounted somewhere else. With the addition of user namespaces allowing unprivileged mounts this condition has gone from annoying to allowing a DOS attack on other users in the system. The possibility for mischief is removed by updating the vfs to support rename, unlink and rmdir on a dentry that is a mountpoint and by lazily unmounting mountpoints on deleted dentries. In particular this change allows rename, unlink and rmdir system calls on a dentry without a mountpoint in the current mount namespace to succeed, and it allows rename, unlink, and rmdir performed on a distributed filesystem to update the vfs cache even if when there is a mount in some namespace on the original dentry. There are two common patterns of maintaining mounts: Mounts on trusted paths with the parent directory of the mount point and all ancestory directories up to / owned by root and modifiable only by root (i.e. /media/xxx, /dev, /dev/pts, /proc, /sys, /sys/fs/cgroup/{cpu, cpuacct, ...}, /usr, /usr/local). Mounts on unprivileged directories maintained by fusermount. In the case of mounts in trusted directories owned by root and modifiable only by root the current parent directory permissions are sufficient to ensure a mount point on a trusted path is not removed or renamed by anyone other than root, even if there is a context where the there are no mount points to prevent this. In the case of mounts in directories owned by less privileged users races with users modifying the path of a mount point are already a danger. fusermount already uses a combination of chdir, /proc/<pid>/fd/NNN, and UMOUNT_NOFOLLOW to prevent these races. The removable of global rename, unlink, and rmdir protection really adds nothing new to consider only a widening of the attack window, and fusermount is already safe against unprivileged users modifying the directory simultaneously. In principle for perfect userspace programs returning -EBUSY for unlink, rmdir, and rename of dentires that have mounts in the local namespace is actually unnecessary. Unfortunately not all userspace programs are perfect so retaining -EBUSY for unlink, rmdir and rename of dentries that have mounts in the current mount namespace plays an important role of maintaining consistency with historical behavior and making imperfect userspace applications hard to exploit. v2: Remove spurious old_dentry. v3: Optimized shrink_submounts_and_drop Removed unsued afs label v4: Simplified the changes to check_submounts_and_drop Do not rename check_submounts_and_drop shrink_submounts_and_drop Document what why we need atomicity in check_submounts_and_drop Rely on the parent inode mutex to make d_revalidate and d_invalidate an atomic unit. v5: Refcount the mountpoint to detach in case of simultaneous renames. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:56 -04:00
Eric W. Biederman	80b5dce8c5	vfs: Add a function to lazily unmount all mounts from any dentry. The new function detach_mounts comes in two pieces. The first piece is a static inline test of d_mounpoint that returns immediately without taking any locks if d_mounpoint is not set. In the common case when mountpoints are absent this allows the vfs to continue running with it's same cacheline foot print. The second piece of detach_mounts __detach_mounts actually does the work and it assumes that a mountpoint is present so it is slow and takes namespace_sem for write, and then locks the mount hash (aka mount_lock) after a struct mountpoint has been found. With those two locks held each entry on the list of mounts on a mountpoint is selected and lazily unmounted until all of the mount have been lazily unmounted. v7: Wrote a proper change description and removed the changelog documenting deleted wrong turns. Signed-off-by: Eric W. Biederman <ebiederman@twitter.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:55 -04:00
Eric W. Biederman	e2dfa93546	vfs: factor out lookup_mountpoint from new_mountpoint I am shortly going to add a new user of struct mountpoint that needs to look up existing entries but does not want to create a struct mountpoint if one does not exist. Therefore to keep the code simple and easy to read split out lookup_mountpoint from new_mountpoint. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:55 -04:00
Eric W. Biederman	0a5eb7c818	vfs: Keep a list of mounts on a mount point To spot any possible problems call BUG if a mountpoint is put when it's list of mounts is not empty. AV: use hlist instead of list_head Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Eric W. Biederman <ebiederman@twitter.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:54 -04:00
Eric W. Biederman	7af1364ffa	vfs: Don't allow overwriting mounts in the current mount namespace In preparation for allowing mountpoints to be renamed and unlinked in remote filesystems and in other mount namespaces test if on a dentry there is a mount in the local mount namespace before allowing it to be renamed or unlinked. The primary motivation here are old versions of fusermount unmount which is not safe if the a path can be renamed or unlinked while it is verifying the mount is safe to unmount. More recent versions are simpler and safer by simply using UMOUNT_NOFOLLOW when unmounting a mount in a directory owned by an arbitrary user. Miklos Szeredi <miklos@szeredi.hu> reports this is approach is good enough to remove concerns about new kernels mixed with old versions of fusermount. A secondary motivation for restrictions here is that it removing empty directories that have non-empty mount points on them appears to violate the rule that rmdir can not remove empty directories. As Linus Torvalds pointed out this is useful for programs (like git) that test if a directory is empty with rmdir. Therefore this patch arranges to enforce the existing mount point semantics for local mount namespace. v2: Rewrote the test to be a drop in replacement for d_mountpoint v3: Use bool instead of int as the return type of is_local_mountpoint Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:54 -04:00
Eric W. Biederman	bafc9b754f	vfs: More precise tests in d_invalidate The current comments in d_invalidate about what and why it is doing what it is doing are wildly off-base. Which is not surprising as the comments date back to last minute bug fix of the 2.2 kernel. The big fat lie of a comment said: If it's a directory, we can't drop it for fear of somebody re-populating it with children (even though dropping it would make it unreachable from that root, we still might repopulate it if it was a working directory or similar). [AV] What we really need to avoid is multiple dentry aliases of the same directory inode; on all filesystems that have ->d_revalidate() we either declare all positive dentries always valid (and thus never fed to d_invalidate()) or use d_materialise_unique() and/or d_splice_alias(), which take care of alias prevention. The current rules are: - To prevent mount point leaks dentries that are mount points or that have childrent that are mount points may not be be unhashed. - All dentries may be unhashed. - Directories may be rehashed with d_materialise_unique check_submounts_and_drop implements this already for well maintained remote filesystems so implement the current rules in d_invalidate by just calling check_submounts_and_drop. The one difference between d_invalidate and check_submounts_and_drop is that d_invalidate must respect it when a d_revalidate method has earlier called d_drop so preserve the d_unhashed check in d_invalidate. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:54 -04:00
Eric W. Biederman	3ccb354d64	vfs: Document the effect of d_revalidate on d_find_alias d_drop or check_submounts_and_drop called from d_revalidate can result in renamed directories with child dentries being unhashed. These renamed and drop directory dentries can be rehashed after d_materialise_unique uses d_find_alias to find them. Reviewed-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:53 -04:00
Al Viro	9ea459e110	delayed mntput On final mntput() we want fs shutdown to happen before return to userland; however, the only case where we want it happen right there (i.e. where task_work_add won't do) is MNT_INTERNAL victim. Those have to be fully synchronous - failure halfway through module init might count on having vfsmount killed right there. Fortunately, final mntput on MNT_INTERNAL vfsmounts happens on shallow stack. So we handle those synchronously and do an analog of delayed fput logics for everything else. As the result, we are guaranteed that fs shutdown will always happen on shallow stack. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:53 -04:00
Ian Kent	b3ca406f27	autofs - remove obsolete d_invalidate() from expire Biederman's umount-on-rmdir series changes d_invalidate() to sumarily remove mounts under the passed in dentry regardless of whether they are busy or not. So calling this in fs/autofs4/expire.c:autofs4_tree_busy() is definitely the wrong thing to do becuase it will silently umount entries instead of just cleaning stale dentrys. But this call shouldn't be needed and testing shows that automounting continues to function without it. As Al Viro correctly surmises the original intent of the call was to perform what shrink_dcache_parent() does. If at some time in the future I see stale dentries accumulating following failed mounts I'll revisit the issue and possibly add a shrink_dcache_parent() call if needed. Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:52 -04:00
Al Viro	8d85b4845a	Allow sharing external names after __d_move() * external dentry names get a small structure prepended to them (struct external_name). * it contains an atomic refcount, matching the number of struct dentry instances that have ->d_name.name pointing to that external name. The first thing free_dentry() does is decrementing refcount of external name, so the instances that are between the call of free_dentry() and RCU-delayed actual freeing do not contribute. * __d_move(x, y, false) makes the name of x equal to the name of y, external or not. If y has an external name, extra reference is grabbed and put into x->d_name.name. If x used to have an external name, the reference to the old name is dropped and, should it reach zero, freeing is scheduled via kfree_rcu(). * free_dentry() in dentry with external name decrements the refcount of that name and, should it reach zero, does RCU-delayed call that will free both the dentry and external name. Otherwise it does what it used to do, except that __d_free() doesn't even look at ->d_name.name; it simply frees the dentry. All non-RCU accesses to dentry external name are safe wrt freeing since they all should happen before free_dentry() is called. RCU accesses might run into a dentry seen by free_dentry() or into an old name that got already dropped by __d_move(); however, in both cases dentry must have been alive and refer to that name at some point after we'd done rcu_read_lock(), which means that any freeing must be still pending. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-10-09 02:38:41 -04:00
Al Viro	6d13f69444	missing data dependency barrier in prepend_name() AFAICS, prepend_name() is broken on SMP alpha. Disclaimer: I don't have SMP alpha boxen to reproduce it on. However, it really looks like the race is real. CPU1: d_path() on /mnt/ramfs/<255-character>/foo CPU2: mv /mnt/ramfs/<255-character> /mnt/ramfs/<63-character> CPU2 does d_alloc(), which allocates an external name, stores the name there including terminating NUL, does smp_wmb() and stores its address in dentry->d_name.name. It proceeds to d_add(dentry, NULL) and d_move() old dentry over to that. ->d_name.name value ends up in that dentry. In the meanwhile, CPU1 gets to prepend_name() for that dentry. It fetches ->d_name.name and ->d_name.len; the former ends up pointing to new name (64-byte kmalloc'ed array), the latter - 255 (length of the old name). Nothing to force the ordering there, and normally that would be OK, since we'd run into the terminating NUL and stop. Except that it's alpha, and we'd need a data dependency barrier to guarantee that we see that store of NUL __d_alloc() has done. In a similar situation dentry_cmp() would survive; it does explicit smp_read_barrier_depends() after fetching ->d_name.name. prepend_name() doesn't and it risks walking past the end of kmalloc'ed object and possibly oops due to taking a page fault in kernel mode. Cc: stable@vger.kernel.org # 3.12+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-29 14:46:30 -04:00
Linus Torvalds	1e3827bf8a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Assorted fixes + unifying __d_move() and __d_materialise_dentry() + minimal regression fix for d_path() of victims of overwriting rename() ported on top of that" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Don't exchange "short" filenames unconditionally. fold swapping ->d_name.hash into switch_names() fold unlocking the children into dentry_unlock_parents_for_move() kill __d_materialise_dentry() __d_materialise_dentry(): flip the order of arguments __d_move(): fold manipulations with ->d_child/->d_subdirs don't open-code d_rehash() in d_materialise_unique() pull rehashing and unlocking the target dentry into __d_materialise_dentry() ufs: deal with nfsd/iget races fuse: honour max_read and max_write in direct_io mode shmem: fix nlink for rename overwrite directory	2014-09-27 17:05:14 -07:00
Linus Torvalds	6111da3432	Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "This is quite late but these need to be backported anyway. This is the fix for a long-standing cpuset bug which existed from 2009. cpuset makes use of PF_SPREAD_{PAGE\|SLAB} flags to modify the task's memory allocation behavior according to the settings of the cpuset it belongs to; unfortunately, when those flags have to be changed, cpuset did so directly even whlie the target task is running, which is obviously racy as task->flags may be modified by the task itself at any time. This obscure bug manifested as corrupt PF_USED_MATH flag leading to a weird crash. The bug is fixed by moving the flag to task->atomic_flags. The first two are prepatory ones to help defining atomic_flags accessors and the third one is the actual fix" * 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags sched: add macros to define bitops for task atomic flags sched: fix confusing PFA_NO_NEW_PRIVS constant	2014-09-27 16:45:33 -07:00
Linus Torvalds	8369289864	ARM: SoC fixes for 3.17 Here's our last set of fixes for 3.17. Most of these are for TI platforms, fixing some noisy Kconfig issues, runtime clock and power issues on several platforms and NAND timings on DRA7. There are also a couple of bug fixes for i.MX, one for QCOM and a small fix to avoid section mismatch noise on PXA. Diffstat looks large, partially due to some tables being updated and thus touching many lines. The qcom gsbi change also restructures clock management a bit and thus touches a bunch of lines. All in all, a bit more changes than we'd like at this point, but nothing stands out as risky either so it seems like the right thing to send it up now instead of holding it to the merge window. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJUJxVMAAoJEIwa5zzehBx3VVoP/3WeftI/+vncYhMmPCaUxOso B/rNY1CW2ZYr9yWEvREQtMQCkLWYPifeyHa+fXHeFfLGWlMP1wU4LP78RrvaMnSs V0d2wYmfTkSIlVwqRMuArY9KwnOTRSiDfhQpl2BQ84u1IaZM5/IRw9oNICTao8jI A7NsLAnss3exKCT06R3CcG7+fq3zVc19aI1QJG61BFqTIVItf71NTm/lcjsL3Tss Tr/ITTgZM6UGkEnTUuRCl3gpMn/TVvO/qE94xU6vY0jqDQKUl1cxUCx6gRcSDRu4 PvLvPS7d4p99dHmLxVUuLBT7AGtRCxfdAoVE3D3rmGfcthDt1nFBgJfp6ekQZAM9 ZfJnrvfHRLjl/lxQvWWkpuugu0z7GCFeXRFHN6aLsD6aRD4JmYoRuSeA0aXmTKyp oDcduXqYOImTcbUQ8G8n1YeK8BAVlL6PEZKvaIhjmxUWHVeGdpesz9s7TFBqGBBd F1EeCPtAczBpNJP4E/dRDzWYjp+lGyQs4dQEU+YpRe9drzJpw6GsDuaF78QP8A5a TEcc3y3o2FSNbGCw9qQ7pkgm76aS1YhLKMQb+2JXJptgwKMw3G6abMr+iomlm3Id DY8+WIBggx/gB5k/onFseZvjNxVKqQUeh31UT5e1v/9M4bCJvEcY+KeKcgjbPpy7 GnGoXEvCnwZ7kPokqH0D =K6xV -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Here's our last set of fixes for 3.17. Most of these are for TI platforms, fixing some noisy Kconfig issues, runtime clock and power issues on several platforms and NAND timings on DRA7. There are also a couple of bug fixes for i.MX, one for QCOM and a small fix to avoid section mismatch noise on PXA. Diffstat looks large, partially due to some tables being updated and thus touching many lines. The qcom gsbi change also restructures clock management a bit and thus touches a bunch of lines. All in all, a bit more changes than we'd like at this point, but nothing stands out as risky either so it seems like the right thing to send it up now instead of holding it to the merge window" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: drivers/soc: qcom: do not disable the iface clock in probe ARM: imx: fix .is_enabled() of shared gate clock ARM: OMAP3: Fix I/O chain clock line assertion timed out error ARM: keystone: dts: fix bindings for pcie and usb clock nodes bus: omap_l3_noc: Fix connID for OMAP4 ARM: DT: imx53: fix lvds channel 1 port ARM: dts: cm-t54: fix serial console power supply. ARM: dts: dra7-evm: Fix NAND GPMC timings ARM: pxa: fix section mismatch warning for pxa_timer_nodt_init ARM: OMAP: Fix Kconfig warning for omap1	2014-09-27 14:58:59 -07:00
Linus Torvalds	74807afd3f	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS fixes from Ralf Baechle: "The final round of fixes. One corner case in the math emulator and another one in the mcount function for ftrace" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: mcount: Adjust stack pointer for static trace in MIPS32 MIPS: Fix MFC1 & MFHC1 emulation for 64-bit MIPS systems	2014-09-27 14:42:18 -07:00
Linus Torvalds	cd40fab6db	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "This has: - EFI revert to fix a boot regression - early_ioremap() fix for boot failure - KASLR fix for possible boot failures - EFI fix for corrupted string printing - remove a misleading EFI bootup 'failed!' error message Unfortunately it's all rather close to the merge window" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Truncate 64-bit values when calling 32-bit OutputString() x86/efi: Delete misleading efi_printk() error message Revert "efi/x86: efistub: Move shared dependencies to <asm/efi.h>" x86/kaslr: Avoid the setup_data area when picking location x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8	2014-09-27 14:23:13 -07:00
Mikhail Efremov	d2fa4a8476	vfs: Don't exchange "short" filenames unconditionally. Only exchange source and destination filenames if flags contain RENAME_EXCHANGE. In case if executable file was running and replaced by other file /proc/PID/exe should still show correct file name, not the old name of the file by which it was replaced. The scenario when this bug manifests itself was like this: * ALT Linux uses rpm and start-stop-daemon; * during a package upgrade rpm creates a temporary file for an executable to rename it upon successful unpacking; * start-stop-daemon is run subsequently and it obtains the (nonexistant) temporary filename via /proc/PID/exe thus failing to identify the running process. Note that "long" filenames (> DNAiME_INLINE_LEN) are still exchanged without RENAME_EXCHANGE and this behaviour exists long enough (should be fixed too apparently). So this patch is just an interim workaround that restores behavior for "short" names as it was before changes introduced by commit `da1ce0670c` ("vfs: add cross-rename"). See https://lkml.org/lkml/2014/9/7/6 for details. AV: the comments about being more careful with ->d_name.hash than with ->d_name.name are from back in 2.3.40s; they became obsolete by 2.3.60s, when we started to unhash the target instead of swapping hash chain positions followed by d_delete() as we used to do when dcache was first introduced. Acked-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: `da1ce0670c` "vfs: add cross-rename" Signed-off-by: Mikhail Efremov <sem@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-27 15:59:39 -04:00
Linus Torvalds	a28ddb87cd	fold swapping ->d_name.hash into switch_names() and do it along with ->d_name.len there Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-27 15:59:11 -04:00
Al Viro	986c01942a	fold unlocking the children into dentry_unlock_parents_for_move() ... renaming it into dentry_unlock_for_move() and making it more symmetric with dentry_lock_for_move(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 23:11:15 -04:00
Al Viro	63cf427a57	kill __d_materialise_dentry() it folds into __d_move() now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 23:06:14 -04:00
Al Viro	4453641fe8	__d_materialise_dentry(): flip the order of arguments ... thus making it much closer to (now unreachable, BTW) IS_ROOT(dentry) case in __d_move(). A bit more and it'll fold in. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 22:54:02 -04:00
Al Viro	9d8cd306a8	__d_move(): fold manipulations with ->d_child/->d_subdirs list_del() + list_add() is a slightly pessimised list_move() list_del() + INIT_LIST_HEAD() is a slightly pessimised list_del_init() Interleaving those makes the resulting code even worse. And harder to follow... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 21:34:01 -04:00
Al Viro	8527dd7187	don't open-code d_rehash() in d_materialise_unique() ... and get rid of duplicate BUG_ON() there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 21:26:50 -04:00
Al Viro	5cc3821b57	pull rehashing and unlocking the target dentry into __d_materialise_dentry() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 21:25:35 -04:00
Al Viro	e4502c63f5	ufs: deal with nfsd/iget races Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 21:17:52 -04:00
Miklos Szeredi	2c80929c4c	fuse: honour max_read and max_write in direct_io mode The third argument of fuse_get_user_pages() "nbytesp" refers to the number of bytes a caller asked to pack into fuse request. This value may be lesser than capacity of fuse request or iov_iter. So fuse_get_user_pages() must ensure that *nbytesp won't grow. Now, when helper iov_iter_get_pages() performs all hard work of extracting pages from iov_iter, it can be done by passing properly calculated "maxsize" to the helper. The other caller of iov_iter_get_pages() (dio_refill_pages()) doesn't need this capability, so pass LONG_MAX as the maxsize argument here. Fixes: `c9c37e2e63` ("fuse: switch to iov_iter_get_pages()") Reported-by: Werner Baumann <werner.baumann@onlinehome.de> Tested-by: Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 21:16:51 -04:00
Miklos Szeredi	b928095b0a	shmem: fix nlink for rename overwrite directory If overwriting an empty directory with rename, then need to drop the extra nlink. Test prog: #include <stdio.h> #include <fcntl.h> #include <err.h> #include <sys/stat.h> int main(void) { const char test_dir1 = "test-dir1"; const char test_dir2 = "test-dir2"; int res; int fd; struct stat statbuf; res = mkdir(test_dir1, 0777); if (res == -1) err(1, "mkdir(\"%s\")", test_dir1); res = mkdir(test_dir2, 0777); if (res == -1) err(1, "mkdir(\"%s\")", test_dir2); fd = open(test_dir2, O_RDONLY); if (fd == -1) err(1, "open(\"%s\")", test_dir2); res = rename(test_dir1, test_dir2); if (res == -1) err(1, "rename(\"%s\", \"%s\")", test_dir1, test_dir2); res = fstat(fd, &statbuf); if (res == -1) err(1, "fstat(%i)", fd); if (statbuf.st_nlink != 0) { fprintf(stderr, "nlink is %lu, should be 0\n", statbuf.st_nlink); return 1; } return 0; } Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-09-26 21:16:42 -04:00
Linus Torvalds	c6ff6486e5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fix from Dmitry Torokhov: "A small fixup to i8042 adding Asus X450LCP to the nomux list" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: i8042 - fix Asus X450LCP touchpad detection	2014-09-26 11:04:31 -07:00
Linus Torvalds	d2865c7d4e	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A CONFIG_STACK_GROWSUP=y fix, and a hotplug llc CPU mask fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix unreleased llc_shared_mask bit during CPU hotplug sched: Fix end_of_stack() and location of stack canary for architectures using CONFIG_STACK_GROWSUP	2014-09-26 08:38:09 -07:00
Linus Torvalds	8207649c41	Merge branch 'akpm' (fixes from Andrew Morton) Merge fixes from Andrew Morton: "9 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: softdirty: keep bit when zapping file pte fs/cachefiles: add missing \n to kerror conversions genalloc: fix device node resource counter drivers/rtc/rtc-efi.c: add missing module alias mm, slab: initialize object alignment on cache creation mm: softdirty: addresses before VMAs in PTE holes aren't softdirty ocfs2/dlm: do not get resource spinlock if lockres is new nilfs2: fix data loss with mmap() ocfs2: free vol_label in ocfs2_delete_osb()	2014-09-26 08:11:43 -07:00
Peter Feiner	dbab31aa2c	mm: softdirty: keep bit when zapping file pte This fixes the same bug as `b43790eedd` ("mm: softdirty: don't forget to save file map softdiry bit on unmap") and `9aed8614af` ("mm/memory.c: don't forget to set softdirty on file mapped fault") where the return value of pte_*mksoft_dirty was being ignored. To be sure that no other pte/pmd "mk" function return values were being ignored, I annotated the functions in arch/x86/include/asm/pgtable.h with __must_check and rebuilt. The userspace effect of this bug is that the softdirty mark might be lost if a file mapped pte get zapped. Signed-off-by: Peter Feiner <pfeiner@google.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Jamie Liu <jamieliu@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:35 -07:00
Fabian Frederick	6ff66ac77a	fs/cachefiles: add missing \n to kerror conversions Commit `0227d6abb3` ("fs/cachefiles: replace kerror by pr_err") didn't include newline featuring in original kerror definition Signed-off-by: Fabian Frederick <fabf@skynet.be> Reported-by: David Howells <dhowells@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> [3.16.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:35 -07:00
Vladimir Zapolskiy	6f3aabd183	genalloc: fix device node resource counter Decrement the np_pool device_node refcount, which was incremented on the preceding of_parse_phandle() call. Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Olof Johansson <olof@lixom.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:35 -07:00
Pali Rohár	451ff6d409	drivers/rtc/rtc-efi.c: add missing module alias Without proper alias kernel module is not loaded for rtc-efi driver. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Cc: dann frazier <dannf@dannf.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:35 -07:00
David Rientjes	d4a5fca592	mm, slab: initialize object alignment on cache creation Since commit `4590685546` ("mm/sl[aou]b: Common alignment code"), the "ralign" automatic variable in __kmem_cache_create() may be used as uninitialized. The proper alignment defaults to BYTES_PER_WORD and can be overridden by SLAB_RED_ZONE or the alignment specified by the caller. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=85031 Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Andrei Elovikov <a.elovikov@gmail.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:35 -07:00
Peter Feiner	87e6d49a00	mm: softdirty: addresses before VMAs in PTE holes aren't softdirty In PTE holes that contain VM_SOFTDIRTY VMAs, unmapped addresses before VM_SOFTDIRTY VMAs are reported as softdirty by /proc/pid/pagemap. This bug was introduced in commit `68b5a65248` ("mm: softdirty: respect VM_SOFTDIRTY in PTE holes"). That commit made /proc/pid/pagemap look at VM_SOFTDIRTY in PTE holes but neglected to observe the start of VMAs returned by find_vma. Tested: Wrote a selftest that creates a PMD-sized VMA then unmaps the first page and asserts that the page is not softdirty. I'm going to send the pagemap selftest in a later commit. Signed-off-by: Peter Feiner <pfeiner@google.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Jamie Liu <jamieliu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:35 -07:00
Joseph Qi	5760a97c71	ocfs2/dlm: do not get resource spinlock if lockres is new There is a deadlock case which reported by Guozhonghua: https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html This case is caused by &res->spinlock and &dlm->master_lock misordering in different threads. It was introduced by commit `8d400b81cc` ("ocfs2/dlm: Clean up refmap helpers"). Since lockres is new, it doesn't not require the &res->spinlock. So remove it. Fixes: `8d400b81cc` ("ocfs2/dlm: Clean up refmap helpers") Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: joyce.xue <xuejiufei@huawei.com> Reported-by: Guozhonghua <guozhonghua@h3c.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:34 -07:00
Andreas Rohner	56d7acc792	nilfs2: fix data loss with mmap() This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: ----------------[BEGIN SCRIPT]-------------------- mkfs.nilfs2 -f /dev/sdb mount /dev/sdb /mnt dd if=/dev/zero bs=1M count=30 of=/mnt/testfile umount /mnt mount /dev/sdb /mnt CHECKSUM_BEFORE="$(md5sum /mnt/testfile)" /root/mmaptest/mmaptest /mnt/testfile 30 10 5 sync CHECKSUM_AFTER="$(md5sum /mnt/testfile)" umount /mnt mount /dev/sdb /mnt CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)" umount /mnt echo "BEFORE MMAP:\t$CHECKSUM_BEFORE" echo "AFTER MMAP:\t$CHECKSUM_AFTER" echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT" ----------------[END SCRIPT]-------------------- The mmaptest tool looks something like this (very simplified, with error checking removed): ----------------[BEGIN mmaptest]-------------------- data = mmap(NULL, file_size - file_offset, PROT_READ \| PROT_WRITE, MAP_SHARED, fd, file_offset); for (i = 0; i < write_count; ++i) { memcpy(data + i * 4096, buf, sizeof(buf)); msync(data, file_size - file_offset, MS_SYNC)) } ----------------[END mmaptest]-------------------- The output of the script looks something like this: BEFORE MMAP: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile AFTER MMAP: 6604a1c31f10780331a6850371b3a313 /mnt/testfile AFTER REMOUNT: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile So it is clear, that the changes done using mmap() do not survive a remount. This can be reproduced a 100% of the time. The problem was introduced in commit `136e8770cd` ("nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary"). If the page was read with mpage_readpage() or mpage_readpages() for example, then it has no buffers attached to it. In that case page_has_buffers(page) in nilfs_set_page_dirty() will be false. Therefore nilfs_set_file_dirty() is never called and the pages are never collected and never written to disk. This patch fixes the problem by also calling nilfs_set_file_dirty() if the page has no buffers attached to it. [akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/] Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net> Tested-by: Andreas Rohner <andreas.rohner@gmx.net> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:34 -07:00
Joseph Qi	f13a568e5a	ocfs2: free vol_label in ocfs2_delete_osb() osb->vol_label is malloced in ocfs2_initialize_super but not freed if error occurs or during umount, thus causing a memory leak. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-09-26 08:10:34 -07:00
Markos Chandras	8a574cfa26	MIPS: mcount: Adjust stack pointer for static trace in MIPS32 Every mcount() call in the MIPS 32-bit kernel is done as follows: [...] move at, ra jal _mcount addiu sp, sp, -8 [...] but upon returning from the mcount() function, the stack pointer is not adjusted properly. This is explained in details in `58b69401c7` (MIPS: Function tracer: Fix broken function tracing). Commit `ad8c396936` ("MIPS: Unbreak function tracer for 64-bit kernel.) fixed the stack manipulation for 64-bit but it didn't fix it completely for MIPS32. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: <stable@vger.kernel.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7792/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2014-09-26 11:41:17 +02:00
Paul Burton	c8c0da6bdf	MIPS: Fix MFC1 & MFHC1 emulation for 64-bit MIPS systems Commit `bbd426f542` "MIPS: Simplify FP context access" modified the SIFROMREG & SIFROMHREG macros such that they return unsigned rather than signed 32b integers. I had believed that to be fine, but inadvertently missed the MFC1 & MFHC1 cases which write to a struct pt_regs regs element. On MIPS32 this is fine, but on 64 bit those saved regs' fields are 64 bit wide. Using unsigned values caused the 32 bit value from the FP register to be zero rather than sign extended as the architecture specifies, causing incorrect emulation of the MFC1 & MFHc1 instructions. Fix by reintroducing the casts to signed integers, and therefore the sign extension. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: stable@vger.kernel.org # v3.15+ Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/7848/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2014-09-26 11:33:11 +02:00

1 2 3 4 5 ...

469892 Commits