15114 Commits

Author SHA1 Message Date
Eric Paris
52cef7555a inotify: seperate new watch creation updating existing watches
There is nothing known wrong with the inotify watch addition/modification
but this patch seperates the two code paths to make them each easy to
verify as correct.

Signed-off-by: Eric Paris <eparis@redhat.com>
2009-08-27 08:02:04 -04:00
Steven Whitehouse
307cf6e63c GFS2: Rename eattr.[ch] as xattr.[ch]
Use the more conventional name for the extended attribute
support code. Update all the places which care.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-26 18:51:04 +01:00
Steven Whitehouse
40b78a3223 GFS2: Clean up of extended attribute support
This has been on my list for some time. We need to change the way
in which we handle extended attributes to allow faster file creation
times (by reducing the number of transactions required) and the
extended attribute code is the main obstacle to this.

In addition to that, the VFS provides a way to demultiplex the xattr
calls which we ought to be using, rather than rolling our own. This
patch changes the GFS2 code to use that VFS feature and as a result
the code shrinks by a couple of hundred lines or so, and becomes
easier to read.

I'm planning on doing further clean up work in this area, but this
patch is a good start. The cleaned up code also uses the more usual
"xattr" shorthand, I plan to eliminate the use of "eattr" eventually
and in the mean time it serves as a flag as to which bits of the code
have been updated.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-26 18:41:32 +01:00
Linus Torvalds
e9cab24cf3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
  ext3: Improve error message that changing journaling mode on remount is not possible
  ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED
2009-08-25 09:47:36 -07:00
Trond Myklebust
7111dc7392 NFSv4: Fix an infinite looping problem with the nfs4_state_manager
Commit 76db6d9500caeaa774a3e32a997eba30bbdc176b (nfs41: add session setup
to the state manager) introduces an infinite loop possibility in the NFSv4
state manager. By first checking nfs4_has_session() before clearing the
NFS4CLNT_SESSION_SETUP flag, it allows for a situation where someone sets
that flag, but it never gets cleared, and so the state manager loops.

In fact commit c3fad1b1aaf850bf692642642ace7cd0d64af0a3 (nfs41: add session
reset to state manager) causes this to happen every time we get a network
partition error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-24 16:28:42 -07:00
Linus Torvalds
2584e7986f Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
  ocfs2/dlm: Wait on lockres instead of erroring cancel requests
  ocfs2: Add missing lock name
  ocfs2: Don't oops in ocfs2_kill_sb on a failed mount
  ocfs2: release the buffer head in ocfs2_do_truncate.
  ocfs2: Handle quota file corruption more gracefully
2009-08-24 14:41:28 -07:00
Hugh Dickins
353d5c30c6 mm: fix hugetlb bug due to user_shm_unlock call
2.6.30's commit 8a0bdec194c21c8fdef840989d0d7b742bb5d4bc removed
user_shm_lock() calls in hugetlb_file_setup() but left the
user_shm_unlock call in shm_destroy().

In detail:
Assume that can_do_hugetlb_shm() returns true and hence user_shm_lock()
is not called in hugetlb_file_setup(). However, user_shm_unlock() is
called in any case in shm_destroy() and in the following
atomic_dec_and_lock(&up->__count) in free_uid() is executed and if
up->__count gets zero, also cleanup_user_struct() is scheduled.

Note that sched_destroy_user() is empty if CONFIG_USER_SCHED is not set.
However, the ref counter up->__count gets unexpectedly non-positive and
the corresponding structs are freed even though there are live
references to them, resulting in a kernel oops after a lots of
shmget(SHM_HUGETLB)/shmctl(IPC_RMID) cycles and CONFIG_USER_SCHED set.

Hugh changed Stefan's suggested patch: can_do_hugetlb_shm() at the
time of shm_destroy() may give a different answer from at the time
of hugetlb_file_setup().  And fixed newseg()'s no_id error path,
which has missed user_shm_unlock() ever since it came in 2.6.9.

Reported-by: Stefan Huber <shuber2@gmail.com>
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Tested-by: Stefan Huber <shuber2@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-24 12:53:01 -07:00
Jan Kara
3c4cec6527 ext3: Improve error message that changing journaling mode on remount is not possible
This patch makes the error message about changing journaling mode on remount
more descriptive. Some people are going to hit this error now due to commit
bbae8bcc49bc4d002221dab52c79a50a82e7cd1f if they configure a kernel to default
to data=writeback mode. The problem happens if they have data=ordered set for
the root filesystem in /etc/fstab but not in the kernel command line (and they
don't use initrd). Their filesystem then gets mounted as data=writeback by
kernel but then their boot fails because init scripts won't be able to remount
the filesystem rw. Better error message will hopefully make it easier for them
to find the error in their setup and bother us less with error reports :).

Signed-off-by: Jan Kara <jack@suse.cz>
2009-08-24 16:48:45 +02:00
Theodore Ts'o
6d41807614 ext3: Update Kconfig description of EXT3_DEFAULTS_TO_ORDERED
The old description for this configuration option was perhaps not
completely balanced in terms of describing the tradeoffs of using a
default of data=writeback vs. data=ordered.  Despite the fact that old
description very strongly recomended disabling this feature, all of
the major distributions have elected to preserve the existing 'legacy'
default, which is a strong hint that it perhaps wasn't telling the
whole story.

This revised description has been vetted by a number of ext3
developers as being better at informing the user about the tradeoffs
of enabling or disabling this configuration feature.

Cc: linux-ext4@vger.kernel.org
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
2009-08-24 16:48:32 +02:00
Bob Peterson
d34843d0c4 GFS2: Add "-o errors=panic|withdraw" mount options
This patch adds "-o errors=panic" and "-o errors=withdraw" to the
gfs2 mount options.  The "errors=withdraw" option is today's
current behaviour, meaning to withdraw from the file system if a
non-serious gfs2 error occurs.  The new "errors=panic" option
tells gfs2 to force a kernel panic if a non-serious gfs2 file
system error occurs.  This may be useful, for example, where
fabric-level fencing is used that has no way to reboot (such as
fence_scsi).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-24 10:44:18 +01:00
Roel Kluin
cd0120751d GFS2: jumping to wrong label?
Also a gfs2_glock_dq() is required here.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-24 10:41:44 +01:00
Mimi Zohar
6777d773a4 kernel_read: redefine offset type
vfs_read() offset is defined as loff_t, but kernel_read()
offset is only defined as unsigned long. Redefine
kernel_read() offset as loff_t.

Cc: stable@kernel.org
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-24 14:58:23 +10:00
Chuck Lever
5eecfde615 NFS: Handle a zero-length auth flavor list
Some releases of Linux rpc.mountd (nfs-utils 1.1.4 and later) return an
empty auth flavor list if no sec= was specified for the export.  This is
notably broken server behavior.

The new auth flavor list checking added in a recent commit rejects this
case.  The OpenSolaris client does too.

The broken mountd implementation is already widely deployed.  To avoid
a behavioral regression, the kernel's mount client skips flavor checking
(ie reverts to the pre-2.6.32 behavior) if mountd returns an empty
flavor list.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-23 23:43:57 -04:00
Linus Torvalds
8e9d78edea Re-introduce page mapping check in mark_buffer_dirty()
In commit a8e7d49aa7be728c4ae241a75a2a124cdcabc0c5 ("Fix race in
create_empty_buffers() vs __set_page_dirty_buffers()"), I removed a test
for a NULL page mapping unintentionally when some of the code inside
__set_page_dirty() was moved to the callers.

That removal generally didn't matter, since a filesystem would serialize
truncation (which clears the page mapping) against writing (which marks
the buffer dirty), so locking at a higher level (either per-page or an
inode at a time) should mean that the buffer page would be stable.  And
indeed, nothing bad seemed to happen.

Except it turns out that apparently reiserfs does something odd when
under load and writing out the journal, and we have a number of bugzilla
entries that look similar:

	http://bugzilla.kernel.org/show_bug.cgi?id=13556
	http://bugzilla.kernel.org/show_bug.cgi?id=13756
	http://bugzilla.kernel.org/show_bug.cgi?id=13876

and it looks like reiserfs depended on that check (the common theme
seems to be "data=journal", and a journal writeback during a truncate).

I suspect reiserfs should have some additional locking, but in the
meantime this should get us back to the pre-2.6.29 behavior.

Pattern-pointed-out-by: Roland Kletzing <devzero@web.de>
Cc: stable@kernel.org (2.6.29 and 2.6.30)
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-21 17:40:08 -07:00
Linus Torvalds
b57f92157e Merge branch 'btrfs' of git://git.kernel.dk/linux-2.6-block
* 'btrfs' of git://git.kernel.dk/linux-2.6-block:
  btrfs: fix inode rbtree corruption
2009-08-21 09:56:55 -07:00
From: Nick Piggin
03e860bd9f btrfs: fix inode rbtree corruption
Node may not be inserted over existing node. This causes inode tree
corruption and I was seeing crashes in inode_tree_del which I can not
reproduce after this patch.

The other way to fix this would be to tie inode lifetime in the rbtree
with inode while not in freeing state. I had a look at this but it is
not so trivial at this point. At least this patch gets things working again.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Acked-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-08-21 10:09:44 +02:00
Amerigo Wang
939a9421eb vfs: allow file truncations when both suid and write permissions set
When suid is set and the non-owner user has write permission, any writing
into this file should be allowed and suid should be removed after that.

However, current kernel only allows writing without truncations, when we
do truncations on that file, we get EPERM.  This is a bug.

Steps to reproduce this bug:

% ls -l rootdir/file1
-rwsrwsrwx 1 root root 3 Jun 25 15:42 rootdir/file1
% echo h > rootdir/file1
zsh: operation not permitted: rootdir/file1
% ls -l rootdir/file1
-rwsrwsrwx 1 root root 3 Jun 25 15:42 rootdir/file1
% echo h >> rootdir/file1
% ls -l rootdir/file1
-rwxrwxrwx 1 root root 5 Jun 25 16:34 rootdir/file1

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Eric Sandeen <esandeen@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <jmorris@namei.org>
2009-08-21 14:25:48 +10:00
Goldwyn Rodrigues
c795b33ba1 ocfs2/dlm: Wait on lockres instead of erroring cancel requests
In case a downconvert is queued, and a flock receives a signal,
BUG_ON(lockres->l_action != OCFS2_AST_INVALID) is triggered
because a lock cancel triggers a dlmunlock while an AST is
scheduled.

To avoid this, allow a LKM_CANCEL to pass through, and let it
wait on __dlm_wait_on_lockres().

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de>
Acked-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-08-20 18:42:34 -07:00
Jan Kara
a8b88d3d49 ocfs2: Add missing lock name
There is missing name for NFSSync cluster lock. This makes lockdep unhappy
because we end up passing NULL to lockdep when initializing lock key. Fix it.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-08-20 16:41:53 -07:00
Jan Kara
e1af88a1ad nfs: Remove reference to generic_osync_inode from a comment
generic_file_direct_write() no longer calls generic_osync_inode() so remove the
comment.

CC: linux-nfs@vger.kernel.org
CC: Neil Brown <neilb@suse.de>
CC: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-19 19:48:08 -04:00
James Morris
ece13879e7 Merge branch 'master' into next
Conflicts:
	security/Kconfig

Manual fix.

Signed-off-by: James Morris <jmorris@namei.org>
2009-08-20 09:18:42 +10:00
Trond Myklebust
7d7ea88289 NFS: Use the DNS resolver in the mount code.
In the referral code, use it to look up the new server's ip address if the
fs_locations attribute contains a hostname.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-19 18:22:15 -04:00
Trond Myklebust
e571cbf1a4 NFS: Add a dns resolver for use with NFSv4 referrals and migration
The NFSv4 and NFSv4.1 protocols both allow for the redirection of a client
from one server to another in order to support filesystem migration and
replication. For full protocol support, we need to add the ability to
convert a DNS host name into an IP address that we can feed to the RPC
client.

We'll reuse the sunrpc cache, now that it has been converted to work with
rpc_pipefs.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-19 18:22:15 -04:00
Trond Myklebust
6a396f67d2 Merge branch 'nfsv4_xdr_cleanups-for-2.6.32' into nfs-for-2.6.32
Conflicts:
	fs/nfs/nfs4xdr.c
2009-08-19 18:21:52 -04:00
Linus Torvalds
6c30c53fd5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: fix oopses with doubly mounted snapshots
  nilfs2: missing a read lock for segment writer in nilfs_attach_checkpoint()
2009-08-19 10:40:24 -07:00
KOSAKI Motohiro
0753ba01e1 mm: revert "oom: move oom_adj value"
The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to
the mm_struct.  It was a very good first step for sanitize OOM.

However Paul Menage reported the commit makes regression to his job
scheduler.  Current OOM logic can kill OOM_DISABLED process.

Why? His program has the code of similar to the following.

	...
	set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom */
	...
	if (vfork() == 0) {
		set_oom_adj(0); /* Invoked child can be killed */
		execve("foo-bar-cmd");
	}
	....

vfork() parent and child are shared the same mm_struct.  then above
set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also
change oom_adj for vfork() parent.  Then, vfork() parent (job scheduler)
lost OOM immune and it was killed.

Actually, fork-setting-exec idiom is very frequently used in userland program.
We must not break this assumption.

Then, this patch revert commit 2ff05b2b and related commit.

Reverted commit list
---------------------
- commit 2ff05b2b4e (oom: move oom_adj value from task_struct to mm_struct)
- commit 4d8b9135c3 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE)
- commit 8123681022 (oom: only oom kill exiting tasks with attached memory)
- commit 933b787b57 (mm: copy over oom_adj value at fork time)

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:13 -07:00
Jeff Layton
89a4eb4b66 vfs: make get_sb_pseudo set s_maxbytes to value that can be cast to signed
get_sb_pseudo sets s_maxbytes to ~0ULL which becomes negative when cast
to a signed value.  Fix it to use MAX_LFS_FILESIZE which casts properly
to a positive signed value.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Steve French <smfrench@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-18 16:31:12 -07:00
Ryusuke Konishi
a924586036 nilfs2: fix oopses with doubly mounted snapshots
will fix kernel oopses like the following:

 # mount -t nilfs2 -r -o cp=20 /dev/sdb1 /test1
 # mount -t nilfs2 -r -o cp=20 /dev/sdb1 /test2
 # umount /test1
 # umount /test2

BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1069
in_atomic(): 0, irqs_disabled(): 1, pid: 3886, name: umount.nilfs2
1 lock held by umount.nilfs2/3886:
 #0:  (&type->s_umount_key#31){+.+...}, at: [<c10b398a>] deactivate_super+0x52/0x6c
irq event stamp: 1219
hardirqs last  enabled at (1219): [<c135c774>] __mutex_unlock_slowpath+0xf8/0x119
hardirqs last disabled at (1218): [<c135c6d5>] __mutex_unlock_slowpath+0x59/0x119
softirqs last  enabled at (1214): [<c1033316>] __do_softirq+0x1a5/0x1ad
softirqs last disabled at (1205): [<c1033354>] do_softirq+0x36/0x5a
Pid: 3886, comm: umount.nilfs2 Not tainted 2.6.31-rc6 #55
Call Trace:
 [<c1023549>] __might_sleep+0x107/0x10e
 [<c13603c0>] do_page_fault+0x246/0x397
 [<c136017a>] ? do_page_fault+0x0/0x397
 [<c135e753>] error_code+0x6b/0x70
 [<c136017a>] ? do_page_fault+0x0/0x397
 [<c104f805>] ? __lock_acquire+0x91/0x12fd
 [<c1050a62>] ? __lock_acquire+0x12ee/0x12fd
 [<c1050a62>] ? __lock_acquire+0x12ee/0x12fd
 [<c1050b2b>] lock_acquire+0xba/0xdd
 [<d0d17d3f>] ? nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2]
 [<c135d4fe>] down_write+0x2a/0x46
 [<d0d17d3f>] ? nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2]
 [<d0d17d3f>] nilfs_detach_segment_constructor+0x2f/0x2fa [nilfs2]
 [<c104ea2c>] ? mark_held_locks+0x43/0x5b
 [<c104ecb1>] ? trace_hardirqs_on_caller+0x10b/0x133
 [<c104ece4>] ? trace_hardirqs_on+0xb/0xd
 [<d0d09ac1>] nilfs_put_super+0x2f/0xca [nilfs2]
 [<c10b3352>] generic_shutdown_super+0x49/0xb8
 [<c10b33de>] kill_block_super+0x1d/0x31
 [<c10e6599>] ? vfs_quota_off+0x0/0x12
 [<c10b398f>] deactivate_super+0x57/0x6c
 [<c10c4bc3>] mntput_no_expire+0x8c/0xb4
 [<c10c5094>] sys_umount+0x27f/0x2a4
 [<c10c50c6>] sys_oldumount+0xd/0xf
 [<c10031a4>] sysenter_do_call+0x12/0x38
 ...

This turns out to be a bug brought by an -rc1 patch ("nilfs2: simplify
remaining sget() use").

In the patch, a new "put resource" function, nilfs_put_sbinfo()
was introduced to delay freeing nilfs_sb_info struct.

But the nilfs_put_sbinfo() mistakenly used atomic_dec_and_test()
function to check the reference count, and it caused the nilfs_sb_info
was freed when user mounted a snapshot twice.

This bug also suggests there was unseen memory leak in usual mount
/umount operations for nilfs.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-08-19 02:10:13 +09:00
Wengang Wang
970343cd49 GFS2: free disk inode which is deleted by remote node -V2
this patch is for the same problem that Benjamin Marzinski fixes at commit
b94a170e96dc416828af9d350ae2e34b70ae7347

quotation of the original problem:

---cut here---
When a file is deleted from a gfs2 filesystem on one node, a dcache
entry for it may still exist on other nodes in the cluster. If this
happens, gfs2 will be unable to free this file on disk. Because of this,
it's possible to have a gfs2 filesystem with no files on it and no free
space. With this patch, when a node receives a callback notifying it
that the file is being deleted on another node, it schedules a new
workqueue thread to remove the file's dcache entry.
---end cut---

after applying Benjamin's patch, I think there is still a case in which the disk
inode remains even when "no space" is hit. the case is that when running
d_prune_aliases() against the inode, there are one or more dentries(aliases)
which have reference count number > 0. in this case the dentries won't be pruned.
and even later, the reference count becomes to 0, the dentries can still be
cached in memory. unfortunately, no callback come again, things come back to
the state before the callback runs. thus the on disk inode remains there until
in memoryinode is removed for some other reason(shrinking inode cache or unmount
the volume..).

this patch is to remove those dentries when their reference count becomes to 0 and
the inode is deleted by remote node. for implementation, gfs2_dentry_delete() is
added as dentry_operations.d_delete. the function returns true when the inode is
deleted by remote node. in dput(), gfs2_dentry_delete() is called and since it
returns true, the dentry is unhashed from dcache and then removed. when all dentries
are removed, the in memory inode get removed so that the on disk inode is freed.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-18 10:29:39 +01:00
Zhang Qiang
1154ecbd2f nilfs2: missing a read lock for segment writer in nilfs_attach_checkpoint()
'ns_cno' of structure 'the_nilfs' must be protected from segment
writer, in other words, the caller of nilfs_get_checkpoint should hold
read lock for nilfs->ns_segctor_sem.  This patch adds the lock/unlock
operations in nilfs_attach_checkpoint() when calling
nilfs_cpfile_get_checkpoint().

Signed-off-by: Zhang Qiang <zhangqiang.buaa@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-08-18 17:32:27 +09:00
Abhishek Kulkarni
4b53e4b500 9p: remove unnecessary v9fses->options which duplicates the mount string
The mount options string is saved in sb->s_options. This patch removes
the redundant duplicating of the mount options. Also, since we are not
displaying anything special in show options, we replace v9fs_show_options
with generic_show_options for now.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:42:28 -05:00
Abhishek Kulkarni
48559b4c30 9p: Add missing cast for the error return value in v9fs_get_inode
Cast the error return value (ENOMEM) in v9fs_get_inode() to its
correct type using ERR_PTR.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:35:08 -05:00
Jan Kara
5fd1318937 ocfs2: Don't oops in ocfs2_kill_sb on a failed mount
If we fail to mount the filesystem, we have to be careful not to dereference
uninitialized structures in ocfs2_kill_sb.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-08-17 14:32:24 -07:00
Abhishek Kulkarni
4d3297ca5b 9p: Remove redundant inode uid/gid assignment
Remove a redundant update of inode's i_uid and i_gid
after v9fs_get_inode() since the latter already sets up
a new inode and sets the proper uid and gid values.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:27:58 -05:00
Abhishek Kulkarni
1b5ab3e867 9p: Fix possible regressions when ->get_sb fails.
->get_sb can fail causing some badness. this patch fixes
   * clear sb->fs_s_info in kill_sb.
   * deactivate_locked_super() calls kill_sb (v9fs_kill_super) which closes the
     destroys the client, clunks all its fids and closes the v9fs session.
     Attempting to do it twice will cause an oops.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:27:57 -05:00
Abhishek Kulkarni
4f4038328d 9p: Fix v9fs show_options
Add the delimiter ',' before the options when they are passed
and check if no option parameters are passed to prevent displaying
NULL in /proc/mounts.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:27:57 -05:00
Abhishek Kulkarni
02bc35672b 9p: Fix possible memleak in v9fs_inode_from fid.
Add missing p9stat_free in v9fs_inode_from_fid to avoid
any possible leaks.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:27:57 -05:00
Abhishek Kulkarni
0e15597ebf 9p: minor comment fixes
Fix the comments -- mostly the improper and/or missing descriptions
of function parameters.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:27:57 -05:00
Abhishek Kulkarni
2bb541157f 9p: Fix possible inode leak in v9fs_get_inode.
Add a missing iput when cleaning up if v9fs_get_inode
fails after returning a valid inode.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:27:57 -05:00
Abhishek Kulkarni
50fb6d2bd7 9p: Check for error in return value of v9fs_fid_add
Check if v9fs_fid_add was successful or not based on its
return value.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2009-08-17 16:27:57 -05:00
Linus Torvalds
c58afec8b2 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: fix locking in xfs_iget_cache_hit
2009-08-17 13:39:30 -07:00
Eric Paris
08e53fcb0d inotify: start watch descriptor count at 1
The inotify_add_watch man page specifies that inotify_add_watch() will
return a non-negative integer.  However, historically the inotify
watches started at 1, not at 0.

Turns out that the inotifywait program provided by the inotify-tools
package doesn't properly handle a 0 watch descriptor.  In 7e790dd5 we
changed from starting at 1 to starting at 0.  This patch starts at 1,
just like in previous kernels, but also just like in previous kernels
it's possible for it to wrap back to 0.  This preserves the kernel
functionality exactly like it was before the patch (neither method broke
the spec)

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-17 13:37:37 -07:00
Eric Paris
cd94c8bbef inotify: tail drop inotify q_overflow events
In f44aebcc the tail drop logic of events with no file backing
(q_overflow and in_ignored) was reversed so IN_IGNORED events would
never be tail dropped.  This now means that Q_OVERFLOW events are NOT
tail dropped.  The fix is to not tail drop IN_IGNORED, but to tail drop
Q_OVERFLOW.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-17 13:37:37 -07:00
Eric Paris
eef3a116be notify: unused event private race
inotify decides if private data it passed to get added to an event was
used by checking list_empty().  But it's possible that the event may
have been dequeued and the private event removed so it would look empty.

The fix is to use the return code from fsnotify_add_notify_event rather
than looking at the list.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-08-17 13:37:37 -07:00
Tao Ma
60e2ec4866 ocfs2: release the buffer head in ocfs2_do_truncate.
In ocfs2_do_truncate, we forget to release last_eb_bh which
will cause memleak. So call brelse in the end.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-08-17 12:50:35 -07:00
Jan Kara
ada508274b ocfs2: Handle quota file corruption more gracefully
ocfs2_read_virt_blocks() does BUG when we try to read a block from a file
beyond its end. Since this can happen due to filesystem corruption, it
is not really an appropriate answer. Make ocfs2_read_quota_block() check
the condition and handle it by calling ocfs2_error() and returning EIO.

[ Modified to print ip_blkno in the error - Joel ]

Reported-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-08-17 12:50:12 -07:00
Steven Whitehouse
31e54b01f3 GFS2: Add sysfs link to device
This adds a link from the per-gfs2 sb sysfs directory to
the block device upon which the filesystem is mounted. The
link is called "device", strangely enough :-)

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17 11:11:18 +01:00
Steven Whitehouse
05164e5b37 GFS2: Replace assertion with proper error handling
One fewer assert, one more place we can recover gracefully
if there is an error.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17 11:06:43 +01:00
Steven Whitehouse
6050b9c74f GFS2: Improve error handling in inode allocation
A little while back, block allocation was given some improved
error handling which meant that -EIO was returned in the case
of there being a problem in the resource group data. In addition
a message is printed explaning what went wrong and how to fix it.
This extends that error handling so that it also covers inode
allocation too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17 11:05:31 +01:00
Steven Whitehouse
440d6da207 GFS2: Add some more info to uevents
With each uevent, we now always include the journal ID. We
can't call it JID since that is already in use by some of
the individual events relating to recovery, so we use
JOURNALID instead. We don't send the JOURNALID for spectator
mounts, since there isn't one.

Also the ADD event now has both RDONLY and SPECTATOR information
to match that of the ONLINE event.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-08-17 11:05:00 +01:00