10116 Commits

Author SHA1 Message Date
Andi Kleen
39d80c33a0 ext4: Avoid double dirtying of super block in ext4_put_super()
While reading code I noticed that ext4_put_super() dirties the 
superblock bh twice. It is always done in ext4_commit_super()
too. Remove the redundant dirty operation.
Should be a nop semantically.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-10-06 21:37:44 -04:00
Eric Sandeen
6873fa0de1 Hook ext4 to the vfs fiemap interface.
ext4_ext_walk_space() was reinstated to be used for iterating over file
extents with a callback; it is used by the ext4 fiemap implementation.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
2008-10-07 00:46:36 -04:00
Jeff Layton
6b37faa175 [CIFS] fix some settings of cifsAttrs after calling SetFileInfo and SetPathInfo
We only need to set them when we call SetFileInfo or SetPathInfo
directly, and as soon as possible after then. We had one place setting
it where it didn't need to be, and another place where it was missing.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-10-06 21:54:41 +00:00
Josef Bacik
68c9d702bb generic block based fiemap implementation
Any block based fs (this patch includes ext3) just has to declare its own
fiemap() function and then call this generic function with its own
get_block_t. This works well for block based filesystems that will map
multiple contiguous blocks at one time, but will work for filesystems that
only map one block at a time, you will just end up with an "extent" for each
block. One gotcha is this will not play nicely where there is hole+data
after the EOF. This function will assume its hit the end of the data as soon
as it hits a hole after the EOF, so if there is any data past that it will
not pick that up. AFAIK no block based fs does this anyway, but its in the
comments of the function anyway just in case.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org
2008-10-03 17:32:43 -04:00
Mark Fasheh
00dc417fa3 ocfs2: fiemap support
Plug ocfs2 into ->fiemap. Some portions of ocfs2_get_clusters() had to be
refactored so that the extent cache can be skipped in favor of going
directly to the on-disk records. This makes it easier for us to determine
which extent is the last one in the btree. Also, I'm not sure we want to be
caching fiemap lookups anyway as they're not directly related to data
read/write.

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: ocfs2-devel@oss.oracle.com
Cc: linux-fsdevel@vger.kernel.org
2008-10-03 17:32:11 -04:00
Mark Fasheh
c4b929b85b vfs: vfs-level fiemap interface
Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap
inode operation.

Userspace can get extent information on a file via fiemap ioctl. As input,
the fiemap ioctl takes a struct fiemap which includes an array of struct
fiemap_extent (fm_extents). Size of the extent array is passed as
fm_extent_count and number of extents returned will be written into
fm_mapped_extents. Offset and length fields on the fiemap structure
(fm_start, fm_length) describe a logical range which will be searched for
extents. All extents returned will at least partially contain this range.
The actual extent offsets and ranges returned will be unmodified from their
offset and range on-disk.

The fiemap ioctl returns '0' on success. On error, -1 is returned and errno
is set. If errno is equal to EBADR, then fm_flags will contain those flags
which were passed in which the kernel did not understand. On all other
errors, the contents of fm_extents is undefined.

As fiemap evolved, there have been many authors of the vfs patch. As far as
I can tell, the list includes:
Kalpak Shah <kalpak.shah@sun.com>
Andreas Dilger <adilger@sun.com>
Eric Sandeen <sandeen@redhat.com>
Mark Fasheh <mfasheh@suse.com>

Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: linux-api@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
2008-10-08 19:44:18 -04:00
Kalpak Shah
4d20c685fa ext4: fix xattr deadlock
ext4_xattr_set_handle() eventually ends up calling
ext4_mark_inode_dirty() which tries to expand the inode by shifting
the EAs.  This leads to the xattr_sem being downed again and leading
to a deadlock.

This patch makes sure that if ext4_xattr_set_handle() is in the
call-chain, ext4_mark_inode_dirty() will not expand the inode.

Signed-off-by: Kalpak Shah <kalpak.shah@sun.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-08 23:21:54 -04:00
Theodore Ts'o
45a90bfd90 jbd2: Fix buffer head leak when writing the commit block
Also make sure the buffer heads are marked clean before submitting bh
for writing.  The previous code was marking the buffer head dirty,
which would have forced an unneeded write (and seek) to the journal
for no good reason.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-06 12:04:02 -04:00
Theodore Ts'o
ede86cc473 ext4: Add debugging markers that can be used by systemtap
This debugging markers are designed to debug problems such as the
random filesystem latency problems reported by Arjan.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-05 20:50:06 -04:00
Duane Griffin
23f8b79eae jbd2: abort instead of waiting for nonexistent transaction
The __jbd2_log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal. 
However, if there are no transactions to be processed (e.g.  because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Cc: Sami Liedes <sliedes@cc.hut.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-08 23:28:31 -04:00
Frederic Bohe
c806e68f56 ext4: fix initialization of UNINIT bitmap blocks
This fixes a bug which caused on-line resizing of filesystems with a
1k blocksize to fail.  The root cause of this bug was the fact that if
an uninitalized bitmap block gets read in by userspace (which
e2fsprogs does try to avoid, but can happen when the blocksize is less
than the pagesize and an adjacent blocks is read into memory)
ext4_read_block_bitmap() was erroneously depending on the buffer
uptodate flag to decide whether it needed to initialize the bitmap
block in memory --- i.e., to set the standard set of blocks in use by
a block group (superblock, bitmaps, inode table, etc.).  Essentially,
ext4_read_block_bitmap() assumed it was the only routine that might
try to read a block containing a block bitmap, which is simply not
true.  

To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap()
must always initialize uninitialized bitmap blocks.  Once a block or
inode is allocated out of that bitmap, it will be marked as
initialized in the block group descriptor, so in general this won't
result any extra unnecessary work.

Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10 08:09:18 -04:00
Theodore Ts'o
c2ea3fde61 ext4: Remove old legacy block allocator
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10 09:40:52 -04:00
Theodore Ts'o
240799cdf2 ext4: Use readahead when reading an inode from the inode table
With modern hard drives, reading 64k takes roughly the same time as
reading a 4k block.  So request readahead for adjacent inode table
blocks to reduce the time it takes when iterating over directories
(especially when doing this in htree sort order) in a cold cache case.
With this patch, the time it takes to run "git status" on a kernel
tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches"
is reduced by 21%.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-09 23:53:47 -04:00
Nick Piggin
4b19de6d1c mm: tiny-shmem nommu fix
The previous patch db203d53d474aa068984e409d807628f5841da1b ("mm:
tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock
ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU
architectures because it was using the expanding truncate to signal ramfs
to allocate a physically contiguous RAM backing the inode (otherwise it is
unusable for "memory mapping" it to userspace).

However do_truncate is what caused the lock ordering error, due to it
taking i_mutex.  In this case, we can actually just call ramfs directly to
allocate memory for the mapping, rather than go via truncate.

Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-02 15:53:13 -07:00
Nick Piggin
16dbc6c961 inotify: fix lock ordering wrt do_page_fault's mmap_sem
Fix inotify lock order reversal with mmap_sem due to holding locks over
copy_to_user.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reported-by: "Daniel J Blueman" <daniel.blueman@gmail.com>
Tested-by: "Daniel J Blueman" <daniel.blueman@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-02 15:53:13 -07:00
Balbir Singh
31a78f23ba mm owner: fix race between swapoff and exit
There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on.  The condition occurs when
try_to_unuse() runs in parallel with an exiting task.  A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.

CPU0                                    CPU1
                                        try_to_unuse
                                        looks at mm = task0->mm
                                        increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
                                        mmput(mm) decrements mm->mm_users
task0 freed
                                        dereferencing mm->owner fails

The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.

Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.

Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.

Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same.  And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.

Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-29 08:41:47 -07:00
Linus Torvalds
d0185c0882 Fix NULL pointer dereference in proc_sys_compare
The VFS interface for the 'd_compare()' is a bit special (read: 'odd'),
because it really just essentially replaces a memcmp().  The filesystem
is supposed to just compare the two names with whatever case-independent
or other function.

And when I say 'is supposed to', I obviously mean that 'procfs does odd
things, and actually looks at the dentry that we don't even pass down,
rather than just the name'.  Which results in problems, because we
actually call d_compare before we have even verified that the dentry is
still hashed at all.

And that causes a problm since the inode that procfs looks at may have
been free'd and the d_inode pointer is NULL.  procfs just assumes that
all dentries are positive, since procfs itself never generates a
negative one.  But memory pressure will still result in the dentry
getting torn down, and as it is removed by RCU, it still remains visible
on some lists - and to d_compare.

If the filesystem just did a name comparison, we wouldn't care.  And we
could just fix procfs to know about negative dentries too.  But rather
than have the low-level filesystems know about internal VFS details,
just move the check for a unhashed dentry up a bit, so that we will only
call d_compare on dentries that are still active.

The actual oops this caused didn't look like a NULL pointer dereference
because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)'
to get at its internal proc_inode information from the inode pointer,
and accessed a field below the inode. So the oops would look something
like

	BUG: unable to handle kernel paging request at fffffffffffffff0
	IP: [<ffffffff802bc6c6>] proc_sys_compare+0x36/0x50

and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and
ppc64 (Hugh Dickins).

Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-09-29 07:42:57 -07:00
Linus Torvalds
ec4d90287e Merge git://oss.sgi.com:8090/xfs/linux-2.6
* git://oss.sgi.com:8090/xfs/linux-2.6:
  [XFS] Remove xfs_iext_irec_compact_full()
  [XFS] Fix extent list corruption in xfs_iext_irec_compact_full().
2008-09-26 08:49:34 -07:00
Linus Torvalds
bde40fe071 Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6
* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6:
  UBIFS: fix printk format warnings
  UBIFS: remove incorrect assert
  UBIFS: TNC / GC race fixes
  UBIFS: create the name of the background thread in every case
2008-09-26 08:20:26 -07:00
Steven Whitehouse
254db57f9b GFS2: Support for I/O barriers
This patch adds barrier support to GFS2. There is not a lot of change
really... we just add the barrier flag when we write journal header
blocks. If the underlying device refuses to support them, we fall back
to the previous way of doing things (wait for the I/O and hope) since
there is nothing else we can do. There is no user configuration,
barriers will always be on unless the device refuses to support them.
This seems a reasonable solution to me since this is a correctness
issue.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-26 10:23:22 +01:00
Lachlan McIlroy
71a8c87fb3 [XFS] Remove xfs_iext_irec_compact_full()
Yet another bug was found in xfs_iext_irec_compact_full() and while the
source of the bug was found it wasn't an easy task to track it down
because the conditions are very difficult to reproduce.

A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their
significant effort in tracking down the source of this corruption.

xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost
identical - they both compact indirect extent lists by moving extents from
subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only
moves extents if all of the extents in the next buffer will fit into the
empty space in the buffer before it. xfs_iext_irec_compact_full() will go
a step further and move part of the next buffer if all the extents wont
fit. It will then shift the remaining extents in the next buffer up to the
start of the buffer. The bug here was that we did not update er_extoff and
this caused extent list corruption.

It does not appear that this extra functionality gains us much. Calling
xfs_iext_irec_compact_pages() instead will do a good enough job at
compacting the indirect list and will be quicker too.

For the case in xfs_iext_indirect_to_direct() the total number of extents
in the indirect list will fit into one buffer so we will never need the
extra functionality of xfs_iext_irec_compact_full() there.

Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the
buffers will never overlap) so we don't want the performance hit that can
incur.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32166a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
2008-09-26 12:17:57 +10:00
Lachlan McIlroy
f1ccd29551 [XFS] Fix extent list corruption in xfs_iext_irec_compact_full().
If we don't move all the records from the next buffer into the current
buffer then we need to update the er_extoff field of the next buffer as we
shift the remaining records to the start of the buffer.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32165a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Russell Cattelan <cattelan@thebarn.com>
2008-09-26 12:16:46 +10:00
Julien Brunel
62aa528e02 9p: use an IS_ERR test rather than a NULL test
In case of error, the function p9_client_walk returns an ERR pointer, but
never returns a NULL pointer.  So a NULL test that comes after an IS_ERR
test should be deleted.

The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@match_bad_null_test@
expression x, E;
statement S1,S2;
@@
x = p9_client_walk(...)
... when != x = E
*  if (x != NULL)
S1 else S2
// </smpl>

Signed-off-by: Julien Brunel <brunel@diku.dk>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-09-24 16:22:22 -05:00
Jeff Layton
dfd15c46a6 cifs: explicitly revoke SPNEGO key after session setup
cifs: explicitly revoke SPNEGO key after session setup

The SPNEGO blob returned by an upcall can only be used once. Explicitly
revoke it to make sure that we never pick it up again after session
setup exits.

This doesn't seem to be that big an issue on more recent kernels, but
older kernels seem to link keys into the session keyring by default.
That said, explicitly revoking the key seems like a reasonable thing
to do here.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24 20:59:37 +00:00
Nick Piggin
d9414774dc cifs: Convert cifs to new aops.
cifs: Convert cifs to new aops.

This patch is based on the one originally posted by Nick Piggin. His
patch was very close, but had a couple of small bugs. Nick's original
comments follow:

This is another relatively naive conversion. Always do the read upfront
when the page is not uptodate (unless we're in the writethrough path).

Fix an uninitialized data exposure where SetPageUptodate was called
before the page was uptodate.

SetPageUptodate and switch to writeback mode in the case that the full
page was dirtied.

Acked-by: Shaggy <shaggy@austin.ibm.com>
Acked-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24 19:32:56 +00:00
Steve French
d388908ec4 [CIFS] update DOS attributes in cifsInode if we successfully changed them
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24 19:22:52 +00:00
Jeff Layton
391e575556 cifs: remove NULL termination from rename target in CIFSSMBRenameOpenFIle
cifs: remove NULL termination from rename target in CIFSSMBRenameOpenFIle

The rename destination isn't supposed to be null terminated. Also,
change the name string arg to be const.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24 19:07:36 +00:00
Jeff Layton
7ce86d5a93 cifs: work around samba returning -ENOENT on SetFileDisposition call
cifs: work around samba returning -ENOENT on SetFileDisposition call

Samba seems to return STATUS_OBJECT_NAME_NOT_FOUND when we try to set
the delete on close bit after doing a rename by filehandle. This looks
like a samba bug to me, but a lot of servers will do this. For now,
pretend an -ENOENT return is a success.

Samba does however seem to respect the CREATE_DELETE_ON_CLOSE bit
when opening files that already exist. Windows will ignore it, but
so adding it to the open flags should be harmless.

We're also currently ignoring the return code on the rename by
filehandle, so no need to set rc based on it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24 18:59:20 +00:00
Jeff Layton
74553b1b6a cifs: fix inverted NULL check after kmalloc
cifs: fix inverted NULL check after kmalloc

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-24 18:55:11 +00:00
Theodore Ts'o
5e8814f2f7 ext4: Combine proc file handling into a single set of functions
Previously mballoc created a separate set of functions for each proc
file.  This combines the tunables into a single set of functions which
gets used for all of the per-superblock proc files, saving
approximately 2k of compiled object code.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-23 18:07:35 -04:00
Steve French
9d81523480 [CIFS] clean up upcall handling for dns_resolver keys
We're given the datalen in the downcall, so there's no need to do any
calls to strlen(). Just keep track of the datalen in the key. Finally,
add a sanity check of the data in the downcall to make sure that it
looks like a real IP address.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23 18:46:07 +00:00
Steve French
ee2fd967fb [CIFS] fix busy-file renames and refactor cifs_rename logic
Break out the code that does the actual renaming into a separate
function and have cifs_rename call that. That function will attempt a
path based rename first and then do a filehandle based one if it looks
like the source is busy.

The existing logic tried a path based rename first, but if we needed to
remove the destination then it only attempted a filehandle based rename
afterward. Not all servers support renaming by filehandle, so we need to
always attempt path rename first and fall back to filehandle rename if
it doesn't work.

This also fixes renames of open files on windows servers (at least when
the source and destination directories are the same).

CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23 18:23:33 +00:00
Jeff Layton
6d22f09896 cifs: add function to set file disposition
cifs: add function to set file disposition

The proper way to set the delete on close bit on an already existing
file is to use SET_FILE_INFO with an infolevel of
SMB_FILE_DISPOSITION_INFO. Add a function to do that and have the
silly-rename code use it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23 17:39:28 +00:00
Steve French
7c9c3760b3 [CIFS] add constants for string lengths of keynames in SPNEGO upcall string
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23 17:23:09 +00:00
Jeff Layton
a12a1ac7a4 cifs: move rename and delete-on-close logic into helper function
cifs: move rename and delete-on-close logic into helper function

When a file is still open on the server, we attempt to set the
DELETE_ON_CLOSE bit and rename it to a new filename. When the
last opener closes the file, the server should delete it.

This patch moves this mechanism into a helper function and has
the two places in cifs_unlink that do this procedure call it. It
also fixes the open flags to be correct.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23 17:11:03 +00:00
Theodore Ts'o
9f6200bbfc ext4: move /proc setup and teardown out of mballoc.c
...and into the core setup/teardown code in fs/ext4/super.c so that
other parts of ext4 can define tuning parameters.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-23 09:18:24 -04:00
Jeff Layton
2846d38647 cifs: have find_writeable_file prefer filehandles opened by same task
When the CIFS client goes to write out pages, it needs to pick a
filehandle to write to. find_writeable_file however just picks the
first filehandle that it finds. This can cause problems when a lock
is issued against a particular filehandle and we pick a different
filehandle to write to.

This patch tries to avert this situation by having find_writable_file
prefer filehandles that have a pid that matches the current task.
This seems to fix lock test 11 from the connectathon test suite when
run against a windows server.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-23 02:24:13 +00:00
Pekka Enberg
232087cb73 cifs: don't use GFP_KERNEL with GFP_NOFS
GFP_KERNEL and GFP_NOFS are mutually exclusive. If you combine them, you end up
with plain GFP_KERNEL which can deadlock in cases where you really want
GFP_NOFS.

Cc: Steve French <sfrench@samba.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-09-22 22:23:56 +00:00
Theodore Ts'o
f702ba0fd7 ext4: Don't use 'struct dentry' for internal lookups
This is a port of a patch from Linus which fixes a 200+ byte stack
usage problem in ext4_get_parent().

It's more efficient to pass down only the actual parts of the dentry
that matter: the parent inode and the name, instead of allocating a
struct dentry on the stack.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-09-22 15:21:01 -04:00
Theodore Ts'o
914258bf2c ext4/jbd2: Avoid WARN() messages when failing to write to the superblock
This fixes some very common warnings reported by kerneloops.org

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-06 21:35:40 -04:00
Steven Whitehouse
719ee34467 GFS2: high time to take some time over atime
Until now, we've used the same scheme as GFS1 for atime. This has failed
since atime is a per vfsmnt flag, not a per fs flag and as such the
"noatime" flag was not getting passed down to the filesystems. This
patch removes all the "special casing" around atime updates and we
simply use the VFS's atime code.

The net result is that GFS2 will now support all the same atime related
mount options of any other filesystem on a per-vfsmnt basis. We do lose
the "lazy atime" updates, but we gain "relatime". We could add lazy
atime to the VFS at a later date, if there is a requirement for that
variant still - I suspect relatime will be enough.

Also we lose about 100 lines of code after this patch has been applied,
and I have a suspicion that it will speed things up a bit, even when
atime is "on". So it seems like a nice clean up as well.

From a user perspective, everything stays the same except the loss of
the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very
least, and to be honest I don't think anybody ever used it) and that a
number of options which were ignored before now work correctly.

Please let me know if you've got any comments. I'm pushing this out
early so that you can all see what my plans are.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-18 13:53:59 +01:00
Steven Whitehouse
37ec89e83c GFS2: The war on bloat
The following patch shrinks the gfs2_args structure which is embedded in
every GFS2 superblock. It cuts down the size of the options to a single
unsigned int (the 13 bits of bitfields will be rounded up to that size
by the compiler) from the current 11 unsigned ints. So on x86 thats 44
bytes shrinking to 4 bytes, in each and every GFS2 superblock.

Signed-off-by: Steven Whitehouse <swhitho@redhat.com>
2008-09-18 13:49:32 +01:00
Alexander Beregalov
7424bac82f UBIFS: fix printk format warnings
fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long
unsigned int', but argument 5 has type 'long unsigned int'

fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long
unsigned int', but argument 2 has type 'long unsigned int'

Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-18 09:57:57 +03:00
Adrian Hunter
6e14968c86 UBIFS: remove incorrect assert
The assert was not valid because one of the variables
'taken_empty_lebs' has transient values out of sync
with the other variables.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-17 14:47:09 +03:00
Adrian Hunter
6dcfac4f13 UBIFS: TNC / GC race fixes
- update GC sequence number if any nodes may have been moved
even if GC did not finish the LEB
- don't ignore error return when reading

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-17 14:23:26 +03:00
Sebastian Siewior
0855f310df UBIFS: create the name of the background thread in every case
If the ubifs partition is mounted RO and then remounted RW we end
up with no thread name in ubifs_remount_rw() and the thread appears
nameless.

Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-09-17 10:07:56 +03:00
Lachlan McIlroy
2fd6f6ec64 [XFS] Don't do I/O beyond eof when unreserving space
When unreserving space with boundaries that are not block aligned we round
up the start and round down the end boundaries and then use this function,
xfs_zero_remaining_bytes(), to zero the parts of the blocks that got
dropped during the rounding. The problem is we don't consider if these
blocks are beyond eof. Worse still is if we encounter delayed allocations
beyond eof we will try to use the magic delayed allocation block number as
a real block number. If the file size is ever extended to expose these
blocks then we'll go through xfs_zero_eof() to zero them anyway.

SGI-PV: 983683

SGI-Modid: xfs-linux-melb:xfs-kern:32055a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17 16:52:50 +10:00
Lachlan McIlroy
e1f5dbd707 [XFS] Fix use-after-free with buffers
We have a use-after-free issue where log completions access buffers via
the buffer log item and the buffer has already been freed. Fix this by
taking a reference on the buffer when attaching the buffer log item and
release the hold when the buffer log item is detached and we no longer
need the buffer. Also create a new function xfs_buf_item_free() to combine
some common code.

SGI-PV: 985757

SGI-Modid: xfs-linux-melb:xfs-kern:32025a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-09-17 16:52:13 +10:00
David Chinner
f9114eba1e [XFS] Prevent lockdep false positives when locking two inodes.
If we call xfs_lock_two_inodes() to grab both the iolock and the ilock,
then drop the ilocks on both inodes, then grab them again (as
xfs_swap_extents() does) then lockdep will report a locking order problem.
This is a false positive.

To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks
at once - force calers to make two separate calls. This means that nested
dropping and regaining of the ilocks will retain the same lockdep subclass
and so lockdep will not see anything wrong with this code.

SGI-PV: 986238

SGI-Modid: xfs-linux-melb:xfs-kern:31999a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17 16:51:21 +10:00
David Chinner
b5b8c9acd5 [XFS] Fix barrier status change detection.
The current code in xlog_iodone() uses the wrong macro to check if the
barrier has been cleared due to an EOPNOTSUPP error form the lower layer.

SGI-PV: 986143

SGI-Modid: xfs-linux-melb:xfs-kern:31984a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Nathaniel W. Turner <nate@houseofnate.net>
Signed-off-by: Peter Leckie <pleckie@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-09-17 16:50:50 +10:00