linux/fs
Dave Chinner a1e16c2666 xfs: limit speculative prealloc size on sparse files
Speculative preallocation based on the current file size works well
for contiguous files, but is sub-optimal for sparse files where the
EOF preallocation can fill holes and result in large amounts of
zeros being written when it is not necessary.

The algorithm is modified to prevent EOF speculative preallocation
from triggering larger allocations on IO patterns of
truncate--to-zero-seek-write-seek-write-....  which results in
non-sparse files for large files. This, unfortunately, is the way cp
now behaves when copying sparse files and so needs to be fixed.

What this code does is that it looks at the existing extent adjacent
to the current EOF and if it determines that it is a hole we disable
speculative preallocation altogether. To avoid the next write from
doing a large prealloc, it takes the size of subsequent
preallocations from the current size of the existing EOF extent.
IOWs, if you leave a hole in the file, it resets preallocation
behaviour to the same as if it was a zero size file.

Example new behaviour:

$ xfs_io -f -c "pwrite 0 31m" \
            -c "pwrite 33m 1m" \
            -c "pwrite 128m 1m" \
            -c "fiemap -v" /mnt/scratch/blah
wrote 32505856/32505856 bytes at offset 0
31 MiB, 7936 ops; 0.0000 sec (1.608 GiB/sec and 421432.7439 ops/sec)
wrote 1048576/1048576 bytes at offset 34603008
1 MiB, 256 ops; 0.0000 sec (1.462 GiB/sec and 383233.5329 ops/sec)
wrote 1048576/1048576 bytes at offset 134217728
1 MiB, 256 ops; 0.0000 sec (1.719 GiB/sec and 450704.2254 ops/sec)
/mnt/scratch/blah:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..65535]:      96..65631        65536   0x0
   1: [65536..67583]:  hole              2048
   2: [67584..69631]:  67680..69727      2048   0x0
   3: [69632..262143]: hole             192512
   4: [262144..264191]: 262240..264287    2048   0x1

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-02-14 17:21:32 -06:00
..
9p The following changes since commit 4cbe5a555f: 2012-10-12 09:59:23 +09:00
adfs adfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
affs affs: drop vmtruncate 2012-12-20 14:00:01 -05:00
afs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
autofs4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
befs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
bfs bfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-12-20 18:14:31 -08:00
cachefiles FS-Cache: Mark cancellation of in-progress operation 2012-12-20 22:34:00 +00:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2012-12-20 14:00:13 -08:00
cifs cifs: eliminate cifsERROR variable 2012-12-20 11:27:17 -06:00
coda fs: push rcu_barrier() from deactivate_locked_super() to filesystems 2012-10-02 21:35:55 -04:00
configfs lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
cramfs
debugfs fs/debugsfs: remove unnecessary inode->i_private initialization 2012-11-15 17:46:42 -08:00
devpts TTY: devpts, document devpts inode operations 2012-10-22 16:50:13 -07:00
dlm dlm: fix lvb invalidation conditions 2012-11-16 11:20:42 -06:00
ecryptfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
efs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
exofs exofs: don't leak io_state and pages on read error 2012-12-14 12:17:32 +02:00
exportfs Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux 2012-12-20 14:04:11 -08:00
ext2 ext2: fix return values on parse_options() failure 2012-10-09 23:23:53 +02:00
ext3 lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
ext4 lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
f2fs f2fs: fix tracking parent inode number 2012-12-11 13:43:45 +09:00
fat fat: fix incorrect function comment 2012-12-20 17:40:20 -08:00
freevxfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
fscache FS-Cache: Clear remaining page count on retrieval cancellation 2012-12-20 22:35:15 +00:00
fuse Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-17 20:58:12 -08:00
gfs2 lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
hfs hfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
hfsplus Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-20 20:00:43 -08:00
hostfs Merge branch 'for-linus-37rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2012-10-10 11:15:20 +09:00
hpfs hpfs: drop vmtruncate 2012-12-20 18:40:00 -05:00
hppfs pidns: Use task_active_pid_ns where appropriate 2012-11-19 05:59:09 -08:00
hugetlbfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
isofs tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking 2012-10-09 23:33:55 -04:00
jbd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-12-13 12:00:02 -08:00
jbd2 There are two major features for this merge window. The first is 2012-12-16 17:33:01 -08:00
jffs2 jffs2: hold erase_completion_lock on exit 2012-11-18 11:59:01 +02:00
jfs jfs: drop vmtruncate 2012-12-20 18:40:52 -05:00
lockd lockd: Remove BUG_ON()s from fs/lockd/clntproc.c 2012-11-04 14:43:40 -05:00
logfs logfs: drop vmtruncate 2012-12-20 18:40:53 -05:00
minix minix: drop vmtruncate 2012-12-20 18:40:53 -05:00
ncpfs ncpfs: drop vmtruncate 2012-12-20 18:40:54 -05:00
nfs NFS: Kill fscache warnings when mounting without -ofsc 2012-12-21 08:32:09 -08:00
nfs_common
nfsd Revert "nfsd: warn on odd reply state in nfsd_vfs_read" 2012-12-21 17:07:45 -08:00
nilfs2 nilfs2: drop vmtruncate 2012-12-20 18:40:54 -05:00
nls
notify Merge branch 'for-next' of git://git.infradead.org/users/eparis/notify 2012-12-20 20:11:52 -08:00
ntfs ntfs: drop vmtruncate 2012-12-20 18:40:55 -05:00
ocfs2 ocfs2: drop vmtruncate 2012-12-20 14:00:01 -05:00
omfs omfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
openpromfs fs: push rcu_barrier() from deactivate_locked_super() to filesystems 2012-10-02 21:35:55 -04:00
proc Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-20 20:00:43 -08:00
pstore lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
qnx4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
qnx6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
quota quota: Use the pre-processor to compile out quotactl_cmd_write when !CONFIG_BLOCK 2012-12-13 16:33:24 +01:00
ramfs
reiserfs reiserfs: drop vmtruncate 2012-12-20 14:00:01 -05:00
romfs fs: push rcu_barrier() from deactivate_locked_super() to filesystems 2012-10-02 21:35:55 -04:00
squashfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-10-02 20:25:04 -07:00
sysfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
sysv sysv: drop vmtruncate 2012-12-20 14:00:01 -05:00
ubifs ubifs: use prandom_bytes 2012-12-17 17:15:26 -08:00
udf udf: remove un-needed variable from inode_getblk 2012-12-13 16:33:23 +01:00
ufs ufs: drop vmtruncate 2012-12-20 14:00:01 -05:00
xfs xfs: limit speculative prealloc size on sparse files 2013-02-14 17:21:32 -06:00
aio.c
anon_inodes.c
attr.c userns: Allow chown and setgid preservation 2012-11-20 04:17:24 -08:00
bad_inode.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
binfmt_aout.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
binfmt_elf_fdpic.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
binfmt_elf.c binfmt_elf: fix corner case kfree of uninitialized data 2012-12-17 17:15:19 -08:00
binfmt_em86.c exec: use -ELOOP for max recursion depth 2012-12-17 17:15:23 -08:00
binfmt_flat.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
binfmt_misc.c exec: do not leave bprm->interp on stack 2012-12-20 17:40:19 -08:00
binfmt_script.c exec: do not leave bprm->interp on stack 2012-12-20 17:40:19 -08:00
binfmt_som.c get rid of pt_regs argument of ->load_binary() 2012-11-28 21:53:38 -05:00
bio-integrity.c
bio.c vfs: fix: don't increase bio_slab_max if krealloc() fails 2012-10-22 22:00:26 +02:00
block_dev.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
buffer.c fs/buffer.c: remove redundant initialization in alloc_page_buffers() 2012-12-12 17:38:35 -08:00
char_dev.c char_dev: pin parent kobject 2012-10-22 08:50:37 +03:00
compat_binfmt_elf.c coredump: extend core dump note section to contain file names of mapped files 2012-10-06 03:05:17 +09:00
compat_ioctl.c Merge 3.7-rc3 into tty-next 2012-10-29 09:00:57 -07:00
compat.c vfs: define struct filename and have getname() return it 2012-10-12 20:14:55 -04:00
coredump.c do_coredump(): get rid of pt_regs argument 2012-11-29 00:01:25 -05:00
coredump.h coredump: update coredump-related headers 2012-10-06 03:05:15 +09:00
dcache.c vfs: d_obtain_alias() needs to use "/" as default name. 2012-12-20 18:49:10 -05:00
dcookies.c
direct-io.c direct-io: don't read inode->i_blkbits multiple times 2012-11-29 12:38:44 -08:00
drop_caches.c
eventfd.c fs, eventfd: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
eventpoll.c fs, epoll: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
exec.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-20 20:00:43 -08:00
fcntl.c Fix F_DUPFD_CLOEXEC breakage 2012-10-09 15:52:31 +09:00
fhandle.c Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux 2012-12-20 14:04:11 -08:00
fifo.c
file_table.c fs: Fix imbalance in freeze protection in mark_files_ro() 2012-12-20 13:57:36 -05:00
file.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2012-12-12 12:22:13 -08:00
filesystems.c vfs: define struct filename and have getname() return it 2012-10-12 20:14:55 -04:00
fs_struct.c kill daemonize() 2012-11-28 21:49:02 -05:00
fs-writeback.c writeback: fix a typo in comment 2012-12-12 17:38:34 -08:00
generic_acl.c
inode.c mm: redefine address_space.assoc_mapping 2012-12-11 17:22:26 -08:00
internal.h writeback: put unused inodes to LRU after writeback completion 2012-11-26 17:41:24 -08:00
ioctl.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
ioprio.c
Kconfig Introduce a new file system, Flash-Friendly File System (F2FS), to Linux 3.8. 2012-12-20 13:54:52 -08:00
Kconfig.binfmt coredump: make core dump functionality optional 2012-10-06 03:05:15 +09:00
libfs.c vfs: drop vmtruncate 2012-12-20 18:46:29 -05:00
locks.c UAPI Disintegration 2012-10-09 2012-10-09 18:35:22 -04:00
Makefile f2fs: update Kconfig and Makefile 2012-12-11 13:43:42 +09:00
mbcache.c
mount.h proc: Usable inode numbers for the namespace file descriptors. 2012-11-20 04:19:49 -08:00
mpage.c
namei.c vfs: fix renameat to retry on ESTALE errors 2012-12-20 18:50:05 -05:00
namespace.c vfs, freeze: use ACCESS_ONCE() to guard access to ->mnt_flags 2012-12-20 13:36:18 -05:00
no-block.c
open.c vfs: make fchownat retry once on ESTALE errors 2012-12-20 18:50:07 -05:00
pipe.c pipe(2) - race-free error recovery 2012-09-26 21:08:52 -04:00
pnode.c
pnode.h vfs: Only support slave subtrees across different user namespaces 2012-11-19 05:59:20 -08:00
posix_acl.c
proc_namespace.c
read_write.c sendfile: allows bypassing of notifier events 2012-12-20 17:40:21 -08:00
read_write.h compat: fs: Generic compat_sys_sendfile implementation 2012-10-02 21:35:55 -04:00
readdir.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
select.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
seq_file.c lseek: the "whence" argument is called "whence" 2012-12-17 17:15:12 -08:00
signalfd.c fs, epoll: add procfs fdinfo helper 2012-12-17 17:15:27 -08:00
splice.c writeback: remove nr_pages_dirtied arg from balance_dirty_pages_ratelimited_nr() 2012-12-11 17:22:21 -08:00
stack.c
stat.c vfs: fix readlinkat to retry on ESTALE 2012-12-20 18:50:01 -05:00
statfs.c vfs: fix user_statfs to retry once on ESTALE errors 2012-12-20 18:50:07 -05:00
super.c vfs: drop lock/unlock super 2012-10-09 23:33:39 -04:00
sync.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
timerfd.c switch simple cases of fget_light to fdget 2012-09-26 22:20:08 -04:00
utimes.c vfs: allow utimensat() calls to retry once on an ESTALE error 2012-12-20 18:50:08 -05:00
xattr_acl.c userns: Fix posix_acl_file_xattr_userns gid conversion 2012-10-12 13:16:48 -07:00
xattr.c vfs: make lremovexattr retry once on ESTALE error 2012-12-20 18:50:11 -05:00