linux/fs
Toshiyuki Okajima fc80c44277 jbd: positively dispose the unmapped data buffers in journal_commit_transaction()
After ext3-ordered files are truncated, there is a possibility that the
pages which cannot be estimated still remain.  Remaining pages can be
released when the system has really few memory.  So, it is not memory
leakage.  But the resource management software etc.  may not work
correctly.

It is possible that journal_unmap_buffer() cannot release the buffers, and
the pages to which they belong because they are attached to a commiting
transaction and journal_unmap_buffer() cannot release them.  To release
such the buffers and the pages later, journal_unmap_buffer() leaves it to
journal_commit_transaction().  (journal_unmap_buffer() puts the mark
'BH_Freed' to the buffers so that journal_commit_transaction() can
identify whether they can be released or not.)

In the journalled mode and the writeback mode, jbd does with only metadata
buffers.  But in the ordered mode, jbd does with metadata buffers and also
data buffers.

Actually, journal_commit_transaction() releases only the metadata buffers
of which release is demanded by journal_unmap_buffer(), and also releases
the pages to which they belong if possible.

As a result, the data buffers of which release is demanded by
journal_unmap_buffer() remain after a transaction commits.  And also the
pages to which they belong remain.

Such the remained pages don't have mapping any longer.  Due to this fact,
there is a possibility that the pages which cannot be estimated remain.

The metadata buffers marked 'BH_Freed' and the pages to which
they belong can be released at 'JBD: commit phase 7'.

Therefore, by applying the same code into 'JBD: commit phase 2' (where the
data buffers are done with), journal_commit_transaction() can also release
the data buffers marked 'BH_Freed' and the pages to which they belong.

As a result, all the buffers marked 'BH_Freed' can be released, and also
all the pages to which these buffers belong can be released at
journal_commit_transaction().  So, the page which cannot be estimated is
lost.

<<Excerpt of code at 'JBD: commit phase 7'>>
 >         spin_lock(&journal->j_list_lock);
 >         while (commit_transaction->t_forget) {
 >                 transaction_t *cp_transaction;
 >                 struct buffer_head *bh;
 >
 >                 jh = commit_transaction->t_forget;
 >...
 >                 if (buffer_freed(bh)) {
 >                 ^^^^^^^^^^^^^^^^^^^^^^^^
 >                         clear_buffer_freed(bh);
 >                        ^^^^^^^^^^^^^^^^^^^^^^^^
 >                         clear_buffer_jbddirty(bh);
 >                 }
 >
 >                 if (buffer_jbddirty(bh)) {
 >                         JBUFFER_TRACE(jh, "add to new checkpointing trans");
 >                         __journal_insert_checkpoint(jh, commit_transaction);
 >                         JBUFFER_TRACE(jh, "refile for checkpoint writeback");
 >                         __journal_refile_buffer(jh);
 >                         jbd_unlock_bh_state(bh);
 >                 } else {
 >                         J_ASSERT_BH(bh, !buffer_dirty(bh));
 > ...
 >                         JBUFFER_TRACE(jh, "refile or unfile freed buffer");
 >                         __journal_refile_buffer(jh);
 >                         if (!jh->b_transaction) {
 >                                 jbd_unlock_bh_state(bh);
 >                                  /* needs a brelse */
 >                                 journal_remove_journal_head(bh);
 >                                 release_buffer_page(bh);
 >                                 ^^^^^^^^^^^^^^^^^^^^^^^^
 >                         } else
 >                 }
****************************************************************
* Apply the code of "^^^^^^" lines into 'JBD: commit phase 2' *
****************************************************************

At journal_commit_transaction() code, there is one extra message in the
series of jbd debug messages.  ("JBD: commit phase 2") This patch fixes
it, too.

Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:32 -07:00
..
9p 9p: fix O_APPEND in legacy mode 2008-07-03 09:59:03 -05:00
adfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
affs [PATCH] fix reservation discarding in affs 2008-05-06 13:45:33 -04:00
afs Fix various old email addresses for dwmw2 2008-06-06 11:29:10 -07:00
autofs
autofs4 autofs4: remove unused ioctls 2008-07-24 10:47:33 -07:00
befs byteorder: don't directly include linux/byteorder/generic.h 2008-05-16 12:01:45 -07:00
bfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
cifs Merge commit 'v2.6.26' into bkl-removal 2008-07-14 15:29:34 -06:00
coda device create: coda: convert device_create to device_create_drvdata 2008-07-21 21:54:41 -07:00
configfs configfs: Allow ->make_item() and ->make_group() to return detailed errors. 2008-07-17 15:21:29 -07:00
cramfs fs: Remove unnecessary inclusions of asm/semaphore.h 2008-04-18 22:16:44 -04:00
debugfs debugfs: Implement debugfs_remove_recursive() 2008-07-21 21:54:59 -07:00
devpts devpts: factor out PTY index allocation 2008-04-30 08:29:48 -07:00
dlm configfs: Allow ->make_item() and ->make_group() to return detailed errors. 2008-07-17 15:21:29 -07:00
ecryptfs eCryptfs: Make all persistent file opens delayed 2008-07-24 10:47:31 -07:00
efs
exportfs fs: replace remaining __FUNCTION__ occurrences 2008-04-30 08:29:54 -07:00
ext2 ext2: remove double definitions of xattr macros 2008-07-25 10:53:31 -07:00
ext3 ext3: handle deleting corrupted indirect blocks 2008-07-25 10:53:32 -07:00
ext4 ext4: do not set extents feature from the kernel 2008-07-11 19:27:31 -04:00
fat Merge commit 'v2.6.26' into bkl-removal 2008-07-14 15:29:34 -06:00
freevxfs fs/freevxfs/: proper externs 2008-04-29 08:06:00 -07:00
fuse fuse: fix thinko in max I/O size calucation 2008-06-17 18:08:10 -07:00
gfs2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw 2008-07-15 10:38:46 -07:00
hfs hfs: fix warning with 64k PAGE_SIZE 2008-04-30 08:29:52 -07:00
hfsplus Fix hfsplus oops on image without extents 2008-05-13 08:02:24 -07:00
hostfs
hpfs
hppfs fix hppfs Makefile breakage 2008-05-21 16:55:58 -07:00
hugetlbfs hugetlbfs: per mount huge page sizes 2008-07-24 10:47:17 -07:00
isofs isofs: fix access to unallocated memory when reading corrupted filesystem 2008-04-30 08:29:33 -07:00
jbd jbd: positively dispose the unmapped data buffers in journal_commit_transaction() 2008-07-25 10:53:32 -07:00
jbd2 ext4: Add ordered mode support for delalloc 2008-07-11 19:27:31 -04:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2008-05-01 11:15:28 -07:00
jfs jfs: remove DIRENTSIZ 2008-06-10 15:12:58 -05:00
lockd Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux 2008-07-20 21:21:46 -07:00
minix minix: remove !NO_TRUNCATE code 2008-07-25 10:53:30 -07:00
msdos Replace BKL with superblock lock in fat/msdos/vfat 2008-06-20 14:05:54 -06:00
ncpfs Remove BKL from remote_llseek v2 2008-07-02 15:06:27 -06:00
nfs fix fs/nfs/nfsroot.c compilation 2008-07-24 17:32:41 -07:00
nfs_common
nfsd Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux 2008-07-20 21:21:46 -07:00
nls
ntfs ntfs: le*_add_cpu conversion 2008-05-24 09:56:08 -07:00
ocfs2 configfs: Allow ->make_item() and ->make_group() to return detailed errors. 2008-07-17 15:21:29 -07:00
openpromfs
partitions fs: ldm.[ch] use get_unaligned_* helpers 2008-07-25 10:53:26 -07:00
proc vmallocinfo: add NUMA information 2008-07-24 10:47:17 -07:00
qnx4
ramfs ramfs: enable splice write 2008-07-04 09:52:14 +02:00
reiserfs reiserfs: discard prealloc in reiserfs_delete_inode 2008-07-08 12:39:31 -07:00
romfs
smbfs Remove BKL from remote_llseek v2 2008-07-02 15:06:27 -06:00
sysfs driver core: Suppress sysfs warnings for device_rename(). 2008-07-21 21:55:01 -07:00
sysv sysv: [bl]e*_add_cpu conversion 2008-04-30 08:29:52 -07:00
ubifs UBIFS: include to compilation 2008-07-15 17:35:24 +03:00
udf udf: Fix regression in UDF anchor block detection 2008-06-24 11:38:03 +02:00
ufs UFS: add const to parser token table 2008-07-24 11:50:15 -07:00
vfat Replace BKL with superblock lock in fat/msdos/vfat 2008-06-20 14:05:54 -06:00
xfs Fix reference counting race on log buffers 2008-07-11 11:37:18 -07:00
aio.c uml: activate_mm: remove the dead PF_BORROWED_MM check 2008-06-06 11:36:22 -07:00
anon_inodes.c flag parameters: NONBLOCK in anon_inode_getfd 2008-07-24 10:47:28 -07:00
attr.c
bad_inode.c
binfmt_aout.c fs/binfmt_aout.c: use printk_ratelimit() 2008-04-29 08:06:04 -07:00
binfmt_elf_fdpic.c nommu: fix ksize() abuse 2008-06-06 11:29:13 -07:00
binfmt_elf.c execve filename: document and export via auxiliary vector 2008-07-22 09:59:40 -07:00
binfmt_em86.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_flat.c nommu: fix ksize() abuse 2008-06-06 11:29:13 -07:00
binfmt_misc.c binfmt_misc: use simple_read_from_buffer() 2008-07-24 10:47:27 -07:00
binfmt_script.c binfmt_misc.c: avoid potential kernel stack overflow 2008-04-29 08:06:04 -07:00
binfmt_som.c [PATCH] sanitize handling of shared descriptor tables in failing execve() 2008-04-25 09:23:53 -04:00
bio-integrity.c block: integrity checkpatch cleanups 2008-07-03 13:21:13 +02:00
bio.c Add bvec_merge_data to handle stacked devices and ->merge_bvec() 2008-07-03 13:21:15 +02:00
block_dev.c [PATCH] fix cgroup-inflicted breakage in block_dev.c 2008-06-23 08:30:55 -04:00
buffer.c Merge branch 'generic-ipi' into generic-ipi-for-linus 2008-07-15 21:55:59 +02:00
char_dev.c Remove the lock_kernel() call from chrdev_open() 2008-06-20 14:05:53 -06:00
compat_binfmt_elf.c
compat_ioctl.c autofs4: remove unused ioctls 2008-07-24 10:47:33 -07:00
compat.c flag parameters: signalfd 2008-07-24 10:47:27 -07:00
dcache.c fix soft lock up at NFS mount via per-SB LRU-list of unused dentries 2008-07-24 10:47:15 -07:00
dcookies.c
direct-io.c
dnotify.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
dquot.c quota: don't call sync_fs() from vfs_quota_off() when there's no quota turn off 2008-05-13 08:02:23 -07:00
drop_caches.c vfs: skip inodes without pages to free in drop_pagecache_sb() 2008-04-29 08:06:05 -07:00
eventfd.c flag parameters: check magic constants 2008-07-24 10:47:29 -07:00
eventpoll.c flag parameters add-on: remove epoll_create size param 2008-07-24 10:47:29 -07:00
exec.c exec: remove some includes 2008-07-25 10:53:28 -07:00
fcntl.c flag parameters: dup2 2008-07-24 10:47:28 -07:00
fifo.c
file_table.c [PATCH] split linux/file.h 2008-05-01 13:08:16 -04:00
file.c [PATCH] avoid multiplication overflows and signedness issues for max_fds 2008-05-16 17:22:52 -04:00
filesystems.c
fs-writeback.c VFS: export sync_sb_inodes 2008-07-14 19:10:52 +03:00
generic_acl.c
inode.c VFS: fix unused variable warning 2008-05-06 13:13:37 -07:00
inotify_user.c flag parameters: check magic constants 2008-07-24 10:47:29 -07:00
inotify.c
internal.h [PATCH] move a bunch of declarations to fs/internal.h 2008-04-21 23:11:01 -04:00
ioctl.c make vfs_ioctl() static 2008-04-29 08:06:00 -07:00
ioprio.c
Kconfig Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 2008-07-17 10:55:51 -07:00
Kconfig.binfmt frv: don't offer BINFMT_FLAT 2008-06-06 11:29:08 -07:00
libfs.c add kernel-doc for simple_read_from_buffer and memory_read_from_buffer 2008-07-04 10:40:07 -07:00
locks.c [patch 4/4] flock: remove unused fields from file_lock_operations 2008-06-23 11:52:30 -04:00
Makefile Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6 2008-07-16 15:02:57 -07:00
mbcache.c
mpage.c vfs: add hooks for ext4's delayed allocation support 2008-07-11 19:27:31 -04:00
namei.c [patch 3/4] vfs: fix ERR_PTR abuse in generic_readlink 2008-06-23 11:52:30 -04:00
namespace.c LSM/SELinux: show LSM mount options in /proc/mounts 2008-07-14 15:02:05 +10:00
nfsctl.c
no-block.c
open.c fs: check for statfs overflow 2008-07-24 10:47:19 -07:00
pipe.c flag parameters: NONBLOCK in pipe 2008-07-24 10:47:29 -07:00
pnode.c [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
pnode.h [patch 7/7] vfs: mountinfo: show dominating group id 2008-04-23 00:05:09 -04:00
posix_acl.c
quota_v1.c quota: do not allow setting of quota limits to too high values 2008-04-28 08:58:32 -07:00
quota_v2.c quota: le*_add_cpu conversion 2008-04-30 08:29:51 -07:00
quota.c quota: quota core changes for quotaon on remount 2008-04-28 08:58:33 -07:00
read_write.c Remove BKL from remote_llseek v2 2008-07-02 15:06:27 -06:00
read_write.h
readdir.c
select.c Fix performance regression on lmbench select benchmark 2008-06-22 12:23:15 -07:00
seq_file.c [patch 2/7] vfs: mountinfo: add seq_file_root() 2008-04-23 00:04:38 -04:00
signalfd.c flag parameters: check magic constants 2008-07-24 10:47:29 -07:00
splice.c splice: fix generic_file_splice_read() race with page invalidation 2008-07-04 09:52:14 +02:00
stack.c
stat.c
super.c fix soft lock up at NFS mount via per-SB LRU-list of unused dentries 2008-07-24 10:47:15 -07:00
sync.c SYNC_FILE_RANGE_WRITE may and will block. Document that. 2008-07-24 10:47:17 -07:00
timerfd.c flag parameters: check magic constants 2008-07-24 10:47:29 -07:00
utimes.c [patch for 2.6.26 4/4] vfs: utimensat(): fix write access check for futimens() 2008-06-23 08:43:52 -04:00
xattr_acl.c
xattr.c xattr: add missing consts to function arguments 2008-04-29 08:06:06 -07:00