This allows detection of blocks that have already been written in the
running transaction so they can be recowed instead of modified again.
It is step one in trusting the transid field of the block pointers.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When freeing root block of a tree, btrfs_free_extent' parameter
'ref_generation' is from root block itseft. When freeing non-root
block, 'ref_generation' is from its parent. so when converting a
non-root block to root block, we must guarantee its generation is
equal to its parent's generation.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This forces file data extents down the disk along with the metadata that
references them. The current implementation is fairly simple, and just
writes out all of the dirty pages in an inode before the commit.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
A number of workloads do not require copy on write data or checksumming.
mount -o nodatasum to disable checksums and -o nodatacow to disable
both copy on write and checksumming.
In nodatacow mode, copy on write is still performed when a given extent
is under snapshot.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The codes that fixup the right leaf and the codes that dirty the
extnet buffer use the variable 'right_nritems' , both of them expect
'right_nritems' is the number of items in right leaf after the push.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The fixes do a number of things:
1) Most btrfs_drop_extent callers will try to leave the inline extents in
place. It can truncate bytes off the beginning of the inline extent if
required.
2) writepage can now update the inline extent, allowing mmap writes to
go directly into the inline extent.
3) btrfs_truncate_in_transaction truncates inline extents
4) extent_map.c fixed to not merge inline extent mappings and hole
mappings together
Signed-off-by: Chris Mason <chris.mason@oracle.com>
1) Forced defrag wasn't working properly (btrfsctl -d) because some
cache only checks were incorrect.
2) Defrag only the leaves unless in forced defrag mode.
3) Don't use complex logic to figure out if a leaf is needs defrag
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When making room for a new item, it is ok to create an empty leaf, but
when making room to extend an item, split_leaf needs to make sure it
keeps the item we're extending in the path and make sure we don't end up
with an empty leaf.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This allows us to defrag huge directories, but skip the expensive defrag
case in more common usage, where it does not help as much.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This allows the tree walking code to defrag only the newly allocated
buffers, it seems to be a good balance between perfect defragging and the
performance hit of repeatedly reallocating blocks.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This adds two types of btree defrag, a run time form that tries to
defrag recently allocated blocks in the btree when they are still in ram,
and an ioctl that forces defrag of all btree blocks.
File data blocks are not defragged yet, but this can make a huge difference
in sequential btree reads.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Before, snapshot deletion was a single atomic unit. This caused considerable
lock contention and required an unbounded amount of space. Now,
the drop_progress field in the root item is used to indicate how far along
snapshot deletion is, and to resume where it left off.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Almost none of the files including module.h need to do so,
remove them.
Include sched.h in extent-tree.c to silence a warning about cond_resched()
being undeclared.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Attaching below is some of the code cleanups that i came across while
reading the code.
a) alloc_path already calls init_path.
b) Mention that btrfs_inode is the in memory copy.Ext4 have ext4_inode_info as
the in memory copy ext4_inode as the disk copy
Signed-off-by: Chris Mason <chris.mason@oracle.com>