There is now extent_map for mapping offsets in the file to disk and
extent_io for state tracking, IO submission and extent_bufers.
The new extent_map code shifts from [start,end] pairs to [start,len], and
pushes the locking out into the caller. This allows a few performance
optimizations and is easier to use.
A number of extent_map usage bugs were fixed, mostly with failing
to remove extent_map entries when changing the file.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
There were a few places that could cause duplicate extent insertion,
this adjusts the code that creates holes to avoid it.
lookup_extent_map is changed to correctly return all of the extents in a
range, even when there are none matching at the start of the range.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
test_range_bit doesn't properly handle the case: there's a hole at the
end of the range and there's no other extent_state after the range.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
btrfs_find_free_objectid may return a used objectid due to arithmetic
underflow. This bug may happen when parameter 'root' is tree root, so
it may cause serious problems when creating snapshot or sub-volume.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Using ilookup5 during data=ordered writeback could deadlock on I_LOCK. This
saves a pointer to the inode instead.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This patch adds readonly inode flag support. A file with this flag
can't be modified, but can be deleted.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
While shrinking the FS, the allocation functions need to make sure
they don't try to allocate bytes past the end of the FS.
nodatacow needed an extra check to force cows when the existing extents are
past the end of the FS.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
It is very difficult to create a consistent snapshot of the btree when
other writers may update the btree before the commit is done.
This changes the snapshot creation to happen during the commit, while
no other updates are possible.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This forces file data extents down the disk along with the metadata that
references them. The current implementation is fairly simple, and just
writes out all of the dirty pages in an inode before the commit.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The shrinking code used btrfs_next_leaf to find the next item, but
this does not cow the blocks it touches. This fix calls search_slot after
finding the next item to do appropriate cow and balancing.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The patch fixes the overlapping extent issue in shrink_extent_tree.
It checks whether there is an overlapping extent by using
find_previous_extent. If there is an overlapping extent, it setups
key.objectid and cur_byte properly.
---
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This is intended to prevent accidentally filling the drive. A determined
user can still make things oops.
It includes some accounting of the current bytes under delayed allocation,
but this will change as things get optimized
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Yan Zheng noticed the offset into the extent was incorrectly being added to the
extent start before trying to find it in the extent allocation tree.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
A number of workloads do not require copy on write data or checksumming.
mount -o nodatasum to disable checksums and -o nodatacow to disable
both copy on write and checksumming.
In nodatacow mode, copy on write is still performed when a given extent
is under snapshot.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
One of my old patches introduces a new bug to
btrfs_drop_extents(changeset 275). Inline extents are not truncated
properly when "extent_end == end", it can trigger the BUG_ON at
file.c:600. I hope I don't introduce new bug this time.
---
Signed-off-by: Chris Mason <chris.mason@oracle.com>
--Boundary-00=_CcOWHFYK4T+JwSj
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Hello everybody,
compiling btrfs into the kernel results in section mismatch warnings. __exit
functions are called where they are not allowed to. The attached patch fixes
this for me. Not sure if it is correct though.
Signed-off-by: Christian Hesse <mail@earthworm.de>
--
Regards,
Chris
--Boundary-00=_CcOWHFYK4T+JwSj
Content-Type: text/x-diff; charset="iso-8859-1";
name="btrfs-section_mismatches.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename="btrfs-section_mismatches.patch"
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The codes that fixup the right leaf and the codes that dirty the
extnet buffer use the variable 'right_nritems' , both of them expect
'right_nritems' is the number of items in right leaf after the push.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
There was a slight problem with ACL's returning EINVAL when you tried to set an
ACL. This isn't correct, we should be returning EOPNOTSUPP, so I did a very
ugly thing and just commented everybody out and made them return EOPNOTSUPP.
This is only temporary, I'm going back to implement ACL's, but Chris wants to
push out a release so this will suffice for now.
Also Yan suggested setting reada to -1 in the delete case to enable backwards
readahead, and in the listxattr case I moved path->reada = 2; to after the if
(!path) check so we can avoid a possible null dereference. Thank you,
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This patch adds a new parameter 'full_scan' to 'find_search_start',
thereby 'find_search_start' can know whether 'find_free_extent' is in
full scan phrase. I feel that 'find_search_start' should skip calling
'btrfs_find_block_group' when 'find_free_extent' is in full scan
phrase. In my test on a 2GB volume, Oops occurs when space usage is
about 76%. After apply the patch, Oops occurs when space usage is
near 100%.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This patch adds a helper function 'update_pinned_extents' to
extent-tree.c. The usage of the helper function is similar to
'update_block_group', the last parameter of the function indicates
pin vs unpin.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
When 'item_end' is equal to 'inode->i_size', 'found_type' is updated
and current item is skipped. This behavior is correct for extent item,
but incorrect for csum item. For example, there is a csum item with
'offset == 0'. When deleting the inode, 'inode->i_size' is set to 0,
so the csum item isn't deleted.
Signed-off-by: Chris Mason <chris.mason@oracle.com>