third_party_littlefs

mirror of https://gitee.com/openharmony/third_party_littlefs synced 2024-11-23 23:10:00 +00:00

Author	SHA1	Message	Date
Christopher Haster	916b308558	Fixed excessive waste from overly large inline files Before this, there were some safety limits, but there was no real default limit to the size of inline files other than the amount of RAM available. On PCs, this meant that inline files were free to fill up directory blocks to a little under the block size. However this is very wasteful in terms of storage space. Because of splitting limits to keep the compact runtime reasonable, each byte of an inline files uses 4x the amount. Fortunately we can find an optimal inline limit: Inline file waste for n bytes = 3n CTZ file waste for n bytes = B - n Where B = block size Solving for n = B/4 So the optimal inline limit is B/4. However, this assumes a perfect inline file and no metadata. We can decrease this to B/8 to give a bit more breathing room for directory+file metadata.	2019-01-22 21:02:39 -06:00
Christopher Haster	e1f9d2bc09	Added support for RAM-independent reading of inline files One of the new features in LittleFS is "inline files", which is the inlining of small files in the parent directory. Inline files have a big limitation in that they no longer have a dedicated scratch area to write out data before commit-time. This is fine as long as inline files are small enough to fit in RAM. However, this dependency on RAM creates an uncomfortable situation for portability, with larger devices able to create larger files than smaller devices. This problem is especially important on embedded systems, where RAM is at a premium. Recently, I realized this RAM requirement is necessary for _writing_ inline files, but not for _reading_ inline files. By allowing fetches of specific slices of inline files it's possible to read inline files without the RAM to back it. However however, this creates a conflict with COW semantics. Normally, when a file is open twice, it is referenced by a COW data structure that can be updated independently. Inlines files that fit in RAM also allows independent updates, but the moment an inline file can't fit in RAM, any updates to that directory block could corrupt open files referencing the inline file. The fact that this behaviour is only inconsistent for inline files created on a different device with more RAM creates a potential nightmare for user experience. Fortunately, there is a workaround for this. When we are commiting to a directory, any open files needs to live in a COW structure or in RAM. While we could move large inline files to COW structures at open time, this would break the separation of read/write operations and could lead to write errors at read time (ie ENOSPC). But since this is only an issue for commits, we can defer the move to a COW structure to any commits to that directory. This means when committing to a directory we need to find any _open_ large inline files and evict them from the directory, leaving the file with a new COW structure even if it was opened read only. While complicated, the end result is inline files that can use the MAX RAM that is available, but can be read with MIN RAM, even with multiple write operations happening to the underlying directory block. This prevents users from needing to learn the idiosyncrasies of inline files to use the filesystem portably.	2019-01-22 20:59:59 -06:00
Christopher Haster	51b2c7e4b6	Changed custom attribute descriptors to used arrays While linked-lists do have some minor benefits, arrays are more idiomatic in C and may provide a more intuitive API. Initially the linked-list approach was more beneficial than it is now, since it allowed custom attributes to be chained to internal linked lists of attributes. However, this was dropped because exposing the internal attribute list in this way created a rather messy user interface that required strictly encoding the attributes with the on-disk tag format. Minor downside, users can no longer introduce custom attributes in different layers (think OS vs app). Minor upside, the code size and stack usage was reduced a bit. Fortunately, this API can always be changed in the future without breaking anything (except maybe API compatibility).	2019-01-13 23:56:53 -06:00
Christopher Haster	66d751544d	Modified global state format to work with new tag format The main difference here is a change from encoding "hasorphans" and "hasmove" bits in the tag itself. This worked with the old format, but in the new format the space these bits take up must be consistent for each tag type. The tradeoff is that the new tag format allows for up to 256 different global states which may be useful in the future (for example, a global free list). The new format encodes this info in the data blob, using an additional word of storage. This word is actually formatted the same as though it was a tag, which simplified internal handling and may allow other tag types in the future. Format for global state: [---- 96 bits ----] [1\|- 11 -\|- 10 -\|- 10 -\|--- 64 ---] ^ ^ ^ ^ ^- move dir pair \| \| \| \-------------------------- unused, must be 0s \| \| \--------------------------------- move id \| \---------------------------------------- type, 0xfff for move \--------------------------------------------- has orphans This also included another iteration over globals (renamed to gstate) with some simplifications to how globals are handled.	2019-01-13 23:56:50 -06:00
Christopher Haster	b989b4a89f	Cleaned up tag encoding, now with clear chunk field Before, the tag format's type field was limited to 9-bits. This sounds like a lot, but this field needed to encode up to 256 user-specified types. This limited the flexibility of the encoded types. As time went on, more bits in the type field were repurposed for various things, leaving a rather fragile type field. Here we make the jump to full 11-bit type fields. This comes at the cost of a smaller length field, however the use of the length field was always going to come with a RAM limitation. Rather than putting pressure on RAM for inline files, the new type field lets us encode a chunk number, splitting up inline files into multiple updatable units. This actually pushes the theoretical inline max from 8KiB to 256KiB! (Note that we only allow a single 1KiB chunk for now, chunky inline files is just a theoretical future improvement). Here is the new 32-bit tag format, note that there are multiple levels of types which break down into more info: [---- 32 ----] [1\|-- 11 --\|-- 10 --\|-- 10 --] ^. ^ . ^ ^- entry length \|. \| . \------------ file id chunk info \|. \-----.------------------ type info (type3) \.-----------.------------------ valid bit [-3-\|-- 8 --] ^ ^- chunk info \------- type info (type1) Additionally, I've split the CREATE tag into separate SPLICE and NAME tags. This simplified the new compact logic a bit. For now, littlefs still follows the rule that a NAME tag precedes any other tags related to a file, but this can change in the future.	2019-01-13 23:56:01 -06:00
Christopher Haster	a548ce68c1	Switched to traversal-based compact logic This simplifies some of the interactions between reading and writing inside the commit logic. Unfortunately this change didn't decrease code size as was initially hoped, but it does offer a nice runtime improvement for the common case and should improve debugability. Before, the compact logic required three iterations: 1. iterate through all the ids in a directory 2. scan attrs bound to each id in the directory 3. lookup attrs in the in-progress commit The code for this, while terse and complicated, did have some nice side effect. The directory lookup logic could be reused for looking up in the in-progress commit, and iterating through each id allows us to know exactly how many ids we can fit during a compact. Giving us a O(n^3) compact and O(n^3) split. However, this was complicated by a few things. First, this compact logic doesn't handle deleted attrs. To work around this, I added a marker for the last commit (or first based on your perspective) which would indicate if a delete should be copied over. This worked but was a bit hacky and meant deletes weren't cleaned up on the first compact. Second, we can't actually figure out our compacted size until we compact. This worked ok except for the fact that splits will always have a failed compact. This means we waste an erase which could very expensive. It is possible to work around this by keeping our work, but with only a single prog cache this was very tricky and also somewhat hacky. Third, the interactions between reading and writing to the same block were tricky and error-prone. They should mostly be working now, but seeing this requirement go away does not make me sad. The new compact logic fixes these issues by moving the complexity into a general-purpose lfs_dir_traverse function which has much fewer side effects on the system. We can even use it for dry-runs to precompute our estimated size. How does it work? 1. iterate through all attr in the directory 2. for each attr, scan the rest of the directory to figure out the attr's history, this will change the attr based on dir modifications and may even exit early if the attr was deleted. The end result is a traversal function that gives us the resulting state of each attr in only O(n^2). To make this complete, we allow a bounded recursion into mcu-side move attrs, although this ends up being O(n^3) unlike moves in the original solution (however moves are less common. This gives us a nice traversal function we can use for compacts and moves, handles deletes, and is overall simpler to reason about. Two minor hiccups: 1. We need to handle create attrs specially, since this algorithm doesn't care or id order, which can cause problems since attr insertion are order sensitive. We can fix this by simply looking up each create (since there is only one per file) in order at the beginning of our traversal. This is oddly complimentary to the move logic, which also handles create attrs separately. 2. We no longer know exactly how many ids we can write to a dir during splits. However, since we can do a dry-run traversal, we can use that to simply binary search for the mid-point. This gives us a O(n^2) compact and O(n^2 log n) split, which is a nice minor improvement (remember n is bounded by block size).	2018-12-28 11:17:51 -06:00
Christopher Haster	dc507a7b5f	Changed required alignment of lookahead_size to 64 bits This is to prepare for future compatibility with other implementations of the allocator's lookahead that are under consideration. The most promising design so far is a sort of segments-list data structure that stores pointer+size pairs, requiring 64-bits of alignment. Changing this now takes advantage of the major version to avoid a compatibility break in the future. If we end up not changing the allocator or don't need 64-bit alignment we can easily drop this requirement without breaking anyone's code.	2018-10-22 17:58:47 -05:00
Christopher Haster	5b26c68ae2	Tweaked tag endianness to catch power-loss after <1 word is written There was an interesting subtlety with the existing layout of tags that could become a problem in the future. Basically, littlefs avoids writing to any region of storage it is not absolutely sure has been erased beforehand. This is a part of limiting the number of assumptions about storage. It's possible a storage technology can't support writes without erases in a way that is undetectable at write time (Maybe changing a bit without an erase decreases the longevity of the information stored on the bit). But the existing layout had a very tiny corner case where this wasn't true. Consider the location of the valid bit in the tag struct: [1\|--- 31 ---] ^--- valid bit The responsibility of this bit is to indicate if an attempt has been made to write the following commit. If it is not set (the specific value is dependent on a previous read and identified by the preceeding commit), the assumption is that it is safe to write to the next region because it has been erased previously. If it is set, we check if the next commit is valid, if it isn't (because of CRC failure, likely due to power-loss), we discard the commit. But because an attempt has been made to write to that storage, we must then do a compaction to move to the other block in the metadata-pair. This plan looks good on paper, but what does it look like on storage? The problem is that words in littlefs are in little-endian. So on storage the tag actually looks like this: [- 8 -\|- 8 -\|- 8 -\|1\|- 7 -] ^-- valid bit This means that we don't actually set the valid bit before writing the tag! We write the lower bytes first. If we lose power, we may have written 3 bytes without this fact being detectable. We could restructure the tag structure to store the valid bit lower, however because none of the fields are 7 bits, this would make the extraction more costly, and we then lose the ability to check this valid bit with a sign comparison. The simple solution is to just store the tag in big-endian. A small benefit is that this will actually have a negative code cost on big-endian machines. This mixture of endiannesses is frustrating, however it is a pragmatic solution with only a 20-byte code size cost.	2018-10-22 17:58:32 -05:00
Christopher Haster	4a1b8ae222	Fixed issues found by more aggressive rename tests - Fixed underflow issue caused by search id shortcuts that would result in early termination from lfs_dir_get - Fixed issue where entry file delete would toss out the best id during lfs_dir_fetchmatch - Fixed globals going out of date when canceling in same metadata-pair - Fixed early removal of metadata-pair when attribute list contains creates after deletes bring dir->count to zero	2018-10-21 11:25:48 -05:00
Christopher Haster	c8a39c4b23	Merge remote-tracking branch 'origin/master' into v2-rebase-part2	2018-10-20 21:02:25 -05:00
Christopher Haster	ec4d8b68ad	Changed release script to generate drafts	2018-10-20 12:34:41 -05:00
Christopher Haster	c7894a61e1	Added a handful of links to related projects Interesting open-source projects that I've ran into around embedded storage. May be interesting to others in the embedded space. Added mklfs, SPIFFS, and Dhara. Also a thanks to jolivepetrus for posting the mklfs tool he put together.	2018-10-20 12:34:41 -05:00
Christopher Haster	195075819e	Added 2GiB file size limit and EFBIG reporting On disk, littlefs uses 32-bit integers to track file size. This sets a theoretical limit of 4GiB for files. However, the API passes file sizes around as signed numbers, with negative values representing error codes. This means that not all of the APIs will work with file sizes > 2GiB. Because of related complications over in FUSE land, I've added the LFS_FILE_MAX constant and proper error reporting if file writes/seeks exceed the 2GiB limit. In v2 this will join the other constants that get stored in the superblock to help portability. Since littlefs is targeting microcontrollers, it's likely this will be a sufficient solution. Note that it's still possible to enable partial-support for 4GiB files by defining LFS_FILE_MAX during compilation. This will work for most of the APIs, except lfs_file_seek, lfs_file_tell, and lfs_file_size. We can also consider improving support for 4GiB files, by making seek a bit more complicated and adding a lfs_file_stat function. I'll leave this for a future improvement if there's interest. Found by cgrozemuller	2018-10-20 12:34:23 -05:00
Christopher Haster	97d8d5e96a	Fixed issue where a rename causes a split and pushes dir out of sync The issue happens when a rename causes a split in the destination pair. If the destination pair is the same as the source pair, this triggers the logic to keep both pairs in sync. Unfortunately, this logic didn't work, because the source entry still resides in the old source pair, unlike the destination pair, which is now in the new pair created by the split. The best fix for now is to refetch the source pair after the changes to the destination pair. This isn't the most efficient solution, but fortunately this bug has already been fixed in the revamped move logic in littlefs v2 (currently in progress). Found by ohoc	2018-10-20 12:34:11 -05:00
Christopher Haster	795dd8c7ab	Fixed mkdir when inserting into a non-end block This was an oversight on my part when adding strict ordering to directories. Unfortunately now we can't take advantage of the atomic creation of tail+dir entries. Now we need to first create the tail, then create the actually directory entry. If we lose power, the orphan is cleaned up like orphans created during remove. Note that we still take advantage of the atomic tail+dir entries if we are an end block. This is actually because this corner case is complicated to _not_ do atomically, needing to update the directory we just committed to.	2018-10-18 10:00:49 -05:00
Christopher Haster	97a7191814	Fixed issue with creating files named "littlefs" A rather humorous issue, we accidentally ended up mixing our file namespace with our superblocks. This meant if we created a file named "littlefs" it would reference the superblock and all sorts of things would break. Fixing this also highlighted another issue, the fact that the superblock always needs to come before any file entries in the directory. I didn't account for this in the initial B-tree design, but we need a higher ordering for superblocks + children + files than just name. To fix this I added ordering information in the 2 bits currently unused in the tag type. Though note that the size of these fields are flexible. 9-bit type field: [--- 9 ---] [1\|- 3 -\|- 2 -\|- 3 -] ^ ^ ^ ^- type-specific info \| \| \------- ordering info \| \------------- subtype \----------------- user bit	2018-10-18 10:00:49 -05:00
Christopher Haster	aeca7667b3	Switched to strongly ordered directories Instead of storing files in an arbitrary order, we now store files in ascending lexicographical order by filename. Although a big change, this actually has little impact on how littlefs works internally. We need to support file insertion, and compare file names to find our position. But since we already need to scan the entire directory block, this adds relatively little overhead. What this does allow, is the potential to add B-tree support in the future in a backwards compatible manner. How could you add B-trees to littlefs? 1. Add an optional "child" tag with a pointer that allows you to skip to a position in the metadata-pair list that composes the directory 2. When splitting a metadata-pair (sound familiar?), we either insert a second child tag in our parent, or we create a new root containing the child tags. 3. Each layer needs a bit stored in the tail-pointer to indicate if we're going to the next layer. This can be created trivially when we create a new root. 4. During lookup we keep two pointers containing the bounds of our search. We may need to iterate through multiple metadata-pairs in our linked-list, but this gives us a O(log n) lookup cost in a balanced tree. 5. During deletion we also delete any children pointers. Note that children pointers must come before the actual file entry. This gives us a B-tree implementation that is compatible with the current directory layout (assuming the files are ordered). This means that B-trees could be supported by a host PC and ignored on a small device. And during power-loss, we never end up with a broken filesystem, just a less-than-optimal tree. Note that we don't handle removes, so it's possible for a tree to become unbalanced. But worst case that's the same as the current linked-list implementation. All we need to do now is keep directories ordered. If we decide to drop B-tree support in the future or the B-tree implementation turns out inherently flawed, we can just drop the ordered requirement without breaking compatibility and recover the code cost.	2018-10-18 10:00:49 -05:00
Christopher Haster	7af8b81b81	Changed lookahead configuration unit to bytes instead of bits The fact that the lookahead buffer uses bits instead of bytes is an internal detail. Poking this through to the user API has caused a decent amount of confusion. Most buffers are provided as bytes and the inconsistency here can be surprising. The use of bytes instead of bits also makes us forward compatible in the case that we want to change the lookahead internal representation (hint segment list). Additionally, we change the configuration name to lookahead_size. This matches other configurations, such as cache_size and read_size, while also notifying the user that something important changed at compile time (by breaking).	2018-10-18 10:00:49 -05:00
Christopher Haster	ad96fca18f	Changed attr_max to be specific to custom attributes While technically, both system and user attributes share the same disk limitations, that's not what attr_max represents when considered from the user's perspective. To the user, attr_max applies only to custom attributes. This means attr_max should not impact other configurable limitations, such as inline files, and the ordering should be reconsidered with what the user finds most important.	2018-10-18 10:00:49 -05:00
Christopher Haster	f010d2add1	Fixed issue with reads ignoring the pcache The downside of smarter caching is that now there are more complicated corner cases to consider. Here we weren't considering our pcaches when aligning reads to the rcache. This meant if things were unaligned, we would read a cache-line that overlaps the pcache and then proceed to ignore whatever we overlapped. This fix is to determine the limit of an rcache read not from cache alignment but from the available caches, which we check anyways to find cached data.	2018-10-18 10:00:49 -05:00
Christopher Haster	d7e4abad0b	Edited tag structure to balance size vs id count This is a minor tweak that resulted from looking at some other use cases for the littlefs data-structure on disk. Consider an implementation that does not need to buffer inline-files in RAM. In this case we should have as large a tag size field as possible. Unfortunately, we don't have much space to work with in the 32-bit tag struct, so we have to make some compromises. These limitations could be removed with a 64-bit tag struct, at the cost of code size. 32-bit tag structure: [--- 32 ---] [1\|- 9 -\|- 9 -\|-- 13 --] ^ ^ ^ ^- entry length \| \| \-------- file id \| \-------------- tag type \------------------ valid bit	2018-10-18 10:00:49 -05:00
Christopher Haster	cafe6ab466	Fixed issue with splitting metadata-pairs in full filesystem Depending on your perspective, this may not be a necessary operation, given that a nearly-full filesystem is already prone to ENOSPC errors, especially a COW filesystem. However, splitting metadata-pairs can happen in really unfortunate situations, such as removing files. The solution here is to allow "overcompaction", that is, a compaction without bounds checking to allow splitting. This unfortunately pushes our metadata-pairs past their reasonable limit of saturation, which means writes get exponentially costly. However it does allow littlefs to continue working in extreme situations.	2018-10-18 10:00:49 -05:00
Christopher Haster	29b881017d	Revisited xored-globals and related logic Added separate bit for "hasmove", which means we don't need to check the move id, and allows us to add more sync-related global states in the future, as long as they never happen simultaneously (such as orphans and moves). Also refactored some of the logic and removed the union in the global structure, which didn't really add anything of value.	2018-10-18 10:00:49 -05:00
Christopher Haster	cf87ba5375	Combined superblock scan and fetch of xored-globals during mount Conceptually these are two separate operations. However, they are both only needed during mount, both require iteration over the linked-list of metadata-pairs, and both are independent from each other. Combining these into one gives us a nice code savings. Additionally, this greatly simplifies the lookup of the root directory. Initially we used a flag to indicate which superblock was root, since we didn't want to fetch more pairs than we needed to. But since we're going to fetch all metadata-pairs anyways, we can just use the last superblock we find as the indicator of our root directory.	2018-10-18 10:00:49 -05:00
Christopher Haster	7bacf9b1e0	Removed xored-globals from the mdir struct The xored-globals have a very large footprint. In the worst case, the xored-globals are stored on each metadata-pair, twice in memory. They must be very small, but are also very useful, so at risk of growing in the future (hint global free-list?). Initially we also stored a copy in each mdir structure, since this avoided extra disk access to look up the globals when we need to modify the global state on a metadata-pair. But we can easily just fetch the globals when needed. This is more costly in terms of runtime, but reduces RAM impact of globals, which was previously needed for each open dir and file.	2018-10-18 10:00:49 -05:00
Christopher Haster	5eeeb9d6ac	Revisited some generic concepts, callbacks, and some reorganization - Callbacks for get/match, this does have a code cost, but allows more code reuse, which almost balances out the code cost, but also reduces maintenance and increased flexibility. Also callbacks may be able to be gc-ed in some cases. - Consistent struct vs _t usage, _t for external-facing struct that shouldn't be messed with outside the library. structs for external and internal structs where anyone with access is allowed to modify. - Reorganized several high-level function groups - Inlined structures that didn't need separate definitions in header	2018-10-18 10:00:49 -05:00
Christopher Haster	617dd87621	Added deletion to custom attributes This follows from enabling tag deletion, however does require some consideration with the APIs. Now we can remove custom attributes, as well as determine if an attribute exists or not.	2018-10-18 10:00:49 -05:00
Christopher Haster	c67a41af7a	Added support for deleting attributes littlefs has a mechanism for deleting file entries, but it doesn't have a mechanism for deleting individual tags. This _is_ sufficient for a filesystem, but limits our flexibility. Deleting attributes would be useful in the custom attribute API and for future improvements (hint the child pointers in B-trees). However, deleteing attributes is tricky. We can't just omit the attribute, since we can only add new tags. Additionally, we need a way to track what attributes have been deleted during compaction, which currently relies on writing out attributes to disk. The solution here is pretty nifty. First we have to come up with a way to represent a "deleted" attribute. Rather than adding an additional bit to the already squished tag structure, we use a -1 length field, specifically 0xfff. Now we can commit a delete attribute, and this deleted tag acts as a place holder during compacts. However our delete tag will never leave our metadata log. We need some way to discard our delete tag if we know it's the only representation of that tag on the metadata log. Ah! We know it's the only tag if it's in the first commit on the metadata log. So we add an additional bit to the CRC entry to indicate if we're on the first commit, and use that to decide if we need to keep delete tags around. Now we have working tag deletion. Interestingly enough, tag deletion is actually indirectly more efficient than entry deletion, since compacting entries requires multiple passes, whereas tag deletion gets cleaned up lazily. However we can't adopt the same strategy in entry deletion because of the compact ordering of entries. Tag deletion works because tag types are unique and static. Managing entry deletion in this manner would require static id allocation, which would cause problems when creating files, running out of space, and disallow arbitrary insertions of files.	2018-10-18 10:00:49 -05:00
Christopher Haster	6046d85e6e	Added support for entry insertion Currently unused, the insertion of new file entries in arbitrary locations in a metadata-pair is very easy to add into the existing metadata logging. The only tricky things: 1. Name tags must strictly precede any tags related to a file. We can pull this off during a compact, but must make two passes. One for the name tag, one for the file. Though a benefit of this is that now our scans during moves can exit early upon finding the name tag. 1. We need to handle name tags appearing out of order. This makes name tags symmetric to deletes, although it doesn't seem like we can leverage this fact very well. Note this also means we need to make the superblock tag a type of name tag.	2018-10-18 10:00:49 -05:00
Christopher Haster	6db5202bdc	Modified valid bit to provide an early check on all tags The valid bit present in tags is a requirement to properly detect the end of commits in metadata logs. The way it works is that the CRC entry is allowed to specify what is needed from the next tag's valid bit. If it's incorrect, we've reached the end of the commit. We then set the valid bit to indicate when we tried to program a new commit. If we lose power, this commit will still be thrown out by a bad checksum. However, the valid bit is unused outside of the CRC entry. Here we turn on the valid bit for all tags, which means we have a decent chance of exiting early if we hit a half-written commit. We still need to guarantee detection of the valid bit on commits following the CRC entry, so we allow the CRC entry to flip the expected valid bit. The only tricky part is what valid bit we expect by default, since this is used on the first commit on a metadata log. Here we default to a 1, which gives us the fastest exit on blocks that erase to 0. This is because blocks that erase to 1s will implicitly flip the valid bit of the next tag, allowing us to exit on the next tag. If we defaulted to 0, we could exit faster on disks that erase to 1, but would need to scan the entire block on disks that erase to 0 before we realize a CRC commit is never coming.	2018-10-18 10:00:49 -05:00
Christopher Haster	a43f9b3cd5	Modified lfs_dir_compact to avoid redundant erases during split The commit machine in littlefs has three stages: commit, compact, and then split. First we try to append our commit to the metadata log, if that fails we try to compact the metadata log to remove duplicates and make room for the commit, if that still fails we split the metadata into two metadata-pairs and try again. Each stage is less efficient but also less frequent. However, in the case that we're filling up a directory with new files, such as the bootstrap process in setting up a new system, we must pass through all three stages rather quickly in order to get enough metadata-pairs to hold all of our files. This means we'll compact, split, and then need to compact again. This creates more erases than is needed in the optimal case, which can be a big cost on disks with an expensive erase operation. In theory, we can actually avoid this redundant erase by reusing the data we wrote out in the first attempt to compact. In practice, this trick is very complicated to pull off. 1. We may need to cache a half-completed program while we write out the new metadata-pair. We need to write out the second pair first in order to get our new tail before we complete our first metadata-pair. This requires two pcaches, which we don't have The solution here is to just drop our cache and reconstruct what if would have been. This needs to be perfect down to the byte level because we don't have knowledge of where our cache lines are. 2. We may have written out entries that are then moved to the new metadata-pair. The solution here isn't pretty but it works, we just add a delete tag for any entry that was moved over. In the end the solution ends up a bit hacky, with different layers poked through the commit logic in order to manage writes at the byte level from where we manage splits. But it works fairly well and saves erases.	2018-10-18 10:00:49 -05:00
Christopher Haster	478dcdddef	Revisited caching rules to optimize bus transactions The littlefs driver has always had this really weird quirk: larger cache sizes can significantly harm performance. This has probably been one of the most surprising pieces of configuraing and optimizing littlefs. The reason is that littlefs's caches are kinda dumb (this is somewhat intentional, as dumb caches take up much less code space than smart caches). When littlefs needs to read data, it will load the entire cache line. This means that even when we only need a small 4 byte piece of data, we may need to read a full 512 byte cache. And since microcontrollers may be reading from storage over relatively slow bus protocols, the time to send data over the bus may dominate other operations. Now that we have separate configuration options for "cache_size" and "read_size", we can start making littlefs's caches a bit smarter. They aren't going to be perfect, because code size is still a priority, but there are some small improvements we can do: 1. Program caches write to prog_size aligned units, but eagerly cache as much as possible. There's no downside to using the full cache in program operations. 2. Add a hint parameter to cached reads. This internal API allows callers to tell the cache how much data they expect to need. This avoids excess bus traffic, and now we can even bypass the cache if the caller provides enough of a buffer. We can still fall back to reading full cache-lines in the cases where we don't know how much data we need by providing the block size as the hint. We do this for directory fetches and for file reads. This has immediate improvements for both metadata-log traversal and CTZ skip-list traversal, since these both only need to read 4-byte pointers and can always bypass the cache, allowing reuse elsewhere.	2018-10-18 10:00:49 -05:00
Christopher Haster	4db96d4d44	Changed unwritable superblock to ENOSPC for consistency While ECORRUPT is not a wrong error code, it doesn't match other instances of hitting a corrupt block during write. During writes, if blocks are detected as corrupt their data is evicted and moved to a new clean block. This means that at the end of a disk's lifetime, exhaustion errors will be reported as ENOSPC when littlefs can't find any new block to store the data. This has the benefit of matching behaviour when a new file is written and no more blocks can be found, due to either a small disk or corrupted blocks on disk. To littlefs it's like the disk shrinks in size over time.	2018-10-18 10:00:48 -05:00
Christopher Haster	a2532a34cd	Fixed inline files when inline_max == cache_size The initial implementation of inline files was thrown together fairly quicky, however it has worked well so far and there hasn't been much reason to change it. One shortcut was to trick file writes into thinking they are writing to imaginary blocks. This works well and reuses most of the file code paths, as long as we don't flush the imaginary block out to disk. Initially we did this by limiting inline_max to cache_max-1, ensuring that the cache never fills up and gets flushed. This was a rather dirty hack, the better solution, implemented here, is to handle the representation of an "imaginary" block correctly all the way down into the cache layer. So now for files specifically, the value -1 represents a null pointer, and the value -2 represents an "imaginary" block. This may become a problem if the number of blocks approaches the max, however this -2 value is never written to disk and can be changed in the future without breaking compatibility.	2018-10-18 10:00:48 -05:00
Christopher Haster	d5e800575d	Collapsed recursive deorphans into a single pass Because a block can go bad at any time, if we're unlucky, we may end up generating multiple orphans in a single metadata write. This is exacerbated by the early eviction in dynamic wear-leveling. We can't track _all_ orphans, because that would require unbounded storage and significantly complicate things, but there are a handful of intentional orphans we do track because they are easy to resolve without the O(n^2) deorphan scan. These are anytime we intentionally remove a metadata-pair. Initially we cleaned up orphans as they occur with whatever knowledge we do have, and just accepted the extra O(n^2) deorphan scans in the unlucky case. However we can do a bit better by being lazy and leaving deorphaning up to the next metadata write. This needs to work with the known orphans while still setting the orphan flag on disk correctly. To accomplish this we replace the internal flag with a small counter. Note, this means that our internal representation of orphans differs from what's on disk. This is annoying but not the end of the world.	2018-10-18 10:00:48 -05:00
Christopher Haster	21217d75ad	Dropped lfs_fs_getattr for the more implicit lfs_getattr("/") This was a pretty simple oversight on my part. Conceptually, there's no difference between lfs_fs_getattr and lfs_getattr("/"). Any operations on directories can be applied "globally" by referring to the root directory. Implementation wise, this actually fixes the "corner case" of storing attributes on the root directory, which is broken since the root directory doesn't have a related entry. Instead we need to use the root superblock for this purpose. Fewer functions means less code to document and maintain, so this is a nice benefit. Now we just have a single lfs_getattr/setattr/removeattr set of functions along with the ability to access attributes atomically in lfs_file_opencfg.	2018-10-18 10:00:48 -05:00
Christopher Haster	38011f4cd0	Fixed minor memory leak - Fixed memory leak - Change lfs_globals_zero to use memset as this made leak checking more effective - Checked for leaks with valgrind	2018-10-18 10:00:48 -05:00
Christopher Haster	126ef8b07f	Added allocation randomization for dynamic wear-leveling This implements the second step of full dynamic wear-leveling, block allocation randomization. This is the key part the uniformly distributes wear across the filesystem, even through reboots. The entropy actually comes from the filesystem itself, by xoring together all of the CRCs in the metadata-pairs on the filesystem. While this sounds like a ridiculous operation, it's easy to do when we already scan the metadata-pairs at mount time. This gives us a random number we can use for block allocation. Unfortunately it's not a great general purpose random generator as the output only changes every filesystem write. Fortunately that's exactly when we need our allocator. --- Additionally, the randomization created a mess for the testing framework. Fortunately, this method of randomization is deterministic. A very useful property for reproducing bugs.	2018-10-18 09:55:47 -05:00
Christopher Haster	e4a0d586d5	Added building blocks for dynamic wear-leveling Initially, littlefs relied entirely on bad-block detection for wear-leveling. Conceptually, at the end of a devices lifespan, all blocks would be worn evenly, even if they weren't worn out at the same time. However, this doesn't work for all devices, rather than causing corruption during writes, wear reduces a devices "sticking power", causing bits to flip over time. This means for many devices, true wear-leveling (dynamic or static) is required. Fortunately, way back at the beginning, littlefs was designed to do full dynamic wear-leveling, only dropping it when making the retrospectively short-sighted realization that bad-block detection is theoretically sufficient. We can enable dynamic wear-leveling with only a few tweaks to littlefs. These can be implemented without breaking backwards compatibility. 1. Evict metadata-pairs after a certain number of writes. Eviction in this case is identical to a relocation to recover from a bad block. We move our data and stick the old block back into our pool of blocks. For knowing when to evict, we already have a revision count for each metadata-pair which gives us enough information. We add the configuration option block_cycles and evict when our revision count is a multiple of this value. 2. Now all blocks participate in COW behaviour. However we don't store the state of our allocator, so every boot cycle we reuse the first blocks on storage. This is very bad on a microcontroller, where we may reboot often. We need a way to spread our usage across the disk. To pull this off, we can simply randomize which block we start our allocator at. But we need a random number generator that is different on each boot. Fortunately we have a great source of entropy, our filesystem. So we seed our block allocator with a simple hash of the CRCs on our metadata-pairs. This can be done for free since we already need to scan the metadata-pairs during mount. What we end up with is a uniform distribution of wear on storage. The wear is not perfect, if a block is used for metadata it gets more wear, and the randomization may not be exact. But we can never actually get perfect wear-leveling, since we're already resigned to dynamic wear-leveling at the file level. With the addition of metadata logging, we end up with a really interesting two-stage wear-leveling algorithm. At the low-level, metadata is statically wear-leveled. At the high-level, blocks are dynamically wear-leveled. --- This specific commit implements the first step, eviction of metadata pairs. Entertwining this into the already complicated compact logic was a bit annoying, however we can combine the logic for superblock expansion with the logic for metadata-pair eviction.	2018-10-18 09:30:45 -05:00
Christopher Haster	20b669a23d	Fixed issue with big-endian CTZ lists intertwined in commit logic Found while testing big-endian support. Basically, if littlefs is really really unlucky, the block allocator could kick in while committing a file's CTZ reference. If this happens, the block allocator will need to traverse all CTZ skip-lists in memory, including the skip-list we're committing. This means we can't convert the CTZ's endianness in place, and need to make a copy on big-endian systems. We rely on dead-code elimination from the compiler to make the conditional behaviour for big-endian vs little-endian system a noop determined by the lfs_tole32 intrinsic.	2018-10-16 20:53:25 -05:00
Christopher Haster	10f45ac02f	Changed lfs_crc to match more common API In looking at the common CRC APIs out there, this seemed the most common. At least more common than the current modified-in-place pointer API. It also seems to have a slightly better code footprint. I'm blaming pointer optimization issues. One downside is that lfs_crc can't report errors, however it was already assumed that lfs_crc can not error.	2018-10-16 20:53:19 -05:00
Christopher Haster	3b3981eb74	Fixed testing issues introduced by expanding superblocks This was mostly tweaking test cases to be accommodating for variable sized superblock-lists. Though there were a few bugs that needed fixing: - Changed compact to use source dir for move since the original dir could have changed as a result of an expand. - Created copy of current directory so we don't overwrite ourselves during an internal commit update. Also made sure all of the test suites provide reproducable results when ran independently (the entry tests were behaving differently based on which tests were ran before). (Some where legitimate test failures)	2018-10-16 20:18:24 -05:00
Christopher Haster	d8f930eeab	Modified CTZ struct type to make space for erased files in the future In v1, littlefs didn't trust blocks that were been previously erased and conservatively erased any blocks before writing to them. This was a part of the design since the beginning because of the complexity of managing erased blocks when we can lose power at any time. However, we theoretically could keep track of files that have been properly erased by marking them with an "erased bit". A file marked this way could be opened and appended to without needing to COW the last block. The requirement would be that the "erased bit" is cleared during a write, since a power-loss would require that littlefs no longer trust the erased state of the file. This commit just shuffles the struct types around to make space for an "erased bit" in the struct type field to be added in the future. This ordering also makes more sense, since there will likely be more file representations than directory representations on disk.	2018-10-16 20:07:19 -05:00
Christopher Haster	7c70068b89	Added root entry and expanding superblocks Expanding superblocks has been on my wishlist for a while. The basic idea is that instead of maintaining a fixed offset blocks {0, 1} to the the root directory (1 pointer), we maintain a dynamically sized linked-list of superblocks that point to the actual root. If the number of writes to the root exceeds some value, we increase the size of the superblock linked-list. This can leverage existing metadata-pair operations. The revision count for metadata-pairs provides some knowledge on how much wear we've put on the superblock, and the threaded linked-list can also be reused for this purpose. This means superblock expansion is both optional and cheap to implement. Expanding superblocks helps both extremely small and extremely large filesystem (extreme being relative of course). On the small end, we can actually collapse the superblock into the root directory and drop the hard requirement of 4-blocks for the superblock. On the large end, our superblock will now last longer than the rest of the filesystem. Each time we expand, the number of cycles until the superblock dies is increased by a power. Before we were stuck with this layout: level cycles limit layout 1 E^2 390 MiB s0 -> root Now we expand every time a fixed offset is exceeded: level cycles limit layout 0 E 4 KiB s0+root 1 E^2 390 MiB s0 -> root 2 E^3 37 TiB s0 -> s1 -> root 3 E^4 3.6 EiB s0 -> s1 -> s2 -> root ... Where the cycles are the number of cycles before death, and the limit is the worst-case size a filesystem where early superblock death becomes a concern (all writes to root using this formula: E^\|s\| = E*B, E = erase cycles = 100000, B = block count, assuming 4096 byte blocks). Note we can also store copies of the superblock entry on the expanded superblocks. This may help filesystem recover tools in the future.	2018-10-16 19:30:56 -05:00
Christopher Haster	c3e36bd2a7	Standardized naming for internal functions - lfs_pairblah -> lfs_pair_blah - lfs_ctzblah -> lfs_ctz_blah - lfs_tagblah -> lfs_tag_blah - lfs_globalblah -> lfs_global_blah - lfs_commitblah -> lfs_commit_blah	2018-10-16 11:35:39 -05:00
Christopher Haster	6d0a6fc462	Merge remote-tracking branch 'origin/master' into v2-alpha	2018-10-16 11:33:00 -05:00
Christopher Haster	3186e89b14	Changed littlefs-fuse target for testing purposes This is a downside caused by relying on and external repo for testing, but also storing the CI configuration inside this repo. Fortunately we can use a temporary v2-alpha branch in the FUSE repo mirroring the v2-alpha branch for testing.	2018-10-16 09:42:46 -05:00
Christopher Haster	dbcbe4e088	Changed name of upper-limits from blah_size to blah_max This standardizes the naming between the LFS_BLAH_MAX macros and the blah_max configuration in the lfs_config structure.	2018-10-16 09:42:46 -05:00
Christopher Haster	213530c376	Changed LFS_ERR_CORRUPT to match EILSEQ instead of EBADE LFS_ERR_CORRUPT is unfortunately not a well defined error code. It's very important in the context of littlefs, but missing from the standard error codes defined in Linux. After some discussions with other developers, it was encouraged to use the encoding for EILSEQ over EBADE for representing on disk corrupt, as EILSEQ implies that there is something wrong with the data. I've changed this now to take advantage of the breaking changes in v2 to avoid a risky change to a return value.	2018-10-16 09:40:05 -05:00
Christopher Haster	a88230ae6a	Updated custom attribute documentation and tweaked nonexistant attributes Because of limitations in how littlefs manages attributes on disk, littlefs views zero-length attributes and missing attributes as the same thing. The simpliest implementation of attributes mirrors this behaviour transparently for the user.	2018-10-16 09:20:44 -05:00

1 2 3 4 5 ...

299 Commits