Commit Graph

252 Commits

Author SHA1 Message Date
Christopher Haster
d7b0652936 Removed old move logic, now passing move tests
The introduction of xored-globals required quite a bit of work to
integrate. But now that that is working, we can strip out the old move
logic.

It's worth noting that the xored-globals integration with commits is
relatively complex and subtle.
2018-10-15 16:03:18 -05:00
Christopher Haster
2ff32d2dfb Fixed bug where globals were poisoning move commits
The issue lies in the reuse of the id field for globals. Before globals,
the only tags with a non-null (0x3ff) id field were names, structs, and
other file-specific metadata. But globals are also using this field for
the indirect delete, since otherwise the globals structure would be very
unaligned (74-bits long).

To make matters worse, the id field for globals contains the delta used
to reconstruct the globals at mount time. Which means the id field could
take on very absurd values and break the dir fetch logic if we're not
careful.

Solution is to use the scope portion of the type field where necessary,
although unforunately this does add some code cost.
2018-10-15 15:56:04 -05:00
Christopher Haster
b46fcac585 Fixed issues with finding wrong ids after bad commits
Unfortunately, the behaviour needed of lfs_dir_fetchwith is as subtle as
it is important. When fetching from a block corrupted by power-loss,
lfs_dir_fetch must be able to rewind any state it picks up to before the
corruption. This is not limited to the directory state, but includes
find results and other side-effects.

This gets a bit complicated when trying to generalize littlefs's
fetchwith mechanics. Being able to scan a directory block during a fetch
greatly impacts the runtime of littlefs operations, but if the state is
generic how do we know what to rollback to?

The fix here is to leave the management of rolling back state to the
fetchwith match functions, and transparently pass a CRC tag to indicate
the temporary state can be saved.
2018-10-13 19:46:38 -05:00
Christopher Haster
cebf7aa0fe Switched back to simple deorphan-step on directory remove
Originally I tried to reuse the indirect delete to accomplish truely
atomic directory removes, however this fell apart when it came to
implementing directory removes as a side-effect of renames.

A single indirect-delete simply can't handle renames with removes as
a side effects. When copying an entry to its destination, we need to
atomically delete both the old entry, and the source of our copy. We
can't delete both with only a single indirect-delete. It is possible to
accomplish this with two indirect-deletes, but this is such an uncommon
case that it's really not worth supporting efficiently due to how
expensive globals are.

I also dropped indirect-deletes for normal directory removes. I may add
it back later, but at the moment it's extra code cost for that's not
traveled very often.

As a result, restructured the indirect delete handling to be a bit more
generic, now with a multipurpose lfs_globals_t struct instead of the
delete specific lfs_entry_t struct.

Also worked on integrating xored-globals, now with several primitive
global operations to manage fetching/updating globals on disk.
2018-10-13 19:35:45 -05:00
Christopher Haster
3ffcedb95b Restructured tags to better support xored-globals
32-bit tag structure:
[---        32       ---]
[1|- 9 -|- 10 -|-- 12 --]
 ^   ^     ^       ^- entry length
 |   |     \--------- file id
 |   \--------------- tag type
 \------------------- valid

In this tag, the type decomposes into some more information:
[---      9      ---]
[1|- 2 -|- 3 -|- 3 -]
 ^   ^     ^     ^- struct
 |   |     \------- type
 |   \------------- scope
 \----------------- user

The change in this encoding is the addition of a global scope:
LFS_SCOPE_STRUCT = 0 00 xxx xxx
LFS_SCOPE_ENTRY  = 0 01 xxx xxx
LFS_SCOPE_DIR    = 0 10 xxx xxx
LFS_SCOPE_FS     = 0 11 xxx xxx
LFS_SCOPE_USER   = 1 xx xxx xxx
2018-10-13 19:12:35 -05:00
Christopher Haster
e39f7e99d1 Introduced xored-globals logic to fix fundamental problem with moves
This was a big roadblock for a while: with the new feature of inlined
files, the existing move logic was fundamentally flawed.

To pull off atomic moves between two different metadata-pairs, littlefs
uses a simple, if a bit clumsy trick.
1. Marks entry as "moving"
2. Copies entry to new metadata-pair
3. Deletes old entry

If power is lost before the move operation is completed, we will find the
"moving" tag. This means there may or may not be an incomplete move on
the filesystem. In this case, we simply search for the moved entry, if
we find it, we remove the old entry, otherwise we just remove the
"moving" tag.

This worked perfectly, until we introduced inlined files. See, unlike
the existing directory and ctz entries, inlined files have no guarantee
they are unique. There is nothing we can search for that will allow us
to find a moved file unless we assign entries globally-unique ids. (note
that moves are fundamentally rename operations, so searching for names
does not make sense).

---

Solving this problem required completely restructuring how littlefs
handled moves and pulled out a really old idea that had been left in the
cutting room floor back when littlefs was going through many
designs: xored-globals.

The problem xored-globals solves is the need to maintain some global state
via commits to these distributed, independent metadata-pairs. The idea
is that we can use some sort of symmetric operation, such as xor, to
introduces deltas of the global state that can be committed atomically
along with any other info to these metadata-pairs.

This means that to figure out our global state, we xor together the global
delta stored in every metadata-pair.

Which means any commit can update the global state atomically, opening
up a whole new set atomic possibilities.

There is a couple of downsides. These globals may end up with deltas on
every single metadata-pair, effectively duplicating the data for each
block. Additionally, these globals need to have multiple copies in RAM.
This means and globals need to be a bounded size and very small, since even
small globals will have a large footprint.

---

On top of xored-globals, it's trivial to fix our move logic. Here we've
added an indirect delete tag which allows us to atomically specify a
delete of any entry on the filesystem.

Our move operation is now:
1. Copy entry to new metadata-pair and atomically xor globals to
   indirectly delete our original entry.
2. Delete the original entry and xor globals to remove the indirect
   delete.

Extra exciting is that this now takes our relatively clumsy move
operation into a sexy guaranteed O(1) move operation with no searching
necessary (though we do need to xor globals during mount).

Also reintroduced entry struct, now with a specific purpose to describe
the metadata-pair + id combo needed by indirect deletes to locate an
entry.
2018-10-13 18:35:33 -05:00
Christopher Haster
116c1e76de Adopted EISDIR as internal error for root path as argument
Unfortunately, it's hard to make directory lookups for root not a
special case, and we don't like special cases when we're trying to keep
code size small.

Since there are a handful of code paths where opening root should return
EISDIR (such as lfs_file_open("/")), using EISDIR to note that the
argument is in fact a path to the root.

This is needed because we no longer look up an entries contents in
lfs_dir_find for free, since entries are more ephemeral.
2018-10-13 18:25:08 -05:00
Christopher Haster
f458da4b7c Added the internal meta-directory structure
Similarly to the internal "meta-attributes", I was finding quite a bit
of need for an internal structure that mirrors the user-facing directory
structure for when I need to do an operation on a metadata-pair, but
don't need all of the state associated with a fully iterable directory
chain.

lfs_mdir_t - meta-directory, describes a single metadata-pair
lfs_dir_t  - directory, describes an iterable directory chain

While it may seem complex to have all these structures lying around,
they only complicate the code at compile time. To the machine, any
number of nested structures all looks the same.
2018-10-13 18:20:44 -05:00
Christopher Haster
eaa9220aad Renamed lfs_entry_t -> lfs_mattr_t
Attributes are used to describe more than just entries, so calling these
list of attributes "entries" was inaccurate. However, the name
"attributes" would conflict with "user attributes", user-facing
attributes with a very similar purpose. "user attributes" must be kept
distinct due to differences in binary layout (internal attributes can
use a more compact tag+buffer representation, but expecting users to
jump through hoops to get their data to look like that isn't very
user-friendly).

Decided to go with "mattr" as shorthand for "meta-attributes", similar
to "metadata".
2018-10-13 18:14:38 -05:00
Christopher Haster
9278b17537 Trimmed old names and functions from the code base
I've found using temporary names to duplicate functions temporarily
(lfs_dir_commit + lfs_dir_commit_) is a great way to introduce sweeping
changes while keeping the code base functional and (mostly) passing
tests.

It does mean at some point I need to deduplicate all these functions.
2018-10-13 18:13:12 -05:00
Christopher Haster
85a9638d9f Fixed issues discovered around testing moves
lfs_dir_fetchwith did not recover from failed dir fetches correctly,
added a temporary dir variable to hold dir contents while being
populated, allowing us to fall back to a known good dir state if a
commit is corrupted.

There is a RAM cost, but the upside is that our lfs_dir_fetchwith
actually works.

Also added better handling of move ids during some get functions.
2018-10-13 18:08:28 -05:00
Christopher Haster
483d41c545 Passing all of the basic functionality tests
Integration with the new journaling metadata has now progressed to the
point where all of the basic functionality tests are passing. This
includes:
- test_format
- test_dirs
- test_files
- test_seek
- test_truncate
- test_interspersed
- test_paths

Some of the fixes:
- Modified move to correctly change entry ids
- Called lfs_commit_move directly from compact, avoiding commit parsing
  logic during a compact
- Opened up commit filters to be passed down from compact for moves
- Added correct drop logic to lfs_dir_delete
- Updated lfs_dir_seek to use ids instead of offsets
- Caught id updates manually where possible (this needs to be fixed)
2018-10-13 17:58:56 -05:00
Christopher Haster
11a3c8d062 Continued progress with reintroducing testing on the new metadata logging
Now with some tweaks to commit/compact, and a committers for entrylists and
moves specifically. No longer relying on a commitwith callback, the
types of commits are now infered from their tags.

This means we can now commit things atomically with special commits,
such as moves. Now lfs_rename can move entries to new names correctly.
2018-10-13 17:47:01 -05:00
Christopher Haster
0bdaeb7f8b More testing progress, combined dir/commit traversal
Passing more tests now with the journalling change, but still have more
work to do.

The most humorous bug was a bug where during the three step move
process, the entry move logic would dumbly copy over any tags associated
with the moving entry, including the tag used to temporarily mark the
entry as "moving".

Also combined dir and commit traversal using a "stop_at_commit" flag in
directory struct as a short-term hack to combine the code paths.
2018-10-13 17:44:37 -05:00
Christopher Haster
0405ceb171 Cleaned up enough things to pass basic file testing 2018-10-13 13:41:05 -05:00
Christopher Haster
a3c67d9697 Reorganized the internal operations to make more sense
Also refactored lfs_dir_compact a bit, adding begin and end as arguments
since they simplify a bit of the logic and can be found out much easier
earlier in the commit logic.

Also changed add -> append and drop -> delete and cleaned up some of the
logic around there.
2018-10-13 13:38:04 -05:00
Christopher Haster
0695862b38 Completed transition of files with journalling metadata
This was the simpler part of transitioning since file operations only
interact with metadata at sync time.

Also switched from array to linked-list of entries.
2018-10-13 13:33:29 -05:00
Christopher Haster
fe553e8af4 More progress integrating journaling
- Integrated into lfs_file_t_, duplicating functions where necessary
- Added lfs_dir_fetchwith_ as common parent to both lfs_dir_fetch_ and
  lfs_dir_find_
- Added similar parent with lfs_dir_commitwith_
- Made matching find/get operations with getbuffer/getentry and
  findbuffer/findentry
- lfs_dir_alloc now populates tail, since almost all directory block
  allocations need to populate tail
2018-10-13 13:31:47 -05:00
Christopher Haster
87f3e01a17 Progressed integration of journaling metadata pairs
- Integrated journaling into lfs_dir_t_ struct and operations,
  duplicating functions where necessary
- Added internal lfs_tag_t and lfs_stag_t
- Consolidated lfs_region and lfs_entry structures
2018-10-13 13:31:42 -05:00
Christopher Haster
8070abec34 Added rudimentary framework for journaling metadata pairs
This is a big change stemming from the fact that resizable entries
were surprisingly complicated to implement and came in with a sizable
code cost.

The theory is that the journalling has a comparable cost to resizable
entries. Both need to handle overflowing blocks, and managing offsets is
comparable to managing attribute IDs. But by jumping all the way to full
journaling, we can statically wear-level the metadata written to
metadata pairs.

The idea of journaling littlefs's metadata has been mentioned several times in
discussions and fits well into how littlefs works. You could even view the
existing metadata log as a log of size 2.

The downside of this approach is that changing the metadata in this way
would break compatibility from the existing layout on disk. Something
that resizable entries does not do.

That being said, adopting journaling at the metadata layer offers a big
improvement to littlefs's performance and wear-leveling, with very
little cost (maybe even none or negative after resizable entries?).
2018-10-13 13:22:53 -05:00
Christopher Haster
61f454b008 Added tests for resizable entries and custom attributes
Also found some bugs. Should now have a good amount of confidence in
these features.
2018-10-09 23:02:57 -05:00
Christopher Haster
ea4ded420c Fixed big-endian support again
This is what I get for not runing CI on a local development branch.
2018-10-09 23:02:57 -05:00
Christopher Haster
746b90965c Added lfs_fs_size for finding a count of used blocks
This has existed for some time in the form of the lfs_traverse
function, through which a user could provide a simple callback that
would just count the number of blocks lfs_traverse finds. However,
this approach is relatively unconventional and has proven to be confusing
for most users.
2018-10-09 23:02:57 -05:00
Christopher Haster
93244a3734 Added file-level and fs-level custom attribute APIs
In the form of lfs_file_setattr, lfs_file_getattr, lfs_fs_setattr,
lfs_fs_getattr.

This enables atomic updates of custom attributes as described in
6c754c8, and provides a custom attribute API that allows custom attributes
to be stored on the filesystem itself.
2018-10-09 23:02:50 -05:00
Christopher Haster
636c0ed3d1 Modified commit regions to work better with custom attributes
Mostly just removed LFS_FROM_DROP and changed the DSL grammar a bit to
allow drops to occur naturally through oldsize -> newsize diff expressed
in the region struct. This prevents us from having to add a drop every
time we want to update an entry in-place.
2018-10-09 23:02:09 -05:00
Christopher Haster
6c754c8023 Added support for atomically committing custom attributes
Although it's simple and probably what most users expect, the previous
custom attributes API suffered from one problem: the inability to update
attributes atomically.

If we consider our timestamp use case, updating a file would require:
1. Update the file
2. Update the timestamp

If a power loss occurs during this sequence of updates, we could end up
with a file with an incorrect timestamp.

Is this a big deal? Probably not, but it could be a surprise only found
after a power-loss. And littlefs was developed with the _specifically_
to avoid suprises during power-loss.

The littlefs is perfectly capable of bundling multiple attribute updates
in a single directory commit. That's kind of what it was designed to do.
So all we need is a new committer opcode for list of attributes, and
then poking that list of attributes through the API.

We could provide the single-attribute functions, but don't, because the
fewer functions makes for a smaller codebase, and these are already the
more advanced functions so we can expect more from users. This also
changes semantics about what happens when we don't find an attribute,
since erroring would throw away all of the other attributes we're
processing.

To atomically commit both custom attributes and file updates, we need a
new API, lfs_file_setattr. Unfortunately the semantics are a bit more
confusing than lfs_setattr, since the attributes aren't written out
immediately.
2018-10-09 23:02:09 -05:00
Christopher Haster
6ffc8d3480 Added simple custom attributes
A much requested feature (mostly because of littlefs's notable lack of
timestamps), this commits adds support for user-specified custom
attributes.

Planned (though underestimated) since v1, custom attributes provide a
route for OSs and applications to provide their own metadata in
littlefs, without limiting portability.

However, unlike custom attributes that can be found on much more
powerful PC filesystems, these custom attributes are very limited,
intended for only a handful of bytes for very important metadata. Each
attribute has only a single byte to identify the attribute, and the
size of all attributes attached to a file is limited to 64 bytes.

Custom attributes can be accessed through the lfs_getattr, lfs_setattr,
and lfs_removeattr functions.
2018-10-09 23:02:09 -05:00
Christopher Haster
65ea6b3d0f Bumped versions, cleaned up some TODOs and missing comments 2018-10-09 23:02:09 -05:00
Christopher Haster
6774276124 Expanded inline files up to a limit of 1023 bytes
One of the big benefits of inline files is that small files no longer need to
take up a full block. This opens up an opportunity to provide much better
support for storage devices with only a handful of very large blocks. Such as
the internal flash found on most microcontrollers.

After investigating some use cases for a filesystem on internal flash,
it has become apparent that the 255-byte limit is going to be too
restrictive to be useful in many cases. Most uses I found needed files
~4-64 bytes in size, but it wasn't uncommon to find files ~512 bytes in
length.

To try to remedy this, I've pushed the 255 byte limit up to 1023 bytes,
by stealing some bits from the previously-unused attributes's size.
Unfortunately this limits attributes to 63 bytes in total and has a
minor code cost, but I'm not sure even 1023 bytes will be sufficient for
a lot of cases.

The littlefs will probably never be as efficient with internal flash as
other filesystems such as SPIFFS, it just wasn't designed for this sort of
limited geometry. However, this feature has been heavily requested, even
with limitations, because of the opportunity for code reuse on
microcontrollers with both internal and external flash.
2018-10-09 23:02:09 -05:00
Christopher Haster
6362afa8d0 Added disk-backed limits on the name/attrs/inline sizes
Being a portable, microcontroller-scale embedded filesystem, littlefs is
presented with a relatively unique challenge. The amount of RAM
available is on completely different scales from machine to machine, and
what is normally a reasonable RAM assumption may break completely on an
embedded system.

A great example of this is file names. On almost every PC these days, the limit
for a file name is 255 bytes. It's a very convenient limit for a number
of reasons. However, on microcontrollers, allocating 255 bytes of RAM to
do a file search can be unreasonable.

The simplest solution (and one that has existing in littlefs for a
while), is to let this limit be redefined to a smaller value on devices
that need to save RAM. However, this presents an interesting portability
issue. If these devices are plugged into a PC with relatively infinite
RAM, nothing stops the PC from writing files with full 255-byte file
names, which can't be read on the small device.

One solution here is to store this limit on the superblock during format
time. When mounting a disk, the filesystem implementation is responsible for
checking this limit in the superblock. If it's larger than what can be
read, raise an error. If it's smaller, respect the limit on the
superblock and raise an error if the user attempts to exceed it.

In this commit, this strategy is adopted for file names, inline files,
and the size of all attributes, since these could impact the memory
consumption of the filesystem. (Recording the attribute's limit is
iffy, but is the only other arbitrary limit and could be used for disabling
support of custom attributes).

Note! This changes makes it very important to configure littlefs
correctly at format time. If littlefs is formatted on a PC without
changing the limits appropriately, it will be rejected by a smaller
device.
2018-10-09 23:02:09 -05:00
Christopher Haster
955545839b Added internal lfs_dir_set, an umbrella to dir append/update/remove operations
This move was surprisingly complex, but offers the ultimate opportunity for
code reuse in terms of resizable entries. Instead of needing to provide
separate functions for adding and removing entries, adding and removing
entries can just be viewed as changing an entry's size to-and-from zero.

Unfortunately, it's not _quite_ that simple, since append and remove
hide some relatively complex operations for when directory blocks
overflow or need to be cleaned up.

However, with enough shoehorning, and a new committer type that allows
specifying recursive commit lists (is this now a push-down automata?),
it does seem to be possible to shove all of the entry update logic into
a single function.

Sidenote, I switched back to an enum-based DSL, since the addition of a
recursive region opcode breaks the consistency of what needs to be
passed to the DSL callback functions. It's much simpler to handle each
opcode explicitly inside a recursive lfs_commit_region function.
2018-10-09 23:02:09 -05:00
Christopher Haster
ad74825bcf Added internal lfs_dir_get to consolidate logic for reading dir entries
It's a relatively simple function but offers some code reuse as well as
making the dir entry operations a bit more readable.
2018-10-09 23:02:09 -05:00
Christopher Haster
d0e0453651 Changed how we write out superblock to use append
Making the superblock look like "just another entry" allows us to treat
the superblock like "just another entry" and reuse a decent amount of
logic that would otherwise only be used a format and mount time. In this
case we can use append to write out the superblock like it was creating
a new entry on the filesystem.
2018-10-09 23:02:09 -05:00
Christopher Haster
701e4fa438 Fixed a handful of bugs as result of testing 2018-10-09 23:02:09 -05:00
Christopher Haster
d8cadecba6 Better implementation of inline files, now with overflowing
Now when a file overflows the max inline file size, it will be correctly
written out to a proper block. Additionally, tweaked corner cases around
inline file, however this still needs significant testing.

A real neat part that surprised me is that littlefs _already_ contains
the logic for writing out inline files: in lfs_file_relocate! With a bit
of tweaking, littlefs can pull off both the overflow from inline to
normal files _and_ the relocating of bad blocks in files with the same
piece of logic.
2018-10-09 23:02:09 -05:00
Christopher Haster
836e23895a Shoehorned in hacky implementation of inline files
Proof-of-concept implementation of inline files that stores the file's
content directly in its parent's directory pair.

Inline files are indicated by a different type stored in an entry's
struct field, and take advantage of resizable entries. Where a normal
file's entry would normally hold the reference to the CTZ skip-list, an
inline file's entry contains the contents of the actual file.

Unfortunately, storing the inline file on disk is the easy part. We also
need to manage inline files in the internals of littlefs and provide the
same operations that we do on normal files, all while reusing as much
code as possible to avoid a significant increase in code cost.

There is a relatively simple, though maybe a bit hacky, solution here. If a
file fits entirely in a cache line, the file logic never actually has to go to
disk. This means we can just give the file a "pretend" block (hopefully
one that would assert if ever written to), and carry out file operations
as normal, as long as we catch the file before it exceeds the cache line
and write out the file to an actual disk.
2018-10-09 23:02:09 -05:00
Christopher Haster
fb23044872 Fixed big-endian support for entry structures 2018-10-09 23:02:09 -05:00
Christopher Haster
9273ac708b Added size field to entry structure
The size field is redundant, since an entry's size can be determined
from the nlen+elen+alen+4. However, as you may have guessed from that
expression, calculating the size this way is a bit roundabout and
inefficient. Despite its redundancy, it's cheaper to store the size in the
entry, though with a minor RAM cost.

Note, extra care must now be taken to make sure these size and len fields
don't fall out of sync.
2018-10-09 23:02:09 -05:00
Christopher Haster
03b262b1e8 Separated out version of dir remove/append for non-entries
This allows updates to directories without needing to allocate an entry
struct for every call.
2018-10-09 23:02:09 -05:00
Christopher Haster
362b0bbe45 Minor improvement to from-memory commits
Tweaked the commit callback to pass the arguments for from-memory
commits explicitly, with non-from-memory commits still being able to
hijack the opaque data pointer for additional state.

The from-memory commits make up the vast majority of commits in
littlefs, so this small change has a noticable impact.
2018-10-09 23:02:09 -05:00
Christopher Haster
e4a0cd942d Take advantage of empty space early in dir search
Before, when appending new entries to a directory, we try to find empty space
in the last block of a directory chain. This has a nice side-effect that
the order of directory entries is maintained. However, this isn't strictly
necessary.

We're already scanning the directory chain in order, so other than changes to
directory order, there's no downside to taking advantage of any free
space we come across.
2018-10-09 23:02:09 -05:00
Christopher Haster
f30ab677a4 Traded enum-based DSL for full callback-based DSL
Now, instead of passing an enum for mem/disk commits, we pass a function
pointer that can specify any behaviour.

This has the benefit of opening up the possibility to pass any sort of
commit logic to the committers, and unused logic can be garbage-collected
by the compiler if unused. The downside is that unfortunately compilers have
a harder time optimizing around functions pointers than enums, and
fitting the state into structs for the callbacks may be costly.
2018-10-09 23:02:09 -05:00
Christopher Haster
ca3d6a52d2 Made implicity tag updates explicit
Before, tags were implicitly updated by the dir update functions, which
have a strong understanding of the entry struct. However, most of the
time the tag was already a part of the entry struct being committed.

By making tag updates explicit, this does add cost to commits that
now have to pass tag updates explicitly, but it reduces cost where that
tag and entry update can be combined into one commit region.

It also simplifies the dir update functions.
2018-10-09 23:02:09 -05:00
Christopher Haster
692f0c542e Naive implementation of resizable entries
Now, with the off, diff, and len parameters in each commit entry, we can build
up directory commits that resize entries. This adds complexity but opens
up the directory blocks to be much more flexible.

The main concern is that resizing entries can push around neighboring entries
in surprising ways, such as pushing them into new directory blocks when a
directory splits. This can break littlefs's internal logic in how it tracks
in-flight entries. The most problematic example being open files.

Fortunately, this is helped by a global linked-list of all files and
directories opened by the filesystem. As entries change size, the state
of open files/dirs may be updated as needed. Note this already needed to
exist for the ability to remove files/dirs, which has the same issue.
2018-10-09 23:02:09 -05:00
Christopher Haster
e3daee2621 Changed dir append to mirror commit DSL
Expiremental implementation. This opens up the opportunity to use the same
commit description for both commits and appends, which effectively do the same
thing.

This should lead to better code reuse.
2018-10-09 23:02:09 -05:00
Christopher Haster
73d29f05b2 Adopted a tiny LISP-like DSL for some extra flexibility
Really all this means is that the internal commit function was changed
from taking an array of "commit structures" to a linked-list of "commit
structures". The benefit of a linked-list is that layers of commit
functions can pull off some minor modifications to the description of
the commit. Most notably, commit functions can add additional entries
that will be atomically written out and CRCed along with the initial
commit.

Also a minor benefit, this is one less parameter when committing a
directory with zero entries.
2018-10-09 23:02:09 -05:00
Christopher Haster
4c35c8655a Added different sources for commits, now with disk->disk moves
Previously, commits could only come from memory in RAM. This meant any
entries had to be buffered in their entirety before they could be moved
to a different directory pair. By adding parameters for specifying
commits from existing entries stored on disk, we allow any sized entries
to be moved between directory pairs with a fixed RAM cost.
2018-10-09 23:02:09 -05:00
Christopher Haster
49698e431f Separated type/struct fields in dir entries
The separation of data-structure vs entry type has been implicit for a
while now, and even taken advantage of to simplify the traverse logic.

Explicitely separating the data-struct and entry types allows us to
introduce new data structures (inlined files).
2018-10-09 23:02:01 -05:00
Vincent Dupont
28d2d96a83 Fix -Wsign-compare error 2018-09-29 11:33:19 -05:00
Christopher Haster
646b1b5a6c Added -Wjump-misses-init and fixed uninitialized warnings 2018-09-26 18:58:54 -05:00
Christopher Haster
e5a6938faf Fixed possible infinite loop in deorphan step
Normally, the linked-list of directory pairs should terminate at a null
pointer. However, it is possible if the filesystem is corrupted, that
that this linked-list forms a cycle.

This should never happen with littlefs's power resilience, but if it does
we should recover appropriately.

Modified lfs_deorphan to notice if we have a cycle and return
LFS_ERR_CORRUPT in that situation.

Found by kneko715
2018-09-26 18:58:11 -05:00
Christopher Haster
3419284689 Fixed issue with corruption due to different cache sizes
The lfs_cache_zero function that was recently added assumed a single cache
size, which is incorrect. This would cause a buffer overflow if
read_size != prog_size.

Since lfs_cache_zero is only used for scrubbing prog caches, the fix
here is to use lfs_cache_drop instead on read caches. Info in read
caches should never make its way to disk.

Found by nstcl
2018-09-04 13:57:22 -05:00
Freddie Chopin
0422c55b81 Fix memory leaks in lfs_mount and lfs_format
Squashed:
- Change lfs_deinit() return to void to simplify error handling
- Move lfs_deinit() before lfs_init()
- Fix memory leaks in lfs_init()
- Fix memory leaks in lfs_format()
- Fix memory leaks in lfs_mount()
2018-07-19 16:54:38 -05:00
Christopher Haster
11ad3a2414
Merge pull request #76 from ARMmbed/fix-corrupt-read
Add handling for corrupt as initial state of blocks
2018-07-17 20:32:33 -05:00
Damien George
961fab70c3 Added file config structure and lfs_file_opencfg
The optional config structure options up the possibility of adding
file-level configuration in a backwards compatible manner.

Also adds possibility to open multiple files with LFS_NO_MALLOC
enabled thanks to dpgeorge

Also bumped minor version to v1.5
2018-07-17 18:32:18 -05:00
Christopher Haster
041e90a1ca Added handling for corrupt as initial state of blocks
Before this, littlefs incorrectly assumed corrupt blocks were only the result
of our own modification. This would be fine for most cases of freshly
erased storage, but for storage with block-level ECC this wasn't always
true.

Fortunately, it's quite easy for littlefs to handle this case correctly,
as long as corrupt storage always reports that it is corrupt, which for
most forms of ECC is the case unless we perform a write on the storage.

found by rojer
2018-07-16 15:33:52 -05:00
Freddie Chopin
7e67f9324e Use PRIu32 and PRIx32 format specifiers to fix warnings
When using "%d" or "%x" with uint32_t types, arm-none-eabi-gcc reports
warnings like below:

-- >8 -- >8 -- >8 -- >8 -- >8 -- >8 --

In file included from lfs.c:8:
lfs_util.h:45:12: warning: format '%d' expects argument of type 'int', but argument 4 has type 'lfs_block_t' {aka 'long unsigned int'} [-Wformat=]
     printf("lfs debug:%d: " fmt "\n", __LINE__, __VA_ARGS__)
            ^~~~~~~~~~~~~~~~
lfs.c:2512:21: note: in expansion of macro 'LFS_DEBUG'
                     LFS_DEBUG("Found partial move %d %d",
                     ^~~~~~~~~
lfs.c:2512:55: note: format string is defined here
                     LFS_DEBUG("Found partial move %d %d",
                                                      ~^
                                                      %ld

-- >8 -- >8 -- >8 -- >8 -- >8 -- >8 --

Fix this by replacing "%d" and "%x" with `"%" PRIu32` and `"%" PRIx32`.
2018-07-11 12:32:21 +02:00
Christopher Haster
eed1eec5fd Fixed information leaks through reused caches
As a shortcut, littlefs never bother to zero any of the buffers is used.
It didn't need to because it would always write out the entirety of the
data it needed.

Unfortunately, this, combined with the extra padding used to align
buffers to the nearest prog size, would lead to uninitialized data
getting written out to disk.

This means unrelated file data could be written to different parts of
storage, or worse, information leaked from the malloc calls could be
written out to disk unnecessarily.

found by rojer
2018-07-10 11:18:46 -05:00
Damien George
51346b8bf4 Fixed shadowed variable warnings
- Fixed shadowed variable warnings in lfs_dir_find.
- Fixed unused parameter warnings when LFS_NO_MALLOC is enabled.
- Added extra warning flags to CFLAGS.
- Updated tests so they don't shadow the "size" variable for -Wshadow
2018-07-02 10:29:19 -05:00
Christopher Haster
6beff502e9 Changed license to BSD-3-Clause
For better compatibility with GPL v2

With permissions from:
- aldot
- Sim4n6
- jrast
2018-06-21 11:41:43 -05:00
Christopher Haster
c5e2b335d6 Added error when opening multiple files with a statically allocated buffer
Opening multiple files simultaneously is not supported without dynamic memory,
but the previous behaviour would just let the files overwrite each other, which
could lead to bad errors down the line

found by husigeza
2018-04-30 03:37:10 -05:00
Christopher Haster
015b86bc51 Fixed issue with trailing dots in file paths
Paths such as the following were causing issues:
/tea/hottea/.
/tea/hottea/..

Unfortunately the existing structure for path lookup didn't make it very
easy to introduce proper handling in this case without duplicating the
entire skip logic for paths. So the lfs_dir_find function had to be
restructured a bit.

One odd side-effect of this is that now lfs_dir_find includes the
initial fetch operation. This kinda breaks the fetch -> op pattern of
the dir functions, but does come with a nice code size reduction.
2018-04-22 07:26:31 -05:00
Christopher Haster
9637b96069 Fixed lookahead overflow and removed unbounded lookahead pointers
As pointed out by davidefer, the lookahead pointer modular arithmetic
does not work around integer overflow when the pointer size is not a
multiple of the block count.

To avoid overflow problems, the easy solution is to stop trying to
work around integer overflows and keep the lookahead offset inside the
block device. To make this work, the ack was modified into a resetable
counter that is decremented every block allocation.

As a plus, quite a bit of the allocation logic ended up simplified.
2018-04-11 14:38:25 -05:00
Christopher Haster
89a7630d84 Fixed issue with lookahead trusting old lookahead blocks
One of the big simplifications in littlefs's implementation is the
complete lack of tracking free blocks, allowing operations to simply
drop blocks that are no longer in use.

However, this means the lookahead buffer can easily contain outdated
blocks that were previously deleted. This is usually fine, as littlefs
will rescan the storage if it can't find a free block in the lookahead
buffer, but after changes that caused littlefs to more conservatively
respect the alloc acks (e611cf5), any scanned blocks after an ack would
be incorrectly trusted.

The fix is to eagerly scan ahead in the lookahead when we allocate so
that alloc acks are better able to discredit old lookahead blocks. Since
usually alloc acks are tightly coupled to allocations of one or two blocks,
this allows littlefs to properly rescan every set of allocations.

This may still be a concern if there is a long series of worn out
blocks, but in the worst case littlefs will conservatively avoid using
blocks it's not sure about.

Found by davidefer
2018-04-09 14:37:35 -05:00
Christopher Haster
d9c076d909 Removed the uninitialized read for invalid superblocks 2018-03-19 00:39:40 -05:00
Christopher Haster
9ee112a7cb Fixed issue updating dir struct when extended dir chain
Like most of the lfs_dir_t functions, lfs_dir_append is responsible for
updating the lfs_dir_t struct if the underlying directory block is
moved. This property makes handling worn out blocks much easier by
removing the amount of state that needs to be considered during a
directory update.

However, extending the dir chain is a bit of a corner case. It's not
changing the old block, but callers of lfs_dir_append do assume the
"entry" will reside in "dir" after lfs_dir_append completes.

This issue only occurs when creating files, since mkdir does not use
the entry after lfs_dir_append. Unfortunately, the tests against
extending the directory chain were all made using mkdir.

Found by schouleu
2018-02-28 23:14:41 -06:00
Christopher Haster
d9c36371e7 Fixed handling of root as target for create operations
Before this patch, when calling lfs_mkdir or lfs_file_open with root
as the target, littlefs wouldn't find the path properly and happily
run into undefined behaviour.

The fix is to populate a directory entry for root in the lfs_dir_find
function. As an added plus, this allowed several special cases around
root to be completely dropped.
2018-02-28 23:13:02 -06:00
Christopher Haster
a3fd2d4d6d Added more configurable utils
Note: It's still expected to modify lfs_utils.h when porting littlefs
to a new target/system. There's just too much room for system-specific
improvements, such as taking advantage of CRC hardware.

Rather, encouraging modification of lfs_util.h and making it easy to
modify and debug should result in better integration with the consuming
systems.

This just adds a bunch of quality-of-life improvements that should help
development and integration in littlefs.

- Macros that require no side-effects are all-caps
- System includes are only brought in when needed
- Malloc/free wrappers
- LFS_NO_* checks for quickly disabling things at the command line
- At least a little-bit more docs
2018-02-19 01:40:23 -06:00
Christopher Haster
a0a55fb9e5 Added conversion to/from little-endian on disk
Required to support big-endian processors, with the most notable being
the PowerPC architecture.

On little-endian architectures, these conversions can be optimized out
and have no code impact.

Initial patch provided by gmouchard
2018-02-19 01:39:08 -06:00
Christopher Haster
e611cf5050 Fix incorrect lookahead population before ack
Rather than tracking all in-flight blocks blocks during a lookahead,
littlefs uses an ack scheme to mark the first allocated block that
hasn't reached the disk yet. littlefs assumes all blocks since the
last ack are bad or in-flight, and uses this to know when it's out
of storage.

However, these unacked allocations were still being populated in the
lookahead buffer. If the whole block device fits in the lookahead
buffer, _and_ littlefs managed to scan around the whole storage while
an unacked block was still in-flight, it would assume the block was
free and misallocate it.

The fix is to only fill the lookahead buffer up to the last ack.
The internal free structure was restructured to simplify the runtime
calculation of lookahead size.
2018-02-08 01:52:39 -06:00
Christopher Haster
a25743a82a Fixed some minor error code differences
- Write on read-only file to return LFS_ERR_BADF
- Renaming directory onto file to return LFS_ERR_NOTEMPTY
- Changed LFS_ERR_INVAL in lfs_file_seek to assert
2018-02-04 14:36:36 -06:00
Christopher Haster
6716b5580a Fixed error check when truncating files to larger size 2018-02-04 14:09:55 -06:00
Christopher Haster
dc513b172f Silenced more of aldot's warnings
Flags used:
-Wall -Wextra -Wshadow -Wwrite-strings -Wundef -Wstrict-prototypes
-Wunused -Wunused-parameter -Wunused-function -Wunused-value
-Wmissing-prototypes -Wmissing-declarations -Wold-style-definition
2018-02-04 13:15:30 -06:00
Bernhard Reutner-Fischer
aa50e03684 Commentary typo fix 2018-02-04 13:15:26 -06:00
Bernhard Reutner-Fischer
029361ea16 Silence shadow warnings 2018-02-04 13:15:09 -06:00
Christopher Haster
035552a858 Add version info for software library and on-disk structures
An annoying part of filesystems is that the software library can change
independently of the on-disk structures. For this reason versioning is
very important, and must be handled separately for the software and
on-disk parts.

In this patch, littlefs provides two version numbers at compile time,
with major and minor parts, in the form of 6 macros.

LFS_VERSION        // Library version, uint32_t encoded
LFS_VERSION_MAJOR  // Major - Backwards incompatible changes
LFS_VERSION_MINOR  // Minor - Feature additions

LFS_DISK_VERSION        // On-disk version, uint32_t encoded
LFS_DISK_VERSION_MAJOR  // Major - Backwards incompatible changes
LFS_DISK_VERSION_MINOR  // Minor - Feature additions

Note that littlefs will error if it finds a major version number that
is different, or a minor version number that has regressed.
2018-01-26 14:26:25 -06:00
Christopher Haster
d88f0ac02f Added lfs_file_truncate
As a copy-on-write filesystem, the truncate function is a very nice
function to have, as it can take advantage of reusing the data already
written out to disk.
2018-01-20 19:22:44 -06:00
Christopher Haster
1fb6a19520 Reduced ctz traverse runtime by 2x
Unfortunately for us, the ctz skip-list does not offer very much benefit
for full traversals. Since the information about which blocks are in
use are spread throughout the file, we can't use the fast-lanes
embedded in the skip-list without missing blocks.

However, it turns out we can at least use the 2nd level of the skip-list
without missing any blocks. From an asymptotic analysis, a constant speed
up isn't interesting, but from a pragmatic perspective, a 2x speedup is
not bad.
2018-01-12 12:07:45 -06:00
Christopher Haster
db8872781a Added error code LFS_ERR_NOTEMPTY
As noted by itayzafrir, removing a non-empty directory should
error with ENOTEMPTY, not EINVAL
2018-01-12 12:07:40 -06:00
Christopher Haster
c2fab8fabb Added asserts on geometry and updated config documentation
littlefs had an unwritten assumption that the block device's program
size would be a multiple of the read size, and the block size would
be a multiple of the program size. This has already caused confusion
for users. Added a note and assert to catch unexpected geometries
early.

Also found that the prog/erase functions indicated they must return
LFS_ERR_CORRUPT to catch bad blocks. This is no longer true as errors
are found by CRC.
2018-01-11 11:56:09 -06:00
Christopher Haster
472ccc4203 Fixed file truncation without writes
In the open call, the LFS_O_TRUNC flag was correctly zeroing the file, but
it wasn't actually writing the change out to disk. This went unnoticed because
in the cases where the truncate was followed by a file write, the
updated contents would be written out correctly.

Marking the file as dirty if the file isn't already truncated fixes the
problem with the least impact. Also added better test cases around
truncating files.
2018-01-11 10:26:33 -06:00
Christopher Haster
aea3d3db46 Fixed positive seek bounds checking
This bug was a result of an annoying corner case around intermingling
signed and unsigned offsets. The boundary check that prevents seeking
a file to a position before the file was preventing valid seeks with
positive offsets.

This corner case is a bit more complicated than it looks because the
offset is signed, while the size of the file is unsigned. Simply
casting both to signed or unsigned offsets won't handle large files.
2018-01-03 15:00:04 -06:00
Christopher Haster
425aa3c694 Fixed issue with immediate exhaustion and small unaligned storage
This was a small hole in the logic that handles initializing the
lookahead buffer. To imitate exhaustion (so the block allocator
will trigger a scan), the lookahead buffer is rewound a full
lookahead and set up to look like it is exhausted. However,
unlike normal allocation, this rewind was not kept aligned to
a multiple of the scan size, which is limited by both the
lookahead buffer and the total storage size.

This bug went unnoticed for so long because it only causes
problems when the block device is both:
1. Not aligned to the lookahead buffer (not a power of 2)
2. Smaller than the lookahead buffer

While this seems like a strange corner case for a block device,
this turned out to be very common for internal flash, especially
when a handleful of blocks are reserved for code.
2017-12-27 12:59:32 -06:00
Christopher Haster
bf78b09d37 Added directory list for synchronizing in flight directories
As it was, if a user operated on a directory while at the same
time iterating over the directory, the directory objects could
fall out of sync. In the best case, files may be skipped while
removing everything in a file, in the worst case, a very poorly
timed directory relocate could be missed.

Simple fix is to add the same directory tracking that is currently
in use for files, at a small code+complexity cost.
2017-11-22 14:49:43 -06:00
Christopher Haster
f9f4f5ccec Fixed standard name mismatch LFS_ERR_EXISTS -> LFS_ERR_EXIST
Matches the standard EEXIST name found on most systems. Other than
this name, all other common constant names were consistent in this
manner.
2017-11-16 17:50:14 -06:00
Christopher Haster
843e3c6c75 Added sticky-bit for preventing file syncs after write errors
Short story, files are no longer committed to directories during
file sync/close if the last write did not complete successfully.
This avoids a set of interesting user-experience issues related
to the end-of-life behaviour of the filesystem.

As a filesystem approaches end-of-life, the chances of running into
LFS_ERR_NOSPC grows rather quickly. Since this condition occurs after
at the end of a devices life, it's likely that operating in these
conditions hasn't been tested thoroughly.

In the specific case of file-writes, you can hit an LFS_ERR_NOSPC after
parts of the file have been written out. If the program simply continues
and closes the file, the file is written out half completed. Since
littlefs has a strong garuntee the prevents half-writes, it's unlikely
this state of the file would be expected.

To make things worse, since close is also responsible for memory
cleanup, it's actually _impossible_ to continue working as it was
without leaking memory.

By prevent the file commits, end-of-life behaviour should at least retain
a previous copy of the filesystem without any surprises.
2017-11-16 17:25:41 -06:00
Christopher Haster
2612e1b3fa Modified lfs_ctz_extend to be a little bit safer
Specifically around error handling. As is, incorrectly handled
errors could cause higher code to get uninitialized blocks,
potentially leading to writes to arbitray blocks on storage.
2017-11-16 15:10:17 -06:00
Christopher Haster
6664723e18 Fixed issue with committing directories to bad-blocks that are stuck
This is only an issue in the weird case that are worn down block is
left in the odd state of not being able to change the data that resides
on the block. That being said, this does pop up often when simulating
wear on block devices.

Currently, directory commits checked if the write succeeded by crcing the
block to avoid the additional RAM cost for another buffer. However,
before this commit, directory commits just checked if the block crc was
valid, rather than comparing to the expected crc. This would usually
work, unless the block was stuck in a state with valid crc.

The fix is to simply compare with the expected crc to find errors.
2017-11-16 14:53:45 -06:00
Christopher Haster
3f31c8cba3 Fixed corner case with immediate exhaustion and lookahead==block_count
The previous math for determining if we scanned all of disk wasn't set
up correctly in the lfs_mount function. If lookahead == block_count
the lfs_alloc function would think we had already searched the entire
disk.

This is only an issue if we manage to exhaust a block on the first
pass after mount, since lfs_alloc_ack resets the lookahead region
into a valid state after a succesful block allocation.
2017-11-10 10:53:33 -06:00
Christopher Haster
f4aeb8331a Fixed issue with aggressively rounding down lookahead configuration
The littlefs allows buffers to be passed statically in the case
that a system does not have a heap. Unfortunately, this means we
can't round up in the case of an unaligned lookahead buffer.

Double unfortunately, rounding down after clamping to the block device
size could result in a lookahead of zero for block devices < 32 blocks
large.

The assert in littlefs does catch this case, but rounding down prevents
support for < 32 block devices.

The solution is to simply require a 32-bit aligned buffer with an
assert. This avoids runtime problems while allowing a user to pass
in the correct buffer for < 32 block devices. Rounding up can be
handled at higher API levels.
2017-11-10 10:53:30 -06:00
Christopher Haster
db51a395ba Removed stray newline in LFS_ERROR for version 2017-11-09 19:44:23 -06:00
Christopher Haster
2ab150cc50 Removed toolchain specific warnings
- Comparisons with differently signed integer types
- Incorrectly signed constant
- Unreachable default returns
- Leaked uninitialized variables in relocate goto statements
2017-10-30 16:55:45 -05:00
Christopher Haster
0825d34f3d Adopted alternative implementation for lfs_ctz_index
Same runtime cost, however reduces the logic and avoids one of
the two big branches. See the DESIGN.md for more info.

Now uses these equations instead of the messy guess and correct method:
n = (N - w/8(popcount(N/(B-2w/8)) + 2)) / (B-2w/8)
off = N - (B-w2/8)n - w/8popcount(n)
2017-10-30 12:03:33 -05:00
Christopher Haster
46e22b2a38 Adopted lfs_ctz_index implementation using popcount
This reduces the O(n^2logn) runtime to read a file to only O(nlog).
The extra O(n) did not touch the disk, so it isn't a problem until the
files become very large, but this solution comes with very little cost.

Long story short, you can find the block index + offset pair for a
CTZ linked-list with this series of formulas:

n' = floor(N / (B - 2w/8))
N' = (B - 2w/8)n' + (w/8)popcount(n')
off' = N - N'
n, off =
  n'-1, off'+B                if off' <  0
  n',   off'+(w/8)(ctz(n')+1) if off' >= 0

For the long story, you will need to see the updated DESIGN.md
2017-10-18 00:44:30 -05:00
Christopher Haster
4fdca15a0d Slight name change with ctz skip-list functions
changed:
lfs_index -> lfs_ctz_index
lfs_index_find -> lfs_ctz_find
lfs_index_append -> lfs_ctz_append
lfs_index_traverse -> lfs_ctz_traverse
2017-10-18 00:41:43 -05:00
Christopher Haster
f3578e3250 Removed clamping to block size in ctz linked-list
Initially, I was concerned that the number of pointers in the ctz
linked-list could exceed the storage in a block. Long story short
this isn't really possible outside of extremely small block sizes.

Since clamping impacts the layout of files on disk, removing the
block size removed quite a bit of logic and corner cases. Replaced
with an assert on block size during initialization.

---

Long story long, the minimum block size needed to store all ctz
pointers in a filesystem can be found with this formula:

B = (w/8)*log2(2^w / (B - 2*(w/8)))

where:
B = block size in bytes
w = pointer width in bits

It's not a very pretty formula, but does give us some useful info
if we apply some math:

min block size:
32 bit ctz linked-list = 104 bytes
64 bit ctz linked-list = 448 bytes

For littlefs, 128 bytes is a perfectly reasonable minimum block size.
2017-10-12 20:31:33 -05:00
Christopher Haster
83d4c614a0 Updated copyright
Due to employee contract
Per ARM license remains under Apache 2.0
2017-10-12 20:29:10 -05:00
Christopher Haster
539409e2fb Refactored deduplicate/deorphan step to single deorphan step
Deduplication and deorphan steps aren't required under indentical
conditions, but they can be processed in the same iteration of the
filesystem. Since lfs_alloc (requires deorphan) occurs on most write
calls to the filesystem (requires deduplication), it was simpler to
just compine the steps into a single lfs_deorphan step.

Also traded out the places where lfs_rename/lfs_remove just defer
operations to the deorphan step. This adds a bit of code, but also
significantly speeds up directory operations.
2017-10-10 17:14:46 -05:00
Christopher Haster
2936514b5e Added atomic move using dirty tag in entry type
The "move problem" has been present in littlefs for a while, but I haven't
come across a solution worth implementing for various reasons.

The problem is simple: how do we move directory entries across
directories atomically? Since multiple directory entries are involved,
we can't rely entirely on the atomic block updates. It ends up being
a bit of a puzzle.

To make the problem more complicated, any directory block update can
fail due to wear, and cause the directory block to need to be relocated.
This happens rarely, but brings a large number of corner cases.

---

The solution in this patch is simple:
1. Mark source as "moving"
2. Copy source to destination
3. Remove source

If littlefs ever runs into a "moving" entry, that means a power loss
occured during a move. Either the destination entry exists or it
doesn't. In this case we just search the entire filesystem for the
destination entry.

This is expensive, however the chance of a power loss during a move
is relatively low.
2017-10-10 06:18:46 -05:00
Christopher Haster
984340225b Fixed incorrect return value from lfs_file_seek
lfs_file_seek returned the _previous_ file offset on success, where
most standards return the _calculated_ offset on success.

This just falls into me not noticing a mistake, and shows why it's
always helpful to have a second set of eyes on code.
2017-09-26 19:50:39 -05:00
Christopher Haster
273cb7c9c8 Fixed problem with lookaheads larger than block device
Simply limiting the lookahead region to the size of
the block device fixes the problem.

Also added logic to limit the allocated region and
floor to nearest word, since the additional memory
couldn't really be used effectively.
2017-09-18 21:20:33 -05:00
Christopher Haster
d9367e05ce Fixed collection of multiblock directories
Moslty just a hole in testing. Dir blocks were not being
correctly collected when removing entries from very large
files due to forgetting about the tail-bit in the directory
block size. The test hole has now been filled.

Also added lfs_entry_size to avoid having to repeat that
expression since it is a bit ridiculous
2017-09-17 20:38:54 -05:00
Christopher Haster
a83b2fe463 Added checks for out-of-bound seeks
- out-of-bound read results in eof
- out-of-bound write will fill missing area with zeros

The write behaviour matches expected posix behaviour, but was
under consideration for not being dropped, since littlefs does
not support holes, and support of out-of-band seeks adds complexity.

However, it turned out filling with zeros was trivial, and only
cost an extra 74 bytes of flash (0.48%).
2017-09-17 18:07:08 -05:00
Christopher Haster
a8fa5e6571 Fixed some corner cases with paths
- Added handling for root to lfs_stat
- Corrected lfs_dir_find to update path even on failures
- Added more checks for missing directories in path
2017-09-17 16:51:07 -05:00
Christopher Haster
26dd49aa04 Fixed issue with negative modulo with unaligned lookaheads
When the lookahead buffer wraps around in an unaligned filesystem, it's
possible for blocks at the beginning of the disk to have a negative distance
from the lookahead, but still reside in the lookahead buffer.

Switching to signed modulo doesn't quite work due to how negative modulo
is implemented in C, so the simple solution is to shift the region to be
positive.
2017-09-17 16:51:07 -05:00
Christopher Haster
0982020fb3 Fixed issue with cold-write after seek to block boundary
This off-by-one error was caused by a slight difference between the
pos argument to lfs_index_find and lfs_index_extend. When pos is on
a block boundary, lfs_index_extend expects block to point before pos,
as it would when writing a file linearly. But when seeking to that
pos, the lfs_index_find to warm things up just supplies the block it
expects pos to be in.

Fixed the off-by-one error and added a test case for several of these
cold seek+writes.
2017-09-17 16:51:04 -05:00
Christopher Haster
c2283a2753 Extended entry tag to support attributes
Zero attributes are actually supported at the moment, but this change
will allow entry attribute to be added in a backwards compatible manner.

Each dir entry is now prefixed with a 32 bit tag:
4b - entry type
4b - data structure
8b - entry len
8b - attribute len
8b - name len

A full entry on disk looks a bit like this:
[-  8 -|-  8 -|-  8 -|-  8 -|-- elen --|-- alen --|-- nlen --]
[ type | elen | alen | nlen |  entry   |  attrs   |   name   ]

The actually contents of the attributes section is a bit handwavey
until the first attributes are implemented, but to put plans in place:
Each attribute will be prefixed with only a byte that indicates the type
of attribute. Attributes should be sorted based on portability, since
unknown attributes will force attribute parsing to stop.
2017-07-18 02:09:35 -05:00
Christopher Haster
663e953a50 Adopted the Apache 2.0 license 2017-07-08 11:49:40 -05:00
Christopher Haster
8a9b9baa12 Modified entry head to include name length
This provides a path for adding inlined files in the future, which
requires multiple lengths to distinguish between the file data and name.

As an extra bonus, the directory can now be iterated over even if the
types are unknown, since the name's representation is consistent on all
entry types.

This does come at the cost of reducing types from 16-bits to 8-bits, but
I doubt this will become a problem.
2017-06-28 21:46:34 -05:00
Christopher Haster
931442a784 Adopted redundant cache read in lfs_file_relocate
Previously had some custom logic that could be reduced
2017-06-28 21:46:02 -05:00
Christopher Haster
0e1022a86c Fixed missing erase during file relocation
This was an easy fix, but highlighted the fact that the current testing
framework doesn't detect when a block is written to without an
associated erase.

Added a quick solution that creates an empty file during an erase.
2017-06-28 21:45:45 -05:00
Christopher Haster
a1138a41ce Fixed dirty rcache during directory commit
An interesting side-effect of adding internal checks to the littlefs
for block errors, is that the littlefs starts to cover up its own
flaws. Probably out of embarrassment.

In this case, the relocation logic for directories left the littlefs
rcache dirty with invalid data. The littlefs detected the error,
treated it as a corrupted write, and just moved the "corrupted" block
to a new block, which as a side-effect flushes the rcache.

Since committing a dir will end up flushing the rcache to check for
errors anyways, we can just drop the rcache in lfs_bd_sync.
2017-06-28 21:45:28 -05:00
Christopher Haster
11664160a8 Fixed relocation bug when a file is closed with lingering caches
This bug required larger cache sizes to notice, since block errors
usually get detected in the early stages of writing to files.

Since the fix here requires both lfs_file_write and lfs_file_flush
relocate file blocks, the relocation logic was moved out into a
seperate function.
2017-06-28 21:44:56 -05:00
Christopher Haster
fe28ea0f93 Added internal check of data written to disk
Before, the littlefs relied on the underlying block device
to report corruption that occurs when writing data to disk.
This requirement is easy to miss or implement incorrectly, since
the error detection is only required when a block becomes corrupted,
which is very unlikely to happen until late in the block device's
lifetime.

The littlefs can detect corruption itself by reading back written data.
This requires a bit of care to reuse the available buffers, and may rely
on checksums to avoid additional RAM requirements.

This does have a runtime penalty with the extra read operations, but
should make the littlefs much more robust to different implementations.
2017-06-28 15:50:47 -05:00
Christopher Haster
1eeb2a6811 Shrinked on-disk directory program size
Directories still consume two full erase blocks, but now only program
the exact on-disk region to store the directory contents. This results
in a decent improvement in the amount of data written and read to the
device when doing directory operations.

Calculating the checksum of dynamically sized data is surprisingly
tricky, since the size of the data could also contain errors. For the
littlefs, we can assume the data size must fit in an erase block.
If the data size is invalid, we can just treat the block as corrupted.
2017-06-28 15:50:40 -05:00
Christopher Haster
fd1da602d7 Added support for handling corrupted blocks
This provides a limited form of wear leveling. While wear is
not actually balanced across blocks, the filesystem can recover
from corrupted blocks and extend the lifetime of a device nearly
as much as dynamic wear leveling.

For use-cases where wear is important, it would be better to use
a full form of dynamic wear-leveling at the block level. (or
consider a logging filesystem).

Corrupted block handling was simply added on top of the existing
logic in place for the filesystem, so it's a bit more noodly than
it may have to be, but it gets the work done.
2017-05-15 00:40:56 -05:00
Christopher Haster
b35d761196 Removed words variable from lfs struct 2017-05-08 00:53:08 -05:00
Christopher Haster
63b52c9f2e Added proper handling for removing open files
Conveniently we previously added a linked-list of files
for things like this. This should handle most of the corner
cases where files are open during strange operations.

This also brings up the point that we aren't doing anything similar
for directories and don't even have a dir linked-list. After thinking
about it for a while, I've decided to leave out this handling for dirs.
It will likely be very complicated, with little gains as directories
are less used in embedded systems. Additionally, dirs are only open
for reading, and corruption will probably just cause the dir iteration
to terminate. If needed, correct handling of open directories can be
added later.
2017-05-08 00:48:28 -05:00
Christopher Haster
8621b61f38 Adopted 0xffffffff as null pointer
- Default value of most flash-based storage
- Avoids 0 == superblock/dir issue
- Usually causes assertions in bd driver layer
- Easier to notice in hex dumps
2017-05-08 00:46:10 -05:00
Christopher Haster
4808e9ae26 Added caching with managed caches at the file level
This adds a fully independent layer between the rest of the filesystem
and the block device. This requires some additionally logic around cache
invalidation and flushing, but removes the need for any higher layer to
consider read/write sizes less than what is supported by the hardware.

Additionally, these caches can be used for possible speed improvements.
This is left up to the user to optimize for their use cases. For very
limited embedded systems with byte-level read/writes, the caches could
be omitted completely, or they could even be the size of a full block
for minimizing storage access.

(A full block may not be the best for speed, consider if only a small
portion of the read block is used, but I'll leave that evaluation as an
exercise for any consumers of this library)
2017-05-08 00:44:54 -05:00
Christopher Haster
6869b14694 Fixed memory leak for lookahead buffer 2017-05-08 00:38:20 -05:00
Christopher Haster
a30142e0e1 Fixed allocation bugs near the end of storage
Also added better testing specifically for these corner cases
around the end of storage
2017-05-08 00:37:58 -05:00
Christopher Haster
210b487325 Added file list for tracking in flight allocations
Needed primarily for tracking block allocations, unfortunately
this prevents the freedom for the user to bitwise copy files.
2017-05-08 00:32:32 -05:00
Christopher Haster
b55719bab1 Adopted more conventional buffer parameter ordering
Adopted buffer followed by size. The other order was original
chosen due to some other functions with a more complicated
parameter list.

This convention is important, as the bd api is one of the main
apis facing porting efforts.
2017-04-23 23:58:43 -05:00
Christopher Haster
0406442253 Fixed non-standard behaviour of rdwr streams
Originally had two seperate positions for reading/writing,
but this is inconsistent with the the posix standard, which
has a single position for reading and writing.

Also added proper handling of when the file is dirty, just
added an internal flag for this state.

Also moved the entry out of the file struct, and rearranged
some members to clean things up.
2017-04-23 23:39:50 -05:00
Christopher Haster
287b54876e Standardized error values
Now matches the commonly used errno codes in name with the value
encoded as the negative errno code
2017-04-23 22:10:16 -05:00
Christopher Haster
5790ec2ce4 Structured some of the bulk of the codebase
- Removed lfs_config.h, distributed between lfs.h and lfs_util.h
- Moved some functions that felt out of place
2017-04-23 21:40:03 -05:00
Christopher Haster
ba8afb9d92 Added support for full seek operations
A rather involved upgrade for both files and directories, seek and
related functions are now completely supported:
- lfs_file_seek
- lfs_file_tell
- lfs_file_rewind
- lfs_file_size
- lfs_dir_seek
- lfs_dir_tell
- lfs_dir_rewind

This change also highlighted the concern that lfs_off_t is unsigned,
whereas off_t is traditionally signed. Unfortunately, lfs_off_t is
already used intensively through the codebase, so in focusing on
moving forward and avoiding getting bogged down by details, I'm going to
keep it as is and use the signed type lfs_soff_t where necessary.
2017-04-23 02:06:48 -05:00
Christopher Haster
a1d8a76b36 Added correct handling of file syncing around overwrites
Now all of the open flags are correctly handled

Even annoying cases where we can't trust the blocks that are already
on file, such as appending existing files and writing to the middle
of files.
2017-04-22 21:42:22 -05:00
Christopher Haster
a4e9132d7f Removed a layer of indirection for index-list lookup
Files are now stored directly in the index-list, instead of being
referenced by pointers that used to live there. This somewhat reduces
the complexity around handling files, while still keeping the O(logn)
lookup cost.
2017-04-22 19:48:31 -05:00
Christopher Haster
aa872657d2 Cleaned up block allocator
Removed scanning for stride
- Adds complexity with questionable benefit
- Can be added as an optimization later

Fixed handling around device boundaries and where lookahead may not be a
factor of the device size (consider small devices with only a few
blocks)

Added support for configuration with optional dynamic memory as found in
the caching configuration
2017-04-22 16:00:45 -05:00
Christopher Haster
7050922623 Added optional block-level caching
This adds caching of the most recent read/program blocks, allowing
support of devices that don't have byte-level read+writes, along
with reduced device access on devices that do support byte-level
read+writes.

Note: The current implementation is a bit eager to drop caches where
it simplifies the cache layer. This layer is already complex enough.

Note: It may be worthwhile to add a compile switch for caching to
reduce code size, note sure.

Note: This does add a dependency on malloc, which could have a porting
layer, but I'm just using the functions from stdlib for now. These can be
overwritten with noops if the user controls the system, and keeps things
simple for now.
2017-04-22 16:00:40 -05:00
Christopher Haster
789286a257 Simplified config
Before, the lfs had multiple paths to determine config options:
- lfs_config struct passed during initialization
- lfs_bd_info struct passed during block device initialization
- compile time options

This allowed different developers to provide their own needs
to the filesystem, such as the block device capabilities and
the higher level user's own tweaks.

However, this comes with additional complexity and action required
when the configurations are incompatible.

For now, this has been reduced to all information (including block
device function pointers) being passed through the lfs_config struct.
We just defer more complicated handling of configuration options to
the top level user.

This simplifies configuration handling and gives the top level user
the responsibility to handle configuration, which they probably would
have wanted to do anyways.
2017-04-22 15:42:05 -05:00
Christopher Haster
3b9d6630c8 Restructured directory code
After quite a bit of prototyping, settled on the following functions:
- lfs_dir_alloc  - create a new dir
- lfs_dir_fetch  - load and check a dir pair from disk
- lfs_dir_commit - save a dir pair to disk
- lfs_dir_shift  - shrink a dir pair to disk
- lfs_dir_append - add a dir entry, creating dirs if needed
- lfs_dir_remove - remove a dir entry, dropping dirs if needed

Additionally, followed through with a few other tweaks
2017-04-18 01:44:01 -05:00
Christopher Haster
bd817abb00 Added support for renaming dirs/files 2017-04-18 01:44:01 -05:00
Christopher Haster
3b1bcbe851 Removed .. and . entries
No longer need to be stored on disk, can be simulated on
the chip side. As mentioned in other commits, the parent
entries had dozens of problems with atomic updates, as
well as making everything just a bit more complex than
is needed.
2017-04-18 01:44:01 -05:00
Christopher Haster
1f13006e36 Added dir navigation without needing parent entries
This should be the last step to removing the need for
parent entries.

Parent entries cause all sort of problems with atomic
directory updates, especially related to moving/deleting
directories.

I couldn't figure out a parser for '..' entries without,
O(n^2) runtime, a stack, or modifying the path itself.
Since the goal is constant memory consumption, I went
with the O(n^2) runtime solution, but this may need to
be optimized later.
2017-04-18 01:44:01 -05:00
Christopher Haster
c25c893219 Moved to brute-force deorphan without parent pointers
Removing the dependency to the parent pointer solves
many issues with non-atomic updates of children's
parent pointers with respect to any move operations.

However, this comes with an embarrassingly terrible
runtime as the only other option is to exhaustively
check every dir entry to find a child's parent.

Fortunately, deorphaning should be a relatively rare
operation.
2017-04-18 01:44:01 -05:00
Christopher Haster
96a42581be Added the lfs_stat function 2017-04-18 01:44:01 -05:00
Christopher Haster
a3734eeb34 Added proper handling of orphans
Unfortunately, threading all dir blocks in a linked-list did
not come without problems.

While it's possible to atomically add a dir to the linked list
(by adding the new dir into the linked-list position immediately
after it's parent, requiring only one atomic update to the parent
block), it is not easy to make sure the linked-list is in a state
that always allows atomic removal of dirs.

The simple solution is to allow this non-atomic removal, with an
additional step to remove any orphans that could have been created
by a power-loss. This deorphan step is only run if the normal
allocator has failed.
2017-04-18 01:44:01 -05:00
Christopher Haster
8a674524fc Added full dir list and rudimentary block allocator
In writing the initial allocator, I ran into the rather
difficult problem of trying to iterate through the entire
filesystem cheaply and with only constant memory consumption
(which prohibits recursive functions).

The solution was to simply thread all directory blocks onto a
massive linked-list that spans the entire filesystem.

With the linked-list it was easy to create a traverse function
for all blocks in use on the filesystem (which has potential
for other utility), and add the rudimentary block allocator
using a bit-vector.

While the linked-list may add complexity (especially where
needing to maintain atomic operations), the linked-list helps
simplify what is currently the most expensive operation in
the filesystem, with no cost to space (the linked-list can
reuse the pointers used for chained directory blocks).
2017-04-18 01:44:01 -05:00
Christopher Haster
ca01b72a35 Added path iteration and chained directories
All path iteration all goes through the lfs_dir_find function,
which manages the syntax of paths and updates the path pointer
to just the name stored in the dir entry.

Also added directory chaining, which allows more than one block
per directory. This is a simple linked list.
2017-04-18 01:44:00 -05:00
Christopher Haster
a711675607 Added dir tests, test fixes, config 2017-03-25 19:23:30 -05:00
Christopher Haster
afa4ad8254 Added a rudimentary test framework
Tests can be found in 'tests/test_blah.sh'
Tests can be run with 'make test'
2017-03-25 19:23:30 -05:00
Christopher Haster
84a57642e5 Restructured the major interfaces of the filesystem 2017-03-25 19:23:26 -05:00
Christopher Haster
f566846223 Revised free-list structure to adopt a lazy scanning allocator of sorts
The free-list structure, while efficient for allocations, had one big
issue: complexity. Storing free blocks as a simple fifo made sense
when dealing with a single file, but as soon as you have two files
open for writing, updating the free list atomicly when the two files
can not necessarily even be written atomicly proved problematic. It's a
solvable problem, but requires many writes to keep track of everything.

Now changing direction to pursue a more "drop it on the floor" strategy.
Since allocated blocks are tracked by the filesystem, we can simply
subtract from all available blocks the blocks we know of to allocate new
blocks. This is very expensive (O(blocks in use * blocks on device)),
but greatly simplifies any interactions that result in deallocated
blocks.

Additionally, it's impossible to corrupt the free list structure
during a power failure. Anything blocks that aren't tracked are simply
"dropped on the floor", and can be allocated later.

There's still a bit of work around the actually allocator to make it
run in a somewhat reasonable frame of time while still avoiding
dynamic allocations. Currently looking at a bit-vector of free
blocks so at least strides of blocks can be skipped in a single
filesystem iteration.
2017-03-25 19:04:21 -05:00
Christopher Haster
ed674e8414 Added support for the basic file operation
Missing seek, but these are the core filesystem operations
provided by this filesystem:
- Read a file
- Append to a file

Additional work is needed around freeing the previous file, so
right now it's limited to appending to existing files, a real
append only filesystem. Unfortunately the overhead of the free
list with multiple open files is becoming tricky.
2017-03-19 22:25:36 -05:00
Christopher Haster
53674cb3bc Added limited support for directories
This comes with a lot of scafolding put into place around the core
of the filesystem.

Added operations:
- append an entry to a directory
- find an entry in a directory
- iterate over entries in a directory

Some to do:
- Chaining multiple directory blocks
- Recursion on directory operations
2017-03-19 22:25:36 -05:00
Christopher Haster
106b06a457 Added better handling for metadata pairs
The core algorithim that backs this filesystem's goal of fault
tolerance is the alternating of "metadata pairs". Backed by a
simple core function for reading and writing, makes heavy use
of c99 designated initializers for passing info about multiple
chunks in an erase block.
2017-03-19 22:25:36 -05:00
Christopher Haster
1d36fc606a Added initial superblock definition
Really started working out how the internal structure of the driver
will be organized. There are a few hazy lines between the intended
data structures with the goal of code reuse, so the function boundaries
may end up a bit weird.
2017-03-19 22:25:33 -05:00