Close: #I77XXH Signed-off-by: liuwenxin <liuwenxin11@huawei.com> Change-Id: Id783611964850c03b96eb0f2a24bdedd376650df
30 KiB
littlefs technical specification
This is the technical specification of the little filesystem. This document covers the technical details of how the littlefs is stored on disk for introspection and tooling. This document assumes you are familiar with the design of the littlefs, for more info on how littlefs works check out DESIGN.md.
| | | .---._____
.-----. | |
--|o |---| littlefs |
--| |---| |
'-----' '----------'
| | |
Some quick notes
-
littlefs is a block-based filesystem. The disk is divided into an array of evenly sized blocks that are used as the logical unit of storage.
-
Block pointers are stored in 32 bits, with the special value
0xffffffff
representing a null block address. -
In addition to the logical block size (which usually matches the erase block size), littlefs also uses a program block size and read block size. These determine the alignment of block device operations, but don't need to be consistent for portability.
-
By default, all values in littlefs are stored in little-endian byte order.
Directories / Metadata pairs
Metadata pairs form the backbone of littlefs and provide a system for distributed atomic updates. Even the superblock is stored in a metadata pair.
As their name suggests, a metadata pair is stored in two blocks, with one block providing a backup during erase cycles in case power is lost. These two blocks are not necessarily sequential and may be anywhere on disk, so a "pointer" to a metadata pair is stored as two block pointers.
On top of this, each metadata block behaves as an appendable log, containing a variable number of commits. Commits can be appended to the metadata log in order to update the metadata without requiring an erase cycles. Note that successive commits may supersede the metadata in previous commits. Only the most recent metadata should be considered valid.
The high-level layout of a metadata block is fairly simple:
.---------------------------------------.
.-| revision count | entries | \
| |-------------------+ | |
| | | |
| | | +-- 1st commit
| | | |
| | +-------------------| |
| | | CRC | /
| |-------------------+-------------------|
| | entries | \
| | | |
| | | +-- 2nd commit
| | +-------------------+--------------| |
| | | CRC | padding | /
| |----+-------------------+--------------|
| | entries | \
| | | |
| | | +-- 3rd commit
| | +-------------------+---------| |
| | | CRC | | /
| |---------+-------------------+ |
| | unwritten storage | more commits
| | | |
| | | v
| | |
| | |
| '---------------------------------------'
'---------------------------------------'
Each metadata block contains a 32-bit revision count followed by a number of commits. Each commit contains a variable number of metadata entries followed by a 32-bit CRC.
Note also that entries aren't necessarily word-aligned. This allows us to store metadata more compactly, however we can only write to addresses that are aligned to our program block size. This means each commit may have padding for alignment.
Metadata block fields:
-
Revision count (32-bits) - Incremented every erase cycle. If both blocks contain valid commits, only the block with the most recent revision count should be used. Sequence comparison must be used to avoid issues with integer overflow.
-
CRC (32-bits) - Detects corruption from power-loss or other write issues. Uses a CRC-32 with a polynomial of
0x04c11db7
initialized with0xffffffff
.
Entries themselves are stored as a 32-bit tag followed by a variable length blob of data. But exactly how these tags are stored is a little bit tricky.
Metadata blocks support both forward and backward iteration. In order to do
this without duplicating the space for each tag, neighboring entries have their
tags XORed together, starting with 0xffffffff
.
Forward iteration Backward iteration
.-------------------. 0xffffffff .-------------------.
| revision count | | | revision count |
|-------------------| v |-------------------|
| tag ~A |---> xor -> tag A | tag ~A |---> xor -> 0xffffffff
|-------------------| | |-------------------| ^
| data A | | | data A | |
| | | | | |
| | | | | |
|-------------------| v |-------------------| |
| tag AxB |---> xor -> tag B | tag AxB |---> xor -> tag A
|-------------------| | |-------------------| ^
| data B | | | data B | |
| | | | | |
| | | | | |
|-------------------| v |-------------------| |
| tag BxC |---> xor -> tag C | tag BxC |---> xor -> tag B
|-------------------| |-------------------| ^
| data C | | data C | |
| | | | tag C
| | | |
| | | |
'-------------------' '-------------------'
One last thing to note before we get into the details around tag encoding. Each tag contains a valid bit used to indicate if the tag and containing commit is valid. This valid bit is the first bit found in the tag and the commit and can be used to tell if we've attempted to write to the remaining space in the block.
Here's a more complete example of metadata block containing 4 entries:
.---------------------------------------.
.-| revision count | tag ~A | \
| |-------------------+-------------------| |
| | data A | |
| | | |
| |-------------------+-------------------| |
| | tag AxB | data B | <--. |
| |-------------------+ | | |
| | | | +-- 1st commit
| | +-------------------+---------| | |
| | | tag BxC | | <-.| |
| |---------+-------------------+ | || |
| | data C | || |
| | | || |
| |-------------------+-------------------| || |
| | tag CxCRC | CRC | || /
| |-------------------+-------------------| ||
| | tag CRCxA' | data A' | || \
| |-------------------+ | || |
| | | || |
| | +-------------------+----| || +-- 2nd commit
| | | tag CRCxA' | | || |
| |--------------+-------------------+----| || |
| | CRC | padding | || /
| |--------------+----+-------------------| ||
| | tag CRCxA'' | data A'' | <---. \
| |-------------------+ | ||| |
| | | ||| |
| | +-------------------+---------| ||| |
| | | tag A''xD | | < ||| |
| |---------+-------------------+ | |||| +-- 3rd commit
| | data D | |||| |
| | +---------| |||| |
| | | tag Dx| |||| |
| |---------+-------------------+---------| |||| |
| |CRC | CRC | | |||| /
| |---------+-------------------+ | ||||
| | unwritten storage | |||| more commits
| | | |||| |
| | | |||| v
| | | ||||
| | | ||||
| '---------------------------------------' ||||
'---------------------------------------' |||'- most recent A
||'-- most recent B
|'--- most recent C
'---- most recent D
Metadata tags
So in littlefs, 32-bit tags describe every type of metadata. And this means every type of metadata, including file entries, directory fields, and global state. Even the CRCs used to mark the end of commits get their own tag.
Because of this, the tag format contains some densely packed information. Note that there are multiple levels of types which break down into more info:
[---- 32 ----]
[1|-- 11 --|-- 10 --|-- 10 --]
^. ^ . ^ ^- length
|. | . '------------ id
|. '-----.------------------ type (type3)
'.-----------.------------------ valid bit
[-3-|-- 8 --]
^ ^- chunk
'------- type (type1)
Before we go further, there's one important thing to note. These tags are not stored in little-endian. Tags stored in commits are actually stored in big-endian (and is the only thing in littlefs stored in big-endian). This little bit of craziness comes from the fact that the valid bit must be the first bit in a commit, and when converted to little-endian, the valid bit finds itself in byte 4. We could restructure the tag to store the valid bit lower, but, because none of the fields are byte-aligned, this would be more complicated than just storing the tag in big-endian.
Another thing to note is that both the tags 0x00000000
and 0xffffffff
are
invalid and can be used for null values.
Metadata tag fields:
-
Valid bit (1-bit) - Indicates if the tag is valid.
-
Type3 (11-bits) - Type of the tag. This field is broken down further into a 3-bit abstract type and an 8-bit chunk field. Note that the value
0x000
is invalid and not assigned a type.-
Type1 (3-bits) - Abstract type of the tag. Groups the tags into 8 categories that facilitate bitmasked lookups.
-
Chunk (8-bits) - Chunk field used for various purposes by the different abstract types. type1+chunk+id form a unique identifier for each tag in the metadata block.
-
-
Id (10-bits) - File id associated with the tag. Each file in a metadata block gets a unique id which is used to associate tags with that file. The special value
0x3ff
is used for any tags that are not associated with a file, such as directory and global metadata. -
Length (10-bits) - Length of the data in bytes. The special value
0x3ff
indicates that this tag has been deleted.
Metadata types
What follows is an exhaustive list of metadata in littlefs.
0x401
LFS_TYPE_CREATE
Creates a new file with this id. Note that files in a metadata block don't necessarily need a create tag. All a create does is move over any files using this id. In this sense a create is similar to insertion into an imaginary array of files.
The create and delete tags allow littlefs to keep files in a directory ordered alphabetically by filename.
0x4ff
LFS_TYPE_DELETE
Deletes the file with this id. An inverse to create, this tag moves over any files neighboring this id similar to a deletion from an imaginary array of files.
0x0xx
LFS_TYPE_NAME
Associates the id with a file name and file type.
The data contains the file name stored as an ASCII string (may be expanded to UTF8 in the future).
The chunk field in this tag indicates an 8-bit file type which can be one of the following.
Currently, the name tag must precede any other tags associated with the id and can not be reassigned without deleting the file.
Layout of the name tag:
tag data
[-- 32 --][--- variable length ---]
[1| 3| 8 | 10 | 10 ][--- (size * 8) ---]
^ ^ ^ ^ ^- size ^- file name
| | | '------ id
| | '----------- file type
| '-------------- type1 (0x0)
'----------------- valid bit
Name fields:
-
file type (8-bits) - Type of the file.
-
file name - File name stored as an ASCII string.
0x001
LFS_TYPE_REG
Initializes the id + name as a regular file.
How each file is stored depends on its struct tag, which is described below.
0x002
LFS_TYPE_DIR
Initializes the id + name as a directory.
Directories in littlefs are stored on disk as a linked-list of metadata pairs, each pair containing any number of files in alphabetical order. A pointer to the directory is stored in the struct tag, which is described below.
0x0ff
LFS_TYPE_SUPERBLOCK
Initializes the id as a superblock entry.
The superblock entry is a special entry used to store format-time configuration and identify the filesystem.
The name is a bit of a misnomer. While the superblock entry serves the same purpose as a superblock found in other filesystems, in littlefs the superblock does not get a dedicated block. Instead, the superblock entry is duplicated across a linked-list of metadata pairs rooted on the blocks 0 and 1. The last metadata pair doubles as the root directory of the filesystem.
.--------. .--------. .--------. .--------. .--------.
.| super |->| super |->| super |->| super |->| file B |
|| block | || block | || block | || block | || file C |
|| | || | || | || file A | || file D |
|'--------' |'--------' |'--------' |'--------' |'--------'
'--------' '--------' '--------' '--------' '--------'
\----------------+----------------/ \----------+----------/
superblock pairs root directory
The filesystem starts with only the root directory. The superblock metadata pairs grow every time the root pair is compacted in order to prolong the life of the device exponentially.
The contents of the superblock entry are stored in a name tag with the superblock type and an inline-struct tag. The name tag contains the magic string "littlefs", while the inline-struct tag contains version and configuration information.
Layout of the superblock name tag and inline-struct tag:
tag data
[-- 32 --][-- 32 --|-- 32 --]
[1|- 11 -| 10 | 10 ][--- 64 ---]
^ ^ ^ ^- size (8) ^- magic string ("littlefs")
| | '------ id (0)
| '------------ type (0x0ff)
'----------------- valid bit
tag data
[-- 32 --][-- 32 --|-- 32 --|-- 32 --]
[1|- 11 -| 10 | 10 ][-- 32 --|-- 32 --|-- 32 --]
^ ^ ^ ^ ^- version ^- block size ^- block count
| | | | [-- 32 --|-- 32 --|-- 32 --]
| | | | [-- 32 --|-- 32 --|-- 32 --]
| | | | ^- name max ^- file max ^- attr max
| | | '- size (24)
| | '------ id (0)
| '------------ type (0x201)
'----------------- valid bit
Superblock fields:
-
Magic string (8-bytes) - Magic string indicating the presence of littlefs on the device. Must be the string "littlefs".
-
Version (32-bits) - The version of littlefs at format time. The version is encoded in a 32-bit value with the upper 16-bits containing the major version, and the lower 16-bits containing the minor version.
This specification describes version 2.0 (
0x00020000
). -
Block size (32-bits) - Size of the logical block size used by the filesystem in bytes.
-
Block count (32-bits) - Number of blocks in the filesystem.
-
Name max (32-bits) - Maximum size of file names in bytes.
-
File max (32-bits) - Maximum size of files in bytes.
-
Attr max (32-bits) - Maximum size of file attributes in bytes.
The superblock must always be the first entry (id 0) in a metadata pair as well as be the first entry written to the block. This means that the superblock entry can be read from a device using offsets alone.
0x2xx
LFS_TYPE_STRUCT
Associates the id with an on-disk data structure.
The exact layout of the data depends on the data structure type stored in the chunk field and can be one of the following.
Any type of struct supersedes all other structs associated with the id. For example, appending a ctz-struct replaces an inline-struct on the same file.
0x200
LFS_TYPE_DIRSTRUCT
Gives the id a directory data structure.
Directories in littlefs are stored on disk as a linked-list of metadata pairs, each pair containing any number of files in alphabetical order.
|
v
.--------. .--------. .--------. .--------. .--------. .--------.
.| file A |->| file D |->| file G |->| file I |->| file J |->| file M |
|| file B | || file E | || file H | || | || file K | || file N |
|| file C | || file F | || | || | || file L | || |
|'--------' |'--------' |'--------' |'--------' |'--------' |'--------'
'--------' '--------' '--------' '--------' '--------' '--------'
The dir-struct tag contains only the pointer to the first metadata-pair in the directory. The directory size is not known without traversing the directory.
The pointer to the next metadata-pair in the directory is stored in a tail tag, which is described below.
Layout of the dir-struct tag:
tag data
[-- 32 --][-- 32 --|-- 32 --]
[1|- 11 -| 10 | 10 ][--- 64 ---]
^ ^ ^ ^- size (8) ^- metadata pair
| | '------ id
| '------------ type (0x200)
'----------------- valid bit
Dir-struct fields:
- Metadata pair (8-bytes) - Pointer to the first metadata-pair in the directory.
0x201
LFS_TYPE_INLINESTRUCT
Gives the id an inline data structure.
Inline structs store small files that can fit in the metadata pair. In this case, the file data is stored directly in the tag's data area.
Layout of the inline-struct tag:
tag data
[-- 32 --][--- variable length ---]
[1|- 11 -| 10 | 10 ][--- (size * 8) ---]
^ ^ ^ ^- size ^- inline data
| | '------ id
| '------------ type (0x201)
'----------------- valid bit
Inline-struct fields:
- Inline data - File data stored directly in the metadata-pair.
0x202
LFS_TYPE_CTZSTRUCT
Gives the id a CTZ skip-list data structure.
CTZ skip-lists store files that can not fit in the metadata pair. These files are stored in a skip-list in reverse, with a pointer to the head of the skip-list. Note that the head of the skip-list and the file size is enough information to read the file.
How exactly CTZ skip-lists work is a bit complicated. A full explanation can be found in the DESIGN.md.
A quick summary: For every nth block where n is divisible by 2ˣ, that block contains a pointer to block n-2ˣ. These pointers are stored in increasing order of x in each block of the file before the actual data.
|
v
.--------. .--------. .--------. .--------. .--------. .--------.
| A |<-| D |<-| G |<-| J |<-| M |<-| P |
| B |<-| E |--| H |<-| K |--| N | | Q |
| C |<-| F |--| I |--| L |--| O | | |
'--------' '--------' '--------' '--------' '--------' '--------'
block 0 block 1 block 2 block 3 block 4 block 5
1 skip 2 skips 1 skip 3 skips 1 skip
Note that the maximum number of pointers in a block is bounded by the maximum file size divided by the block size. With 32 bits for file size, this results in a minimum block size of 104 bytes.
Layout of the CTZ-struct tag:
tag data
[-- 32 --][-- 32 --|-- 32 --]
[1|- 11 -| 10 | 10 ][-- 32 --|-- 32 --]
^ ^ ^ ^ ^ ^- file size
| | | | '-------------------- file head
| | | '- size (8)
| | '------ id
| '------------ type (0x202)
'----------------- valid bit
CTZ-struct fields:
-
File head (32-bits) - Pointer to the block that is the head of the file's CTZ skip-list.
-
File size (32-bits) - Size of the file in bytes.
0x3xx
LFS_TYPE_USERATTR
Attaches a user attribute to an id.
littlefs has a concept of "user attributes". These are small user-provided attributes that can be used to store things like timestamps, hashes, permissions, etc.
Each user attribute is uniquely identified by an 8-bit type which is stored in the chunk field, and the user attribute itself can be found in the tag's data.
There are currently no standard user attributes and a portable littlefs implementation should work with any user attributes missing.
Layout of the user-attr tag:
tag data
[-- 32 --][--- variable length ---]
[1| 3| 8 | 10 | 10 ][--- (size * 8) ---]
^ ^ ^ ^ ^- size ^- attr data
| | | '------ id
| | '----------- attr type
| '-------------- type1 (0x3)
'----------------- valid bit
User-attr fields:
-
Attr type (8-bits) - Type of the user attributes.
-
Attr data - The data associated with the user attribute.
0x6xx
LFS_TYPE_TAIL
Provides the tail pointer for the metadata pair itself.
The metadata pair's tail pointer is used in littlefs for a linked-list containing all metadata pairs. The chunk field contains the type of the tail, which indicates if the following metadata pair is a part of the directory (hard-tail) or only used to traverse the filesystem (soft-tail).
.--------.
.| dir A |-.
||softtail| |
.--------| |-'
| |'--------'
| '---|--|-'
| .-' '-------------.
| v v
| .--------. .--------. .--------.
'->| dir B |->| dir B |->| dir C |
||hardtail| ||softtail| || |
|| | || | || |
|'--------' |'--------' |'--------'
'--------' '--------' '--------'
Currently any type supersedes any other preceding tails in the metadata pair, but this may change if additional metadata pair state is added.
A note about the metadata pair linked-list: Normally, this linked-list contains every metadata pair in the filesystem. However, there are some operations that can cause this linked-list to become out of sync if a power-loss were to occur. When this happens, littlefs sets the "sync" flag in the global state. How exactly this flag is stored is described below.
When the sync flag is set:
- The linked-list may contain an orphaned directory that has been removed in the filesystem.
- The linked-list may contain a metadata pair with a bad block that has been replaced in the filesystem.
If the sync flag is set, the threaded linked-list must be checked for these errors before it can be used reliably. Note that the threaded linked-list can be ignored if littlefs is mounted read-only.
Layout of the tail tag:
tag data
[-- 32 --][-- 32 --|-- 32 --]
[1| 3| 8 | 10 | 10 ][--- 64 ---]
^ ^ ^ ^ ^- size (8) ^- metadata pair
| | | '------ id
| | '---------- tail type
| '------------- type1 (0x6)
'---------------- valid bit
Tail fields:
-
Tail type (8-bits) - Type of the tail pointer.
-
Metadata pair (8-bytes) - Pointer to the next metadata-pair.
0x600
LFS_TYPE_SOFTTAIL
Provides a tail pointer that points to the next metadata pair in the filesystem.
In this case, the next metadata pair is not a part of our current directory and should only be followed when traversing the entire filesystem.
0x601
LFS_TYPE_HARDTAIL
Provides a tail pointer that points to the next metadata pair in the directory.
In this case, the next metadata pair belongs to the current directory. Note that because directories in littlefs are sorted alphabetically, the next metadata pair should only contain filenames greater than any filename in the current pair.
0x7xx
LFS_TYPE_GSTATE
Provides delta bits for global state entries.
littlefs has a concept of "global state". This is a small set of state that can be updated by a commit to any metadata pair in the filesystem.
The way this works is that the global state is stored as a set of deltas distributed across the filesystem such that the global state can be found by the xor-sum of these deltas.
.--------. .--------. .--------. .--------. .--------.
.| |->| gdelta |->| |->| gdelta |->| gdelta |
|| | || 0x23 | || | || 0xff | || 0xce |
|| | || | || | || | || |
|'--------' |'--------' |'--------' |'--------' |'--------'
'--------' '----|---' '--------' '----|---' '----|---'
v v v
0x00 --> xor ------------------> xor ------> xor --> gstate = 0x12
Note that storing globals this way is very expensive in terms of storage usage, so any global state should be kept very small.
The size and format of each piece of global state depends on the type, which is stored in the chunk field. Currently, the only global state is move state, which is outlined below.
0x7ff
LFS_TYPE_MOVESTATE
Provides delta bits for the global move state.
The move state in littlefs is used to store info about operations that could cause to filesystem to go out of sync if the power is lost. The operations where this could occur is moves of files between metadata pairs and any operation that changes metadata pairs on the threaded linked-list.
In the case of moves, the move state contains a tag + metadata pair describing the source of the ongoing move. If this tag is non-zero, that means that power was lost during a move, and the file exists in two different locations. If this happens, the source of the move should be considered deleted, and the move should be completed (the source should be deleted) before any other write operations to the filesystem.
In the case of operations to the threaded linked-list, a single "sync" bit is used to indicate that a modification is ongoing. If this sync flag is set, the threaded linked-list will need to be checked for errors before it can be used reliably. The exact cases to check for are described above in the tail tag.
Layout of the move state:
tag data
[-- 32 --][-- 32 --|-- 32 --|-- 32 --]
[1|- 11 -| 10 | 10 ][1|- 11 -| 10 | 10 |--- 64 ---]
^ ^ ^ ^ ^ ^ ^ ^- padding (0) ^- metadata pair
| | | | | | '------ move id
| | | | | '------------ move type
| | | | '----------------- sync bit
| | | |
| | | '- size (12)
| | '------ id (0x3ff)
| '------------ type (0x7ff)
'----------------- valid bit
Move state fields:
-
Sync bit (1-bit) - Indicates if the metadata pair threaded linked-list is in-sync. If set, the threaded linked-list should be checked for errors.
-
Move type (11-bits) - Type of move being performed. Must be either
0x000
, indicating no move, or0x4ff
indicating the source file should be deleted. -
Move id (10-bits) - The file id being moved.
-
Metadata pair (8-bytes) - Pointer to the metadata-pair containing the move.
0x5xx
LFS_TYPE_CRC
Last but not least, the CRC tag marks the end of a commit and provides a checksum for any commits to the metadata block.
The first 32-bits of the data contain a CRC-32 with a polynomial of
0x04c11db7
initialized with 0xffffffff
. This CRC provides a checksum for
all metadata since the previous CRC tag, including the CRC tag itself. For
the first commit, this includes the revision count for the metadata block.
However, the size of the data is not limited to 32-bits. The data field may larger to pad the commit to the next program-aligned boundary.
In addition, the CRC tag's chunk field contains a set of flags which can change the behaviour of commits. Currently the only flag in use is the lowest bit, which determines the expected state of the valid bit for any following tags. This is used to guarantee that unwritten storage in a metadata block will be detected as invalid.
Layout of the CRC tag:
tag data
[-- 32 --][-- 32 --|--- variable length ---]
[1| 3| 8 | 10 | 10 ][-- 32 --|--- (size * 8 - 32) ---]
^ ^ ^ ^ ^ ^- crc ^- padding
| | | | '- size
| | | '------ id (0x3ff)
| | '----------- valid state
| '-------------- type1 (0x5)
'----------------- valid bit
CRC fields:
-
Valid state (1-bit) - Indicates the expected value of the valid bit for any tags in the next commit.
-
CRC (32-bits) - CRC-32 with a polynomial of
0x04c11db7
initialized with0xffffffff
. -
Padding - Padding to the next program-aligned boundary. No guarantees are made about the contents.