Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Assert Triggered When Removing File After Filesystem is Full #1080

Open
pasvensson opened this issue Mar 7, 2025 · 4 comments
Open

Comments

@pasvensson
Copy link

Hi,
I'm encountering an issue with LittleFS where an internal assert is triggered when attempting to remove a file after the filesystem has been reported as full. This problem occurs consistently and prevents further file operations. Perhaps I'm missing to do something. Is there any way to prevent this from happening?

Steps to Reproduce:

  1. Fill the filesystem until it reports being full.
  2. Attempt to remove a file from the filesystem.
  3. Observe the internal assert being triggered.

Expected Behavior:

The file should be removed without triggering an internal assert.

Actual Behavior:

An internal assert is triggered, preventing the file from being removed.

Environment:

Running a test in the littlefs test environment

Log

Here are the last few lines of output from the test I'm running:

tests/test_exhaustion_repeat.toml:99:warn: Wrote  79 bytes to file 0000000000000612
lfs.c:704:error: No more free space 0x2
lfs.c:2191:warn: Unable to split {0x2, 0x3}
tests/test_exhaustion_repeat.toml:102:warn: file_close: NOSPC
tests/test_exhaustion_repeat.toml:104:warn: Filesystem full, during file_close, removing some files
tests/test_exhaustion_repeat.toml:34:warn: Try to remove 0000000000000579
lfs.c:262:assert: assert failed with 3, expected eq 4294967295
Aborted

References:

The littlefs fork & branch: https://github.com/pasvensson/littlefs/tree/exhaustion_repeat
The test case: tests/test_exhaustion_repeat.toml

@geky
Copy link
Member

geky commented Mar 13, 2025

Hi @pasvensson, thanks for creating an issue and test case.

This is unfortunately expected. If LittleFS fills up completely, it can become "stuck", where it is impossible for the filesystem to allocate a block to make forward progress.

This is a fundamental problem for copy-on-write filesystems.

Eventually it would be nice to solve this by reserving some number of blocks are "emergency" operations, but this is tricky because 1. LittleFS is on devices with large variety of block_counts, and 2. knowing exactly how many blocks are free is surprisingly tricky internally.

In the meantime, it's up to users to avoid completely filling the filesystem. Either via application specific knowledge, or by checking lfs_fs_size during file creation.

@bmcdonnell-fb
Copy link

This is unfortunately expected. If LittleFS fills up completely, it can become "stuck", where it is impossible for the filesystem to allocate a block to make forward progress.

Fortunately for me (not OP), this is not likely in my current use case. But this seems unfortunate in general.

In the meantime, ...

Is an assert the right way to fail on this? I think assert is more like "this shouldn't happen unless there's something unexpectedly wrong with this (library) code". I think returning an error code would be more appropriate here, since you're currently placing the responsibility on the application to handle it.

... it's up to users to avoid completely filling the filesystem. Either via application specific knowledge, or by checking lfs_fs_size during file creation.

This may be worth documenting?

@geky
Copy link
Member

geky commented Mar 18, 2025

Is an assert the right way to fail on this? I think assert is more like "this shouldn't happen unless there's something unexpectedly wrong with this (library) code". I think returning an error code would be more appropriate here, since you're currently placing the responsibility on the application to handle it.

So, it's technically not an assert that's being hit first. In the above test case, lfs_file_close is returning LFS_ERR_NOSPC, and it's only after trying additional operations that an assert is triggered. If the ENOSPC condition occurred during lfs_remove, then it returns LFS_ERR_NOSPC which I believe is correct.

This is a bit of a separate problem, but LFS_ERR_NOSPC is one of the "anywhere" errors that littlefs currently can't fully recover from (the others being LFS_ERR_CORRUPT, LFS_ERR_IO, etc). The result is broken in-RAM state that ends up asserting.

I've been looking into rewriting littlefs to handling these "anywhere" errors robustly, but it will probably add a ~1.5-2x stack cost since you need to keep track of both the before/after states until the metadata commits succeed (or fail). Still, I think ultimately this RAM cost is necessary for correct/safe behavior by default. I'm currently rolling it up into the other big obnoxious snowball of changes (rbyd), since it requires more-or-less a full rewrite.

Maybe in the future we can also add a sort of "LFS_GLASS" mode for users that want the reduced code/RAM cost with the tradeoff of unrecoverable errors, but it's low priority.

Fortunately for me (not OP), this is not likely in my current use case. But this seems unfortunate in general.

It's interesting how rare this has actually been a problem for most embedded use cases. I guess most embedded applications simply try to the avoid the ENOSPC condition, since it usually indicates a throughput issue.

But I agree it's unfortunate. It's extra frustrating in that I don't think a rigorous solution is possible, instead this will probably always rely on a heuristic of some sort (preconfigured number of reserved blocks).

This may be worth documenting?

Agree, but one question is where? Maybe in lfs_remove's documentation? I suspect this sort of thing will probably go mostly unnoticed if put in SPEC.md/DESIGN.md...

@bmcdonnell-fb
Copy link

... it's up to users to avoid completely filling the filesystem. Either via application specific knowledge, or by checking lfs_fs_size during file creation.

This may be worth documenting?

Agree, but one question is where? Maybe in lfs_remove's documentation? I suspect this sort of thing will probably go mostly unnoticed if put in SPEC.md/DESIGN.md...

If the way (a way) to avoid it is "by checking lfs_fs_size during file creation", maybe it makes sense to describe what the user can/must do in block comments with lfs_file_open/lfs_file_opencfg? Possibly citing e.g. DESIGN.md for some fuller discussion, if you add that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants