lfs_alloc() blocks on full filesystem. #1063

ddomnik · 2025-01-11T18:48:27Z

This issue is derived from this Espressif port: joltwallet/esp_littlefs#213 But I think it most likely fits better here.

Usually on a full filesystem littleFS returns from fwrite() within a few ms and prints:
E (2818207) esp_littlefs: ./managed_components/joltwallet__littlefs/src/littlefs/lfs.c:689:error: No more free space 0x48d

However, it could happen that the library gets "stuck" for a very long time (multiple seconds or even minutes - prob. depending on partition size). This happens exactly in this while loop. (The second while in lfs_alloc())

littlefs/lfs.c

Line 656 in d01280e

while (lfs->lookahead.next < lfs->lookahead.size) {

The problem is, that almost every program uses watchdogs. Espressifs default watchdog has a maximum of 60 seconds, which seems to be insufficient in rare cases and then resets the whole device. maximum wdt timeout

Maybe some experts know how to handle this issue.

Based on my current understanding of the issue, a very simple thought would be to give littleFs it's own watchdog that can be set. This allows to "catch" the issue on application level rather than system level.

geky · 2025-02-03T21:30:07Z

Hi @ddomnik, thanks for creating an issue.

One thing to note, while LittleFS does have some long-running operations, these should always involve IO-operations of some sort. It should never get stuck in a CPU-only function, if it does that's a bug and something a watchdog should catch.

One option is to reset the watchdog in the low-level bd read/prog/erase functions. This would avoid false positives with long-running filesystem work. This is also a good place for yield calls for multithreaded/coroutine systems.

Having a second, much longer, watchdog at the application-level is still possible, as you note. You may be able to emulate it in software (via a counter that increments when you reset the low-level watchdog?) to avoid using an additional hardware resource.

However, it could happen that the library gets "stuck" for a very long time (multiple seconds or even minutes - prob. depending on partition size). This happens exactly in this while loop. (The second while in lfs_alloc())

This is most likely the allocator trying to scan the filesystem one last time to find any free blocks. This can be reduced at a RAM cost by increasing cfg.lookahead_size, but the scan scales at $O\left(n^2/L\right)$, so it can still end up very expensive for large partitions.

There is some ongoing work to introduce an optional block-map that will hopefully help with this.

BrianPugh · 2025-02-12T14:17:47Z

One option is to reset the watchdog in the low-level bd read/prog/erase functions.

That's an idea. We could add an additional layer of callbacks in esp_littlefs by adding 4x function handles to esp_vfs_littlefs_conf_t corresponding to read/write/erase/sync. We could have them be conditional based off macros so that the structure is only larger for users that want to use the callbacks.

Additionally, we could just make a configuration that enables/disables automatic calling of esp_task_wdt_reset in the read/write/erase/sync functions.

If either of these is something we want to pursue, please follow up in joltwallet/esp_littlefs#213

geky · 2025-02-13T21:43:50Z

It probably won't help you now, but eventually I'd like to redesign the bd API to take ctx directly. It's an unfortunate missed opportunity that the current API makes composability difficult...

But the bd redesign is currently mid-priority in a sea of high-priority todos...

geky added the question label Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lfs_alloc() blocks on full filesystem. #1063

lfs_alloc() blocks on full filesystem. #1063

ddomnik commented Jan 11, 2025 •

edited

Loading

geky commented Feb 3, 2025

BrianPugh commented Feb 12, 2025

geky commented Feb 13, 2025

lfs_alloc() blocks on full filesystem. #1063

lfs_alloc() blocks on full filesystem. #1063

Comments

ddomnik commented Jan 11, 2025 • edited Loading

geky commented Feb 3, 2025

BrianPugh commented Feb 12, 2025

geky commented Feb 13, 2025

ddomnik commented Jan 11, 2025 •

edited

Loading