Async background persistence #3905

joostjager · 2025-07-02T14:55:33Z

Stripped down version of #3778. It allows background persistence to be async, but channel monitor persistence remains sync. This means that for the time being, users wanting async background persistence would be required to implement both the sync and the async KVStore trait. This model is available through process_events_full_async.

process_events_async still takes a synchronous kv store to remain backwards compatible.

Usage in ldk-node: lightningdevkit/ldk-node@main...joostjager:ldk-node:upgrade-to-async-kvstore

ldk-reviews-bot · 2025-07-02T14:55:35Z

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

lightning-background-processor/src/lib.rs

lightning/src/util/sweep.rs

graphite-app · 2025-07-09T09:54:35Z

lightning/src/util/sweep.rs

+	fn persist_state<'a>(
+		&self, sweeper_state: &SweeperState,
+	) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> {
+		let encoded = &sweeper_state.encode();
+
+		self.kv_store.write(
+			OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE,
+			OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE,
+			OUTPUT_SWEEPER_PERSISTENCE_KEY,
+			encoded,
+		)


The encoded variable is captured by reference in the returned future, but it's a local variable that will be dropped when the function returns. This creates a potential use-after-free issue. Consider moving ownership of encoded into the future instead:

fn persist_state<'a>( &self, sweeper_state: &SweeperState, ) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> { let encoded = sweeper_state.encode(); self.kv_store.write( OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE, OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE, OUTPUT_SWEEPER_PERSISTENCE_KEY, &encoded, ) }

This ensures the data remains valid for the lifetime of the future.

Suggested change

fn persist_state<'a>(

&self, sweeper_state: &SweeperState,

) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> {

let encoded = &sweeper_state.encode();

self.kv_store.write(

OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE,

OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE,

OUTPUT_SWEEPER_PERSISTENCE_KEY,

encoded,

)

fn persist_state<'a>(

&self, sweeper_state: &SweeperState,

) -> Pin<Box<dyn Future<Output = Result<(), io::Error>> + 'a + Send>> {

let encoded = sweeper_state.encode();

self.kv_store.write(

OUTPUT_SWEEPER_PERSISTENCE_PRIMARY_NAMESPACE,

OUTPUT_SWEEPER_PERSISTENCE_SECONDARY_NAMESPACE,

OUTPUT_SWEEPER_PERSISTENCE_KEY,

&encoded,

)

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

Is this real?

I don't think so as the compiler would likely optimize that away, given that encoded will be an owned value (Vec returned by encode()). Still, the change that it suggests looks cleaner.

In general it will be super confusing that we encode at the time of creating the future, but would only actually persist once we dropped the lock. Starting from now we'll need to be super cautious about the side-effects of interleaving persist calls.

The idea is that an async kv store store encodes the data and stores the write action in a queue at the moment the future is created. Things should still happen in the original order.

Can you show a specific scenario where we have to be super cautious even if we have that queue?

The idea is that an async kv store store encodes the data and stores the write action in a queue at the moment the future is created. Things should still happen in the original order.

If that is the idea that we start assuming in this PR, we should probably also start documenting these assumptions in this PR on KVStore already.

Added this requirement to the async KVStore trait doc

lightning/src/util/sweep.rs

joostjager · 2025-07-14T09:02:50Z

@tnull changes made diff and replied to open threads.

tnull

As mentioned above, changes basically look good to me, although I'd prefer to avoid process_events_full_async by using the builder introduced in #3688.

But, generally this should be good for a second reviewer, so pinging @TheBlueMatt.

lightning/src/util/persist.rs

TheBlueMatt · 2025-07-15T15:09:51Z

lightning/src/util/persist.rs

+/// Trait that handles persisting a [`ChannelManager`], [`NetworkGraph`], and [`WriteableScore`] to disk.
+///
+/// [`ChannelManager`]: crate::ln::channelmanager::ChannelManager
+pub trait Persister<'a, CM: Deref, L: Deref, S: Deref>


Given Persister is only used in lightning-background-processor and we've been migrating to just using KVStore everywhere (eg its now required to use Sweeper), maybe we just kill off Persister entirely? We had Persister before we had KVStore as a way to persist the objects that the BP wanted to persist. To simplify the interface, we added the KVStore as a pseudo-wrapper around Persister. But since Sweeper now requires a KVStore explicitly, users can no longer only implement Persister, making it basically useless.

The only reason to keep it would be to avoid building the encoded Vec<u8> of the network graph and scorer for users who are avoiding persisting those objects, but I'm not entirely sure avoiding the ~50MiB memory spike during write is worth it.

Added first commit that removes Persister. Rebased the rest. Lots of simplication.

TheBlueMatt · 2025-07-15T16:25:39Z

lightning/src/util/sweep.rs

@@ -922,6 +930,173 @@ where
 	}
 }

+/// A wrapper around [`OutputSweeper`] to be used with a sync kv store.
+pub struct OutputSweeperSyncKVStore<


I'm also not clear on why this needs a separate wrapper. We can have a second constructor on OutputSweeper that does the KVStore sync wrapping before returning a fully-async OutputSweeper, the only difference between this and that is that track_spendable_outputs would go from async to sync, but in this case its a user who already has support for calling most things async, so not sure why we really care.

In preparation for the addition of an async KVStore, we here remove the Persister pseudo-wrapper. The wrapper is thin, would need to be duplicated for async, and KVStore isn't fully abstracted anyway anymore because the sweeper takes it directly.

lightning/src/util/persist.rs

TheBlueMatt

Almost LGTM, just one real comment and a doc nit.

lightning/src/util/persist.rs

TheBlueMatt · 2025-07-18T23:35:55Z

lightning/src/util/sweep.rs

 				}

-				output_info.status.broadcast(cur_hash, cur_height, spending_tx.clone());
+				self.broadcaster.broadcast_transactions(&[&spending_tx]);


Hmm, it used to be the case that we'd first persist, wait for that to finish, then broadcast. I don't think its critical, but it does seem like we should retain that behavior.

Changed to first await the persist future, and then broadcast.

ldk-reviews-bot · 2025-07-21T00:00:52Z

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

TheBlueMatt

One question about the requirements we want, but figuring out the answer doesn't have to block landing this PR as-is.

TheBlueMatt · 2025-07-21T12:19:53Z

lightning/src/util/persist.rs

-	) -> Result<Vec<u8>, io::Error>;
-	/// Persists the given data under the given `key`.
+	) -> Pin<Box<dyn Future<Output = Result<Vec<u8>, io::Error>> + 'static + Send>>;
+	/// Persists the given data under the given `key`. Note that the order of multiple writes calls needs to be retained


Oh actually, do we want this to be the restriction, or do we want "the order of multiple writes to the same key needs to be retained"? I imagine the second, we don't currently have a need inside LDK to require a strict total order, and it could definitely substantially slow down async persist. cc @tnull

One related thing I've been thinking about is whether it is okay to skip a stale write? If two consecutive same-key writes are executed out of order, is it fine to simply drop the first write? Or could it be that we do need to read that first written data at some point?

I don't see how it could not be okay - writes overwrite, so if there's two writes to the same key we're required to eventually end up with the second one on disk. Only question, I guess, is whether we're allowed to complete the second future first, then the first future later, and still end up with the second future's write. I think that's something we should accept (and document?) but that's the only caller-observable question, I think.

I was thinking of a write -> read -> write pattern, but I believe we already established that that isn't happening in LDK. We weren't going to do ordering for reads anyway.

I was thinking of a write -> read -> write pattern,

Hmm, that's indeed a good question, i.e., whether we'd need to deal with interleaving reads also, otherwise we may end up reading data that was written later, actually?

but I believe we already established that that isn't happening in LDK.

I'm not sure where we established that, but for LDK that def. won't be the case for much longer, as we'll want to migrate to stores that are not completely held in-memory, and we'll read data on-demand on cache failures.

I was thinking of a write -> read -> write pattern, but I believe we already established that that isn't happening in LDK. We weren't going to do ordering for reads anyway.
Hmm, that's indeed a good question, i.e., whether we'd need to deal with interleaving reads also, otherwise we may end up reading data that was written later, actually?

I don't see an issue here - after the storer calls write, the data may be in place (ie returned by a call to read) and after write's future completes is will be in place. That is implicit in the API, and is in fact required by any similar-looking API - you cannot know what is happening after you start the write call, so relying on anything other than the above would obviously be race-y. The same holds for multiple calls to write to the same key.

joostjager force-pushed the async-persister branch from 1b95d30 to 21dc34c Compare July 2, 2025 15:52

joostjager commented Jul 2, 2025

View reviewed changes

lightning-background-processor/src/lib.rs Show resolved Hide resolved

joostjager force-pushed the async-persister branch 7 times, most recently from 3fb7d6b to 1847e8d Compare July 3, 2025 09:57

joostjager added this to Weekly Goals Jul 3, 2025

joostjager self-assigned this Jul 3, 2025

joostjager force-pushed the async-persister branch 2 times, most recently from 1f59bbe to 723a5a6 Compare July 3, 2025 11:52

joostjager mentioned this pull request May 12, 2025

Async Persistence TODOs #3052

Open

24 tasks

TheBlueMatt linked an issue Jul 7, 2025 that may be closed by this pull request

Async KV Store Persister #1470

Open

TheBlueMatt mentioned this pull request Jul 7, 2025

Async KV Store Persister #1470

Open

joostjager force-pushed the async-persister branch 10 times, most recently from bc9c29a to 90ab1ba Compare July 9, 2025 09:52

joostjager marked this pull request as ready for review July 9, 2025 09:52

joostjager requested a review from tnull July 9, 2025 09:52

graphite-app bot reviewed Jul 9, 2025

View reviewed changes

joostjager force-pushed the async-persister branch from 32e17b3 to 9cf8375 Compare July 14, 2025 08:56

graphite-app bot reviewed Jul 14, 2025

View reviewed changes

lightning/src/util/sweep.rs Outdated Show resolved Hide resolved

joostjager force-pushed the async-persister branch from 9cf8375 to c3a4c24 Compare July 14, 2025 09:01

joostjager requested a review from tnull July 14, 2025 09:03

tnull reviewed Jul 15, 2025

View reviewed changes

tnull requested a review from TheBlueMatt July 15, 2025 09:16

joostjager mentioned this pull request Jul 15, 2025

Async FilesystemStore #3931

Draft

TheBlueMatt reviewed Jul 15, 2025

View reviewed changes

joostjager force-pushed the async-persister branch 5 times, most recently from 5988195 to cc5703e Compare July 18, 2025 06:06

joostjager changed the title ~~Async Persister trait and async OutputSweeper persistence~~ Async background persistence Jul 18, 2025

joostjager force-pushed the async-persister branch from cc5703e to e83affa Compare July 18, 2025 08:07

joostjager added 2 commits July 18, 2025 10:19

Remove Persister

90411b1

In preparation for the addition of an async KVStore, we here remove the Persister pseudo-wrapper. The wrapper is thin, would need to be duplicated for async, and KVStore isn't fully abstracted anyway anymore because the sweeper takes it directly.

Rename KVStore trait to KVStoreSync

d6b7ddd

joostjager force-pushed the async-persister branch 3 times, most recently from 98cdc61 to fda8a58 Compare July 18, 2025 09:20

graphite-app bot reviewed Jul 18, 2025

View reviewed changes

lightning/src/util/persist.rs Outdated Show resolved Hide resolved

joostjager force-pushed the async-persister branch from fda8a58 to 1809349 Compare July 18, 2025 09:41

joostjager requested review from tnull and TheBlueMatt July 18, 2025 09:52

TheBlueMatt reviewed Jul 18, 2025

View reviewed changes

Add async KVStore

8f79368

joostjager force-pushed the async-persister branch from 1809349 to 8f79368 Compare July 21, 2025 07:55

TheBlueMatt approved these changes Jul 21, 2025

View reviewed changes

Async background persistence #3905

Are you sure you want to change the base?

Async background persistence #3905

Conversation

joostjager commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Jul 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

graphite-app bot Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joostjager Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joostjager commented Jul 14, 2025

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Jul 21, 2025

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tnull Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

joostjager commented Jul 2, 2025 •

edited

Loading

ldk-reviews-bot commented Jul 2, 2025 •

edited

Loading

joostjager Jul 14, 2025 •

edited

Loading

tnull Jul 21, 2025 •

edited

Loading

TheBlueMatt Jul 21, 2025 •

edited

Loading