[ft] SMT Replenish with > 1000 lastrevealed index #1960

portlandhodl · 2025-05-18T21:59:12Z

Currently in BDK there is a pretty big problem when loading a wallet where the last revealed index is into the value of 1000+ for even decent hardware because of the need to utilize the CPU to generate SPKs for each index and add them to the graph. The process will only use a single CPU core currently. Large wallets will cause timeouts and/or become a DoS vector.

This will if greater than 1000 last revealed use all available CPU cores to generate the SPKs,

Previous, ~3 minutes for 100000 SPKs

Notice the CPU usage is only a single core at a time.

New, 28 Seconds for 100000 SPKs

All cores are being used!

This speed up is linear to the number of CPU cores.

notmandatory · 2025-05-19T00:10:11Z

Concept ACK, but need to handle cfg when std feature not enabled.

evanlinjin · 2025-05-19T01:38:17Z

Thanks for working on this! My main concern is the introduction of an internal Mutex in a library such as BDK. Mutexes should only belong in the application level and is rather problematic in low-level data structures/types. ChatGPT has some nice arguments as to why

We discussed this in the team meetings and the rough consensus for a solution is to cache spks in the changesets. We can structure these changesets in a way so that you can introduce spks for a keychain well in the future.

For example:

pub struct ChangeSet {
    /// Contains for each descriptor_id the last revealed index of derivation
    pub last_revealed: BTreeMap<DescriptorId, u32>,

    /// Cache of script pubkeys.
    ///
    /// We can choose to populate this or leave it empty. The caller can choose to populate this in
    /// any manner they intend; multi-threaded, downloaded asynchronously, etc.
    pub spk_cache: BTreeMap<(DescriptorId, u32), ScriptBuf>,
}

I'm assuming we need an associated cache either in SpkTxOutIndex or KeychainTxOutIndex. The example code above needs some more thought as to where and how we should structure everything. I.e. should we introduce a changeset to SpkTxOutIndex?

To understand your requirements better, is there a special way you are loading the wallet? Or is it loaded from a changeset (the typical way)?

notmandatory · 2025-05-19T02:00:42Z

I suspect there's a way to do this without need of the Mutex by collecting the yielded results of each thread at the end. And still the multi-threaded approach would only be done if the std feature is enabled.

I also support the caching approach, but I don't see why we can't do both with this mult-ithreaded solution going in first since it's a smaller change. Caching will still be useful for both std and no-std users.

portlandhodl · 2025-05-19T02:20:55Z

I suspect there's a way to do this without need of the Mutex by collecting the yielded results of each thread at the end. And still the multi-threaded approach would only be done if the std feature is enabled.

I also support the caching approach, but I don't see why we can't do both with this mult-ithreaded solution going in first since it's a smaller change. Caching will still be useful for both std and no-std users.

I currently have a version of this but the number of libraries that are modified and touched is high at 2 or 3, between bdk-wallet, bdk, and bdk-sqlx. Getting SMT working seems to be an overall much lighter solution that would also speed up the initial generation of the cache.

evanlinjin · 2025-05-19T04:02:14Z

@notmandatory Okay I see what you are saying. To avoid having more complex changes, what do you think about creating a new constructor method that takes in the following:

pub fn with_cache(lookahead: u32, cache: impl IntoIterator<Item = KeychainIndexed<K, ScriptBuf>>) -> Self { todo!() }

This way, we don't need to be opinionated about how we construct the cache and can leave it up to the caller. To implement this cleaner, I suggest creating a CachedSpkIterator that wraps a SpkIterator with a cache.

However, I do wonder if it really is that much more difficult to cache spks in the changeset as, it seems like that more complete solution since you don't need to re-derive the spks.

[CI] Cargo fmt applied [Req] Remove used of mutex and instead use returned results. [std/non-std] variations [fmt] Cargo fmt [Clean] dedup'd code [Clean] Cargo Fmt repair

LLFourn · 2025-05-22T04:13:41Z

@portlandhodl how many public keys are there and how long are the derivation paths for the descriptor in your benchmark. It must be quite a big one to actually hit those numbers. On my computer doing 100,000 derivations for a bip86 style derivation takes around 5 seconds!

portlandhodl · 2025-05-22T19:08:02Z

@portlandhodl how many public keys are there and how long are the derivation paths for the descriptor in your benchmark. It must be quite a big one to actually hit those numbers. On my computer doing 100,000 derivations for a bip86 style derivation takes around 5 seconds!

Currently using bip48 descriptors (depth = 6) along with miniscript with multiple spend conditions.

portlandhodl · 2025-05-26T04:08:23Z

Want me to close in favor of #1963 ?

62767f0 fix(rusqlite_impl): Fix derived spks create table statement (valued mammal) 603f133 docs(chain): Adds docs for persisting spks in `KeychainTxOutIndex` (志宇) 3126cd2 feat(chain)!: Clean up ergonomics of `IndexedTxGraph` (志宇) a055581 feat(chain): `KeychainTxOutIndex`: Make spk cache optional (志宇) 19e3b5d feat(chain): Add `KeychainTxOutIndex::from_changeset` constructor (志宇) d761265 feat(chain): `KeychainTxOutIndex`: Debug build checks (志宇) 76875e7 fix(chain)!: API and logical issues in `KeychainTxOutIndex` (志宇) d299dae feat(chain)!: Persist spks derived from `KeychainTxOutIndex` (志宇) Pull request description: Replaces #1960 Fixes #1964 ### Description Users with large wallet and/or complex descriptors may experience slow startup of `KeychainTxOutIndex`. This PR addresses this problem by providing the option to persist derived spks so that they no longer need to be re-derived on startup. The `IndexedTxGraph` API has been changed for better ergonomics. Compared to #1960, this is a more longterm solution that does not depend on multi-threading logic. ### Changelog notice ```md Changed - `KeychainTxOutIndex::new` to take in an additional parameter `persist_spks` to control whether derived spks are cached and persisted across restarts. The default of `persist_spks` is false. - `KeychainTxOutIndex` methods (`lookahead_to_target, `next_unused_spk`, `reveal_next_spk`) now return changesets as they may derive spks to be persisted. - The `InsertDescriptorError` type now wraps descriptors in `Box` to reduce enum size. Added - `spk_cache` field to `indexer::keychain_txout::ChangeSet` which persists derived spks. - `IndexedTxGraph::from_changeset` - allows constructing from `indexed_tx_graph::ChangeSet` and only indexing when ready. - `IndexedTxGraph::reindex` method. Fixed - `KeychainTxOutIndex::reveal_to_target` so that it actually returns `None` if the `keychain` does not exist. ``` ### Checklists #### All Submissions: * [x] I've signed all my commits * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) * [x] I ran `cargo +nightly fmt` and `cargo clippy` before committing #### New Features: * [ ] I've added tests for the new feature * [x] I've added docs for the new feature ACKs for top commit: ValuedMammal: ACK 62767f0 notmandatory: ACK 62767f0 Tree-SHA512: dc1aa4308ffcc6d121e0d7a1ca4ff9f641ed5db63204747fde47ac02e45dae9b65da95554541705a98b69e59f741c043485a26db736966417061a4c9d220ba29

portlandhodl · 2025-05-27T16:58:25Z

Closed as superseded by #1963

notmandatory assigned portlandhodl May 19, 2025

notmandatory added this to BDK Chain May 19, 2025

notmandatory moved this to Needs Review in BDK Chain May 19, 2025

notmandatory added the new feature New feature or request label May 19, 2025

portlandhodl force-pushed the 05-18-2025-smt-replenish branch from 7a334cf to 3bc325b Compare May 19, 2025 02:39

[ft] Replenish > 1000 lastrevealed with SMT

22573f6

[CI] Cargo fmt applied [Req] Remove used of mutex and instead use returned results. [std/non-std] variations [fmt] Cargo fmt [Clean] dedup'd code [Clean] Cargo Fmt repair

portlandhodl force-pushed the 05-18-2025-smt-replenish branch from 568744e to 22573f6 Compare May 19, 2025 04:44

evanlinjin mentioned this pull request May 22, 2025

Persist spks derived from KeychainTxOutIndex #1963

Merged

5 tasks

portlandhodl closed this May 27, 2025

github-project-automation bot moved this from Needs Review to Done in BDK Chain May 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ft] SMT Replenish with > 1000 lastrevealed index #1960

[ft] SMT Replenish with > 1000 lastrevealed index #1960

Uh oh!

portlandhodl commented May 18, 2025 •

edited

Loading

Uh oh!

notmandatory commented May 19, 2025

Uh oh!

evanlinjin commented May 19, 2025 •

edited

Loading

Uh oh!

notmandatory commented May 19, 2025

Uh oh!

portlandhodl commented May 19, 2025

Uh oh!

evanlinjin commented May 19, 2025

Uh oh!

LLFourn commented May 22, 2025 •

edited

Loading

Uh oh!

portlandhodl commented May 22, 2025

Uh oh!

portlandhodl commented May 26, 2025

Uh oh!

portlandhodl commented May 27, 2025

Uh oh!

Uh oh!

[ft] SMT Replenish with > 1000 lastrevealed index #1960

[ft] SMT Replenish with > 1000 lastrevealed index #1960

Uh oh!

Conversation

portlandhodl commented May 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

notmandatory commented May 19, 2025

Uh oh!

evanlinjin commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

notmandatory commented May 19, 2025

Uh oh!

portlandhodl commented May 19, 2025

Uh oh!

evanlinjin commented May 19, 2025

Uh oh!

LLFourn commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

portlandhodl commented May 22, 2025

Uh oh!

portlandhodl commented May 26, 2025

Uh oh!

portlandhodl commented May 27, 2025

Uh oh!

Uh oh!

portlandhodl commented May 18, 2025 •

edited

Loading

evanlinjin commented May 19, 2025 •

edited

Loading

LLFourn commented May 22, 2025 •

edited

Loading