memoize worktree→family-key resolution off the trace-ingestion path#1621
Open
svarlamov wants to merge 1 commit into
Open
memoize worktree→family-key resolution off the trace-ingestion path#1621svarlamov wants to merge 1 commit into
svarlamov wants to merge 1 commit into
Conversation
Deriving a family key from a worktree runs common_dir_for_worktree (stat walk up the tree for .git, plus a .git-file read for linked worktrees) followed by canonicalize (stat/readlink per path component). This ran uncached on the trace-ingestion path — maybe_append_pending_root_from_trace_payload fires per mutating command, and finalize_root re-derives it again — so the same handful of worktrees re-paid these syscalls on every frame that reached them. Add canonical_family_key_for_worktree in repo_state: a process-global memoized map (worktree path → canonical family-key string) behind a Mutex, and route all four call sites (the two daemon coordinator sites and the two trace normalizer sites) through it. The mapping is a stable function of the on-disk repo layout for a live worktree, so caching is safe. Only successful resolutions are cached — a path that is not yet a repository (before clone/init completes) returns None and is re-resolved next call, so a repo that appears later is still picked up. A coarse size cap (clear-and-rebuild at 4096 entries) bounds growth in a very long-lived daemon; real sessions touch far fewer. Behavior is unchanged: the value equals the previous uncached derivation. New unit tests cover equality-with-uncached, stability across calls, and the no-cache-on-miss invariant. Existing daemon_mode (55), cross_repo (24), worktree (1035), and clone (12) suites pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Removes repeated filesystem syscalls from the daemon's trace-ingestion path by memoizing the worktree → family-key derivation. Independent of (and not stacked on) the pull-rebase PRs; branches off
main.Background
Deriving a family key from a worktree does real I/O:
common_dir_for_worktree— walks parent dirsstat-ing for.git, plus aread_to_stringof the.gitfile for linked worktrees/submodules.canonicalize()—stat/readlinkper path componentThis ran uncached on the ingestion path:
maybe_append_pending_root_from_trace_payloadfires per mutating (family-sequencer-participating) command, doing the full derivation before its dedup guard.finalize_rootre-derives it again when the family key isn't already set.A daemon session touches only a handful of worktrees, but each recurring one re-paid these syscalls on every qualifying frame.
Change
canonical_family_key_for_worktreeinrepo_state.rs: a process-global memoized map (worktree path → canonical family-key string) behind aMutex.clone/initcompletes) returnsNoneand is re-resolved next call — so a repo that appears later is still picked up. This is the key correctness invariant.Correctness
The memoized value is byte-identical to the previous uncached derivation. The mapping is a stable function of the on-disk repo layout for a live worktree.
Tests
daemon_mode(55),cross_repo(24),worktree(1035),clone(12).cargo fmt+clippy -D warningsclean.Note
Pre-existing optimization surfaced during perf review of the pull-rebase work; orthogonal to those PRs, hence a standalone branch off main.
🤖 Generated with Claude Code