Preserve attribution across git restore --source#1601
Conversation
| fn enrich_restore( | ||
| &mut self, | ||
| cmd: &mut NormalizedCommand, | ||
| state: &FamilyState, | ||
| ) -> Result<(), GitAiError> { | ||
| let summary = summarize_restore_args(&command_args(cmd)); | ||
| let Some(source) = summary.source else { | ||
| return Ok(()); | ||
| }; | ||
| let Some(worktree) = cmd.worktree.clone() else { | ||
| return Ok(()); | ||
| }; | ||
| let repo = find_repository_in_path(&worktree.to_string_lossy())?; | ||
| let resolved = | ||
| resolve_cherry_pick_sources_with_cat_file(&repo, &[source.as_str()], &state.refs)?; | ||
| cmd.restore_source_oid = resolved.into_iter().next(); | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
🚩 Non-zero exit code silently drops attribution for partially-successful restores
When git restore --source <commit> -- file1.ts typo.ts is invoked and only file1.ts exists (typo in second path), git restores file1.ts successfully but exits non-zero. Because command_can_move_refs_on_nonzero at src/daemon/ref_cursor.rs:3215-3219 returns false for "restore", enrich_restore never runs, restore_source_oid stays None, and no RestorePaths event is emitted. Additionally, the daemon handler at src/daemon.rs:4388 returns early for non-zero exit. This means attribution for the successfully-restored file is silently lost. This mirrors how other commands (checkout with paths) behave, so it may be intentional, but it's worth noting since stash pop/apply explicitly opted into non-zero handling for the same reason (partial application with conflicts).
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Intentional — this matches the conservative fail-closed behavior for path-scoped worktree mutations (checkout-with-paths) where a non-zero exit means the operation did not complete cleanly. Attribution recovery for partially-successful restores would require trusting an error-exit command's partial effects, which the pipeline deliberately avoids. Stash pop/apply opt into non-zero handling only because conflicts are their normal success-with-conflicts path; a non-zero git restore has no such well-defined partial-success contract, so we fail closed here.
| let repo = find_repository_in_path(&worktree.to_string_lossy())?; | ||
| let resolved = | ||
| resolve_cherry_pick_sources_with_cat_file(&repo, &[source.as_str()], &state.refs)?; | ||
| cmd.restore_source_oid = resolved.into_iter().next(); |
There was a problem hiding this comment.
🚩 enrich_restore resolves source OID via cat-file against live repo, not sequenced state
At src/daemon/ref_cursor.rs:787-789, enrich_restore opens the repo via find_repository_in_path and resolves the --source argument using resolve_cherry_pick_sources_with_cat_file. This resolution runs against the live repo state rather than the sequenced ref snapshot. The code comment justifies this: "restore leaves HEAD untouched" so the live state is consistent. However, if a later command in the family queue has already moved branches before this restore is processed, a branch-name source (e.g., --source feature) could resolve to a different commit than what was live at restore time. In practice this is unlikely since the ref cursor processes commands in order and restore doesn't move refs, but it's a subtle correctness consideration for branch-name sources.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Branch-name sources are already resolved against the sequenced snapshot: resolve_cherry_pick_sources_with_cat_file runs concretize_revision_expr(source, &state.refs) first, which resolves refs/branch names from the passed state.refs (the sequenced ref state), before the final cat-file peel to ^{commit}. That peel only touches the object DB (append-only, HEAD-independent), so it can't be skewed by a later ref move. Combined with restore not moving refs, a branch-name --source resolves to the commit that was live in sequenced order.
| let takes_value = matches!(arg, "--conflict" | "--pathspec-from-file"); | ||
| if takes_value && !arg.contains('=') { | ||
| i += 2; | ||
| continue; |
There was a problem hiding this comment.
🚩 --pathspec-from-file restores will silently skip attribution preservation
When a user runs git restore --source <commit> --pathspec-from-file=paths.txt, the actual paths are read from the file at runtime by git — they never appear in the command's argv. summarize_restore_args would return an empty pathspecs vec, and the workspace analyzer's !summary.pathspecs.is_empty() guard at src/daemon/analyzers/workspace.rs:54 prevents emission of a RestorePaths event. This means attribution is silently lost for file-based pathspecs. This is a reasonable limitation for a first implementation (the stash handler has a similar gap), but worth documenting or tracking.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Agreed — this is an accepted limitation matching the stash handler (pathspec-from-file paths live in the file, not argv). Documented on summarize_restore_args in 4caa6a7 so it's tracked. Out of scope for this PR.
| ); | ||
|
|
||
| // Restrict to the files actually restored. | ||
| file_attributions.retain(|path, _| path_matches_any(path, pathspecs)); |
There was a problem hiding this comment.
🚩 Subdirectory-relative pathspecs may not match authorship log paths
If a user runs git restore --source abc -- feature.ts from a subdirectory (e.g., src/), git restores src/feature.ts but the raw argv pathspec is feature.ts. The path_matches_any comparison in rewrite_restore.rs:80 checks the raw pathspec against repo-root-relative paths in the authorship log. This could fail to match when invoked from a non-root CWD. The same pattern exists in rewrite_stash.rs — the daemon's trace2 ingestion captures the raw argv without CWD normalization. The integration tests all run from the repo root so this edge case isn't covered.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Correct, and it's exact parity with rewrite_stash — both rely on trace2 raw-argv capture without CWD normalization. Added a doc note on path_matches_any in 4caa6a7. The common root-CWD case (and all tests) work; subdir-relative normalization would be a cross-cutting change to the trace2 ingestion shared by stash, so it's out of scope here.
git restore --source <commit> -- <files> rewrites worktree/index content without firing a checkpoint and was invisible to the attribution pipeline, so a subsequent commit attributed the restored lines as untracked even though the source commit already attributed them. Recognize git restore as a mutating, family-sequenced command, resolve its --source revision in the ref cursor, emit a RestorePaths semantic event, and seed an INITIAL working log at HEAD from the source commit's authorship note (no line shifting -- restored bytes equal the source 1:1). The existing post-commit reconciliation then preserves the attribution. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Parse the combined short --source form (-sHEAD~1) in summarize_restore_args so attribution is not silently lost. - Merge restored files directly into the existing INITIAL, persisting blobs only for restored files and leaving every other file's entry and stored blob snapshot untouched, so a restore can never drop another file's attribution. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
git restore supports --conflict <style> and --pathspec-from-file <file> with space-separated values. Skip those values so they are not collected as pathspecs (which would otherwise cause attribution to be lost), mirroring the value-taking-flag handling in summarize_rebase_args. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Daemon NormalizedCommand serialization is versioned atomically, so the new field needs no backward-compat default; match the sibling fields (stash_target_oid, cherry_pick_source_oids) which carry no serde attribute. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
4f361ec to
cb1a6ec
Compare
Note the two edge cases Devin flagged, both shared with the stash handler and inherent to trace2 raw-argv capture: --pathspec-from-file paths live in a file (not argv) so attribution carry is skipped, and subdirectory-relative pathspecs are not CWD-normalized against root-relative authorship-log paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
git restore --source <commit> -- <files>rewrites worktree/index content without firing a checkpoint, and was not recognized anywhere in the attribution pipeline. As a result, a subsequentgit commitattributed the restored lines as untracked even though the source commit's authorship note already attributed them (AI or human).This change recognizes
git restoreend-to-end:git restoreas a mutating, family-sequenced command (command_classification.rs).--sourcerevision to a full commit OID in the ref cursor (ref_cursor.rs), reusing the existing cat-file resolution.cli_parser.rs::summarize_restore_args) and emits aRestorePathssemantic event from the workspace analyzer for source-bearing restores only.rewrite_restore.rs, which reads the source commit's authorship note, extracts per-file attributions (no line shifting — restored bytes equal the source 1:1), and seeds an INITIAL working log keyed to HEAD. The existing post-commit reconciliation then preserves the attribution.A plain
git restore <file>(no--source) restores from the index/HEAD and is not a cross-commit move, so it is treated as a no-op for attribution.Tests
git_restore_source_attribution.rs): AI-source preserved, human-source stays human, partial file subset,--sourcerelative ref, and no-source no-op.🤖 Generated with Claude Code