Skip to content

Preserve attribution across git restore --source#1601

Open
svarlamov wants to merge 5 commits into
mainfrom
fix/git-restore-source-attribution
Open

Preserve attribution across git restore --source#1601
svarlamov wants to merge 5 commits into
mainfrom
fix/git-restore-source-attribution

Conversation

@svarlamov

@svarlamov svarlamov commented Jun 23, 2026

Copy link
Copy Markdown
Member

Summary

git restore --source <commit> -- <files> rewrites worktree/index content without firing a checkpoint, and was not recognized anywhere in the attribution pipeline. As a result, a subsequent git commit attributed the restored lines as untracked even though the source commit's authorship note already attributed them (AI or human).

This change recognizes git restore end-to-end:

  • Classifies git restore as a mutating, family-sequenced command (command_classification.rs).
  • Resolves the --source revision to a full commit OID in the ref cursor (ref_cursor.rs), reusing the existing cat-file resolution.
  • Parses restore args (cli_parser.rs::summarize_restore_args) and emits a RestorePaths semantic event from the workspace analyzer for source-bearing restores only.
  • Adds rewrite_restore.rs, which reads the source commit's authorship note, extracts per-file attributions (no line shifting — restored bytes equal the source 1:1), and seeds an INITIAL working log keyed to HEAD. The existing post-commit reconciliation then preserves the attribution.

A plain git restore <file> (no --source) restores from the index/HEAD and is not a cross-commit move, so it is treated as a no-op for attribution.

Tests

  • Integration (git_restore_source_attribution.rs): AI-source preserved, human-source stays human, partial file subset, --source relative ref, and no-source no-op.
  • Unit: restore arg parsing, workspace analyzer emission, command classification, and pathspec matching.

🤖 Generated with Claude Code


Open in Devin Review

devin-ai-integration[bot]

This comment was marked as resolved.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

Open in Devin Review

Comment thread src/git/cli_parser.rs
Comment thread src/daemon/ref_cursor.rs
Comment on lines +775 to +792
fn enrich_restore(
&mut self,
cmd: &mut NormalizedCommand,
state: &FamilyState,
) -> Result<(), GitAiError> {
let summary = summarize_restore_args(&command_args(cmd));
let Some(source) = summary.source else {
return Ok(());
};
let Some(worktree) = cmd.worktree.clone() else {
return Ok(());
};
let repo = find_repository_in_path(&worktree.to_string_lossy())?;
let resolved =
resolve_cherry_pick_sources_with_cat_file(&repo, &[source.as_str()], &state.refs)?;
cmd.restore_source_oid = resolved.into_iter().next();
Ok(())
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Non-zero exit code silently drops attribution for partially-successful restores

When git restore --source <commit> -- file1.ts typo.ts is invoked and only file1.ts exists (typo in second path), git restores file1.ts successfully but exits non-zero. Because command_can_move_refs_on_nonzero at src/daemon/ref_cursor.rs:3215-3219 returns false for "restore", enrich_restore never runs, restore_source_oid stays None, and no RestorePaths event is emitted. Additionally, the daemon handler at src/daemon.rs:4388 returns early for non-zero exit. This means attribution for the successfully-restored file is silently lost. This mirrors how other commands (checkout with paths) behave, so it may be intentional, but it's worth noting since stash pop/apply explicitly opted into non-zero handling for the same reason (partial application with conflicts).

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional — this matches the conservative fail-closed behavior for path-scoped worktree mutations (checkout-with-paths) where a non-zero exit means the operation did not complete cleanly. Attribution recovery for partially-successful restores would require trusting an error-exit command's partial effects, which the pipeline deliberately avoids. Stash pop/apply opt into non-zero handling only because conflicts are their normal success-with-conflicts path; a non-zero git restore has no such well-defined partial-success contract, so we fail closed here.

Comment thread src/daemon/ref_cursor.rs
Comment on lines +787 to +790
let repo = find_repository_in_path(&worktree.to_string_lossy())?;
let resolved =
resolve_cherry_pick_sources_with_cat_file(&repo, &[source.as_str()], &state.refs)?;
cmd.restore_source_oid = resolved.into_iter().next();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 enrich_restore resolves source OID via cat-file against live repo, not sequenced state

At src/daemon/ref_cursor.rs:787-789, enrich_restore opens the repo via find_repository_in_path and resolves the --source argument using resolve_cherry_pick_sources_with_cat_file. This resolution runs against the live repo state rather than the sequenced ref snapshot. The code comment justifies this: "restore leaves HEAD untouched" so the live state is consistent. However, if a later command in the family queue has already moved branches before this restore is processed, a branch-name source (e.g., --source feature) could resolve to a different commit than what was live at restore time. In practice this is unlikely since the ref cursor processes commands in order and restore doesn't move refs, but it's a subtle correctness consideration for branch-name sources.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Branch-name sources are already resolved against the sequenced snapshot: resolve_cherry_pick_sources_with_cat_file runs concretize_revision_expr(source, &state.refs) first, which resolves refs/branch names from the passed state.refs (the sequenced ref state), before the final cat-file peel to ^{commit}. That peel only touches the object DB (append-only, HEAD-independent), so it can't be skewed by a later ref move. Combined with restore not moving refs, a branch-name --source resolves to the commit that was live in sequenced order.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

Open in Devin Review

Comment thread src/git/cli_parser.rs
Comment on lines +202 to +205
let takes_value = matches!(arg, "--conflict" | "--pathspec-from-file");
if takes_value && !arg.contains('=') {
i += 2;
continue;

@devin-ai-integration devin-ai-integration Bot Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 --pathspec-from-file restores will silently skip attribution preservation

When a user runs git restore --source <commit> --pathspec-from-file=paths.txt, the actual paths are read from the file at runtime by git — they never appear in the command's argv. summarize_restore_args would return an empty pathspecs vec, and the workspace analyzer's !summary.pathspecs.is_empty() guard at src/daemon/analyzers/workspace.rs:54 prevents emission of a RestorePaths event. This means attribution is silently lost for file-based pathspecs. This is a reasonable limitation for a first implementation (the stash handler has a similar gap), but worth documenting or tracking.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — this is an accepted limitation matching the stash handler (pathspec-from-file paths live in the file, not argv). Documented on summarize_restore_args in 4caa6a7 so it's tracked. Out of scope for this PR.

);

// Restrict to the files actually restored.
file_attributions.retain(|path, _| path_matches_any(path, pathspecs));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Subdirectory-relative pathspecs may not match authorship log paths

If a user runs git restore --source abc -- feature.ts from a subdirectory (e.g., src/), git restores src/feature.ts but the raw argv pathspec is feature.ts. The path_matches_any comparison in rewrite_restore.rs:80 checks the raw pathspec against repo-root-relative paths in the authorship log. This could fail to match when invoked from a non-root CWD. The same pattern exists in rewrite_stash.rs — the daemon's trace2 ingestion captures the raw argv without CWD normalization. The integration tests all run from the repo root so this edge case isn't covered.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, and it's exact parity with rewrite_stash — both rely on trace2 raw-argv capture without CWD normalization. Added a doc note on path_matches_any in 4caa6a7. The common root-CWD case (and all tests) work; subdir-relative normalization would be a cross-cutting change to the trace2 ingestion shared by stash, so it's out of scope here.

svarlamov and others added 4 commits June 23, 2026 10:09
git restore --source <commit> -- <files> rewrites worktree/index content
without firing a checkpoint and was invisible to the attribution pipeline,
so a subsequent commit attributed the restored lines as untracked even
though the source commit already attributed them.

Recognize git restore as a mutating, family-sequenced command, resolve its
--source revision in the ref cursor, emit a RestorePaths semantic event, and
seed an INITIAL working log at HEAD from the source commit's authorship note
(no line shifting -- restored bytes equal the source 1:1). The existing
post-commit reconciliation then preserves the attribution.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Parse the combined short --source form (-sHEAD~1) in summarize_restore_args
  so attribution is not silently lost.
- Merge restored files directly into the existing INITIAL, persisting blobs
  only for restored files and leaving every other file's entry and stored
  blob snapshot untouched, so a restore can never drop another file's
  attribution.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
git restore supports --conflict <style> and --pathspec-from-file <file> with
space-separated values. Skip those values so they are not collected as
pathspecs (which would otherwise cause attribution to be lost), mirroring the
value-taking-flag handling in summarize_rebase_args.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Daemon NormalizedCommand serialization is versioned atomically, so the new
field needs no backward-compat default; match the sibling fields
(stash_target_oid, cherry_pick_source_oids) which carry no serde attribute.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@svarlamov svarlamov force-pushed the fix/git-restore-source-attribution branch from 4f361ec to cb1a6ec Compare June 23, 2026 14:11
Note the two edge cases Devin flagged, both shared with the stash handler and
inherent to trace2 raw-argv capture: --pathspec-from-file paths live in a file
(not argv) so attribution carry is skipped, and subdirectory-relative pathspecs
are not CWD-normalized against root-relative authorship-log paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant