Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 109 additions & 27 deletions src/commands/blame.rs

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Model name missing from blame output when --show-prompt flag is used

The agent model name is omitted from the rendered blame line (format!("{} [{}]", prompt.agent_id.tool, short_hash) at src/commands/blame.rs:1832) despite the width calculation including it, so --show-prompt output shows only the tool name with excess padding instead of the intended "tool model [hash]" format.

Impact: Users running git-ai blame --show-prompt see the agent name without the model, contradicting the PR's goal of displaying both.

Width calculation was updated but the parallel rendering site was not

The output_default_format function has two nearly-identical show_prompt branches:

  1. Width calculation at src/commands/blame.rs:1776-1780 (correctly updated):
format!(
    "{} [{}]",
    format_agent_author(&prompt.agent_id.tool, &prompt.agent_id.model),
    short_hash
)
  1. Actual rendering at src/commands/blame.rs:1830-1832 (not updated):
format!("{} [{}]", prompt.agent_id.tool, short_hash)

The width loop at line 1778 computes max_author_width using the full format_agent_author output (e.g. "cursor sonnet-4-6 [abc1234]"), but the rendering loop at line 1832 produces only "cursor [abc1234]". This results in (a) model info missing from --show-prompt output and (b) extra whitespace padding in the author column.

(Refers to line 1832)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
use crate::auth::CredentialStore;
use crate::authorship::authorship_log::{HumanRecord, PromptRecord, SessionRecord};
use crate::authorship::authorship_log_serialization::AuthorshipLog;
use crate::authorship::authorship_log_serialization::{AUTHORSHIP_LOG_VERSION, AuthorshipLog};
use crate::authorship::working_log::CheckpointKind;
use crate::error::GitAiError;
use crate::git::notes_api::read_authorship_v3 as get_reference_as_authorship_log_v3;
use crate::git::repository::Repository;
use crate::git::repository::{exec_git, exec_git_stdin};
#[cfg(windows)]
Expand Down Expand Up @@ -935,6 +934,43 @@ impl Repository {
Ok(hunks)
}

/// Batch-load v3 authorship logs for a set of commits in a handful of git
/// invocations (one `ls-tree` + batched `cat-file`) instead of one
/// `git notes show` subprocess per commit.
///
/// Commits without a valid note are recorded as `None` so callers can cache
/// negative lookups too. Behavior mirrors `read_authorship_v3`: the schema
/// version is enforced and each log's `base_commit_sha` is aligned to the
/// commit its note is attached to.
fn load_authorship_logs_batched(
&self,
commit_shas: impl IntoIterator<Item = String>,
) -> HashMap<String, Option<AuthorshipLog>> {
let unique: Vec<String> = commit_shas
.into_iter()
.collect::<HashSet<_>>()
.into_iter()
.collect();
let mut cache: HashMap<String, Option<AuthorshipLog>> = HashMap::new();
if unique.is_empty() {
return cache;
}

let notes = crate::git::notes_api::read_notes_batch(self, &unique).unwrap_or_default();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Batched note loading drops HTTP fetch-and-cache step present in old per-commit path

The old code called read_authorship_v3 per commit, which for the HTTP backend tried the local cache and then fell back directly to local git notes. The new read_notes_batch (src/git/notes_api.rs:57-94) adds an intermediate step (http_fetch_and_cache_notes) that actively fetches from the remote HTTP backend before falling back to local git notes. This is a behavioral change (the batched path is actually more thorough) but likely an improvement rather than a regression — commits that exist on the remote but not in the local cache will now be found. However, this could cause unexpected network calls in scenarios where the old code never hit the network for blame operations.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

for sha in unique {
let log = notes.get(&sha).and_then(|content| {
let mut log = AuthorshipLog::deserialize_from_string(content).ok()?;
if log.metadata.schema_version != AUTHORSHIP_LOG_VERSION {
return None;
}
log.metadata.base_commit_sha = sha.clone();
Some(log)
});
cache.insert(sha, log);
}
cache
}

/// Post-process blame hunks to populate ai_human_author from authorship logs.
/// For each hunk, looks up the authorship log for its commit and finds the human_author
/// from the prompt record that covers lines in the hunk.
Expand All @@ -946,23 +982,21 @@ impl Repository {
file_path: &str,
options: &GitAiBlameOptions,
) -> Result<Vec<BlameHunk>, GitAiError> {
// Cache authorship logs by commit SHA to avoid repeated lookups
let mut commit_authorship_cache: HashMap<String, Option<AuthorshipLog>> = HashMap::new();
// Batch-load authorship logs for every commit in one shot to avoid one
// `git notes show` subprocess per commit.
let commit_authorship_cache =
self.load_authorship_logs_batched(hunks.iter().map(|h| h.commit_sha.clone()));
// Cache for foreign prompts to avoid repeated grepping
let mut foreign_prompts_cache: HashMap<String, Option<PromptRecord>> = HashMap::new();

let mut result_hunks: Vec<BlameHunk> = Vec::new();

for hunk in hunks {
// Get or fetch the authorship log for this commit
let authorship_log = if let Some(cached) = commit_authorship_cache.get(&hunk.commit_sha)
{
cached.clone()
} else {
let authorship = get_reference_as_authorship_log_v3(self, &hunk.commit_sha).ok();
commit_authorship_cache.insert(hunk.commit_sha.clone(), authorship.clone());
authorship
};
// Look up the pre-loaded authorship log for this commit.
let authorship_log = commit_authorship_cache
.get(&hunk.commit_sha)
.cloned()
.flatten();

// If we have an authorship log, look up human_author for each line
if let Some(ref authorship_log) = authorship_log {
Expand Down Expand Up @@ -1040,6 +1074,18 @@ impl Repository {
}
}

fn format_agent_author(tool: &str, model: &str) -> String {
let model = model.trim();
if model.is_empty() || model.eq_ignore_ascii_case("unknown") {
tool.to_string()
} else {
// Strip a redundant "claude-" prefix (e.g. "claude-sonnet-4-6" -> "sonnet-4-6")
// to keep the label compact.
let model = model.strip_prefix("claude-").unwrap_or(model);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 format_agent_author strips 'claude-' prefix unconditionally from model strings

The format_agent_author function at src/commands/blame.rs:1084 strips a leading claude- prefix from any model string regardless of tool name. This means if an agent tool named 'windsurf' happens to report a model like 'claude-3.5-sonnet', it would display as 'windsurf 3.5-sonnet'. The test at line 2308 (format_agent_author("amp", "anthropic/claude-opus")) shows that only a leading claude- is stripped, and since Anthropic model IDs typically start with claude-, this will commonly trigger for non-Claude tools that use Claude models. This seems intentional per the test but is worth a design review since it could confuse users who see a truncated model name on a non-Claude-branded tool.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

format!("{} {}", tool, model)
}
}

#[allow(clippy::type_complexity)]
fn overlay_ai_authorship(
repo: &Repository,
Expand Down Expand Up @@ -1068,24 +1114,22 @@ fn overlay_ai_authorship(
let mut commits_with_notes: std::collections::HashSet<String> =
std::collections::HashSet::new();

// Group hunks by commit SHA to avoid repeated lookups
let mut commit_authorship_cache: HashMap<String, Option<AuthorshipLog>> = HashMap::new();
// Batch-load authorship logs for every commit in one shot to avoid one
// `git notes show` subprocess per commit.
let commit_authorship_cache =
repo.load_authorship_logs_batched(blame_hunks.iter().map(|h| h.commit_sha.clone()));
// Simulated authorship logs for agent commits without notes. We keep these separate
// from commit_authorship_cache so a single agent commit can be handled across multiple
// blame hunks without being limited to the first hunk's line range.
let mut simulated_authorship_logs: HashMap<String, AuthorshipLog> = HashMap::new();
// Cache for foreign prompts to avoid repeated grepping
let mut foreign_prompts_cache: HashMap<String, Option<PromptRecord>> = HashMap::new();
for hunk in blame_hunks {
// Check if we've already looked up this commit's authorship
let authorship_log = if let Some(cached) = commit_authorship_cache.get(&hunk.commit_sha) {
cached.clone()
} else {
// Try to get authorship log for this commit
let authorship = get_reference_as_authorship_log_v3(repo, &hunk.commit_sha).ok();
commit_authorship_cache.insert(hunk.commit_sha.clone(), authorship.clone());
authorship
};
// Look up the pre-loaded authorship log for this commit.
let authorship_log = commit_authorship_cache
.get(&hunk.commit_sha)
.cloned()
.flatten();

// If we have AI authorship data, look up the author for lines in this hunk
if let Some(ref authorship_log) = authorship_log {
Expand Down Expand Up @@ -1131,8 +1175,13 @@ fn overlay_ai_authorship(
if options.use_prompt_hashes_as_names {
line_authors.insert(current_line_num, prompt_hash.clone());
} else {
line_authors
.insert(current_line_num, prompt_record.agent_id.tool.clone());
line_authors.insert(
current_line_num,
format_agent_author(
&prompt_record.agent_id.tool,
&prompt_record.agent_id.model,
),
);
}

prompt_records.insert(prompt_hash, prompt_record.clone());
Expand Down Expand Up @@ -1724,7 +1773,11 @@ fn output_default_format(
} else if options.show_prompt && prompt_records.contains_key(author) {
let prompt = &prompt_records[author];
let short_hash = &author[..7.min(author.len())];
format!("{} [{}]", prompt.agent_id.tool, short_hash)
format!(
"{} [{}]",
format_agent_author(&prompt.agent_id.tool, &prompt.agent_id.model),
short_hash
)
} else if options.show_email {
format!("{} <{}>", author, &hunk.author_email)
} else {
Expand Down Expand Up @@ -2236,3 +2289,32 @@ fn parse_line_range(range_str: &str) -> Option<(u32, u32)> {

None
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_format_agent_author_with_model() {
// A leading "claude-" is stripped to keep the label compact.
assert_eq!(
format_agent_author("claude", "claude-sonnet-4-6"),
"claude sonnet-4-6"
);
// Non-claude models are shown verbatim.
assert_eq!(format_agent_author("cursor", "gpt-4"), "cursor gpt-4");
// Only a leading "claude-" is stripped, not occurrences elsewhere.
assert_eq!(
format_agent_author("amp", "anthropic/claude-opus"),
"amp anthropic/claude-opus"
);
}

#[test]
fn test_format_agent_author_omits_unknown_or_empty_model() {
assert_eq!(format_agent_author("mock_ai", "unknown"), "mock_ai");
assert_eq!(format_agent_author("mock_ai", "UNKNOWN"), "mock_ai");
assert_eq!(format_agent_author("claude", ""), "claude");
assert_eq!(format_agent_author("claude", " "), "claude");
}
}
15 changes: 12 additions & 3 deletions src/commands/checkpoint_agent/presets/mock_ai.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,11 @@ impl AgentPreset for MockAiPreset {
.unwrap_or(0)
);

let (file_paths, cwd) = if hook_input.is_empty() {
let (file_paths, cwd, model) = if hook_input.is_empty() {
(
vec![],
std::env::current_dir().unwrap_or_else(|_| PathBuf::from(".")),
"unknown".to_string(),
)
} else {
let data: serde_json::Value = serde_json::from_str(hook_input)
Expand All @@ -42,14 +43,22 @@ impl AgentPreset for MockAiPreset {
.map(PathBuf::from)
.unwrap_or_else(|| std::env::current_dir().unwrap_or_else(|_| PathBuf::from(".")));

(paths, cwd)
// Allow tests to specify a model so model-dependent behavior (e.g. blame
// displaying "<agent> <model>") can be exercised end-to-end.
let model = data
.get("model")
.and_then(|v| v.as_str())
.unwrap_or("unknown")
.to_string();

(paths, cwd, model)
};

let context = PresetContext {
agent_id: AgentId {
tool: "mock_ai".to_string(),
id: mock_agent_id,
model: "unknown".to_string(),
model,
},
external_session_id: "mock_ai_session".to_string(),
trace_id: trace_id.to_string(),
Expand Down
35 changes: 29 additions & 6 deletions src/commands/git_ai_handlers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1050,13 +1050,33 @@ fn synthesize_hook_input_from_cli_args(preset_name: &str, remaining_args: &[Stri
match preset_name {
"human" | "mock_ai" | "mock_known_human" => {
let cwd = std::env::current_dir().unwrap_or_else(|_| std::path::PathBuf::from("."));
let mut paths: Vec<String> = remaining_args
// Optional `--model <name>` lets the mock_ai preset attach a real model
// (e.g. for exercising blame's "<agent> <model>" display); ignored by the
// other presets in this branch.
let mut model: Option<String> = None;
let mut path_args: Vec<&String> = Vec::new();
let mut i = 0usize;
while i < remaining_args.len() {
match remaining_args[i].as_str() {
"--model" if i + 1 < remaining_args.len() => {
model = Some(remaining_args[i + 1].clone());
i += 2;
}
arg if !arg.starts_with("--") => {
path_args.push(&remaining_args[i]);
i += 1;
}
_ => {
i += 1;
}
}
}
let mut paths: Vec<String> = path_args
.iter()
.filter(|a| !a.starts_with("--"))
.map(|s| {
let p = std::path::Path::new(s.as_str());
if p.is_absolute() {
s.clone()
(*s).clone()
} else {
cwd.join(p).to_string_lossy().to_string()
}
Expand All @@ -1065,11 +1085,14 @@ fn synthesize_hook_input_from_cli_args(preset_name: &str, remaining_args: &[Stri
if paths.is_empty() {
paths = discover_dirty_files_from_status(&cwd);
}
serde_json::json!({
let mut payload = serde_json::json!({
"file_paths": paths,
"cwd": cwd.to_string_lossy(),
})
.to_string()
});
if let Some(model) = model {
payload["model"] = serde_json::Value::String(model);
}
payload.to_string()
}
"known_human" => {
let cwd = std::env::current_dir().unwrap_or_else(|_| std::path::PathBuf::from("."));
Expand Down
68 changes: 68 additions & 0 deletions tests/integration/blame_flags.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1472,6 +1472,72 @@ fn test_blame_ai_human_author() {
);
}

#[test]
fn test_blame_shows_agent_and_model() {
// Users repeatedly assumed git-ai wasn't capturing the model because blame only
// showed the agent name. Blame should now render AI lines as "<agent> <model>".
let repo = TestRepo::new();
let file_path = repo.path().join("test.txt");

// Human baseline line, committed without AI involvement.
std::fs::write(&file_path, "Human line\n").unwrap();
repo.stage_all_and_commit("Initial commit").unwrap();

// Add an AI line via mock_ai with an explicit model.
std::fs::write(&file_path, "Human line\nAI line\n").unwrap();
repo.git_ai(&[
"checkpoint",
"mock_ai",
"--model",
"claude-sonnet-4-6",
"test.txt",
])
.unwrap();
repo.stage_all_and_commit("AI commit").unwrap();

let output = repo.git_ai(&["blame", "test.txt"]).unwrap();
println!("\n[DEBUG] git-ai blame:\n{}", output);
let lines: Vec<&str> = output.lines().collect();

// The "claude-" prefix is stripped, so "claude-sonnet-4-6" renders as "sonnet-4-6".
assert!(
lines[1].contains("mock_ai sonnet-4-6"),
"AI line should show agent and (claude-stripped) model: {}",
lines[1]
);
// The human line must not pick up a spurious model.
assert!(
!lines[0].contains("sonnet-4-6"),
"Human line should not show a model: {}",
lines[0]
);
}

#[test]
fn test_blame_omits_unknown_model() {
// When the model wasn't captured (sentinel "unknown"), blame should fall back to
// just the agent name rather than printing "mock_ai unknown".
let repo = TestRepo::new();
let file_path = repo.path().join("test.txt");

std::fs::write(&file_path, "Human line\n").unwrap();
repo.stage_all_and_commit("Initial commit").unwrap();

std::fs::write(&file_path, "Human line\nAI line\n").unwrap();
// No --model => mock_ai defaults to the "unknown" sentinel.
repo.git_ai(&["checkpoint", "mock_ai", "test.txt"]).unwrap();
repo.stage_all_and_commit("AI commit").unwrap();

let output = repo.git_ai(&["blame", "test.txt"]).unwrap();
let lines: Vec<&str> = output.lines().collect();

assert!(
lines[1].contains("mock_ai") && !lines[1].contains("unknown"),
"AI line should show agent without an 'unknown' model: {}",
lines[1]
);
}

crate::reuse_tests_in_worktree!(
test_blame_basic_format,
test_blame_line_range,
Expand Down Expand Up @@ -1501,4 +1567,6 @@ crate::reuse_tests_in_worktree!(
test_blame_without_ignore_revs_file_works_normally,
test_blame_ignore_revs_with_multiple_commits,
test_blame_ai_human_author,
test_blame_shows_agent_and_model,
test_blame_omits_unknown_model,
);
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
source: tests/initial_attributions.rs
source: tests/integration/initial_attributions.rs
assertion_line: 295
expression: normalized
---
"COMMIT_SHA (tool1 TIMESTAMP 1) line 1\nCOMMIT_SHA (tool1 TIMESTAMP 2) line 2\nCOMMIT_SHA (tool1 TIMESTAMP 3) line 3\nCOMMIT_SHA (mock_ai TIMESTAMP 4) line 4\nCOMMIT_SHA (tool2 TIMESTAMP 5) line 5\nCOMMIT_SHA (mock_ai TIMESTAMP 6) line 6\nCOMMIT_SHA (mock_ai TIMESTAMP 7) line 7\n"
"COMMIT_SHA (tool1 model1 TIMESTAMP 1) line 1\nCOMMIT_SHA (tool1 model1 TIMESTAMP 2) line 2\nCOMMIT_SHA (tool1 model1 TIMESTAMP 3) line 3\nCOMMIT_SHA (mock_ai TIMESTAMP 4) line 4\nCOMMIT_SHA (tool2 model2 TIMESTAMP 5) line 5\nCOMMIT_SHA (mock_ai TIMESTAMP 6) line 6\nCOMMIT_SHA (mock_ai TIMESTAMP 7) line 7\n"
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
source: tests/initial_attributions.rs
assertion_line: 274
source: tests/integration/initial_attributions.rs
assertion_line: 295
expression: normalized
---
"COMMIT_SHA (tool1 TIMESTAMP 1) line 1\nCOMMIT_SHA (tool1 TIMESTAMP 2) line 2\nCOMMIT_SHA (tool1 TIMESTAMP 3) line 3\nCOMMIT_SHA (mock_ai TIMESTAMP 4) line 4\nCOMMIT_SHA (tool2 TIMESTAMP 5) line 5\nCOMMIT_SHA (mock_ai TIMESTAMP 6) line 6\nCOMMIT_SHA (mock_ai TIMESTAMP 7) line 7\n"
"COMMIT_SHA (tool1 model1 TIMESTAMP 1) line 1\nCOMMIT_SHA (tool1 model1 TIMESTAMP 2) line 2\nCOMMIT_SHA (tool1 model1 TIMESTAMP 3) line 3\nCOMMIT_SHA (mock_ai TIMESTAMP 4) line 4\nCOMMIT_SHA (tool2 model2 TIMESTAMP 5) line 5\nCOMMIT_SHA (mock_ai TIMESTAMP 6) line 6\nCOMMIT_SHA (mock_ai TIMESTAMP 7) line 7\n"
Loading
Loading