fix(providers): dedup tool specs at wire boundary to prevent 400 "Tool names must be unique" by YellowSnnowmann · Pull Request #2846 · tinyhumansai/openhuman

YellowSnnowmann · 2026-05-28T12:07:13Z

Summary

Add first-wins dedup by function.name inside OpenAiCompatibleProvider::convert_tool_specs so duplicate ToolSpec entries never reach the provider's tools array on the wire.
Emit a single log::warn! per request listing the dropped names — no per-call spam, but visibility into where the duplicates originated.
Add 5 unit tests covering None, empty input, unique passthrough, first-wins dedup (verifying the surviving entry's description / parameters are the first occurrence), and many-duplicates.

Problem

Sentry issue TAURI-RUST-2E — cloud API error (400 Bad Request): {"error":{"message":"Tool names must be unique.","type":"invalid_request_error","param":null,"code":"invalid_request_error"}} — 164 events / 14d on the tauri-rust project.
OpenAI's chat-completions schema requires every entry in the tools array to have a unique function.name. Sending two entries with the same name fails the entire request with 400 — the chat turn dies and the user sees a hard error.
dedup_visible_tool_specs already runs at the session builder layer (src/openhuman/agent/harness/session/builder.rs:44) and at every visible-tool-set materialisation point (initial build, post-Composio refresh, scope-filter change). However, several call paths reach the Provider trait without going through that layer:

Sub-agent spawn paths assemble their own tool sets and call Provider::chat directly.
Triage / escalation flows can splice tools from multiple sources before the request is constructed.
Future callers that bypass the session pipeline (no compiler guard prevents this).

Any of these can hand OpenAiCompatibleProvider::chat a tools: &[ToolSpec] slice with duplicate name values, and the pre-fix convert_tool_specs blindly serialised all of them — producing the 400.

Solution

src/openhuman/inference/provider/compatible.rs — replace the .map(...).collect() pipeline in convert_tool_specs with a single pass that tracks seen names:

fn convert_tool_specs(
    tools: Option<&[crate::openhuman::tools::ToolSpec]>,
) -> Option<Vec<serde_json::Value>> {
    tools.map(|items| {
        let mut seen: std::collections::HashSet<&str> =
            std::collections::HashSet::with_capacity(items.len());
        let mut dropped: Vec<&str> = Vec::new();
        let mut out: Vec<serde_json::Value> = Vec::with_capacity(items.len());
        for tool in items {
            if !seen.insert(tool.name.as_str()) {
                dropped.push(tool.name.as_str());
                continue;
            }
            out.push(serde_json::json!({
                "type": "function",
                "function": {
                    "name": tool.name,
                    "description": tool.description,
                    "parameters": tool.parameters,
                }
            }));
        }
        if !dropped.is_empty() {
            log::warn!(
                "[providers][compatible] dropped {} duplicate tool spec(s) at wire \
                 boundary (TAURI-RUST-2E): {:?}",
                dropped.len(),
                dropped
            );
        }
        out
    })
}

Design choices

First-wins — matches the upstream dedup_visible_tool_specs convention at builder.rs:48-50. Same semantics across both layers, no surprise reorderings.
Wire boundary, not just session — the bug exists because not every caller goes through the session dedup. Fixing at the single chokepoint every provider call funnels through (convert_tool_specs) makes future callers safe by construction.
log::warn! (not debug!) — when this branch fires, an upstream code path produced duplicates and bypassed the session dedup. That's a real bug worth investigating, not log noise. Fire rate is bounded by the upstream Sentry frequency (≤164/14d).
HashSet<&str>, not HashSet — borrows from the input slice; no allocations for the dedup itself, just one alloc each for out and dropped.
No change to non-duplicate behaviour — unique inputs produce byte-identical output to pre-fix. Verified by convert_tool_specs_passes_through_unique_names.

Submission Checklist

Tests added or updated (happy path + at least one failure / edge case)
Diff coverage ≥ 80% — pending local pnpm test:rust run
Coverage matrix updated — N/A: bug-fix behaviour-only change, no new feature row
No new external network dependencies introduced
Manual smoke checklist updated — N/A: no release-cut surface touched
Linked issue closed via Closes #NNN — N/A: Sentry-tracked issue, no GitHub issue yet

Impact

Runtime: desktop (Rust core). No mobile / web / CLI surface change.
Performance: one HashSet<&str> per request, sized to items.len(). O(n) — same complexity as the previous iter().map().collect(). No additional allocations on the unique-only happy path.
Security: none — no new network surface, no new inputs trusted, no auth path touched.
Migration / compatibility: none. Trait surface untouched. Output for unique inputs is byte-identical to pre-fix. Previously-failing 400 turns now succeed with the first-occurrence definition of any duplicated tool. Upstream session dedup still runs first — this PR is defence-in-depth, not a replacement.

Bug Fixes
- Duplicate tool definitions are now automatically removed when sending tool specifications to compatible providers; discarded duplicates are reported in logs.
Tests
- Added comprehensive tests to ensure tool specification deduplication preserves the first occurrence and collapses duplicates across many entries.

coderabbitai · 2026-05-28T12:07:48Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1c2d935c-3e17-4078-a38f-e584ca0be187

📥 Commits

Reviewing files that changed from the base of the PR and between af40687 and 00a36d7.

📒 Files selected for processing (2)

src/openhuman/inference/provider/compatible.rs
src/openhuman/inference/provider/compatible_tests.rs

🚧 Files skipped from review as they are similar to previous changes (1)

src/openhuman/inference/provider/compatible.rs

📝 Walkthrough

Walkthrough

The PR adds deduplication logic to convert_tool_specs in the OpenAI-compatible provider to drop duplicate tool specs by name before serialization, logs warnings when duplicates are discarded, and validates the behavior with unit tests using a first-occurrence-wins strategy.

Changes

Tool Spec Deduplication at Provider Boundary

Layer / File(s)	Summary
Tool spec deduplication with HashSet and logging `src/openhuman/inference/provider/compatible.rs`, `src/openhuman/inference/provider/compatible_tests.rs`	`convert_tool_specs` deduplicates input tool specs by `name` using a `HashSet`, preserves first occurrence, drops later duplicates with a warning log, and returns the deduplicated serialized output. Tests verify None/empty handling, unique name passthrough, and deduplication across single and multiple duplicate scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

tinyhumansai/openhuman#2665: Related deduplication of tool specs by tool.name—that PR restores a test asserting deduplication at the run_tool_call_loop boundary.
tinyhumansai/openhuman#2485: Implements the same deduplication (first occurrence wins) in the sub-agent runner spec assembly path.

Suggested labels

rust-core, bug

Suggested reviewers

M3gA-Mind
oxoxDev

Poem

🐰 I hopped through specs one night,
Saw names repeated, blocking light,
A HashSet peeked — one kept, rest gone,
Warnings whispered at the dawn —
Now calls are tidy, names aligned, all right.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: deduplicating tool specs at the provider wire boundary to prevent OpenAI 400 errors from duplicate tool names.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/openhuman/inference/provider/compatible_tests.rs (1)

1593-1618: 💤 Low value

Consider verifying parameters field in first-wins assertion.

The test correctly validates that the first occurrence's description survives deduplication. For completeness, you could also assert that parameters from the first occurrence is retained (not the duplicate's {"different": true}).

♻️ Optional enhancement

     assert_eq!(
         out[0]["function"]["description"].as_str().unwrap(),
         "alpha desc",
         "first occurrence's description must survive (first-wins)"
     );
+    assert_eq!(
+        out[0]["function"]["parameters"]["type"].as_str().unwrap(),
+        "object",
+        "first occurrence's parameters must survive (first-wins)"
+    );

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/inference/provider/compatible_tests.rs` around lines 1593 -
1618, Update the test convert_tool_specs_dedups_duplicate_names_first_wins to
also assert that the first occurrence's `parameters` survive deduplication:
after building `specs` and calling
OpenAiCompatibleProvider::convert_tool_specs(Some(&specs)) -> out, add an
assertion that out[0]["function"]["parameters"] equals the parameters from the
original spec("alpha") (and/or assert it does not equal second_alpha.parameters
/ the serde_json!({"different": true}) value). This ensures `parameters` follow
the same first-wins behavior as `description`.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/inference/provider/compatible_tests.rs`:
- Around line 1593-1618: Update the test
convert_tool_specs_dedups_duplicate_names_first_wins to also assert that the
first occurrence's `parameters` survive deduplication: after building `specs`
and calling OpenAiCompatibleProvider::convert_tool_specs(Some(&specs)) -> out,
add an assertion that out[0]["function"]["parameters"] equals the parameters
from the original spec("alpha") (and/or assert it does not equal
second_alpha.parameters / the serde_json!({"different": true}) value). This
ensures `parameters` follow the same first-wins behavior as `description`.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 63e714ec-9b72-4efe-9866-256226f7ca3b

📥 Commits

Reviewing files that changed from the base of the PR and between 8365e0c and af40687.

📒 Files selected for processing (2)

src/openhuman/inference/provider/compatible.rs
src/openhuman/inference/provider/compatible_tests.rs

CodeGhost21

Good fix for a real production issue. The implementation is correct and the test coverage is thorough.

Walkthrough: convert_tool_specs previously passed all tool specs to the wire verbatim, causing 400 errors when duplicate names were present. This PR adds a single-pass dedup using HashSet<&str> (borrows from input, no extra allocations on the happy path) with a log::warn! when duplicates are actually dropped. First-wins semantics match the upstream dedup_visible_tool_specs convention at session/builder.rs:44. The 5 unit tests cover None input, empty slice, unique passthrough, first-wins dedup verifying the surviving entry's description, and many-duplicates collapse.

One minor note: the log message embeds the Sentry issue ID TAURI-RUST-2E directly in the string. That's good for current traceability, but it will silently go stale if the issue is re-keyed or the project migrates. A plain-language description plus the ID as a secondary annotation (or a code comment on the warn! call) would age better. Not a blocker.

Overall the fix is well-reasoned, well-tested, and the performance analysis in the PR description is accurate. No issues with the implementation.

CodeGhost21

Looks good, nice work!

graycyrus

Solid fix for a real production problem. The dedup logic is exactly right: single-pass, first-wins, HashSet<&str> borrows the input slice so no extra allocations on the happy path — cleaner than the HashSet<String> in dedup_visible_tool_specs upstream since ownership isn't needed here.

The log::warn! on the dropped-names branch is the right call. When this fires it means a caller bypassed the session-layer dedup — that's a real upstream bug worth investigating, not log noise. Rate is bounded by whatever's producing the duplicates.

Five tests cover all the meaningful cases. The first-wins assertion (checking description of the surviving entry) is exactly the right thing to verify.

Two small observations, not blocking:

The Sentry issue ID TAURI-RUST-2E is hardcoded in the log message. That's fine for now, but if the issue is closed or migrated the string becomes stale. A comment linking to the issue is usually more durable than embedding the ID in a log string — but this is a style preference, not a bug.

The dropped Vec<&str> is allocated on every call even when nothing is dropped. Since Vec::new() doesn't heap-allocate until the first push, this is effectively free — but Vec::new() (no with_capacity) would make the zero-drop path even more explicit. Again, not a real concern at this call frequency.

CI is green across the board. Good work tracking down the bypass paths and closing them at the chokepoint.

…bleProvider and log warnings for dropped entries

…tibleProvider to ensure proper handling of input and deduplication

coderabbitai · 2026-05-28T22:21:30Z

Actionable comments posted: 0

M3gA-Mind

LGTM. Reviewed the diff thoroughly:

Implementation (convert_tool_specs): correct first-wins dedup using HashSet<&str>, mirrors the existing dedup_visible_tool_specs convention in session/builder.rs exactly. Single-pass O(n) with pre-allocated capacity. Log::warn! on drop is correctly scoped (no per-call spam when empty).
Tests: 5 cases covering None, empty slice, unique passthrough, first-wins semantics (verifies description from first occurrence survives), and many-duplicates. All pass locally.
Conflict resolution: rebased onto upstream/main, kept both upstream's reasoning_content round-trip tests and this PR's convert_tool_specs dedup tests.
CI: all checks pass (coverage gate ≥ 80%, all E2E, Rust + TS quality).

YellowSnnowmann marked this pull request as ready for review May 28, 2026 12:34

YellowSnnowmann requested a review from a team May 28, 2026 12:34

coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. bug labels May 28, 2026

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

coderabbitai Bot previously approved these changes May 28, 2026

View reviewed changes

CodeGhost21 reviewed May 28, 2026

View reviewed changes

Comment thread src/openhuman/inference/provider/compatible.rs

CodeGhost21 previously approved these changes May 28, 2026

View reviewed changes

graycyrus previously approved these changes May 28, 2026

View reviewed changes

YellowSnnowmann added 2 commits May 29, 2026 03:41

fix(inference): handle duplicate tool specifications in OpenAiCompati…

f8c7e18

…bleProvider and log warnings for dropped entries

test(inference): add unit tests for convert_tool_specs in OpenAiCompa…

00a36d7

…tibleProvider to ensure proper handling of input and deduplication

M3gA-Mind dismissed stale reviews from graycyrus, CodeGhost21, and coderabbitai[bot] via 00a36d7 May 28, 2026 22:19

M3gA-Mind force-pushed the fix/compatible-dedup-tool-specs-at-wire branch from af40687 to 00a36d7 Compare May 28, 2026 22:19

coderabbitai Bot approved these changes May 28, 2026

View reviewed changes

M3gA-Mind approved these changes May 28, 2026

View reviewed changes

M3gA-Mind merged commit a211cac into tinyhumansai:main May 28, 2026
49 of 74 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(providers): dedup tool specs at wire boundary to prevent 400 "Tool names must be unique"#2846

fix(providers): dedup tool specs at wire boundary to prevent 400 "Tool names must be unique"#2846
M3gA-Mind merged 2 commits into
tinyhumansai:mainfrom
YellowSnnowmann:fix/compatible-dedup-tool-specs-at-wire

YellowSnnowmann commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

CodeGhost21 left a comment

Uh oh!

Uh oh!

CodeGhost21 left a comment

Uh oh!

graycyrus left a comment

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

M3gA-Mind left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

YellowSnnowmann commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Design choices

Submission Checklist

Impact

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

CodeGhost21 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CodeGhost21 left a comment

Choose a reason for hiding this comment

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

M3gA-Mind left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

YellowSnnowmann commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading