This is the most-starred MCP in the evalops org (105★ at time of filing). Its core value is model-escalation: Claude Code hands off to Gemini for huge-context sweeps. Today nobody sees the cost side of that escalation — every user pays Anthropic + Google directly with no central visibility.
Ask
Wire two optional platform integrations (env-gated, no behavior change when unset):
-
meter: fire `meter.v1.MeterService/RecordUsage` on every completed Claude call AND every completed Gemini call with:
- `model` = resolved model id (e.g. `claude-opus-4-7`, `gemini-2.5-pro`)
- `provider` = "anthropic" or "google"
- `surface` = "deep-code-reasoning-mcp"
- `event_type` = "analysis" / "escalation" / "conversation"
- `input_tokens`, `output_tokens`, `total_cost_usd`, `metadata.escalation_reason`
-
traces: record one trace per analysis request with spans for each Claude ↔ Gemini turn. Emit `AnnotateTraceQuality` with `quality_per_dollar` so the cost-of-escalation is queryable.
Payoff
- First read surface for this MCP's usage — no other repo has visibility into it today
- Feeds `evalops/console#15` attribution panel (per-agent revenue via `agent_id = "deep-code-reasoner"`)
- Token budgets from `evalops/platform#611` (llm-gateway budgets) become enforceable for MCP surfaces
- The structural story "evalops's 105-star wedge produces data flywheel signal" becomes true
Non-goals
- Not required by default — MIT users who never set the env vars see no change
- Not gating the Claude/Gemini calls; purely telemetry
Related
Scope
~3-4 days.
This is the most-starred MCP in the evalops org (105★ at time of filing). Its core value is model-escalation: Claude Code hands off to Gemini for huge-context sweeps. Today nobody sees the cost side of that escalation — every user pays Anthropic + Google directly with no central visibility.
Ask
Wire two optional platform integrations (env-gated, no behavior change when unset):
meter: fire `meter.v1.MeterService/RecordUsage` on every completed Claude call AND every completed Gemini call with:
traces: record one trace per analysis request with spans for each Claude ↔ Gemini turn. Emit `AnnotateTraceQuality` with `quality_per_dollar` so the cost-of-escalation is queryable.
Payoff
Non-goals
Related
Scope
~3-4 days.