Skip to content

emit platform meter + traces telemetry on Gemini escalation paths #31

@haasonsaas

Description

@haasonsaas

This is the most-starred MCP in the evalops org (105★ at time of filing). Its core value is model-escalation: Claude Code hands off to Gemini for huge-context sweeps. Today nobody sees the cost side of that escalation — every user pays Anthropic + Google directly with no central visibility.

Ask

Wire two optional platform integrations (env-gated, no behavior change when unset):

  1. meter: fire `meter.v1.MeterService/RecordUsage` on every completed Claude call AND every completed Gemini call with:

    • `model` = resolved model id (e.g. `claude-opus-4-7`, `gemini-2.5-pro`)
    • `provider` = "anthropic" or "google"
    • `surface` = "deep-code-reasoning-mcp"
    • `event_type` = "analysis" / "escalation" / "conversation"
    • `input_tokens`, `output_tokens`, `total_cost_usd`, `metadata.escalation_reason`
  2. traces: record one trace per analysis request with spans for each Claude ↔ Gemini turn. Emit `AnnotateTraceQuality` with `quality_per_dollar` so the cost-of-escalation is queryable.

Payoff

  • First read surface for this MCP's usage — no other repo has visibility into it today
  • Feeds `evalops/console#15` attribution panel (per-agent revenue via `agent_id = "deep-code-reasoner"`)
  • Token budgets from `evalops/platform#611` (llm-gateway budgets) become enforceable for MCP surfaces
  • The structural story "evalops's 105-star wedge produces data flywheel signal" becomes true

Non-goals

  • Not required by default — MIT users who never set the env vars see no change
  • Not gating the Claude/Gemini calls; purely telemetry

Related

Scope

~3-4 days.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions