Skip to content

[Telemetry] Inputs/outputs not exported to Genkit Monitoring for run spans #4889

@rossano-montori-elp

Description

@rossano-montori-elp

Is your feature request related to a problem? Please describe.

Yes. When wrapping custom logic inside a ai.run() block, the input/output payloads are not visible in the production Firebase Genkit UI. Instead, spans show emtpy.
When using ai.generate(), it preserves the input/out in the logging traces while for ai.run() it doens't so the Firebase Genkit UI can't stich up the custom step.

Core Genkit primitives (such as flows/generate/index/embed paths) appear to retain richer payload visibility via additional logging/telemetry integration, but generic .run() steps do not consistently get equivalent treatment.

This creates a local-vs-production observability gap: debugging custom multi-step logic is straightforward in local Genkit UI, but significantly harder once deployed.

Describe the solution you'd like

Please add support for .run() step payload observability in production using a structured logging mechanism similar to existing telemetry handling for other Genkit primitives.

Specifically, when a .run() step executes, emit structured log entries containing step input/output metadata in a format that allows the Firebase Genkit UI to reconstruct or surface those values in the Firebase Genkit UI Input/Output panels.

Describe alternatives you've considered

  1. Convert .run() blocks into sub-flows
    This can restore better telemetry visibility, but introduces significant boilerplate (defineFlow, schemas, extra indirection) for simple internal steps.

  2. Manual logging inside .run()
    Logging with logger.info(...) helps in Cloud Logging / Logs tab, but does not populate native Genkit Input/Output panels, fragmenting debugging across multiple views.

Expected behavior

For deployed flows, .run() custom steps should provide usable input/output visibility in Firebase Genkit UI, comparable to other first-class Genkit execution units.

Actual behavior

For deployed flows, .run() custom step spans show missing input/output details in Firebase Genkit UI.

Steps to reproduce

  1. Define a flow with a .run() step that returns an object larger than a small span-attribute budget:
export const myFlow = genkitAiClient.defineFlow(
  {
    name: "test-flow",
  },
  async () => {
    const data = await ai.run("fetch-large-data", async () => {
      // Return an object
      return { somePayload: "123" };
    });

    return data;
  }
);
  1. Run locally via genkit start and execute the flow.
  2. Observe that local tooling can show step payloads.
  3. Deploy to Firebase and execute the same flow.
  4. Open Firebase Genkit UI for the execution.
  5. Observe .run() step shows empty content.

Thanks for considering this. Improving .run() observability in production would make custom steps much more practical for complex AI pipelines without forcing developers to over-model internal logic as sub-flows.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions