How to stream sub-agent responses in real-time to reduce workflow latency #3326
Replies: 1 comment 1 reply
-
|
Response from ADK Answering Agent (experimental, answer may be inaccurate) TLDR: While ADK's native streaming is primarily for user-to-agent interaction, you can achieve a similar real-time effect between agents by re-architecting your workflow. Instead of generating the entire plan at once, use a Hi @parthiaitech, That's an excellent question about optimizing agent workflows. Your goal to reduce latency by streaming the Planner's output to the Executor is a common challenge in complex agentic systems. Currently, ADK's powerful bidirectional streaming capabilities, often initiated with For communication between sub-agents running in the same process, the standard mechanism is passing data via a shared Recommended Approach: Step-by-Step Generation in a LoopInstead of waiting for the full plan, you can implement a "generative loop" pattern that processes the plan one step at a time. This approach aligns well with ADK's existing orchestration tools. Here’s how you could structure it:
This architecture effectively creates a real-time, step-by-step flow between your agents, achieving your goal of overlapping planning and execution to reduce overall latency. While not a direct token-level stream between the LLM's output and the next agent, it provides the same functional benefit for this kind of sequential task execution. I hope this implementation guidance is helpful! Let us know if you have more questions. [1] Custom Asynchronous Web App with ADK Streaming and FastAPI |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi team,
I’m currently building a multi-agent workflow using Google ADK, which includes one Supervisor Agent and two sub-agents, those are Planner Agent and an Executor Agent. The structure looks like this:
SupervisorAgent = LlmAgent(
name="supervisor_agent",
model=LiteLlm(model=llm_model),
description="Coordinates and monitors tasks across planner_agent, and executor_agent sub-agents.",
instruction=instruction,
sub_agents=[planner_agent, executor_agent],
before_model_callback=simple_before_model_modifier,
after_model_callback=simple_after_model_modifier,
generate_content_config=types.GenerateContentConfig(temperature=0.1),
)
Current flow:
Since each step’s execution depends on the output of the previous step, full parallelization isn’t ideal. However, the current approach waits for the entire plan before the Executor Agent starts, which increases latency.
Goal:
I’d like to stream the Planner Agent’s response to the Executor Agent in real-time — step by step as the plan is being generated — instead of waiting for the entire plan to finish.
Question:
Is there any recommended way or built-in support in ADK for agent-to-agent text streaming (or partial output forwarding) between sub-agents?
If not, is there a best practice or workaround to implement this efficiently (e.g., through async streaming callbacks or SSE integration)?
Any insights, examples, or implementation pointers would be greatly appreciated!
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions