Skip to content

Task tool blocks parent session indefinitely when sub-agent LLM call fails #29952

@Cothek

Description

@Cothek

Description

I'm running opencode 1.15.10 on Windows 11. I have a delegator agent that dispatches tasks to @general (a subagent using the opencode free tier, deepseek-v4-flash-free). When the free tier API call fails, the subagent session never gets cleaned up and the task tool blocks the parent session indefinitely.

What happens

  1. Dispatcher calls task() with subagent_type="general"
  2. Subagent session is created
  3. LLM call to the free provider fails in ~110ms with AI_APICallError
  4. The error is logged but the subagent session just sits there
  5. The task tool never returns control to the parent
  6. The parent session is now permanently stuck — can't make any more tool calls
  7. Only way out is to kill opencode and restart
    I let it sit for over 6 minutes to confirm. It does not recover.

Logs

From ~/.local/share/opencode/log/2026-05-29T215503.log:

# Session created
INFO  23:52:35  session id=ses_189d892a... parentID=ses_18b755b34ffe... created
# LLM stream starts
INFO  23:52:36  llm providerID=opencode modelID=deepseek-v4-flash-free stream
# Error at +110ms
ERROR 23:52:36 +110ms  llm providerID=opencode modelID=deepseek-v4-flash-free
       error={"error":{"name":"AI_APICallError",
              "url":"https://opencode.ai/zen/v1/chat/completions"}}
# Nothing happens for 6+ minutes. No cleanup. No retry.
# The parent session is blocked the entire time.
# Finally forced cancel at +6m14s
INFO  23:58:50  session.prompt session.id=ses_18b755b34ffe... cancel
INFO  23:58:50  session.prompt session.id=ses_189d892a... cancel
ERROR 23:58:50  session.processor session.id=ses_189d892a... error=Aborted process

The error happens in 110ms. The subagent session lives for 6+ minutes with no timeout.

The problem

Two things going wrong:

  1. The subagent session is never cleaned up after a fatal LLM error. The AI_APICallError is logged but nothing terminates the session. It just sits there in zombie mode forever.
  2. The task tool has no timeout and no cancel mechanism. Once you dispatch a task, there's no way to set a maximum wait time or abort it. If the subagent never finishes, the parent session is stuck permanently.
    This makes any unreliable provider unrecoverable. It doesn't matter if the error is transient — the first API error locks everything up.

What I'd expect

Any of these would fix it:

  • When a subagent's LLM call fails with a non-recoverable error, terminate the session and return the error to the parent so it can handle it
  • Add a timeout parameter to the task tool so the parent can say "give up after N seconds"
  • Some server-side garbage collection for zombie subagent sessions

Related issues

Environment

Plugins

No response

OpenCode version

No response

Steps to reproduce

No response

Screenshot and/or share link

No response

Operating System

No response

Terminal

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions