Soniox TTS throws 408 timeout and breaks the stream

### Bug Description

The Soniox TTS plugin gives occasional timeouts (408 errors). Within version 1.5.17 this didn't break the stream and agent utterance just proceeded.

Since upgrade from version 1.5.17 to 1.6.x the 408 breaks the stream entirely:


```
2026-06-24 15:06:30,110 - ERROR asyncio - Task exception was never retrieved
future: <Task finished name='Task-1631' coro=<_tts_inference_task() done, defined at /Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/utils/log.py:14> exception=APIStatusError('Request timeout', status_code=408, request_id=None, body='stream_id=160983c9d949 {"stream_id":"160983c9d949","error_code":408,"error_message":"Request timeout","error_type":"request_timeout","more_info":"https://soniox.com/docs/api-reference/errors#request-timeout","request_id":"fc0939a1-15e5-4d72-bbd9-ed5b111392c2"}', retryable=True)>
Traceback (most recent call last):
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 690, in __anext__
    val = await self._event_aiter.__anext__()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopAsyncIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/utils/log.py", line 17, in async_fn_logs
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/opentelemetry/util/_decorator.py", line 71, in async_wrapper
    return await func(*args, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/voice/generation.py", line 289, in _tts_inference_task
    async for audio_frame in tts_node:
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/voice/agent.py", line 523, in tts_node
    async for ev in stream:
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 693, in __anext__
    raise exc  # noqa: B904
    ^^^^^^^^^
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 433, in _traceable_main_task
    await self._main_task()
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 479, in _main_task
    await self._run(output_emitter)
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/plugins/soniox/tts.py", line 296, in _run
    await waiter
livekit.agents._exceptions.APIStatusError: message='Request timeout', status_code=408, retryable=True, body=stream_id=160983c9d949 {"stream_id":"160983c9d949","error_code":408,"error_message":"Request timeout","error_type":"request_timeout","more_info":"https://soniox.com/docs/api-reference/errors#request-timeout","request_id":"fc0939a1-15e5-4d72-bbd9-ed5b111392c2"} {"room": "test-agent-686eba8c5b697f2471ab13eb-1782306293287", "session_id": "RM_sKoGDi8TTeLm", "user_id": "+31613397955", "sip_phone_number": "test-686eba8c5b697f2471ab13eb", "agent_identification": "686eba8c5b697f2471ab13eb", "agent_name": "Vebego HR Agent", "organization_identification": "686ea79b5b697f2471ab13e8", "version_id": "9f6e3955-c8cb-4226-aa84-8a91a5102a50", "organization": "Vebégo Facility Solutions", "agent_id": "686eba8c5b697f2471ab13eb", "organization_id": "686ea79b5b697f2471ab13e8", "channel": "voice", "language": "nl", "pid": 42458, "job_id": "AJ_PkJaqhpfy4qy"}
Traceback (most recent call last):
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 690, in __anext__
    val = await self._event_aiter.__anext__()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopAsyncIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/utils/log.py", line 17, in async_fn_logs
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/opentelemetry/util/_decorator.py", line 71, in async_wrapper
    return await func(*args, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/voice/generation.py", line 289, in _tts_inference_task
    async for audio_frame in tts_node:
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/voice/agent.py", line 523, in tts_node
    async for ev in stream:
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 693, in __anext__
    raise exc  # noqa: B904
    ^^^^^^^^^
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 433, in _traceable_main_task
    await self._main_task()
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 479, in _main_task
    await self._run(output_emitter)
  File "/Users/xxxx/xxx/xxx.xx.xxx/.venv/lib/python3.12/site-packages/livekit/plugins/soniox/tts.py", line 296, in _run
    await waiter
livekit.agents._exceptions.APIStatusError: message='Request timeout', status_code=408, retryable=True, body=stream_id=160983c9d949 {"stream_id":"160983c9d949","error_code":408,"error_message":"Request timeout","error_type":"request_timeout","more_info":"https://soniox.com/docs/api-reference/errors#request-timeout","request_id":"fc0939a1-15e5-4d72-bbd9-ed5b111392c2"}

```

What I think:

In 1.5.17 the agent core called your TTS once per sentence/segment, each Soniox stream got its text, finalized, and closed promptly.

In 1.6.x the core changed to one single TTS stream per whole turn. The LLM tokens are fed incrementally into that one long-lived stream, so between tokens (and especially while waiting for the next LLM chunk or a tool call) the Soniox stream sits open with no text and no text_end. Soniox's per-stream idle timeout fires on that gap → 408 Request timeout, which tears down the stream.

This is probably here: livekit-agents/livekit/agents/voice/generation.py, in the TTS inference path.

We fixed this on our end by making our own Soniox TTS custom plugin, but might be good to have this implementation also resolved in the packages.

### Expected Behavior

Handling Soniox TTS stream needs to be done correctly. As it seams Soniox TTS works inherently different than other TTS providers

### Reproduction Steps

```bash
See description in the bug
Preemptive generation as enabled as well, but shouldn't be the issue.
```

### Operating System

macOS, Linux

### Models Used

Soniox STT, TTS, OpenAI GPT 4.1

### Package Versions

```bash
1.5.17 for all vs 1.6.2 for all
```

### Session/Room/Call IDs

_No response_

### Proposed Solution

```python

```

### Additional Context

_No response_

### Screenshots and Recordings

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Soniox TTS throws 408 timeout and breaks the stream #6225

Bug Description

Expected Behavior

Reproduction Steps

Operating System

Models Used

Package Versions

Session/Room/Call IDs

Proposed Solution

Additional Context

Screenshots and Recordings

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Soniox TTS throws 408 timeout and breaks the stream #6225

Description

Bug Description

Expected Behavior

Reproduction Steps

Operating System

Models Used

Package Versions

Session/Room/Call IDs

Proposed Solution

Additional Context

Screenshots and Recordings

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions