You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance
#5078
Open
auto-d opened this issue
Jan 16, 2025
· 5 comments
The model client documentation suggests this fix for missing tokens in an OpenAIChatCompletionClient streaming response:
set extra_create_args={"stream_options": {"include_usage": True}},
However the (final) message from the server with the requested usage information raises an exception due to an 'empty chunk' in the streaming processor implementation (openai/_openai_client.py).
Code to reproduce in autogen 0.41:
model_client = OpenAIChatCompletionClient(model="gpt-4o-mini", api_key = api_key)
model_client.create_stream(
messages=["Tell me a story about pirates"],
extra_create_args={"stream_options": {"include_usage": True}})
async for response in stream:
print(response)
# ValueError raised by logic in _openai_client.py's `create_stream`.
Workaround for me was setting max_consecutive_empty_chunk_tolerance to 2, which per the comments was included to fix a problem with the Azure endpoint.
The text was updated successfully, but these errors were encountered:
Could you submit a PR to update the API docs of the create_stream methods of both OpenAIChatCompletionClient and AzureOpenAIChatCompletionClient on your error resolution? cc @MohMaz
ekzhu
changed the title
Streaming with OpenAIChatCompletionClient raises an exception when usage is requested
Streaming with OpenAIChatCompletionClient raises an 'empty_chunk` exception when usage is requested, fixed using max_consecutive_empty_chunk_tolerance
Jan 16, 2025
Could you submit a PR to update the API docs of the create_stream methods of both OpenAIChatCompletionClient and AzureOpenAIChatCompletionClient on your error resolution? cc @MohMaz
@ekzhu both of those classes inherit their create_stream implementation from BaseOpenAIChatCompletionClient. Do you want the documentation updated there or in the respective class docs?
The model client documentation suggests this fix for missing tokens in an OpenAIChatCompletionClient streaming response:
However the (final) message from the server with the requested usage information raises an exception due to an 'empty chunk' in the streaming processor implementation (openai/_openai_client.py).
Code to reproduce in autogen 0.41:
Workaround for me was setting
max_consecutive_empty_chunk_tolerance
to 2, which per the comments was included to fix a problem with the Azure endpoint.The text was updated successfully, but these errors were encountered: