Skip to content

Conversation

berkcaputcu
Copy link

@berkcaputcu berkcaputcu commented May 17, 2025

Addresses one of the issues raised in #686

Problem

Ollama LLM returns the final chunk in parts when the chunk is too long. I only noticed this behaviour on the final chunk, I'm not sure if it happens on other chunks as well.

Solution

Improve the json_responses_chunk_handler to gracefully handle cases where a JSON chunk is split across buffer boundaries. If a chunk does not end with '}', it is considered incomplete and buffered until the next chunk arrives. This prevents JSON parsing errors and ensures all responses are processed correctly.

I took part of the solution from this diff: https://github.com/patterns-ai-core/langchainrb/pull/644/files#diff-746ba2cd57580e32b0f013cbe3c8eaf8f1621e112c89f3af07983321dd6846dbL143-L148

berkcaputcu and others added 3 commits May 17, 2025 13:32
Improve the `json_responses_chunk_handler` to gracefully handle cases
where a JSON chunk is split across buffer boundaries. If a chunk does
not end with '}', it is considered incomplete and buffered until the
next chunk arrives. This prevents JSON parsing errors and ensures all
responses are processed correctly.
@sergiobayona
Copy link
Collaborator

@berkcaputcu I made some additional changes against your repo on this PR berkcaputcu#1 please review. If okay, please merge so we can update this PR with those changes.

@berkcaputcu
Copy link
Author

@sergiobayona I noticed that you closed that PR. Do you want me to re-open, merge and update this branch?

@luizcarvalho
Copy link

I'm eager for the implementation of the PR!!! 👯

@jinshen-cn
Copy link
Contributor

I'm eager for the implementation of the PR too!!! 👯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants