Optimizing Chat Memory with ConversationSummaryBufferMemory for Reduced Latency and Context Size Using GroqLLM #89
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Optimizing Chat Memory with ConversationSummaryBufferMemory for Reduced Latency and Context Size Using GroqLLM
Description
Use of "ChatMessageHistory" was storing full role tagged messages and it was thus, using a high context length which can apparently lead to increased latency, violating token limits as well as hallucination risk due to large context length. So, I have used "ConversationSummaryBufferMemory" in place of "ChatMessageHistory". This will summarize the older chats retaining all important parts of conversation and keep in buffer, the recent messages with roles tagged with them while strictly maintaining the provided max limit of token length which will surely reduce latency and hallucination risk due to short, exact and crisp summary of chat history instead of using full chat history in vanilla form. I have used Groq LLM Backend for supporting "ConversationSummaryBufferMemory" instance bcoz of its faster response to calls which will further reduce latency and help maintaining user-engagement.
session_history.py
test_session_memory.py
output
Test Conclusion
It works perfectly without any errors.