Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Length exceeded error after repeated file search in one conversation #5312

Open
1 task done
dvejsada opened this issue Jan 14, 2025 · 7 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@dvejsada
Copy link

What happened?

We have an agent with file search enabled (company assistant with internal regulations), using GPT-40-mini. If there are multiple questions each requiring file search, there will be an API error (approx. in 4th or 5th question, after performing file search) stating that maximum context length was exceeded.

Steps to Reproduce

  1. Create Agent with file search and upload documents.
  2. Use GPT-4o-mini for Agent
  3. Ask several questions each triggering file search-
  4. See error pop up.

What browsers are you seeing the problem on?

No response

Relevant log output

{"attemptNumber":1,"code":"context_length_exceeded","error":{"code":"context_length_exceeded","message":"This model's maximum context length is 128000 tokens. However, your messages resulted in 154763 tokens (154642 in the messages, 121 in the functions). Please reduce the length of the messages or functions.","param":"messages","type":"invalid_request_error"},"headers":{"apim-request-id":"55ad76df-3f36-4208-bbca-1f936902ef19","azureml-model-session":"d063-20241218161552","content-length":"344","content-type":"application/json","date":"Tue, 14 Jan 2025 10:28:42 GMT","ms-azureml-model-error-reason":"model_error","ms-azureml-model-error-statuscode":"400","strict-transport-security":"max-age=31536000; includeSubDomains; preload","x-content-type-options":"nosniff","x-ms-client-request-id":"55ad76df-3f36-4208-bbca-1f936902ef19","x-ms-rai-invoked":"true","x-ms-region":"Sweden Central","x-ratelimit-remaining-requests":"19998","x-ratelimit-remaining-tokens":"1797932","x-request-id":"aa47bddd-c3a9-4465-8719-3e4a4ed66f34"},"level":"error","message":"[handleAbortError] AI response error; aborting request: 400 This model's maximum context length is 128000 tokens. However, your messages resulted in 154763 tokens (154642 in the messages, 121 in the functions). Please reduce the length of the messages or functions.","param":"messages","pregelTaskId":"6cc9921c-5db4-560d-9dfa-182afbe52450","request_id":"aa47bddd-c3a9-4465-8719-3e4a4ed66f34","retriesLeft":6,"stack":"Error: 400 This model's maximum context length is 128000 tokens. However, your messages resulted in 154763 tokens (154642 in the messages, 121 in the functions). Please reduce the length of the messages or functions.\n    at APIError.generate (/app/node_modules/@langchain/openai/node_modules/openai/error.js:45:20)\n    at OpenAI.makeStatusError (/app/node_modules/@langchain/openai/node_modules/openai/core.js:293:33)\n    at OpenAI.makeRequest (/app/node_modules/@langchain/openai/node_modules/openai/core.js:337:30)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /app/node_modules/@langchain/openai/dist/chat_models.cjs:1548:29\n    at async RetryOperation._fn (/app/node_modules/p-retry/index.js:50:12)","status":400,"type":"invalid_request_error"}

Screenshots

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@dvejsada dvejsada added the bug Something isn't working label Jan 14, 2025
@danny-avila
Copy link
Owner

So the only way to prevent this is discard earlier messages in the chat history due to increasing context in tool outputs. Is that acceptable?

@danny-avila danny-avila changed the title [Bug]: Lenght exceeded error after repeated file search in one conversation [Bug]: Length exceeded error after repeated file search in one conversation Jan 14, 2025
@dvejsada
Copy link
Author

So the only way to prevent this is discard earlier messages in the chat history due to increasing context in tool outputs. Is that acceptable?

Is there a way to implement it in a way that only previous file search results (tool outputs) are dropped but not the actual conversation?

Further, it is also caused by this - in case of larger knowledge bases, the file search currently returns about 25 pages of text, so the context window fills up very quickly and also the API costs are quite high.

@danny-avila
Copy link
Owner

@dvejsada thanks I agree about limiting tool outputs first, and also better relevance score handling for file search, will keep this in mind!

@lxDaniel
Copy link

This is the same problem I wrote about in discussions #5107. So I'm looking forward to that

@danny-avila
Copy link
Owner

@lxDaniel this update should help: #5349

But this issue mainly concerns the file_search tool that uses the RAG API

@lxDaniel
Copy link

@lxDaniel this update should help: #5349

But this issue mainly concerns the file_search tool that uses the RAG API

Hi @danny-avila thanks for the update.
It improved the conversation with agents + azure ia search aswell. The context_length_exceeded error still happens, but, at least on my end, it takes longer to occur.
I haven't had the opportunity to test it on files uploaded via the RAG API tho.

@ppatayane
Copy link

ppatayane commented Jan 24, 2025

On similar note, there is also discussion that I would like to link here. I wanted to highlight that the same use case is working fine using plugins as mentioned but fails via Agent.
#4819 (comment)
I am looking forward to check already merged fixes with #5349 with upcoming release
Thank you Danny and team !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants