[Bug]: Length exceeded error after repeated file search in one conversation #5312

dvejsada · 2025-01-14T10:48:54Z

What happened?

We have an agent with file search enabled (company assistant with internal regulations), using GPT-40-mini. If there are multiple questions each requiring file search, there will be an API error (approx. in 4th or 5th question, after performing file search) stating that maximum context length was exceeded.

Steps to Reproduce

Create Agent with file search and upload documents.
Use GPT-4o-mini for Agent
Ask several questions each triggering file search-
See error pop up.

What browsers are you seeing the problem on?

No response

Relevant log output

{"attemptNumber":1,"code":"context_length_exceeded","error":{"code":"context_length_exceeded","message":"This model's maximum context length is 128000 tokens. However, your messages resulted in 154763 tokens (154642 in the messages, 121 in the functions). Please reduce the length of the messages or functions.","param":"messages","type":"invalid_request_error"},"headers":{"apim-request-id":"55ad76df-3f36-4208-bbca-1f936902ef19","azureml-model-session":"d063-20241218161552","content-length":"344","content-type":"application/json","date":"Tue, 14 Jan 2025 10:28:42 GMT","ms-azureml-model-error-reason":"model_error","ms-azureml-model-error-statuscode":"400","strict-transport-security":"max-age=31536000; includeSubDomains; preload","x-content-type-options":"nosniff","x-ms-client-request-id":"55ad76df-3f36-4208-bbca-1f936902ef19","x-ms-rai-invoked":"true","x-ms-region":"Sweden Central","x-ratelimit-remaining-requests":"19998","x-ratelimit-remaining-tokens":"1797932","x-request-id":"aa47bddd-c3a9-4465-8719-3e4a4ed66f34"},"level":"error","message":"[handleAbortError] AI response error; aborting request: 400 This model's maximum context length is 128000 tokens. However, your messages resulted in 154763 tokens (154642 in the messages, 121 in the functions). Please reduce the length of the messages or functions.","param":"messages","pregelTaskId":"6cc9921c-5db4-560d-9dfa-182afbe52450","request_id":"aa47bddd-c3a9-4465-8719-3e4a4ed66f34","retriesLeft":6,"stack":"Error: 400 This model's maximum context length is 128000 tokens. However, your messages resulted in 154763 tokens (154642 in the messages, 121 in the functions). Please reduce the length of the messages or functions.\n    at APIError.generate (/app/node_modules/@langchain/openai/node_modules/openai/error.js:45:20)\n    at OpenAI.makeStatusError (/app/node_modules/@langchain/openai/node_modules/openai/core.js:293:33)\n    at OpenAI.makeRequest (/app/node_modules/@langchain/openai/node_modules/openai/core.js:337:30)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /app/node_modules/@langchain/openai/dist/chat_models.cjs:1548:29\n    at async RetryOperation._fn (/app/node_modules/p-retry/index.js:50:12)","status":400,"type":"invalid_request_error"}

Screenshots

No response

Code of Conduct

I agree to follow this project's Code of Conduct

danny-avila · 2025-01-14T15:03:38Z

So the only way to prevent this is discard earlier messages in the chat history due to increasing context in tool outputs. Is that acceptable?

dvejsada · 2025-01-14T16:03:56Z

So the only way to prevent this is discard earlier messages in the chat history due to increasing context in tool outputs. Is that acceptable?

Is there a way to implement it in a way that only previous file search results (tool outputs) are dropped but not the actual conversation?

Further, it is also caused by this - in case of larger knowledge bases, the file search currently returns about 25 pages of text, so the context window fills up very quickly and also the API costs are quite high.

danny-avila · 2025-01-14T16:07:09Z

@dvejsada thanks I agree about limiting tool outputs first, and also better relevance score handling for file search, will keep this in mind!

lxDaniel · 2025-01-14T16:56:54Z

This is the same problem I wrote about in discussions #5107. So I'm looking forward to that

danny-avila · 2025-01-17T17:43:05Z

@lxDaniel this update should help: #5349

But this issue mainly concerns the file_search tool that uses the RAG API

lxDaniel · 2025-01-21T13:04:52Z

@lxDaniel this update should help: #5349

But this issue mainly concerns the file_search tool that uses the RAG API

Hi @danny-avila thanks for the update.
It improved the conversation with agents + azure ia search aswell. The context_length_exceeded error still happens, but, at least on my end, it takes longer to occur.
I haven't had the opportunity to test it on files uploaded via the RAG API tho.

ppatayane · 2025-01-24T12:15:46Z

On similar note, there is also discussion that I would like to link here. I wanted to highlight that the same use case is working fine using plugins as mentioned but fails via Agent.
#4819 (comment)
I am looking forward to check already merged fixes with #5349 with upcoming release
Thank you Danny and team !

dvejsada added the bug Something isn't working label Jan 14, 2025

danny-avila changed the title ~~[Bug]: Lenght exceeded error after repeated file search in one conversation~~ [Bug]: Length exceeded error after repeated file search in one conversation Jan 14, 2025

dvejsada mentioned this issue Jan 19, 2025

[Enhancement] Add similarity threshold filter danny-avila/rag_api#111

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Length exceeded error after repeated file search in one conversation #5312

[Bug]: Length exceeded error after repeated file search in one conversation #5312

dvejsada commented Jan 14, 2025

danny-avila commented Jan 14, 2025

dvejsada commented Jan 14, 2025

danny-avila commented Jan 14, 2025

lxDaniel commented Jan 14, 2025

danny-avila commented Jan 17, 2025

lxDaniel commented Jan 21, 2025

ppatayane commented Jan 24, 2025 •

edited

Loading

[Bug]: Length exceeded error after repeated file search in one conversation #5312

[Bug]: Length exceeded error after repeated file search in one conversation #5312

Comments

dvejsada commented Jan 14, 2025

What happened?

Steps to Reproduce

What browsers are you seeing the problem on?

Relevant log output

Screenshots

Code of Conduct

danny-avila commented Jan 14, 2025

dvejsada commented Jan 14, 2025

danny-avila commented Jan 14, 2025

lxDaniel commented Jan 14, 2025

danny-avila commented Jan 17, 2025

lxDaniel commented Jan 21, 2025

ppatayane commented Jan 24, 2025 • edited Loading

ppatayane commented Jan 24, 2025 •

edited

Loading