fix(openai): Respect 300k token limit for embeddings API requests #33668
+359
−20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixes #31227 - Resolves the issue where
OpenAIEmbeddingsexceeds OpenAI's 300,000 token per request limit, causing 400 BadRequest errors.Problem
When embedding large document sets, LangChain would send batches containing more than 300,000 tokens in a single API request, causing this error:
The issue occurred because:
embedding_ctx_length(8191 tokens per chunk)chunk_size(default 1000 chunks per request)1000 chunks × 8191 tokens = 8,191,000 tokens→ Exceeds limit!Solution
This PR implements dynamic batching that respects the 300k token limit:
MAX_TOKENS_PER_REQUEST = 300000chunk_sizebatches, accumulate chunks until approaching the 300k limit_get_len_safe_embeddingsand_aget_len_safe_embeddingsChanges
langchain_openai/embeddings/base.py:MAX_TOKENS_PER_REQUESTconstanttests/unit_tests/embeddings/test_base.py:test_embeddings_respects_token_limit()- Verifies large document sets are properly batchedTesting
All existing tests pass (280 passed, 4 xfailed, 1 xpassed).
New test verifies:
Usage
After this fix, users can embed large document sets without errors:
Resolves #31227