-
-
Notifications
You must be signed in to change notification settings - Fork 232
Open
Labels
enhancementNew feature or requestNew feature or request
Description
TL;DR: Add support for batch requests to optimize costs for non-time-sensitive AI tasks.
Background
When working with AI APIs like OpenAI and Anthropic, batch requests are often substantially cheaper than individual calls. For example, OpenAI offers approximately 50% lower prices when using input caching, and Anthropic offers similar discounts for cached prompts.
Many applications don't need immediate responses. These non-time-sensitive workloads would benefit enormously from batching.
Benefits
- Cost savings: 40-50% discount on most provider APIs
- Simplified workflow: No need to manage complex batching logic in application code
- Resource optimization: Less strain on provider APIs and client applications
Thoughts?
gczh, Benoit-Baumann, grammakov, jimjimovich, vsrnth and 4 more
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request