Skip to content

Batch Request Support for Cost Optimization #1

@crmne

Description

@crmne

TL;DR: Add support for batch requests to optimize costs for non-time-sensitive AI tasks.

Background

When working with AI APIs like OpenAI and Anthropic, batch requests are often substantially cheaper than individual calls. For example, OpenAI offers approximately 50% lower prices when using input caching, and Anthropic offers similar discounts for cached prompts.

Many applications don't need immediate responses. These non-time-sensitive workloads would benefit enormously from batching.

Benefits

  • Cost savings: 40-50% discount on most provider APIs
  • Simplified workflow: No need to manage complex batching logic in application code
  • Resource optimization: Less strain on provider APIs and client applications

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions