feat: Async Journey: LiteLLM Removal from Async Engine by eric-tramel · Pull Request #310 · NVIDIA-NeMo/DataDesigner

eric-tramel · 2026-02-07T03:02:33Z

Summary

Adds a comprehensive analysis of removing the litellm dependency from Data Designer. This is a planning document — no code changes.

Key findings

LiteLLM is well-contained (12 production files, all in engine/models/ and engine/models_v2/)
DD underuses it — each ModelFacade creates a Router with a single deployment (no load balancing, no failover)
OpenAI and Anthropic SDKs handle retry/backoff natively; Bedrock does not (manual retry needed for throttling)
Anthropic adapter is HIGH risk due to structurally different response format (content blocks vs strings)

Implementation plan (4 phases)

Replace Router with ModelClient in models_v2/ — OpenAI SDK adapter, keep OpenAI response format as canonical type. models/ untouched as fallback.
Validate — Benchmark, test suite, real inference with env var enabled.
Additional provider adapters — Anthropic + Bedrock. models/ fallback still available.
Consolidate and drop dependency — Delete models/, remove litellm from deps. Only after all adapters are proven.

Reviewed by

10 independent code reviewers examined the report against the actual codebase. Corrections incorporated: expanded test blast radius (4 files, ~56 functions), upgraded Anthropic risk to HIGH, added MCP facade cross-layer caveat, corrected dependency impact analysis.

Adds an opt-in async execution path (DATA_DESIGNER_ASYNC_ENGINE=1) for the cell-by-cell generation pipeline. Replaces thread-pool concurrency with native asyncio TaskGroup + Semaphore for bounded concurrent LLM calls, while keeping the sync path as the default. Key changes: - ModelFacade: acompletion(), agenerate_text_embeddings(), agenerate() - acatch_llm_exceptions decorator (async mirror of catch_llm_exceptions) - AsyncConcurrentExecutor with persistent background event loop - ColumnWiseBuilder branches on env var to fan out via async or threads - Benchmark updated with async mock support Co-Authored-By: Remi <noreply@anthropic.com>

Comprehensive analysis of removing the litellm dependency from Data Designer. Covers blast radius (per-phase), provider SDK research (OpenAI, Anthropic, Bedrock), risk assessment, and a 4-phase implementation plan using the models_v2/ parallel stack approach. Co-Authored-By: Remi <noreply@anthropic.com>

greptile-apps · 2026-02-07T03:04:59Z

Greptile Overview

Greptile Summary

Adds comprehensive planning document for removing LiteLLM dependency from Data Designer. The document is thorough, well-structured, and demonstrates deep understanding of the codebase through 10 independent reviewer validations.

Key strengths:

Well-contained blast radius (12 production files, all in engine/models/ and engine/models_v2/)
Pragmatic 4-phase approach with parallel implementation in models_v2/ to maintain fallback
Accurate technical analysis confirmed against actual codebase (dependency version, file counts, test impact)
Honest risk assessment (Anthropic and Bedrock adapters marked HIGH risk due to response format incompatibilities)
Comprehensive test migration plan (~56 test functions identified)
Clear decision on response format (keep OpenAI structure as canonical to minimize cross-layer changes)

Implementation strategy:

Phase 1: OpenAI adapter in models_v2/ (low risk)
Phase 2: Validation via benchmarks and tests
Phase 3: Anthropic + Bedrock adapters (high risk, structural differences)
Phase 4: Remove models/ and drop dependency (only after validation)

The document correctly identifies that Data Designer underuses LiteLLM (single-deployment Router instead of load balancing), making removal feasible. The parallel stack approach via DATA_DESIGNER_ASYNC_ENGINE env var provides safe rollback mechanism.

Confidence Score: 5/5

This PR is safe to merge with minimal risk - it only adds planning documentation with no code changes
Documentation-only change with comprehensive technical analysis that has been validated by 10 independent reviewers. No production code modified, no breaking changes, no runtime impact. The planning document demonstrates thorough understanding of the codebase and provides clear implementation roadmap with appropriate risk assessment.
No files require special attention

Important Files Changed

Filename	Overview
LITELLM_REMOVAL_ANALYSIS.md	New comprehensive planning document analyzing LiteLLM removal strategy with 4-phase implementation plan

Sequence Diagram

sequenceDiagram
    participant Config as Config Layer
    participant Factory as models_v2/factory.py
    participant Facade as models_v2/facade.py
    participant Client as ModelClient (OpenAI/Anthropic/Bedrock)
    participant SDK as Provider SDK
    participant API as Provider API

    Note over Config,Factory: Phase 1: OpenAI Adapter
    Config->>Factory: create_model_registry(model_configs)
    Factory->>Factory: Construct OpenAIModelClient
    Factory->>Client: Initialize with api_key, base_url
    Factory->>Facade: ModelFacade(client=OpenAIModelClient)
    
    Note over Facade,API: Inference Request Flow
    Facade->>Facade: completion(messages, **params)
    Facade->>Client: client.completion(messages, **kwargs)
    Client->>Client: Translate DD params → SDK params
    Client->>SDK: await sdk.chat.completions.create(...)
    SDK->>SDK: Built-in retry/backoff
    SDK->>API: HTTPS POST /v1/chat/completions
    API-->>SDK: 200 OK with response
    SDK-->>Client: OpenAI response object
    Client->>Client: Extract content, tool_calls, usage
    Client-->>Facade: CompletionResponse
    Facade-->>Config: Generated text

    Note over Factory,Client: Phase 3: Multi-Provider
    Factory->>Factory: match provider_type
    alt provider_type == "openai"
        Factory->>Client: OpenAIModelClient
    else provider_type == "anthropic"
        Factory->>Client: AnthropicModelClient
        Note over Client: Translates content blocks → string
    else provider_type == "bedrock"
        Factory->>Client: BedrockModelClient
        Note over Client: Manual retry for throttling
    end

    Note over Facade,SDK: Error Handling
    SDK-->>Client: SDK-specific exception (e.g., RateLimitError)
    Client->>Client: Map to DD error types
    Client-->>Facade: ModelRateLimitError
    Facade-->>Config: Propagate with FormattedLLMErrorMessage

nabinchha · 2026-02-10T17:54:45Z

LITELLM_REMOVAL_ANALYSIS.md

+
+### Config migration is clean
+
+`provider_type` is only consumed in one place (`_get_litellm_deployment()` → `f"{provider_type}/{model_name}"`). Pydantic handles string → enum coercion automatically, so existing YAML/JSON configs and programmatic construction continue to work. The CLI text field for provider_type would become a select/choice field. All predefined providers and examples use `"openai"` — no existing users need migration.


This might be the only breaking change ux wise? I don't see a need for this as long as we require users to provide ModelConfig.model with the full model path as understood by the inference stack they are using.

johnnygreco · 2026-02-13T15:21:53Z

LITELLM_REMOVAL_ANALYSIS.md

can we put this in plans/{gh-issue}/?

eric-tramel and others added 6 commits February 2, 2026 14:04

Initialize alternate module path

7679741

Fix tests

1c461c1

Add a benchmark to track progress.

1129ed6

fix test patching

4dbbef1

eric-tramel requested a review from a team as a code owner February 7, 2026 03:02

eric-tramel changed the title ~~docs: LiteLLM removal impact analysis and implementation plan~~ feat: Async Journey: LiteLLM Removal from Async Engine Feb 7, 2026

nabinchha reviewed Feb 10, 2026

View reviewed changes

eric-tramel marked this pull request as draft February 11, 2026 23:17

eric-tramel force-pushed the async/async-facade branch from c789ca8 to a87f98b Compare February 13, 2026 01:57

johnnygreco reviewed Feb 13, 2026

View reviewed changes

LITELLM_REMOVAL_ANALYSIS.md

Copy link

Contributor

johnnygreco Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put this in plans/{gh-issue}/?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Async Journey: LiteLLM Removal from Async Engine#310

feat: Async Journey: LiteLLM Removal from Async Engine#310
eric-tramel wants to merge 6 commits intoasync/async-facadefrom
async/litellm-removal

eric-tramel commented Feb 7, 2026

Uh oh!

greptile-apps bot commented Feb 7, 2026

Confidence Score: 5/5

Sequence Diagram

Uh oh!

nabinchha Feb 10, 2026

Uh oh!

johnnygreco Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments


		### Config migration is clean

		`provider_type` is only consumed in one place (`_get_litellm_deployment()` → `f"{provider_type}/{model_name}"`). Pydantic handles string → enum coercion automatically, so existing YAML/JSON configs and programmatic construction continue to work. The CLI text field for provider_type would become a select/choice field. All predefined providers and examples use `"openai"` — no existing users need migration.

Conversation

eric-tramel commented Feb 7, 2026

Summary

Key findings

Implementation plan (4 phases)

Reviewed by

Uh oh!

greptile-apps bot commented Feb 7, 2026

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

nabinchha Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

johnnygreco Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments