Feature/unified v1 #413

avirajsingh7 · 2025-10-21T03:16:02Z

Summary

Target issue is #PLEASE_TYPE_ISSUE_NUMBER
Explain the motivation for making this change. What existing problem does the pull request solve?

Checklist

Before submitting a pull request, please ensure that you mark these task.

Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

Summary by CodeRabbit

Release Notes

New Features
- New /llm/call API endpoint for language model requests with configurable parameters including model selection, temperature, and token limits.
- OpenAI provider integration with support for reasoning modes and text verbosity settings.
- Asynchronous job processing with status tracking for LLM calls.
- Optional vector store search integration for LLM requests.

coderabbitai · 2025-10-21T03:16:11Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR introduces a complete LLM API feature with async job processing. Changes include: database migration adding LLM_API job type, FastAPI endpoint /llm/call, LLM configuration and request/response models, provider-based architecture supporting OpenAI, and Celery job orchestration with proper state transitions.

Changes

Cohort / File(s)	Summary
Database Migration `backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py`	Adds `LLM_API` enum value to the Postgres `jobtype` type via SQL statement in upgrade path.
API Route Integration `backend/app/api/main.py`	Imports and includes the new `llm.router` in the main API router setup.
LLM API Endpoint `backend/app/api/routes/llm.py`	Defines POST `/llm/call` route that accepts `LLMCallRequest`, extracts user context, and delegates to `start_job` for background processing.
Core Models `backend/app/models/job.py`, `backend/app/models/__init__.py`	Adds `LLM_API` member to `JobType` enum and exports new LLM models (`LLMCallRequest`, `LLMCallResponse`, `LLMConfig`, `LLMModelSpec`).
LLM Models Package `backend/app/models/llm/__init__.py`, `backend/app/models/llm/config.py`, `backend/app/models/llm/request.py`, `backend/app/models/llm/response.py`	New models package with `LLMConfig` (prompt + model spec), `LLMModelSpec` (model, provider, sampling, reasoning), `LLMCallRequest` (config + max results), and `LLMCallResponse` (status, tokens, response data).
Job Orchestration `backend/app/services/llm/jobs.py`	Implements `start_job` to enqueue LLM tasks with trace IDs and error handling, and `execute_job` as the Celery task body that processes requests, manages job state transitions (CREATED → PROCESSING → SUCCESS/FAILED), and handles provider dispatch.
LLM Orchestrator `backend/app/services/llm/orchestrator.py`	Central `execute_llm_call` function that routes requests to appropriate provider via factory and returns response or error tuple.
Provider Architecture `backend/app/services/llm/providers/base.py`, `backend/app/services/llm/providers/factory.py`, `backend/app/services/llm/providers/__init__.py`	Defines abstract `BaseProvider` contract, `ProviderFactory` for instantiation with registry, and public re-exports.
OpenAI Provider `backend/app/services/llm/providers/openai.py`	Concrete `OpenAIProvider` implementation with request validation via `OpenAISpec`, API invocation, message extraction, and token accounting.
OpenAI Spec & Conversion `backend/app/services/llm/specs/openai.py`, `backend/app/services/llm/specs/__init__.py`	`OpenAISpec` model with vector store validation, `to_api_params` for OpenAI API formatting, and `from_llm_request` factory method for request translation.
LLM Service Package `backend/app/services/llm/__init__.py`	Exports main public API: `execute_llm_call`, `BaseProvider`, `ProviderFactory`, `OpenAIProvider`.

Sequence Diagram

sequenceDiagram
    participant User
    participant API as FastAPI /llm/call
    participant JobService as start_job
    participant DB
    participant Celery as execute_job (Celery)
    participant Orchestrator
    participant Provider as OpenAIProvider
    participant OpenAI

    User->>API: POST /llm/call (LLMCallRequest)
    API->>JobService: start_job(db, request, project_id, org_id)
    JobService->>DB: Create Job (status=CREATED)
    JobService->>Celery: Enqueue execute_job task (high priority)
    Celery-->>API: Job UUID response
    API-->>User: {status: "processing"}

    rect rgba(100, 150, 200, 0.1)
        Note over Celery: Background Processing
        Celery->>DB: Update Job to PROCESSING
        Celery->>Orchestrator: execute_llm_call(request, client)
        Orchestrator->>Provider: execute(request)
        Provider->>Provider: Convert request → OpenAISpec
        Provider->>OpenAI: API call (create response)
        OpenAI-->>Provider: response with tokens
        Provider->>Provider: Extract message + build LLMCallResponse
        Provider-->>Orchestrator: (LLMCallResponse | None, error | None)
        Orchestrator-->>Celery: response/error tuple
        alt Success
            Celery->>DB: Update Job (status=SUCCESS, response_id)
        else Failure
            Celery->>DB: Update Job (status=FAILED, error_message)
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Rationale: New provider abstraction pattern with factory, OpenAI-specific spec validation/conversion logic, integration with existing async job framework, multiple interdependent model definitions, and state management flow across Celery tasks warrant careful review of logic correctness and API consistency.

Possibly related PRs

Move response API to Celery-backed job processing #381: Establishes the foundational job-processing framework that this PR extends by adding LLM-specific job types, state transitions, and orchestration.
Collection: Standardized OpenAI Creds Fetching #258: Centralizes OpenAI client provisioning and configuration, directly impacting how get_openai_client is integrated into the new execute_job task body.

Suggested labels

enhancement, ready-for-review

Suggested reviewers

AkhileshNegi
kartpop

Poem

🐰 A new LLM path has been cleared,
With providers and specs carefully engineered!
Jobs hop through the Celery queue,
OpenAI responses returned on cue—
Async magic, async true! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title Check	❓ Inconclusive	The title "Feature/unified v1" is vague and lacks meaningful information about the changeset. While it indicates that a feature is being added, it does not communicate what the feature actually accomplishes. The title uses branch naming conventions (slash-separated words) rather than clear, descriptive language. A developer scanning the repository history would not understand that this PR introduces a comprehensive LLM API infrastructure with job orchestration, provider patterns, and multiple data models. The title is generic in nature, similar to non-descriptive terms like "misc updates," and fails to convey the primary purpose of the changes.	Consider revising the title to clearly describe the main change. A more descriptive title might be something like "Add unified LLM API with job orchestration and provider factory pattern" or "Introduce LLM call endpoints with OpenAI provider support." This would help reviewers and future maintainers quickly understand the scope and purpose of the feature being introduced.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 88.24% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (17)

backend/app/services/llm/providers/base.py (3)

27-33: Prefer Generic or Protocol over Any for client type.

Using Any for the client parameter loses type safety benefits. Consider using a TypeVar with Generic or a Protocol to maintain type information across provider implementations.

Apply this diff to add a TypeVar:
+from typing import Any, TypeVar
+
+ClientT = TypeVar("ClientT")
+
-class BaseProvider(ABC):
+class BaseProvider(ABC, Generic[ClientT]):
     """Abstract base class for LLM providers.
     ...
     """
 
-    def __init__(self, client: Any):
+    def __init__(self, client: ClientT):
         """Initialize the provider with client.
 
         Args:
-            client: Provider-specific client (e.g., OpenAI, Anthropic client)
+            client: Provider-specific client
         """
         self.client = client
35-60: Consider using exception-based error handling instead of tuple returns.

The tuple[LLMCallResponse | None, str | None] pattern is error-prone and requires callers to manually check both return values. Consider raising exceptions for errors and returning LLMCallResponse directly, or use a Result type.

Example alternative:
@abstractmethod
def execute(self, request: LLMCallRequest) -> LLMCallResponse:
    """Execute an LLM call using the provider.
    
    Raises:
        ProviderError: If the API call fails
        ValidationError: If request validation fails
    """
61-67: Provider name derivation relies on fragile naming convention.

The get_provider_name() method assumes all provider classes end with "Provider" suffix. If a class doesn't follow this convention, the result will be unexpected. Consider making this an abstract property or using explicit configuration.

Alternative approach:
@property
@abstractmethod
def provider_name(self) -> str:
    """Get the name of the provider."""
    ...

backend/app/models/llm/response.py (2)

22-28: Consider adding validation constraints to numeric fields.

Token count fields should have non-negative constraints to catch invalid API responses early.

Apply this diff:
+from sqlmodel import SQLModel, Field
+
 class LLMCallResponse(SQLModel):
     ...
-    input_tokens: int
-    output_tokens: int
-    total_tokens: int
+    input_tokens: int = Field(ge=0)
+    output_tokens: int = Field(ge=0)
+    total_tokens: int = Field(ge=0)
22-22: Consider using Literal or Enum for status field.

The status field is currently a plain str, which allows any value. Using a Literal type or Enum would provide better type safety and validation.

Example:
from typing import Literal

status: Literal["success", "error", "pending"]

backend/app/services/llm/orchestrator.py (1)

61-64: Sanitize unexpected-error returns to avoid leaking internals

Returning str(e) can expose internal details. Log full exception with stack, but return a generic message.

-    except Exception as e:
-        error_message = f"Unexpected error in LLM service: {str(e)}"
-        logger.error(f"[execute_llm_call] {error_message}", exc_info=True)
-        return None, error_message
+    except Exception as e:
+        logger.error("[execute_llm_call] Unexpected error in LLM service", exc_info=True)
+        return None, "Unexpected error in LLM service"

backend/app/services/llm/providers/factory.py (1)

35-36: Normalize provider names and enable dynamic registration

Make provider matching case-insensitive and allow runtime registration for extensions.

 class ProviderFactory:
@@
-    def create_provider(cls, provider_type: str, client: Any) -> BaseProvider:
+    def create_provider(cls, provider_type: str, client: Any) -> BaseProvider:
         """Create a provider instance based on the provider type.
@@
-        provider_class = cls._PROVIDERS.get(provider_type)
+        normalized = (provider_type or "").strip().lower()
+        provider_class = cls._PROVIDERS.get(normalized)
@@
-        logger.info(f"[ProviderFactory] Creating {provider_type} provider instance")
+        logger.info(f"[ProviderFactory] Creating {normalized} provider instance")
         return provider_class(client=client)
@@
     def get_supported_providers(cls) -> list[str]:
         """Get list of supported provider types.
@@
         return list(cls._PROVIDERS.keys())
+
+    @classmethod
+    def register_provider(cls, name: str, provider_cls: type[BaseProvider]) -> None:
+        """Register a provider at runtime (useful for plugins/tests)."""
+        cls._PROVIDERS[name.strip().lower()] = provider_cls

Also applies to: 48-59, 60-67

backend/app/services/llm/__init__.py (1)

12-13: Avoid side‑effect imports; re‑export explicitly if needed.

Importing app.services.llm.specs for side effects is surprising. If you intend to expose OpenAISpec (or others), explicitly import and add to all, or drop this import if unused.
-# Initialize model specs on module import
-import app.services.llm.specs  # noqa: F401
+# from app.services.llm.specs import OpenAISpec
+# __all__.append("OpenAISpec")

backend/app/models/llm/config.py (2)

28-33: Constrain provider type to prevent typos.

Use an Enum or Literal for provider (e.g., "openai") to fail fast on invalid values and align with ProviderFactory.get_supported_providers().
-from sqlmodel import SQLModel
+from enum import Enum
+from sqlmodel import SQLModel
+
+class Provider(str, Enum):
+    openai = "openai"

 class LLMModelSpec(SQLModel):
-    provider: str = "openai"
+    provider: Provider = Provider.openai
37-51: Optional: add basic bounds at this layer.

You already enforce ranges in OpenAISpec; adding min/max for max_tokens/top_p here provides earlier feedback when the provider is swapped.

backend/app/services/llm/providers/openai.py (2)

69-87: Make message extraction resilient and collect all text parts.

Current logic returns only the first text block; if output contains multiple text items or different shapes, result may be empty.

-        # Find the first ResponseOutputMessage in the output
-        for item in output:
+        texts = []
+        for item in output:
             # Check if it's a message type (has 'role' and 'content' attributes)
             if hasattr(item, "type") and item.type == "message":
                 if hasattr(item, "content"):
                     # Content is a list of content items
                     if isinstance(item.content, list) and len(item.content) > 0:
-                        # Get the first text content
-                        first_content = item.content[0]
-                        if hasattr(first_content, "text"):
-                            return first_content.text
-                        elif hasattr(first_content, "type") and first_content.type == "text":
-                            return getattr(first_content, "text", "")
-                return ""
+                        for c in item.content:
+                            if hasattr(c, "text"):
+                                texts.append(c.text)
+                            elif hasattr(c, "type") and c.type == "text":
+                                t = getattr(c, "text", "")
+                                if t:
+                                    texts.append(t)
+        if texts:
+            return "\n".join(texts)
 
         logger.warning(
             f"[OpenAIProvider] No message found in output array with {len(output)} items"
         )
         return ""

127-129: Guard against missing usage fields.

usage can be None; default to 0s to avoid AttributeError.

-                input_tokens=response.usage.input_tokens,
-                output_tokens=response.usage.output_tokens,
-                total_tokens=response.usage.total_tokens,
+                input_tokens=getattr(getattr(response, "usage", None), "input_tokens", 0),
+                output_tokens=getattr(getattr(response, "usage", None), "output_tokens", 0),
+                total_tokens=getattr(getattr(response, "usage", None), "total_tokens", 0),

backend/app/services/llm/jobs.py (5)

28-37: Persist Celery task_id to the job record.

Store task_id to correlate job↔task and aid ops.

     try:
         task_id = start_high_priority_job(
@@
         )
     except Exception as e:
@@
         )
 
-    logger.info(
+    # Persist task_id for observability
+    job_crud.update(job_id=job.id, job_update=JobUpdate(task_id=task_id))
+
+    logger.info(
         f"[start_job] Job scheduled for LLM call | job_id={job.id}, project_id={project_id}, task_id={task_id}"
     )

Also applies to: 42-46

62-70: Wrap request parsing/logging in try to ensure FAILED status on early errors.

If model parsing/logging fails, the job never moves to FAILED.

-def execute_job(
+def execute_job(
     request_data: dict,
@@
 ) -> LLMCallResponse | None:
-    """Celery task to process an LLM request asynchronously."""
-    request = LLMCallRequest(**request_data)
-    job_id_uuid = UUID(job_id)
-
-    logger.info(
-        f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, "
-        f"provider={request.llm.provider}, model={request.llm.llm_model_spec.model}"
-    )
-
-    try:
+    """Celery task to process an LLM request asynchronously."""
+    try:
+        request = LLMCallRequest(**request_data)
+        job_id_uuid = UUID(job_id)
+        logger.info(
+            f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, "
+            f"provider={request.llm.llm_model_spec.provider}, model={request.llm.llm_model_spec.model}"
+        )

Also applies to: 71-79

60-61: Silence Ruff ARG001 for unused Celery task arg.

Prefix with underscore or remove if not required by signature.

-    task_instance,
+    _task_instance,

86-93: Minor: avoid redundant JobCrud re-instantiation.

You already have job_crud in the same context; reuse it.

-                job_crud = JobCrud(session=session)
-                job_crud.update(
+                job_crud.update(
                     job_id=job_id_uuid,
                     job_update=JobUpdate(
                         status=JobStatus.FAILED, error_message=error_msg
                     ),
                 )

79-83: Define client before conditional to satisfy analyzers.

Initialize client to None before the provider branch; keeps scope clear.

-            provider_type = request.llm.llm_model_spec.provider
+            provider_type = request.llm.llm_model_spec.provider
+            client = None
             if provider_type == "openai":
                 client = get_openai_client(session, organization_id, project_id)
             else:
@@
-        response, error = execute_llm_call(
+        response, error = execute_llm_call(
             request=request,
             client=client,
         )

Also applies to: 95-98

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b7b640 and e76e2c8.

📒 Files selected for processing (18)

backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py (1 hunks)
backend/app/api/main.py (2 hunks)
backend/app/api/routes/llm.py (1 hunks)
backend/app/models/__init__.py (1 hunks)
backend/app/models/job.py (1 hunks)
backend/app/models/llm/__init__.py (1 hunks)
backend/app/models/llm/config.py (1 hunks)
backend/app/models/llm/request.py (1 hunks)
backend/app/models/llm/response.py (1 hunks)
backend/app/services/llm/__init__.py (1 hunks)
backend/app/services/llm/jobs.py (1 hunks)
backend/app/services/llm/orchestrator.py (1 hunks)
backend/app/services/llm/providers/__init__.py (1 hunks)
backend/app/services/llm/providers/base.py (1 hunks)
backend/app/services/llm/providers/factory.py (1 hunks)
backend/app/services/llm/providers/openai.py (1 hunks)
backend/app/services/llm/specs/__init__.py (1 hunks)
backend/app/services/llm/specs/openai.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use type hints in Python code (Python 3.11+ project)

Files:

backend/app/models/job.py
backend/app/services/llm/orchestrator.py
backend/app/services/llm/providers/__init__.py
backend/app/services/llm/providers/factory.py
backend/app/services/llm/__init__.py
backend/app/api/main.py
backend/app/models/__init__.py
backend/app/models/llm/response.py
backend/app/services/llm/providers/base.py
backend/app/services/llm/specs/__init__.py
backend/app/api/routes/llm.py
backend/app/services/llm/providers/openai.py
backend/app/models/llm/__init__.py
backend/app/services/llm/jobs.py
backend/app/models/llm/config.py
backend/app/services/llm/specs/openai.py
backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py
backend/app/models/llm/request.py

backend/app/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Define SQLModel entities (database tables and domain objects) in backend/app/models/

Files:

backend/app/models/job.py
backend/app/models/__init__.py
backend/app/models/llm/response.py
backend/app/models/llm/__init__.py
backend/app/models/llm/config.py
backend/app/models/llm/request.py

backend/app/services/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Implement business logic services under backend/app/services/

Files:

backend/app/services/llm/orchestrator.py
backend/app/services/llm/providers/__init__.py
backend/app/services/llm/providers/factory.py
backend/app/services/llm/__init__.py
backend/app/services/llm/providers/base.py
backend/app/services/llm/specs/__init__.py
backend/app/services/llm/providers/openai.py
backend/app/services/llm/jobs.py
backend/app/services/llm/specs/openai.py

backend/app/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Expose FastAPI REST endpoints under backend/app/api/ organized by domain

Files:

backend/app/api/main.py
backend/app/api/routes/llm.py

🧬 Code graph analysis (13)

backend/app/services/llm/orchestrator.py (4)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/providers/factory.py (2)

ProviderFactory (16-67)

create_provider (35-58)

backend/app/services/llm/providers/base.py (1)

execute (36-59)

backend/app/services/llm/providers/__init__.py (3)

backend/app/services/llm/providers/base.py (1)

BaseProvider (14-67)

backend/app/services/llm/providers/factory.py (1)

ProviderFactory (16-67)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (25-157)

backend/app/services/llm/providers/factory.py (2)

backend/app/services/llm/providers/base.py (1)

BaseProvider (14-67)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (25-157)

backend/app/services/llm/__init__.py (4)

backend/app/services/llm/orchestrator.py (1)

execute_llm_call (17-64)

backend/app/services/llm/providers/base.py (1)

BaseProvider (14-67)

backend/app/services/llm/providers/factory.py (1)

ProviderFactory (16-67)

backend/app/services/llm/providers/openai.py (1)

OpenAIProvider (25-157)

backend/app/models/__init__.py (3)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/models/llm/config.py (2)

LLMConfig (37-51)

LLMModelSpec (12-34)

backend/app/services/llm/providers/base.py (3)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/providers/openai.py (1)

execute (89-157)

backend/app/services/llm/specs/__init__.py (1)

backend/app/services/llm/specs/openai.py (1)

OpenAISpec (15-197)

backend/app/api/routes/llm.py (4)

backend/app/models/auth.py (1)

AuthContext (18-21)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/services/llm/jobs.py (1)

start_job (20-51)

backend/app/utils.py (2)

APIResponse (29-53)

success_response (36-39)

backend/app/services/llm/providers/openai.py (4)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/providers/base.py (2)

BaseProvider (14-67)

execute (36-59)

backend/app/services/llm/specs/openai.py (3)

OpenAISpec (15-197)

from_llm_request (173-197)

to_api_params (101-170)

backend/app/models/llm/__init__.py (3)

backend/app/models/llm/config.py (2)

LLMConfig (37-51)

LLMModelSpec (12-34)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/jobs.py (7)

backend/app/celery/utils.py (1)

start_high_priority_job (18-43)

backend/app/crud/jobs.py (1)

JobCrud (11-42)

backend/app/models/job.py (3)

JobType (16-18)

JobStatus (9-13)

JobUpdate (47-50)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/response.py (1)

LLMCallResponse (8-28)

backend/app/services/llm/orchestrator.py (1)

execute_llm_call (17-64)

backend/app/utils.py (1)

get_openai_client (175-205)

backend/app/services/llm/specs/openai.py (1)

backend/app/models/llm/request.py (1)

LLMCallRequest (11-23)

backend/app/models/llm/request.py (1)

backend/app/models/llm/config.py (1)

LLMConfig (37-51)

🪛 Ruff (0.14.1)

backend/app/api/routes/llm.py

25-25: Local variable job_id is assigned to but never used

Remove assignment to unused variable job_id

(F841)

backend/app/services/llm/jobs.py

60-60: Unused function argument: task_instance

(ARG001)

🔇 Additional comments (11)

backend/app/alembic/versions/219033c644de_add_llm_im_jobs_table.py (2)

19-21: LGTM! Migration safely adds enum value.

The migration correctly uses IF NOT EXISTS to make the enum addition idempotent, which is the recommended approach for Postgres enum extensions.

23-24: Empty downgrade is acceptable for enum additions.

Removing enum values from Postgres is risky if any rows reference them, so leaving downgrade as a no-op is a reasonable choice. If you need to support rollback, consider checking for usage before removal.

backend/app/models/job.py (1)

16-18: LGTM! JobType enum extended correctly.

The new LLM_API enum member aligns with the Alembic migration and follows the existing pattern.

backend/app/api/main.py (1)

10-10: LGTM! LLM router wired correctly.

The new LLM router follows the established pattern for including routers in the API.

Also applies to: 35-35

backend/app/services/llm/specs/__init__.py (1)

1-3: LGTM! Clean module exports.

The re-export follows standard Python packaging conventions.

backend/app/services/llm/providers/__init__.py (1)

1-15: LGTM! Well-documented module exports.

The provider module exports follow best practices with clear documentation and appropriate public API surface.

backend/app/models/__init__.py (1)

51-56: Public API re-export looks good

Re-exporting LLM models here improves discoverability and import ergonomics. LGTM.

backend/app/models/llm/__init__.py (1)

1-21: Clean aggregation of LLM models

Well-scoped __all__ with explicit exports; clear and maintainable.

backend/app/services/llm/__init__.py (1)

15-22: Public surface looks coherent.

execute_llm_call and provider types are neatly exported. No blockers.

backend/app/services/llm/specs/openai.py (2)

172-197: from_llm_request mapping is clean.

max_tokens→max_output_tokens mapping and vector store fields are correctly bridged.

152-161: Code implementation is correct per OpenAI Responses API specification.

The schema matches exactly: "tools" array with file_search type, vector_store_ids as a list, optional max_num_results, and include=["file_search_call.results"] is a valid include value that returns file_search tool call results. No changes needed.

coderabbitai · 2025-10-21T03:21:35Z

backend/app/api/routes/llm.py

+@router.post("/llm/call")
+async def llm_call(
+    request: LLMCallRequest, _session: SessionDep, _current_user: AuthContext
+):
+    """
+    Endpoint to initiate an LLM call as a background job.
+    """
+    project_id = _current_user.project.id
+    organization_id = _current_user.organization.id
+
+    # Start background job
+    job_id = start_job(
+        db=_session,
+        request=request,
+        project_id=project_id,
+        organization_id=organization_id,
+    )
+
+    return APIResponse.success_response(
+        data={"status": "processing", "message": "LLM call job scheduled"},
+    )


⚠️ Potential issue | 🔴 Critical

Guard against missing context; return 202 and include job_id (fixes Ruff F841)

_current_user.project / .organization are optional; direct .id access can raise. Also, return 202 and expose job_id for polling.

-import logging - -from fastapi import APIRouter +import logging + +from fastapi import APIRouter, HTTPException @@ -@router.post("/llm/call") -async def llm_call( - request: LLMCallRequest, _session: SessionDep, _current_user: AuthContext -): +@router.post("/llm/call", status_code=202) +async def llm_call( + request: LLMCallRequest, _session: SessionDep, _current_user: AuthContext +) -> APIResponse[dict[str, str]]: @@ - project_id = _current_user.project.id - organization_id = _current_user.organization.id + if _current_user.project is None or _current_user.organization is None: + raise HTTPException(status_code=403, detail="Project and organization context required") + project_id = _current_user.project.id + organization_id = _current_user.organization.id @@ - job_id = start_job( + job_id = start_job( db=_session, request=request, project_id=project_id, organization_id=organization_id, ) + logger.info(f"[llm_call] Scheduled LLM job | job_id={job_id} project_id={project_id} org_id={organization_id}") @@ - return APIResponse.success_response( - data={"status": "processing", "message": "LLM call job scheduled"}, - ) + return APIResponse.success_response( + data={ + "status": "processing", + "message": "LLM call job scheduled", + "job_id": str(job_id), + }, + )

Note: This also resolves Ruff F841 (unused job_id). As per coding guidelines (Python 3.11+), the explicit return type is added.

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.14.1)

25-25: Local variable job_id is assigned to but never used

Remove assignment to unused variable job_id

(F841)

coderabbitai · 2025-10-21T03:21:36Z

backend/app/models/llm/request.py

+    llm: LLMConfig
+    max_num_results: int = 20  # For vector store file search


🛠️ Refactor suggestion | 🟠 Major

Use Field descriptor for max_num_results with proper constraints.

The field should use SQLModel's Field descriptor with validation constraints and a description, rather than an inline comment. Based on the OpenAISpec (1-50 range), add appropriate bounds.

Apply this diff:

+from sqlmodel import SQLModel, Field + class LLMCallRequest(SQLModel): ... llm: LLMConfig - max_num_results: int = 20 # For vector store file search + max_num_results: int = Field( + default=20, + ge=1, + le=50, + description="Number of results to return from vector store file search" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

llm: LLMConfig

max_num_results: int = 20 # For vector store file search

llm: LLMConfig

max_num_results: int = Field(

default=20,

ge=1,

le=50,

description="Number of results to return from vector store file search"

)

🤖 Prompt for AI Agents

In backend/app/models/llm/request.py around lines 22 to 23, replace the inline comment on max_num_results with SQLModel's Field descriptor: change the declaration to use Field(default=20, description="Max number of vector search results", ge=1, le=50) so the model enforces the 1–50 bounds per the OpenAISpec and preserves the default of 20.

coderabbitai · 2025-10-21T03:21:36Z

backend/app/models/llm/response.py

+class LLMCallResponse(SQLModel):
+    """Response model for /v1/llm/call endpoint.
+
+    Attributes:
+        status: Response status (success, error, etc.)
+        response_id: Unique identifier for this response
+        message: The generated text response
+        model: Model identifier that was used
+        input_tokens: Number of input tokens consumed
+        output_tokens: Number of output tokens generated
+        total_tokens: Total tokens consumed (input + output)
+        file_search_results: Optional list of file search results from RAG
+    """
+
+    status: str
+    response_id: str
+    message: str
+    model: str
+    input_tokens: int
+    output_tokens: int
+    total_tokens: int


⚠️ Potential issue | 🟡 Minor

Docstring mentions missing field file_search_results.

Line 19 in the docstring mentions file_search_results: Optional list of file search results from RAG, but this field is not defined in the model. Either add the field or remove it from the docstring.

Apply this diff to add the missing field:

input_tokens: int output_tokens: int total_tokens: int + file_search_results: list[dict] | None = None

Or remove it from the docstring:

input_tokens: Number of input tokens consumed output_tokens: Number of output tokens generated total_tokens: Total tokens consumed (input + output) - file_search_results: Optional list of file search results from RAG """

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class LLMCallResponse(SQLModel):

"""Response model for /v1/llm/call endpoint.

Attributes:

status: Response status (success, error, etc.)

response_id: Unique identifier for this response

message: The generated text response

model: Model identifier that was used

input_tokens: Number of input tokens consumed

output_tokens: Number of output tokens generated

total_tokens: Total tokens consumed (input + output)

file_search_results: Optional list of file search results from RAG

"""

status: str

response_id: str

message: str

model: str

input_tokens: int

output_tokens: int

total_tokens: int

class LLMCallResponse(SQLModel):

"""Response model for /v1/llm/call endpoint.

Attributes:

status: Response status (success, error, etc.)

response_id: Unique identifier for this response

message: The generated text response

model: Model identifier that was used

input_tokens: Number of input tokens consumed

output_tokens: Number of output tokens generated

total_tokens: Total tokens consumed (input + output)

file_search_results: Optional list of file search results from RAG

"""

status: str

response_id: str

message: str

model: str

input_tokens: int

output_tokens: int

total_tokens: int

file_search_results: list[dict] | None = None

🤖 Prompt for AI Agents

In backend/app/models/llm/response.py around lines 8–28, the docstring documents a file_search_results field that is missing from the model; add the field instead of removing the docstring entry: import Optional and List from typing (and import the FileSearchResult model type), then add a class attribute like file_search_results: Optional[List[FileSearchResult]] = None to the LLMCallResponse model so the docstring and model match.

coderabbitai · 2025-10-21T03:21:36Z

backend/app/services/llm/jobs.py

+    logger.info(
+        f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, "
+        f"provider={request.llm.provider}, model={request.llm.llm_model_spec.model}"
+    )
+


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Bug: wrong attribute access; crashes before try.

LLMConfig has no provider; use request.llm.llm_model_spec.provider. Currently this can raise AttributeError before status moves to PROCESSING.

- logger.info( - f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, " - f"provider={request.llm.provider}, model={request.llm.llm_model_spec.model}" - ) + logger.info( + f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, " + f"provider={request.llm.llm_model_spec.provider}, model={request.llm.llm_model_spec.model}" + )

🏁 Script executed:

#!/bin/bash # Confirm there are no other bad accesses to `request.llm.provider` rg -n -C2 -g '!**/venv/**' -P '\brequest\.llm\.provider\b'

Length of output: 379

--- 🏁 Script executed: ```shell #!/bin/bash # Find LLMConfig definition and its structure fd -e py | xargs rg -l 'class LLMConfig' | head -5

Length of output: 32

🏁 Script executed:

#!/bin/bash # Search for llm_model_spec definition or usage patterns rg -n 'llm_model_spec' backend/app/services/llm/jobs.py -A 2 -B 2

Length of output: 354

🏁 Script executed:

#!/bin/bash # Look for the request model definition rg -n 'class.*Request|@dataclass' backend/app/services/llm/jobs.py -A 5 | head -30

Length of output: 0

--- 🏁 Script executed: ```shell #!/bin/bash # Examine LLMConfig structure cat backend/app/models/llm/config.py

Length of output: 1728

🏁 Script executed:

#!/bin/bash # Check function signature to understand request parameter type rg -n 'def execute_job' backend/app/services/llm/jobs.py -A 10

Length of output: 343

Bug confirmed: AttributeError on line 68 before try block.

LLMConfig has no provider attribute; the correct path is request.llm.llm_model_spec.provider. Line 79 in the same function already uses the correct pattern. This error will crash the job before status moves to PROCESSING.

logger.info( f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, " - f"provider={request.llm.provider}, model={request.llm.llm_model_spec.model}" + f"provider={request.llm.llm_model_spec.provider}, model={request.llm.llm_model_spec.model}" )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

logger.info(

f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, "

f"provider={request.llm.provider}, model={request.llm.llm_model_spec.model}"

)

logger.info(

f"[execute_job] Starting LLM job execution | job_id={job_id}, task_id={task_id}, "

f"provider={request.llm.llm_model_spec.provider}, model={request.llm.llm_model_spec.model}"

)

🤖 Prompt for AI Agents

In backend/app/services/llm/jobs.py around lines 66 to 70, the logger references request.llm.provider which does not exist; replace it with request.llm.llm_model_spec.provider (matching line 79) so the log uses the correct attribute path and avoid the AttributeError that crashes before the job status is updated.

coderabbitai · 2025-10-21T03:21:36Z

backend/app/services/llm/providers/openai.py

+        except openai.OpenAIError as e:
+            error_message = handle_openai_error(e)
+            logger.error(
+                f"[OpenAIProvider] OpenAI API error: {error_message}", exc_info=True
+            )
+            return None, error_message


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Use correct OpenAI SDK v2 exception classes.

openai.OpenAIError may not exist in v2; prefer APIError family (APIConnectionError, RateLimitError, APIStatusError) and handle generically via APIError.

-import openai +from openai import APIError, APIConnectionError, RateLimitError, APIStatusError @@ - except openai.OpenAIError as e: + except (RateLimitError, APIConnectionError, APIStatusError, APIError) as e: error_message = handle_openai_error(e) logger.error( f"[OpenAIProvider] OpenAI API error: {error_message}", exc_info=True ) return None, error_message

🌐 Web query:

What is the correct base exception class (and common subclasses) in the OpenAI Python SDK v2 for handling API errors?

💡 Result:

Base exception: openai.APIError. Common subclasses you’ll see in the v2 Python SDK:

openai.APIConnectionError

openai.APITimeoutError

openai.APIStatusError (and HTTP-code-specific subclasses: BadRequestError, AuthenticationError, PermissionDeniedError, NotFoundError, UnprocessableEntityError)

openai.RateLimitError

openai.InternalServerError

openai.APIResponseValidationError

Sources: [1][2].

Sources
[1] openai PyPI package docs (error handling).
[2] openai-python client README / examples (exception hierarchy).

I'll verify the current state of the code and check the imports to confirm whether the review comment's suggestion is accurate.

🏁 Script executed:

#!/bin/bash # Check current imports and exception handling in the OpenAI provider file echo "=== Imports section ===" head -50 backend/app/services/llm/providers/openai.py | grep -E "^import|^from" echo -e "\n=== Exception handling around lines 147-152 ===" sed -n '140,160p' backend/app/services/llm/providers/openai.py echo -e "\n=== Search for openai.OpenAIError usage ===" rg "openai\.OpenAIError" backend/app/services/llm/providers/openai.py

Length of output: 1229

Update exception handling to use OpenAI SDK v2 exception classes.

The code currently catches openai.OpenAIError at line 147, which does not exist in OpenAI Python SDK v2. The correct base exception class is openai.APIError, with subclasses including openai.APIConnectionError, openai.APITimeoutError, openai.APIStatusError, openai.RateLimitError, openai.InternalServerError, and openai.APIResponseValidationError.

The current code will fail to catch API errors at runtime, allowing exceptions to fall through to the generic Exception handler. Update the import statement and exception handler:

-import openai +from openai import APIError, APIConnectionError, RateLimitError, APIStatusError @@ - except openai.OpenAIError as e: + except (RateLimitError, APIConnectionError, APIStatusError, APIError) as e: error_message = handle_openai_error(e) logger.error( f"[OpenAIProvider] OpenAI API error: {error_message}", exc_info=True ) return None, error_message

Alternatively, catching just APIError alone would suffice since all subclasses inherit from it, but the explicit approach is valid for granular error handling if needed.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In backend/app/services/llm/providers/openai.py around lines 147 to 152, the exception handler currently catches openai.OpenAIError which doesn't exist in OpenAI Python SDK v2; update the imports to use openai.APIError (or import the specific subclasses like openai.APIConnectionError, openai.APITimeoutError, openai.APIStatusError, openai.RateLimitError, openai.InternalServerError, openai.APIResponseValidationError) and replace the except clause to catch openai.APIError (or the specific subclasses) so API errors from the SDK are properly handled and routed through handle_openai_error before logging and returning the error_message.

codecov · 2025-10-22T06:41:51Z

Codecov Report

❌ Patch coverage is 49.14286% with 89 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
backend/app/services/llm/jobs.py	25.42%	44 Missing ⚠️
backend/app/services/llm/providers/openai.py	29.72%	26 Missing ⚠️
backend/app/services/llm/providers/registry.py	42.85%	12 Missing ⚠️
backend/app/api/routes/llm.py	71.42%	4 Missing ⚠️
backend/app/services/llm/providers/base.py	72.72%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

… registry for provider instantiation, and update OpenAI provider execution logic.

avirajsingh7 added 5 commits October 21, 2025 08:38

intial commit v1 llm

d46e028

resolve using registry

e08fdcd

resolve using sqlModel

95be54b

Implement from llm request

a7f63d8

remove md

e76e2c8

avirajsingh7 marked this pull request as draft October 21, 2025 03:16

avirajsingh7 self-assigned this Oct 21, 2025

avirajsingh7 added the enhancement New feature or request label Oct 21, 2025

avirajsingh7 linked an issue Oct 21, 2025 that may be closed by this pull request

Build/Implement Unified LLM API v1 #409

Open

coderabbitai bot reviewed Oct 21, 2025

View reviewed changes

avirajsingh7 added 3 commits October 21, 2025 15:08

rename OpenAISpec to OpenAIResponseSpec

2941118

Enhanced OpenAISpec Configuration

a8b9577

Define intial unified api contract

2d191ee

avirajsingh7 added 4 commits October 23, 2025 16:36

Use flexible json for config

6e359d5

Handle callback

bb74ab6

Refactor LLM provider architecture: remove factory pattern, introduce…

b6a3fd9

… registry for provider instantiation, and update OpenAI provider execution logic.

remove spec

86f6855

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/unified v1 #413

Feature/unified v1 #413

avirajsingh7 commented Oct 21, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 21, 2025 •

edited

Loading

Review skipped

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 21, 2025

Uh oh!

coderabbitai bot Oct 21, 2025

Uh oh!

coderabbitai bot Oct 21, 2025

Uh oh!

coderabbitai bot Oct 21, 2025

Uh oh!

coderabbitai bot Oct 21, 2025

Uh oh!

codecov bot commented Oct 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		llm: LLMConfig
		max_num_results: int = 20 # For vector store file search

Feature/unified v1 #413

Are you sure you want to change the base?

Feature/unified v1 #413

Conversation

avirajsingh7 commented Oct 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Notes

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

avirajsingh7 commented Oct 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 21, 2025 •

edited

Loading

codecov bot commented Oct 22, 2025 •

edited

Loading