Skip to content

WIP refactor(model): unify config-driven retry across VLM and embedding#926

Open
qin-ctx wants to merge 3 commits intomainfrom
refactor/unify-model-retry
Open

WIP refactor(model): unify config-driven retry across VLM and embedding#926
qin-ctx wants to merge 3 commits intomainfrom
refactor/unify-model-retry

Conversation

@qin-ctx
Copy link
Collaborator

@qin-ctx qin-ctx commented Mar 24, 2026

Description

Unify retry handling for model calls by moving transient-error retry into shared utilities and config-driven defaults for both VLM and embedding providers.

Related Issue

Fixes #922

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

  • add a shared model retry helper and reuse its error classification in both VLM backends and the circuit breaker path
  • make vlm.max_retries default to 3, add top-level embedding.max_retries, and wire embedding factories/providers to honor config-driven retry
  • remove per-call async VLM retry overrides and update tests for new defaults, signatures, and retry behavior

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

N/A

Additional Notes

Targeted verification completed with:

  • python -m pytest --override-ini addopts='' tests/models/test_vlm_strip_think_tags.py tests/unit/test_vlm_response_formats.py tests/unit/test_extra_headers_vlm.py tests/unit/test_stream_config_vlm.py tests/unit/test_extra_headers_embedding.py tests/unit/test_jina_embedder.py tests/unit/test_voyage_embedder.py tests/unit/test_minimax_embedder_simple.py tests/unit/test_model_retry.py -q
  • python -m compileall openviking openviking_cli tests/unit

I did not run the server test setup because local imports there require argon2, which is not installed in this environment. There is also an existing unrelated failure in tests/unit/test_openai_embedder.py that was not introduced by this change.

Move retry behavior into shared model-call utilities and config defaults so VLM and embedding providers handle transient failures consistently.

Co-Authored-By: Claude Opus 4.6
@github-actions
Copy link

Failed to generate code suggestions for PR

@qin-ctx qin-ctx changed the title refactor(model): unify config-driven retry across VLM and embedding WIP refactor(model): unify config-driven retry across VLM and embedding Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

[Feature]: Unify config-driven retry across VLM and embedding

1 participant