feat: add new integration for `FallbackChatGenerator` #2358

vblagoje · 2025-10-02T13:36:29Z

Why:

Introduces a robust fallback mechanism for chat generators that automatically switches between multiple generators when primary services fail, ensuring continuous service availability during API outages or rate limiting.

fixes New ChatGenerator fallback component for quota limits/API errors haystack#9786

What:

Added new FallbackChatGenerator component with sequential fallback logic
Implemented per-generator timeout handling and comprehensive error handling (429, 401, 400, 408, 500+ errors)
Added both sync/async execution support with streaming callback handling
Added test suite and three practical usage examples
Set up full integrations structure with proper licensing and documentation

How can it be used:

from haystack_integrations.components.generators.fallback_chat import FallbackChatGenerator

primary = OpenAIChatGenerator(model="gpt-4o-mini")
backup = AnthropicChatGenerator(model="claude-3-5-sonnet-20241022")

fallback = FallbackChatGenerator(generators=[primary, backup], timeout=10.0)
result = fallback.run([ChatMessage.from_user("Hello!")])
print(result["replies"][0].text)

How did you test it:

Added comprehensive unit tests covering success/failure scenarios and timeout behavior
Tested error handling for all specified HTTP status codes and streaming functionality
Created integration examples demonstrating real-world usage patterns
Validated serialization/deserialization and async/sync compatibility

Notes for the reviewer:

Focus on the timeout logic in _get_effective_timeout() and error handling in _run_generator_with_timeout(). The streaming callback forwarding maintains proper async/sync compatibility.

vblagoje · 2025-10-06T13:24:07Z

@sjrl please have a quick look - perhaps examples and chat_generator.py itself are good candidates to quickly grasp the impl from user and our perspective. LMK if you like this direction

vblagoje · 2025-10-07T09:57:51Z

Review from anyone else interested in this area is welcome cc @julian-risch @sjrl

sjrl · 2025-10-07T10:11:06Z

...allback_chat/src/haystack_integrations/components/generators/fallback_chat/chat_generator.py

+        for gen in generators:
+            if not hasattr(gen, "run") or not callable(gen.run):
+                msg = "All items in 'generators' must expose a callable 'run' method (duck-typed ChatGenerator)"
+                raise TypeError(msg)


I'm not sure this check is needed. At least in our other components that take in components in their init we don't strictly double check that they are a Haystack component.

sjrl · 2025-10-07T10:14:35Z

...allback_chat/src/haystack_integrations/components/generators/fallback_chat/chat_generator.py

+                return gen.run(
+                    messages=messages,
+                    generation_kwargs=generation_kwargs,
+                    tools=tools,
+                    streaming_callback=streaming_callback,
+                )


It's possible we should only forward params to the generator's run method if it accepts it. E.g. I don't think generators have tools supported. or at the very least we should enforce in the init method what the run signature of each chat generator should be.

sjrl · 2025-10-07T10:17:20Z

...allback_chat/src/haystack_integrations/components/generators/fallback_chat/chat_generator.py

+    def _run_single_sync(
+        self,
+        gen: Any,
+        messages: list[ChatMessage],
+        generation_kwargs: dict[str, Any] | None,
+        tools: (list[Tool] | Toolset) | None,
+        streaming_callback: StreamingCallbackT | None,
+    ) -> dict[str, Any]:


@vblagoje I'm pretty confused as to what this is function is doing. Why are we running the calls to the generator within a ThreadPoolExecutor with a single worker?

sjrl · 2025-10-07T10:19:29Z

...allback_chat/src/haystack_integrations/components/generators/fallback_chat/chat_generator.py

+            except asyncio.TimeoutError as e:
+                logger.warning("Generator %s timed out after %.2fs", gen_name, effective_timeout)
+                failed.append(gen_name)
+                last_error = e


A general comment. I'm not sure I like how we've implemented our own additional timeout management here. I think we should be asking users to set timeouts on each individual ChatGenerator since that is normally specifiable as a timeout param to the ChatGenerator when making it. I think that is more clear and understandable than creating our own mechanism on top

We can rewrite your example like this

from haystack_integrations.components.generators.fallback_chat import FallbackChatGenerator primary = OpenAIChatGenerator(model="gpt-4o-mini", timeout=10.0) # <-- Added timeout here backup = AnthropicChatGenerator(model="claude-3-5-sonnet-20241022", timeout=10.0) # <-- Added timeout here fallback = FallbackChatGenerator(generators=[primary, backup]) result = fallback.run([ChatMessage.from_user("Hello!")]) print(result["replies"][0].text)

Yes, ok we can do that. I'll drop the timeout on the FallbackChatGenerator and the complexities with what takes precendence: list of generators, outer etc

sjrl · 2025-10-07T10:50:40Z

@vblagoje I also wanted to ask, what was your reasoning for making this a core integration?

vblagoje · 2025-10-07T11:52:16Z

@vblagoje I also wanted to ask, what was your reasoning for making this a core integration?

My reasoning was that this was an optional sidecar not a core feature as we tend to reserve for truly necessary building blocks to core. I'm not hard pressed for integration - we can make it core feature as well. Perhaps via experimental?

vblagoje · 2025-10-07T11:53:21Z

Converting to draft until we place this PR properly and remove additional time management preemption for individual chat generators

sjrl · 2025-10-07T12:28:49Z

@vblagoje I also wanted to ask, what was your reasoning for making this a core integration?

My reasoning was that this was an optional sidecar not a core feature as we tend to reserve for truly necessary building blocks to core. I'm not hard pressed for integration - we can make it core feature as well. Perhaps via experimental?

I think this would be suitable as a core feature but unsure if it should be in experimental or not first. What do you think @julian-risch

…nerators

vblagoje · 2025-10-14T09:57:45Z

Moved to core via deepset-ai/haystack#9859

Initial commit

0cc2add

github-actions bot added the type:documentation Improvements or additions to documentation label Oct 2, 2025

vblagoje added 2 commits October 6, 2025 14:51

Improve tests

10e5c69

Add integration to haystack-core-integrations README

69c6536

vblagoje marked this pull request as ready for review October 7, 2025 09:56

vblagoje requested a review from a team as a code owner October 7, 2025 09:56

vblagoje requested review from davidsbatista and removed request for a team October 7, 2025 09:56

sjrl reviewed Oct 7, 2025

View reviewed changes

vblagoje marked this pull request as draft October 7, 2025 11:52

davidsbatista changed the title ~~Add new integration for FallbackChatGenerator~~ feat: add new integration for FallbackChatGenerator Oct 7, 2025

vblagoje added 2 commits October 8, 2025 17:05

Use timeout as a total timeout per CG (streaming or not)

846bab4

Simplify FallbackChatGenerator by delegating timeout to underlying ge…

bc8f594

…nerators

vblagoje closed this Oct 14, 2025

feat: add new integration for FallbackChatGenerator #2358

feat: add new integration for FallbackChatGenerator #2358

Uh oh!

Conversation

vblagoje commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why:

What:

How can it be used:

How did you test it:

Notes for the reviewer:

Uh oh!

vblagoje commented Oct 6, 2025

Uh oh!

vblagoje commented Oct 7, 2025

Uh oh!

sjrl Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

sjrl Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sjrl Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

sjrl Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

sjrl Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

vblagoje Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

sjrl commented Oct 7, 2025

Uh oh!

vblagoje commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vblagoje commented Oct 7, 2025

Uh oh!

sjrl commented Oct 7, 2025

Uh oh!

vblagoje commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: add new integration for `FallbackChatGenerator` #2358

feat: add new integration for `FallbackChatGenerator` #2358

vblagoje commented Oct 2, 2025 •

edited

Loading

sjrl Oct 7, 2025 •

edited

Loading

vblagoje commented Oct 7, 2025 •

edited

Loading