-
Notifications
You must be signed in to change notification settings - Fork 200
feat: add new integration for FallbackChatGenerator
#2358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@sjrl please have a quick look - perhaps examples and chat_generator.py itself are good candidates to quickly grasp the impl from user and our perspective. LMK if you like this direction |
|
Review from anyone else interested in this area is welcome cc @julian-risch @sjrl |
| for gen in generators: | ||
| if not hasattr(gen, "run") or not callable(gen.run): | ||
| msg = "All items in 'generators' must expose a callable 'run' method (duck-typed ChatGenerator)" | ||
| raise TypeError(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this check is needed. At least in our other components that take in components in their init we don't strictly double check that they are a Haystack component.
| return gen.run( | ||
| messages=messages, | ||
| generation_kwargs=generation_kwargs, | ||
| tools=tools, | ||
| streaming_callback=streaming_callback, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible we should only forward params to the generator's run method if it accepts it. E.g. I don't think generators have tools supported. or at the very least we should enforce in the init method what the run signature of each chat generator should be.
| def _run_single_sync( | ||
| self, | ||
| gen: Any, | ||
| messages: list[ChatMessage], | ||
| generation_kwargs: dict[str, Any] | None, | ||
| tools: (list[Tool] | Toolset) | None, | ||
| streaming_callback: StreamingCallbackT | None, | ||
| ) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vblagoje I'm pretty confused as to what this is function is doing. Why are we running the calls to the generator within a ThreadPoolExecutor with a single worker?
| except asyncio.TimeoutError as e: | ||
| logger.warning("Generator %s timed out after %.2fs", gen_name, effective_timeout) | ||
| failed.append(gen_name) | ||
| last_error = e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general comment. I'm not sure I like how we've implemented our own additional timeout management here. I think we should be asking users to set timeouts on each individual ChatGenerator since that is normally specifiable as a timeout param to the ChatGenerator when making it. I think that is more clear and understandable than creating our own mechanism on top
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can rewrite your example like this
from haystack_integrations.components.generators.fallback_chat import FallbackChatGenerator
primary = OpenAIChatGenerator(model="gpt-4o-mini", timeout=10.0) # <-- Added timeout here
backup = AnthropicChatGenerator(model="claude-3-5-sonnet-20241022", timeout=10.0) # <-- Added timeout here
fallback = FallbackChatGenerator(generators=[primary, backup])
result = fallback.run([ChatMessage.from_user("Hello!")])
print(result["replies"][0].text)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ok we can do that. I'll drop the timeout on the FallbackChatGenerator and the complexities with what takes precendence: list of generators, outer etc
|
@vblagoje I also wanted to ask, what was your reasoning for making this a core integration? |
My reasoning was that this was an optional sidecar not a core feature as we tend to reserve for truly necessary building blocks to core. I'm not hard pressed for integration - we can make it core feature as well. Perhaps via experimental? |
|
Converting to draft until we place this PR properly and remove additional time management preemption for individual chat generators |
FallbackChatGenerator
I think this would be suitable as a core feature but unsure if it should be in experimental or not first. What do you think @julian-risch |
|
Moved to core via deepset-ai/haystack#9859 |
Why:
Introduces a robust fallback mechanism for chat generators that automatically switches between multiple generators when primary services fail, ensuring continuous service availability during API outages or rate limiting.
What:
FallbackChatGeneratorcomponent with sequential fallback logicHow can it be used:
How did you test it:
Notes for the reviewer:
Focus on the timeout logic in
_get_effective_timeout()and error handling in_run_generator_with_timeout(). The streaming callback forwarding maintains proper async/sync compatibility.