-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
demo: conversational pipeline (+generic pipleline) - implementation 2 #330
base: main
Are you sure you want to change the base?
Conversation
Trivy scanning results. |
Code Coverage Summary
Diff against main
Results for commit: cbdb0d2 Minimum allowed coverage is ♻️ This comment has been updated with latest results |
""" | ||
|
||
|
||
class Pipeline(Step[InputT, OutputT]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Every Pipeline
is also a Step
. Which means:
- You can nest pipelines (add pipelines as steps of other pipelines)
- Most likely, typing an argument as
Pipeline[input type, output type]
will never make sense - it's always better to accept anyStep[input type, output type]
. This way bothPipeline
andStep
objects can be given as the value for the argument, so users can freely choose whether they want to involve an entire pipeline or do everything in one step.
(2) also probably suggest that "Step" is not the best name. Because really Step
can be just anything that can take some (typed) input, run some logic on it, and return some (typed) output. And Pipeline
is just a way to chain multiple of such objects. Maybe Processor
and ProcessorChain
would be better names?
|
||
def __init__(self, *steps: Step) -> None: | ||
self.steps = steps | ||
self._validate_intrastep_types() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: during initialization we validate that individual steps have types that are compatible with each other (i.e, that the return type of each step is within bounds of the input type of its following step), but due to specifics of the Python generics system, we can validate that the steps have types compatible with the pipeline itself only when the pipeline is run.
needed_type = Union[tuple(needed_type.__args__)] # noqa: UP007 | ||
if isinstance(actual_type, UnionType): | ||
actual_type = Union[tuple(actual_type.__args__)] # noqa: UP007 | ||
return issubtype(needed_type, actual_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issubtype
comes from an external library (typing_utils
) and unfortunately it is on the library's author to implement all the logic behind different typing hints in Python.
Which means that it usually works, but in some cases (new additions to the typing system, less know types, edge cases, generics) may not. For example, the library don't support new pipe syntax for unions - that's why I had to convert from new to old syntax above.
I see this as a problem, because if we go with this implementation, it will look like Steps and Pipelines can be typed with any typing hints available, and usually it indeed work. But for some more advanced hints it will just not.
return ConversationPipelineResult(plugin_metadata=state.plugin_metadata, output_stream=stream) | ||
|
||
|
||
class ConversationPiepline(Step): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: ConversationPiepline
is meant to be easier to use (than a generic Pipeline
) for conversation as it requires specific arguments and types, showing the user exactly what they need to provide to create a conversational pipeline.
But it is compatible with Pipeline
in many different ways:
- You can pass
ConversationPiepline
object as a step to any otherPipeline
- You can pass any
Pipeline
object as aConversationPiepline
step (e.g., preprocessor, postprocessor, state_to_chat) - Places that need a conversation pipeline should accept anything typed as
Step[ConversationPipelineState | str, ConversationPipelineResult]
, which means that they will accept any of those:- A
ConversationPiepline
- A
Pipeline
- A custom
Step
implementation
- A
- Step implementations are compatible between
ConversationPiepline
andPipeline
ConversationPiepline
uses aPipeline
behind the scenes
(Every time the word "any" is used within the list above it's a shortcut, it should really be "any as long as the types match")
def __init__( | ||
self, | ||
llm: LLM, | ||
preprocessors: Sequence[Step[ConversationPipelineState, ConversationPipelineState]] = [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: in this implementation you don't have plugins that can simulanously act on different stages of the pipeline (like you do with implementation 1). Here, if supporting a functionality requires both pre- and post- processing, you will need to have two different implementations and pass both as objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a conversational pipeline along with a generic pipeline to process and transform input data through a series of steps while integrating with LLMs and document search capabilities. Key changes include:
- The introduction of a generic Pipeline class and its associated Step interface.
- The addition of multiple example pipelines and conversation plugin steps.
- Updates to integration components and dependency specifications.
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
packages/ragbits-core/src/ragbits/core/pipelines/pipeline.py | Implements the core Pipeline and Step classes with type validation and run logic. |
packages/ragbits-core/src/ragbits/core/pipelines/example.py | Provides an example usage of the generic pipeline with multiple steps. |
packages/ragbits-core/src/ragbits/core/llms/base.py | Introduces a temporary SimplePrompt class for prompt handling. |
packages/ragbits-core/pyproject.toml | Updates dependency versions to include new utilities. |
packages/ragbits-conversations/src/ragbits/conversations/piepline/steps.py | Adds various conversation pipeline plugin steps, including document search and history compression. |
packages/ragbits-conversations/src/ragbits/conversations/piepline/state_to_chat.py | Implements a default prompt to convert conversation state to chat format. |
packages/ragbits-conversations/src/ragbits/conversations/piepline/state.py | Defines conversation pipeline state and result dataclasses. |
packages/ragbits-conversations/src/ragbits/conversations/piepline/pipeline.py | Implements the conversation pipeline orchestration with preprocessing and postprocessing steps. |
packages/ragbits-conversations/src/ragbits/conversations/piepline/examples/ | Provides multiple examples demonstrating the conversation pipeline in various configurations. |
Comments suppressed due to low confidence (1)
packages/ragbits-conversations/src/ragbits/conversations/piepline/pipeline.py:29
- The class name 'ConversationPiepline' seems to be misspelled; consider renaming it to 'ConversationPipeline' for clarity and consistency.
class ConversationPiepline(Step):
packages/ragbits-conversations/src/ragbits/conversations/piepline/examples/example1.py
Outdated
Show resolved
Hide resolved
packages/ragbits-conversations/src/ragbits/conversations/piepline/steps.py
Outdated
Show resolved
Hide resolved
packages/ragbits-conversations/src/ragbits/conversations/piepline/state.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <[email protected]>
No description provided.