Skip to content

Adds continuous batching #850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

NathanHB
Copy link
Member

@NathanHB NathanHB commented Jul 3, 2025

No description provided.

@HuggingFaceDocBuilderDev
Copy link
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@NathanHB NathanHB requested a review from Copilot July 8, 2025 11:59
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces continuous batching support for Transformer-based models, enabling split-wise streaming generation.

  • Adds continuous_batching flag throughout configuration, model initialization, and generation functions.
  • Implements a new _continuous_greedy_until path and refactors _generate to dispatch based on the flag.
  • Updates GenerationParameters and example configs to include num_blocks and block_size, and adjusts tests accordingly.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/models/endpoints/test_tgi_model.py Inserts block_size and num_blocks into generation parameters
tests/models/endpoints/test_endpoint_model.py Inserts num_blocks and block_size into generation parameters
src/lighteval/models/transformers/transformers_model.py Propagates continuous_batching through init, from_model, and generate paths
src/lighteval/models/model_input.py Extends GenerationParameters with num_blocks and block_size
examples/model_configs/transformers_model.yaml Adds continuous_batching and example num_blocks/block_size
Comments suppressed due to low confidence (2)

src/lighteval/models/model_input.py:28

  • [nitpick] New fields num_blocks and block_size in GenerationParameters lack descriptions in the class docstring. Consider documenting their purpose and effects.
    num_blocks: NonNegativeInt | None = None  # transformers

src/lighteval/models/transformers/transformers_model.py:114

  • There are no existing tests covering the new continuous_batching logic path. Consider adding unit tests to verify both True and False behaviors.
        continuous_batching (bool):

self.transformers_config = model.config
self.generation_config_dict = config.generation_parameters.to_transformers_dict()
self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)
Copy link
Preview

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The from_model constructor does not set self.continuous_batching, so models loaded via from_model will always default to False. Add self.continuous_batching = self.config.continuous_batching after setting self.config.

Suggested change
self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)
self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)
self.continuous_batching = self.config.continuous_batching if config and hasattr(self.config, 'continuous_batching') else False

Copilot uses AI. Check for mistakes.

Comment on lines +41 to +42
"block_size": None,
"num_blocks": None,
Copy link
Preview

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Test insertion order of block_size then num_blocks differs from the other endpoint test. Consider keeping parameter order consistent across tests to avoid confusion.

Suggested change
"block_size": None,
"num_blocks": None,
"num_blocks": None,
"block_size": None,

Copilot uses AI. Check for mistakes.

Comment on lines +55 to +56
"num_blocks": None,
"block_size": None,
Copy link
Preview

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Here the parameters are added as num_blocks then block_size, which is the reverse of the other test. Align ordering to maintain consistency.

Suggested change
"num_blocks": None,
"block_size": None,
"block_size": None,
"num_blocks": None,

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants