Adds continuous batching #850

NathanHB · 2025-07-03T12:03:50Z

No description provided.

… into add-fast-generate

HuggingFaceDocBuilderDev · 2025-07-03T12:06:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copilot

Pull Request Overview

This PR introduces continuous batching support for Transformer-based models, enabling split-wise streaming generation.

Adds continuous_batching flag throughout configuration, model initialization, and generation functions.
Implements a new _continuous_greedy_until path and refactors _generate to dispatch based on the flag.
Updates GenerationParameters and example configs to include num_blocks and block_size, and adjusts tests accordingly.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/models/endpoints/test_tgi_model.py	Inserts `block_size` and `num_blocks` into generation parameters
tests/models/endpoints/test_endpoint_model.py	Inserts `num_blocks` and `block_size` into generation parameters
src/lighteval/models/transformers/transformers_model.py	Propagates `continuous_batching` through init, from_model, and generate paths
src/lighteval/models/model_input.py	Extends `GenerationParameters` with `num_blocks` and `block_size`
examples/model_configs/transformers_model.yaml	Adds `continuous_batching` and example `num_blocks`/`block_size`

Comments suppressed due to low confidence (2)

src/lighteval/models/model_input.py:28

[nitpick] New fields num_blocks and block_size in GenerationParameters lack descriptions in the class docstring. Consider documenting their purpose and effects.

    num_blocks: NonNegativeInt | None = None  # transformers

src/lighteval/models/transformers/transformers_model.py:114

There are no existing tests covering the new continuous_batching logic path. Consider adding unit tests to verify both True and False behaviors.

        continuous_batching (bool):

Copilot · 2025-07-08T12:00:59Z

src/lighteval/models/transformers/transformers_model.py

        self.transformers_config = model.config
-        self.generation_config_dict = config.generation_parameters.to_transformers_dict()
+        self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)


The from_model constructor does not set self.continuous_batching, so models loaded via from_model will always default to False. Add self.continuous_batching = self.config.continuous_batching after setting self.config.

Suggested change

self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)

self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)

self.continuous_batching = self.config.continuous_batching if config and hasattr(self.config, 'continuous_batching') else False

Copilot · 2025-07-08T12:00:59Z

tests/models/endpoints/test_tgi_model.py

+                        "block_size": None,
+                        "num_blocks": None,


[nitpick] Test insertion order of block_size then num_blocks differs from the other endpoint test. Consider keeping parameter order consistent across tests to avoid confusion.

Suggested change

"block_size": None,

"num_blocks": None,

"num_blocks": None,

"block_size": None,

Copilot · 2025-07-08T12:01:00Z

tests/models/endpoints/test_endpoint_model.py

+                        "num_blocks": None,
+                        "block_size": None,


[nitpick] Here the parameters are added as num_blocks then block_size, which is the reverse of the other test. Align ordering to maintain consistency.

Suggested change

"num_blocks": None,

"block_size": None,

"block_size": None,

"num_blocks": None,

ArthurZucker and others added 9 commits May 9, 2025 13:16

update for CB

41838c0

update

f7a3c2f

push

c9b3467

Merge branch 'main' into add-fast-generate

796ef5a

c'est une honte, 0.2.... ruff....

a7e2751

Merge branch 'add-fast-generate' of github.com:ArthurZucker/lighteval…

a1c4c00

… into add-fast-generate

Merge branch 'main' into add-fast-generate

2b162f7

Merge branch 'main' into add-fast-generate

101083e

merge main

0f772b1

NathanHB added 6 commits July 3, 2025 13:42

fix model

df98d9b

fix model

1da56bd

fix tests

fe6f24c

fix slow tests

96466e4

fix slow tests

8344961

reset vllm model file config

7453c6f

NathanHB requested a review from Copilot July 8, 2025 11:59

Copilot AI reviewed Jul 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adds continuous batching #850

Adds continuous batching #850

Uh oh!

NathanHB commented Jul 3, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 8, 2025

Uh oh!

Copilot AI Jul 8, 2025

Uh oh!

Copilot AI Jul 8, 2025

Uh oh!

Uh oh!

	self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)
	self.config = config if config is not None else TransformersModelConfig(model_name=model.config.name_or_path)
	self.continuous_batching = self.config.continuous_batching if config and hasattr(self.config, 'continuous_batching') else False

Adds continuous batching #850

Are you sure you want to change the base?

Adds continuous batching #850

Uh oh!

Conversation

NathanHB commented Jul 3, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!