Add support for Prithvi in Online serving mode #21518

mgazz · 2025-07-24T10:51:05Z

This PR builds on top of #20072 and it enables the execution of Prithvi in online serving mode.

This is achieved by increasing the level of support for models that skip the tokenizer initialisation.

A longer description of the what we are trying to achieve is available in #20234.

This supersedes #20307

Test Plan

The PR can be tested as following

pytest  tests/entrypoints/openai/test_skip_tokenizer.py

Additional information

Prithvi in serving mode can be started with the following command:

vllm serve --model='christian-pinto/Prithvi-EO-2.0-300M-TL-VLLM' --task embed --trust-remote-code --dtype float16 --skip-tokenizer-init --enforce-eager

The following script provides an example of prompt that can be used to perform an inference (the same is used during the test linked above):

import base64
import requests
import torch
import io
import numpy as np

torch.set_default_dtype(torch.float16)


def post_http_request(prompt: dict, api_url: str) -> requests.Response:
    headers = {"User-Agent": "Test Client","Content-Type": "application/json"}
    response = requests.post(api_url, headers=headers, json=prompt)
    return response


def decompress(output):
    np_result = np.frombuffer(
        base64.b64decode(output), dtype=np.float32)
    return np_result.reshape(1, 2, 512, 512)


def main():
    api_url = f"http://localhost:8000/pooling"
    model_name = 'christian-pinto/Prithvi-EO-2.0-300M-TL-VLLM'

    pixel_values = torch.full((6, 512, 512), 1.0,dtype=torch.float16)
    location_coords = torch.full((1, 2), 1.0,dtype=torch.float16)

    buffer_tiff = io.BytesIO()
    torch.save(pixel_values, buffer_tiff)
    buffer_tiff.seek(0)
    binary_data = buffer_tiff.read()
    base64_tensor_embedding = base64.b64encode(binary_data).decode('utf-8')

    buffer_coord = io.BytesIO()
    torch.save(location_coords, buffer_coord)
    buffer_coord.seek(0)
    binary_data = buffer_coord.read()
    base64_coord_embedding = base64.b64encode(binary_data).decode('utf-8')

    prompt={
        "model":model_name,
        "additional_data":{
            "prompt_token_ids": [1]
        },
        "encoding_format": "base64",
        "messages":[
            {
                "role": "user",
                "content": [
                        { "type": "image_embeds",
                        "image_embeds": {
                            "pixel_values": base64_tensor_embedding,
                            "location_coords": base64_coord_embedding,
                            },
                        }
                        ],
            }]}

    pooling_response = post_http_request(prompt=prompt, api_url=api_url)
    numpy_data = decompress(pooling_response.json()["data"][0]["data"])
    print(f"Returned result: {numpy_data}")


if __name__ == "__main__":
    main()

@DarkLight1337 @maxdebayser @njhill @christian-pinto

github-actions · 2025-07-24T10:51:12Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This PR introduces support for Prithvi in online serving mode by allowing models to skip tokenizer initialization. The changes include modifications to the client, serving engine, serving pooling, and the Prithvi model itself. It's important to ensure that the tokenizer is checked for None before being used and that exceptions provide sufficient context for debugging.

gemini-code-assist · 2025-07-24T10:58:23Z

vllm/engine/multiprocessing/client.py

+                model_config=self.model_config,
+                scheduler_config=engine_config.scheduler_config,
+                lora_config=engine_config.lora_config)
+
        self.input_preprocessor = InputPreprocessor(self.model_config,
                                                    self.tokenizer)


If self.tokenizer is None, this will raise an AttributeError. It's critical to ensure self.tokenizer is checked for None before being used here to prevent a crash. Consider adding a condition to skip this line if self.tokenizer is None.

self.input_preprocessor = InputPreprocessor(self.model_config, self.tokenizer if self.tokenizer else None)

gemini-code-assist · 2025-07-24T10:58:23Z

vllm/entrypoints/openai/serving_engine.py

+            if "prompt_token_ids" not in request.additional_data:
+                raise Exception("Request must contain "
+                                "additional_data['prompt_token_ids'] "
+                                "when the tokenizer is not initialised")


This exception lacks context about why the tokenizer is not initialized. Add more details to the exception message to aid debugging. For example, include the model name or a hint to check the --skip-tokenizer-init flag.

raise Exception("Request must contain " "additional_data['prompt_token_ids'] " "when the tokenizer is not initialised. Check if '--skip-tokenizer-init' flag was used correctly for model {}".format(request.model))

vllm/model_executor/models/prithvi_geospatial_mae.py

DarkLight1337

Please also fix pre-commit

Signed-off-by: Michele Gazzetti <[email protected]>

Improve multimodal input handling logic Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Michele Gazzetti <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]>

DarkLight1337

LGTM, thanks for adding support!

mgazz · 2025-07-24T15:20:52Z

@DarkLight1337 thank you for the support during the review process.

DarkLight1337 · 2025-07-25T03:18:23Z

PTAL at the failing entrypoints test

Signed-off-by: Michele Gazzetti <[email protected]>

mgazz · 2025-07-25T08:51:29Z

Apologies about the test. I will keep an eye on it.

Signed-off-by: Michele Gazzetti <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: shuw <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: x22x22 <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Paul Pak <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Boyuan Feng <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Diego-Castan <[email protected]>

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

mgazz requested review from DarkLight1337, robertgshaw2-redhat, simon-mo and aarnphm as code owners July 24, 2025 10:51

mergify bot added the frontend label Jul 24, 2025

gemini-code-assist bot reviewed Jul 24, 2025

View reviewed changes

DarkLight1337 reviewed Jul 24, 2025

View reviewed changes

vllm/model_executor/models/prithvi_geospatial_mae.py Outdated Show resolved Hide resolved

DarkLight1337 reviewed Jul 24, 2025

View reviewed changes

mgazz and others added 3 commits July 24, 2025 13:59

Enable Prithvi execution in online mode

17b4fb2

Signed-off-by: Michele Gazzetti <[email protected]>

Add test case for an online model skipping tokenizer initialization

b75966a

Signed-off-by: Michele Gazzetti <[email protected]>

Update vllm/model_executor/models/prithvi_geospatial_mae.py

c2339d1

Improve multimodal input handling logic Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Michele Gazzetti <[email protected]>

mgazz force-pushed the online_prithvi_no_tokenizer branch from e5dec1e to c2339d1 Compare July 24, 2025 14:08

mgazz added 3 commits July 24, 2025 14:22

keep consistent prompt_inputs type

a944927

Signed-off-by: Michele Gazzetti <[email protected]>

fix pre-commit

3f69541

Signed-off-by: Michele Gazzetti <[email protected]>

clean up

437f528

Signed-off-by: Michele Gazzetti <[email protected]>

DarkLight1337 approved these changes Jul 24, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) July 24, 2025 15:17

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 24, 2025

maxdebayser approved these changes Jul 24, 2025

View reviewed changes

mgazz added 2 commits July 25, 2025 07:45

Restore logic handling when tokenizer is none

cfcabe5

Signed-off-by: Michele Gazzetti <[email protected]>

limit number sequences to avoid OOM during warmup

98d58fd

Signed-off-by: Michele Gazzetti <[email protected]>

auto-merge was automatically disabled July 25, 2025 08:05
Head branch was pushed to by a user without write access

fix pre-commit

276c144

Signed-off-by: Michele Gazzetti <[email protected]>

vllm-bot merged commit e189b50 into vllm-project:main Jul 25, 2025
66 of 68 checks passed

liuyumoye pushed a commit to liuyumoye/vllm that referenced this pull request Jul 31, 2025

Add support for Prithvi in Online serving mode (vllm-project#21518)

b23ba0d

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request Aug 6, 2025

Add support for Prithvi in Online serving mode (vllm-project#21518)

9b6b89f

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

Add support for Prithvi in Online serving mode (vllm-project#21518)

a2387c6

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

taneem-ibrahim pushed a commit to taneem-ibrahim/vllm that referenced this pull request Aug 14, 2025

Add support for Prithvi in Online serving mode (vllm-project#21518)

d374140

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

simon-mo mentioned this pull request Aug 19, 2025

[Doc]: Missing Mamba1 and Jamba V1 support in Release v0.10.1 Highlights #23170

Closed

1 task

mgazz mentioned this pull request Aug 26, 2025

Add support for Prithvi geospatial model in serving mode #20307

Closed

4 tasks

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

Add support for Prithvi in Online serving mode (vllm-project#21518)

fa514e1

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

googlercolin pushed a commit to googlercolin/vllm that referenced this pull request Aug 29, 2025

Add support for Prithvi in Online serving mode (vllm-project#21518)

966e678

Signed-off-by: Michele Gazzetti <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add support for Prithvi in Online serving mode #21518

Add support for Prithvi in Online serving mode #21518

Uh oh!

mgazz commented Jul 24, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jul 24, 2025

Uh oh!

gemini-code-assist bot Jul 24, 2025

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

DarkLight1337 left a comment

Uh oh!

mgazz commented Jul 24, 2025

Uh oh!

DarkLight1337 commented Jul 25, 2025

Uh oh!

mgazz commented Jul 25, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add support for Prithvi in Online serving mode #21518

Add support for Prithvi in Online serving mode #21518

Uh oh!

Conversation

mgazz commented Jul 24, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Plan

Additional information

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

mgazz commented Jul 24, 2025

Uh oh!

DarkLight1337 commented Jul 25, 2025

Uh oh!

mgazz commented Jul 25, 2025

Uh oh!

Uh oh!

Uh oh!

mgazz commented Jul 24, 2025 •

edited by github-actions bot

Loading