docker: docker-aware precompiled wheel support #21127

dougbtv · 2025-07-17T16:13:30Z

Main goal is in the context of CI, in order to not build wheels when unnecessary, and speed up CI builds overall.

added VLLM_DOCKER_BUILD_CONTEXT to keep precompiled wheel logic in setup.py but add parameterization for use during a docker build.
normalized VLLM_USE_PRECOMPILED, treat only "1" or "true" as true (makes it more complex to force unset in CI context)
setup.py now copies contextually-named precompiled wheel into dist/ during docker builds.
smoother precompiled wheel flow, overall, in docker

See also: vllm-project/ci-infra#125
In follow up of: #20943

Notably: setup.py would automatically fetch the upstream main and rebase your work on top of it -- in the context of docker, it always takes the remote main commitish and uses that. This does require that your work be rebased if it's dependent on upstream changes currently in main.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Allows pre-built wheels to be used in a docker build context, especially for CI build improvements where building wheels isn't necessary (currently: we're building wheels one very CI run)

Test Plan

I'd build like this:

time docker build --no-cache=true --progress plain --file docker/Dockerfile --build-arg max_jobs=16 --build-arg USE_SCCACHE=0 --build-arg USE_FLASHINFER_PREBUILT_WHEEL=true --build-arg VLLM_USE_PRECOMPILED=1 --tag dougbtv/vllm:precomp-nocache . > /tmp/doug.docker.precomp.log 2>&1

And would get resulting times around 3m5.478s on my test system -- the bottleneck is now the downloads from external repositories, such as apt installs and pip installs.

Back of the napkin math (just me looking at a few of my own runs) it's usually about 40 minutes for a full out build in buildkite CI the way it stands now.

Test Result

I'd then validate that it would run using:

docker run -it --rm --gpus device=4 -e VLLM_LOGGING_LEVEL=DEBUG -v /home/dougtest/network-share/vllm/tests:/workdir dougbtv/vllm:precomp-nocache-rebase

Currently runs.

As it stands, without implementation in the ci-infra repo, builds should happen the same way that they do today (e.g. where VLLM_USE_PRECOMPILED is false-y)

(Optional) Documentation Update

github-actions · 2025-07-17T16:13:41Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request introduces support for using precompiled wheels within Docker builds to accelerate CI. It adds a VLLM_DOCKER_BUILD_CONTEXT flag to alter setup.py behavior, normalizes the VLLM_USE_PRECOMPILED environment variable, and adds logic to copy the correct wheel into the dist/ directory.

The changes are logical and well-implemented to achieve the goal. However, I've identified a significant issue with hardcoded architecture tags (x86_64) in both setup.py and docker/Dockerfile. This will cause build failures on other architectures like arm64, which the Dockerfile appears to support. I've left comments with suggestions on how to make this logic platform-aware.

gemini-code-assist · 2025-07-17T16:14:46Z

docker/Dockerfile

+RUN if [ "$VLLM_USE_PRECOMPILED" = "1" ]; then \
+        echo "Cleaning up extra wheels in dist/..." && \
+        # Identify the most recent manylinux1_x86_64 wheel
+        KEEP_WHEEL=$(ls -t dist/*manylinux1_x86_64.whl 2>/dev/null | head -n1) && \


The wheel filename pattern *manylinux1_x86_64.whl is hardcoded. This will not work for other architectures like arm64, for which there is build logic in this Dockerfile (using TARGETPLATFORM).

When using precompiled wheels on arm64, this step will fail to find the correct wheel to keep. If there are multiple wheels in dist/, it might not clean up correctly, potentially leading to the wrong wheel being installed in the final image.

This should be parameterized. You could use a shell variable set based on TARGETPLATFORM to specify the wheel pattern.

For example:

if [ "$TARGETPLATFORM" = "linux/arm64" ]; then WHEEL_PLATFORM_TAG="*manylinux2014_aarch64.whl" else WHEEL_PLATFORM_TAG="*manylinux1_x86_64.whl" fi KEEP_WHEEL=$(ls -t dist/${WHEEL_PLATFORM_TAG} 2>/dev/null | head -n1)

This change is necessary to support multi-architecture builds with precompiled wheels.

In the context of other hardcoding here, we're currently specifying:

https://wheels.vllm.ai/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl

elsewhere as a statically defined string. So, I think we can directly assume the arch.

gemini-code-assist · 2025-07-17T16:14:47Z

setup.py

+                version = version_line.split(": ")[1].strip()
+
+            # Build correct filename using internal version
+            arch_tag = "cp38-abi3-manylinux1_x86_64"


The arch_tag is hardcoded to cp38-abi3-manylinux1_x86_64. This will cause issues when building for other architectures, such as arm64, which is handled in the Dockerfile via the TARGETPLATFORM build argument. This will lead to incorrect wheel names and build failures on non-x86_64 platforms when using precompiled wheels.

To make this more robust, you should determine the architecture tag dynamically. A good approach would be to set an environment variable in the Dockerfile based on TARGETPLATFORM and read it here.

For example, in your Dockerfile:

ARG TARGETPLATFORM RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \ export VLLM_ARCH_TAG="cp38-abi3-manylinux2014_aarch64" ; \ else \ export VLLM_ARCH_TAG="cp38-abi3-manylinux1_x86_64" ; \ fi && \ ... python3 setup.py bdist_wheel ...

Then in setup.py, you could read this environment variable. This would make the build process platform-aware.

simon-mo · 2025-07-17T16:58:42Z

Confirming that the release wheel building workflow still works? (so the per commit is still published post merge)

dougbtv · 2025-07-17T17:35:24Z

Confirming that the release wheel building workflow still works? (so the per commit is still published post merge)

Thanks for asking @simon-mo -- that behavior shouldn't change yet with this PR. Gist is that this PR just enables using VLLM_USE_PRECOMPILED during docker builds but doesn't actually call the docker build yet. That change will (I believe) go into the ci-infra (however, I haven't found the publish workflow for wheels.vllm.ai -- -- happy to take a pointer if you have it handy)

That being said, I still need to take another pass at the related ci infra PR - #125 and ensure that we implement the pre-merge/post-merge logic we discussed (optionally run wheel build pre-merge, ensure build and publish post-merge)

Main goal is in the context of CI, in order to not build wheels when unnecessary, and speed up CI builds overall. - added VLLM_DOCKER_BUILD_CONTEXT to envs to skip git + unzip logic in setup.py - normalized VLLM_USE_PRECOMPILED, treat only "1" or "true" as true - setup.py now copies contextually-named precompiled wheel into dist/ during docker builds. - smoother precompiled wheel flow, overall, in docker Signed-off-by: dougbtv <[email protected]>

dougbtv · 2025-07-18T14:11:42Z

Confirmed, this will not impact the behavior of uploading wheels for availability on wheels.vllm.ai -- the wheel upload is triggered here:

https://github.com/vllm-project/vllm/blob/main/.buildkite/release-pipeline.yaml#L2

The dockerfile build doesn't ask for VLLM_USE_PRECOMPILED, so the wheel build will still happen via setup.py when executed at docker build time.

That action in turn uploads the wheel at: https://github.com/vllm-project/vllm/blob/main/.buildkite/scripts/upload-wheels.sh#L77-L78

(Thanks Kevin Luu for the pointer, too)

simon-mo

There's a bug in finding the commit to use. But other places LGTM

simon-mo · 2025-07-29T20:56:28Z

setup.py

+            # In Docker build context, .git may be immutable or missing.
+            if envs.VLLM_DOCKER_BUILD_CONTEXT:
+                return upstream_main_commit
+


This will be problematic as the main commit might not have the wheels ready (e.g. when it is just merged).

Hmm we still have nightly in that case.

# Fallback to nightly wheel if latest commit wheel is unavailable, # in this rare case, the nightly release CI hasn't finished on main. if not is_url_available(wheel_location): wheel_location = "https://wheels.vllm.ai/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl"

I guess the same problem will be if the PR merge base is not compatible with latest main/nightly. we can address this as a follow up.

Signed-off-by: dougbtv <[email protected]>

…21127)" This reverts commit a1873db.

Signed-off-by: dougbtv <[email protected]> Signed-off-by: shuw <[email protected]>

Signed-off-by: dougbtv <[email protected]>

Signed-off-by: dougbtv <[email protected]> Signed-off-by: x22x22 <[email protected]>

Signed-off-by: dougbtv <[email protected]>

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Noam Gat <[email protected]>

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Paul Pak <[email protected]>

Signed-off-by: dougbtv <[email protected]>

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Boyuan Feng <[email protected]>

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Diego-Castan <[email protected]>

Signed-off-by: dougbtv <[email protected]>

mergify bot added the ci/build label Jul 17, 2025

gemini-code-assist bot reviewed Jul 17, 2025

View reviewed changes

dougbtv force-pushed the use-precompiled-truthiness branch from b34f6f6 to 2c095c4 Compare July 17, 2025 16:32

dougbtv mentioned this pull request Jul 17, 2025

Implement Dockerfile argument for VLLM_USE_PRECOMPILED environment vllm-project/ci-infra#125

Merged

dougbtv force-pushed the use-precompiled-truthiness branch from 2c095c4 to 6488c11 Compare July 17, 2025 17:43

dougbtv force-pushed the use-precompiled-truthiness branch from 6488c11 to 3dcc491 Compare July 17, 2025 17:52

simon-mo approved these changes Jul 29, 2025

View reviewed changes

simon-mo merged commit a1873db into vllm-project:main Jul 29, 2025
14 checks passed

liuyumoye pushed a commit to liuyumoye/vllm that referenced this pull request Jul 31, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

6cc9afd

Signed-off-by: dougbtv <[email protected]>

simon-mo added a commit to simon-mo/vllm that referenced this pull request Aug 1, 2025

Revert "docker: docker-aware precompiled wheel support (vllm-project#…

fb5cad8

…21127)" This reverts commit a1873db.

wenscarl pushed a commit to wenscarl/vllm that referenced this pull request Aug 4, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

83047b0

Signed-off-by: dougbtv <[email protected]> Signed-off-by: shuw <[email protected]>

wenscarl pushed a commit to wenscarl/vllm that referenced this pull request Aug 4, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

b16771d

Signed-off-by: dougbtv <[email protected]> Signed-off-by: shuw <[email protected]>

vadiklyutiy pushed a commit to CentML/vllm that referenced this pull request Aug 5, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

ce71c30

Signed-off-by: dougbtv <[email protected]>

x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

cd43ab4

Signed-off-by: dougbtv <[email protected]> Signed-off-by: x22x22 <[email protected]>

x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

5d8ed8a

Signed-off-by: dougbtv <[email protected]> Signed-off-by: x22x22 <[email protected]>

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

a6e9c86

Signed-off-by: dougbtv <[email protected]>

jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

a4a1e43

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>

noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

5f04870

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Noam Gat <[email protected]>

paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

bcbf21d

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Paul Pak <[email protected]>

taneem-ibrahim pushed a commit to taneem-ibrahim/vllm that referenced this pull request Aug 14, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

be3e079

Signed-off-by: dougbtv <[email protected]>

BoyuanFeng pushed a commit to BoyuanFeng/vllm that referenced this pull request Aug 14, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

f608186

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Boyuan Feng <[email protected]>

diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

99d9884

Signed-off-by: dougbtv <[email protected]> Signed-off-by: Diego-Castan <[email protected]>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

c7d54ce

Signed-off-by: dougbtv <[email protected]>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

84ba680

Signed-off-by: dougbtv <[email protected]>

googlercolin pushed a commit to googlercolin/vllm that referenced this pull request Aug 29, 2025

docker: docker-aware precompiled wheel support (vllm-project#21127)

4389dbd

Signed-off-by: dougbtv <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

docker: docker-aware precompiled wheel support #21127

docker: docker-aware precompiled wheel support #21127

dougbtv commented Jul 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jul 17, 2025

Uh oh!

dougbtv Jul 17, 2025

Uh oh!

gemini-code-assist bot Jul 17, 2025

Uh oh!

dougbtv Jul 17, 2025

Uh oh!

simon-mo commented Jul 17, 2025

Uh oh!

dougbtv commented Jul 17, 2025

Uh oh!

dougbtv commented Jul 18, 2025

Uh oh!

simon-mo left a comment

Uh oh!

simon-mo Jul 29, 2025

Uh oh!

simon-mo Jul 29, 2025

Uh oh!

simon-mo Jul 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

docker: docker-aware precompiled wheel support #21127

docker: docker-aware precompiled wheel support #21127

Conversation

dougbtv commented Jul 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

github-actions bot commented Jul 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

dougbtv Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

dougbtv Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

simon-mo commented Jul 17, 2025

Uh oh!

dougbtv commented Jul 17, 2025

Uh oh!

dougbtv commented Jul 18, 2025

Uh oh!

simon-mo left a comment

Choose a reason for hiding this comment

Uh oh!

simon-mo Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

simon-mo Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

simon-mo Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dougbtv commented Jul 17, 2025 •

edited by github-actions bot

Loading