-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
docker: docker-aware precompiled wheel support #21127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker: docker-aware precompiled wheel support #21127
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces support for using precompiled wheels within Docker builds to accelerate CI. It adds a VLLM_DOCKER_BUILD_CONTEXT
flag to alter setup.py
behavior, normalizes the VLLM_USE_PRECOMPILED
environment variable, and adds logic to copy the correct wheel into the dist/
directory.
The changes are logical and well-implemented to achieve the goal. However, I've identified a significant issue with hardcoded architecture tags (x86_64
) in both setup.py
and docker/Dockerfile
. This will cause build failures on other architectures like arm64
, which the Dockerfile appears to support. I've left comments with suggestions on how to make this logic platform-aware.
RUN if [ "$VLLM_USE_PRECOMPILED" = "1" ]; then \ | ||
echo "Cleaning up extra wheels in dist/..." && \ | ||
# Identify the most recent manylinux1_x86_64 wheel | ||
KEEP_WHEEL=$(ls -t dist/*manylinux1_x86_64.whl 2>/dev/null | head -n1) && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wheel filename pattern *manylinux1_x86_64.whl
is hardcoded. This will not work for other architectures like arm64
, for which there is build logic in this Dockerfile (using TARGETPLATFORM
).
When using precompiled wheels on arm64
, this step will fail to find the correct wheel to keep. If there are multiple wheels in dist/
, it might not clean up correctly, potentially leading to the wrong wheel being installed in the final image.
This should be parameterized. You could use a shell variable set based on TARGETPLATFORM
to specify the wheel pattern.
For example:
if [ "$TARGETPLATFORM" = "linux/arm64" ]; then
WHEEL_PLATFORM_TAG="*manylinux2014_aarch64.whl"
else
WHEEL_PLATFORM_TAG="*manylinux1_x86_64.whl"
fi
KEEP_WHEEL=$(ls -t dist/${WHEEL_PLATFORM_TAG} 2>/dev/null | head -n1)
This change is necessary to support multi-architecture builds with precompiled wheels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of other hardcoding here, we're currently specifying:
https://wheels.vllm.ai/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
elsewhere as a statically defined string. So, I think we can directly assume the arch.
version = version_line.split(": ")[1].strip() | ||
|
||
# Build correct filename using internal version | ||
arch_tag = "cp38-abi3-manylinux1_x86_64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The arch_tag
is hardcoded to cp38-abi3-manylinux1_x86_64
. This will cause issues when building for other architectures, such as arm64
, which is handled in the Dockerfile
via the TARGETPLATFORM
build argument. This will lead to incorrect wheel names and build failures on non-x86_64 platforms when using precompiled wheels.
To make this more robust, you should determine the architecture tag dynamically. A good approach would be to set an environment variable in the Dockerfile
based on TARGETPLATFORM
and read it here.
For example, in your Dockerfile
:
ARG TARGETPLATFORM
RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
export VLLM_ARCH_TAG="cp38-abi3-manylinux2014_aarch64" ; \
else \
export VLLM_ARCH_TAG="cp38-abi3-manylinux1_x86_64" ; \
fi && \
...
python3 setup.py bdist_wheel ...
Then in setup.py
, you could read this environment variable. This would make the build process platform-aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 above.
b34f6f6
to
2c095c4
Compare
Confirming that the release wheel building workflow still works? (so the per commit is still published post merge) |
Thanks for asking @simon-mo -- that behavior shouldn't change yet with this PR. Gist is that this PR just enables using That being said, I still need to take another pass at the related ci infra PR - #125 and ensure that we implement the pre-merge/post-merge logic we discussed (optionally run wheel build pre-merge, ensure build and publish post-merge) |
2c095c4
to
6488c11
Compare
Main goal is in the context of CI, in order to not build wheels when unnecessary, and speed up CI builds overall. - added VLLM_DOCKER_BUILD_CONTEXT to envs to skip git + unzip logic in setup.py - normalized VLLM_USE_PRECOMPILED, treat only "1" or "true" as true - setup.py now copies contextually-named precompiled wheel into dist/ during docker builds. - smoother precompiled wheel flow, overall, in docker Signed-off-by: dougbtv <[email protected]>
6488c11
to
3dcc491
Compare
Confirmed, this will not impact the behavior of uploading wheels for availability on wheels.vllm.ai -- the wheel upload is triggered here: https://github.com/vllm-project/vllm/blob/main/.buildkite/release-pipeline.yaml#L2 The That action in turn uploads the wheel at: https://github.com/vllm-project/vllm/blob/main/.buildkite/scripts/upload-wheels.sh#L77-L78 (Thanks Kevin Luu for the pointer, too) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bug in finding the commit to use. But other places LGTM
# In Docker build context, .git may be immutable or missing. | ||
if envs.VLLM_DOCKER_BUILD_CONTEXT: | ||
return upstream_main_commit | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be problematic as the main commit might not have the wheels ready (e.g. when it is just merged).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm we still have nightly in that case.
# Fallback to nightly wheel if latest commit wheel is unavailable,
# in this rare case, the nightly release CI hasn't finished on main.
if not is_url_available(wheel_location):
wheel_location = "https://wheels.vllm.ai/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the same problem will be if the PR merge base is not compatible with latest main/nightly. we can address this as a follow up.
Signed-off-by: dougbtv <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: shuw <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: shuw <[email protected]>
Signed-off-by: dougbtv <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: x22x22 <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: x22x22 <[email protected]>
Signed-off-by: dougbtv <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: Noam Gat <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: Paul Pak <[email protected]>
Signed-off-by: dougbtv <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: Boyuan Feng <[email protected]>
Signed-off-by: dougbtv <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
Signed-off-by: dougbtv <[email protected]>
Signed-off-by: dougbtv <[email protected]>
Signed-off-by: dougbtv <[email protected]>
Main goal is in the context of CI, in order to not build wheels when unnecessary, and speed up CI builds overall.
See also: vllm-project/ci-infra#125
In follow up of: #20943
Notably: setup.py would automatically fetch the upstream main and rebase your work on top of it -- in the context of docker, it always takes the remote main commitish and uses that. This does require that your work be rebased if it's dependent on upstream changes currently in main.
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.Purpose
Allows pre-built wheels to be used in a docker build context, especially for CI build improvements where building wheels isn't necessary (currently: we're building wheels one very CI run)
Test Plan
I'd build like this:
And would get resulting times around
3m5.478s
on my test system -- the bottleneck is now the downloads from external repositories, such as apt installs and pip installs.Back of the napkin math (just me looking at a few of my own runs) it's usually about 40 minutes for a full out build in buildkite CI the way it stands now.
Test Result
I'd then validate that it would run using:
Currently runs.
As it stands, without implementation in the ci-infra repo, builds should happen the same way that they do today (e.g. where VLLM_USE_PRECOMPILED is false-y)
(Optional) Documentation Update