Skip to content

SageMaker deployment errors #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jonrossclaytor opened this issue Jul 10, 2023 · 2 comments
Open

SageMaker deployment errors #94

jonrossclaytor opened this issue Jul 10, 2023 · 2 comments

Comments

@jonrossclaytor
Copy link

Background

We are attempting to deploy SageMaker Endpoints using the code provided under Deploy - Amazon SageMaker from huggingface.co for these two models:

https://huggingface.co/Salesforce/codegen25-7b-multi
https://huggingface.co/openchat/opencoderplus

Error

Both endpoints consistently fail to deploy. Both fail health checks - error logs available on request as it does not appear I can attach them here.

@jonrossclaytor
Copy link
Author

@philschmid is there any guidance you can provide on these errors?

@JimAllanson
Copy link

Currently, all models with sharded checkpoints such as these are failing to deploy, as this library is filtering out files that don't match a predefined allowlist, and the sharded format isn't included in that list.

I've made a PR that fixes this issue in #93 but until it gets merged you might be able to get by by building a custom docker image with my fork, like so:

FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:2.0.0-transformers4.28.1-gpu-py310-cu118-ubuntu20.04
RUN pip install --no-cache-dir \
    git+https://github.com/JimAllanson/sagemaker-huggingface-inference-toolkit@sharded-checkpoint-support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants