Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] create opea's own model downloader image #744

Open
lianhao opened this issue Jan 22, 2025 · 3 comments
Open

[Feature] create opea's own model downloader image #744

lianhao opened this issue Jan 22, 2025 · 3 comments
Labels
feature New feature or request

Comments

@lianhao
Copy link
Collaborator

lianhao commented Jan 22, 2025

Priority

Undecided

OS type

Ubuntu

Hardware type

Xeon-GNR

Running nodes

Single Node

Description

Currently, the helm chart is using upstream image huggingface/downloader:0.17.3 to download models. This image is buggy and quite old, and according to HF community feedback, they don't plan to maintain it any more.

We need to create our own image to download the model, and leverage the latest HF_HUB_ENABLE_HF_TRANSFER for fast model download.

@eero-t
Copy link
Contributor

eero-t commented Jan 22, 2025

Something I did not notice earlier: https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#hfhubenablehftransfer

hf_transfer lacks several user-friendly features such as resumable downloads and proxies

I think this needs to wait until it has at least proxy support...

@lianhao lianhao removed the v1.3 label Jan 22, 2025
@lianhao
Copy link
Collaborator Author

lianhao commented Jan 22, 2025

no resumable/proxy support is clearly a no-go.

@eero-t
Copy link
Contributor

eero-t commented Jan 22, 2025

Btw. Downloader Job that could be run before OPEA applications to download all specified models to given hostPath or PVC would be useful though, as it would:

  • makes it sure that app startup does not fail to disk being full (models filling disk would have happened earlier)
  • allow apps' PVCs to use readOnlyMany mode (AFAIK supported by all cloud provides, unlike the writeMany mode)
  • (token) secret and additional privileges would be needed only for single Job
  • better support offline (no internet / private cloud) use-cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants