-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InferenceClient] Add dynamic inference providers mapping #2836
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
SUPPORTED_MODELS = { | ||
"automatic-speech-recognition": { | ||
"openai/whisper-large-v3": "fal-ai/whisper", | ||
}, | ||
"text-to-image": { | ||
"black-forest-labs/FLUX.1-dev": "fal-ai/flux/dev", | ||
"black-forest-labs/FLUX.1-schnell": "fal-ai/flux/schnell", | ||
"ByteDance/SDXL-Lightning": "fal-ai/lightning-models", | ||
"fal/AuraFlow-v0.2": "fal-ai/aura-flow", | ||
"Kwai-Kolors/Kolors": "fal-ai/kolors", | ||
"PixArt-alpha/PixArt-Sigma-XL-2-1024-MS": "fal-ai/pixart-sigma", | ||
"playgroundai/playground-v2.5-1024px-aesthetic": "fal-ai/playground-v25", | ||
"stabilityai/stable-diffusion-3-medium": "fal-ai/stable-diffusion-v3-medium", | ||
"stabilityai/stable-diffusion-3.5-large": "fal-ai/stable-diffusion-v35-large", | ||
"Warlord-K/Sana-1024": "fal-ai/sana", | ||
}, | ||
"text-to-speech": { | ||
"m-a-p/YuE-s1-7B-anneal-en-cot": "fal-ai/yue", | ||
}, | ||
"text-to-video": { | ||
"genmo/mochi-1-preview": "fal-ai/mochi-v1", | ||
"tencent/HunyuanVideo": "fal-ai/hunyuan-video", | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm wondering if, for Developer experience, we should still support an (empty by default) mapping that Partners can use locally before implementing the server-side mapping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and in the future let's add a link to some doc on how to implement the server-side mapping here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I kept the hardcoded list
(updated the PR to align with hf.js client + did some refacto. Still some tests to fix I suppose ^^) |
I've completed the PR. I still need to check a few things but it should be nearly done now. I took the opportunity to -once again 🙈- refactor how the providers are defined. We now use a base class that is inherited in all providers. Steps to I've also refactored the providers' tests to test only the parts that are overwritten in subclasses. The problem with current tests were that a lot was duplicated and it required the provider to be live on production (to get mapping). Since implementation is now centralized, we can also centralize the tests (still needs to be done). And testing atomic methods prevent from having to test end-to-end with production mapping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the refactored version is much much better, especially for testing ❤️ thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current state looks good to me :) @hanouticelina can you have a last look at it?
All good on my side! some tests are failing but seem unrelated but i'm not sure if they are flaky or not, first time seeing connection errors from the model download cdn https://cdn-lfs-dev.hf.co |
Let's merge for now :) I've reported internally (here) our CI issues but it's not related indeed. |
Companion PR to huggingface/huggingface.js#1173. This is to give a first idea on how to implement the dynamic inference providers mapping before moving on with the TODOs.
TODOs: (a bit similar to the hf.js PR)
handle if model supports both=> not needed for now => let's make them "conversational"-onlytext-generation
andconversational
HfApi().model_info
call is cached.