[InferenceClient] Add dynamic inference providers mapping #2836

hanouticelina · 2025-02-05T10:08:50Z

Companion PR to huggingface/huggingface.js#1173. This is to give a first idea on how to implement the dynamic inference providers mapping before moving on with the TODOs.

TODOs: (a bit similar to the hf.js PR)

handle prod/staging status -> warn the user when the model is in staging mode.
~~handle if model supports both text-generation and conversational~~ => not needed for now => let's make them "conversational"-only
add and update tests.
The HfApi().model_info call is cached.

HuggingFaceDocBuilderDev · 2025-02-05T10:12:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

julien-c · 2025-02-05T12:06:45Z

src/huggingface_hub/inference/_providers/fal_ai.py

-SUPPORTED_MODELS = {
-    "automatic-speech-recognition": {
-        "openai/whisper-large-v3": "fal-ai/whisper",
-    },
-    "text-to-image": {
-        "black-forest-labs/FLUX.1-dev": "fal-ai/flux/dev",
-        "black-forest-labs/FLUX.1-schnell": "fal-ai/flux/schnell",
-        "ByteDance/SDXL-Lightning": "fal-ai/lightning-models",
-        "fal/AuraFlow-v0.2": "fal-ai/aura-flow",
-        "Kwai-Kolors/Kolors": "fal-ai/kolors",
-        "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS": "fal-ai/pixart-sigma",
-        "playgroundai/playground-v2.5-1024px-aesthetic": "fal-ai/playground-v25",
-        "stabilityai/stable-diffusion-3-medium": "fal-ai/stable-diffusion-v3-medium",
-        "stabilityai/stable-diffusion-3.5-large": "fal-ai/stable-diffusion-v35-large",
-        "Warlord-K/Sana-1024": "fal-ai/sana",
-    },
-    "text-to-speech": {
-        "m-a-p/YuE-s1-7B-anneal-en-cot": "fal-ai/yue",
-    },
-    "text-to-video": {
-        "genmo/mochi-1-preview": "fal-ai/mochi-v1",
-        "tencent/HunyuanVideo": "fal-ai/hunyuan-video",
-    },
-}


i'm wondering if, for Developer experience, we should still support an (empty by default) mapping that Partners can use locally before implementing the server-side mapping

(and in the future let's add a link to some doc on how to implement the server-side mapping here)

yes, I kept the hardcoded list

src/huggingface_hub/hf_api.py

Wauplin · 2025-02-07T17:42:52Z

(updated the PR to align with hf.js client + did some refacto. Still some tests to fix I suppose ^^)

Wauplin · 2025-02-10T18:35:29Z

I've completed the PR. I still need to check a few things but it should be nearly done now. I took the opportunity to -once again 🙈- refactor how the providers are defined. We now use a base class that is inherited in all providers. Steps to prepare_request are split in atomic methods that can be tested individually (to prepare api_token, mapped_model, base_url, full url, payload, body, etc.). Goal is to reuse as much things as possible and therefore making the introduction of a new provider effortless. The case of LLMs is particularly interesting to optimize (see sambanova.py implementation).

I've also refactored the providers' tests to test only the parts that are overwritten in subclasses. The problem with current tests were that a lot was duplicated and it required the provider to be live on production (to get mapping). Since implementation is now centralized, we can also centralize the tests (still needs to be done). And testing atomic methods prevent from having to test end-to-end with production mapping.

hanouticelina

the refactored version is much much better, especially for testing ❤️ thanks!

src/huggingface_hub/inference/_providers/hf_inference.py

tests/test_inference_client.py

Wauplin

Current state looks good to me :) @hanouticelina can you have a last look at it?

hanouticelina · 2025-02-11T14:41:09Z

All good on my side! some tests are failing but seem unrelated but i'm not sure if they are flaky or not, first time seeing connection errors from the model download cdn https://cdn-lfs-dev.hf.co

Wauplin · 2025-02-11T17:10:01Z

Let's merge for now :) I've reported internally (here) our CI issues but it's not related indeed.

hanouticelina added 3 commits February 5, 2025 10:40

first draft of dynamic mapping

dbda294

fix imports and typing

2c56667

add back recommended models fetching for hf-inference

2760a76

fix

7783943

julien-c reviewed Feb 5, 2025

View reviewed changes

hanouticelina added 13 commits February 5, 2025 14:27

avoir circular imports

cc73ead

small clean up

a178f03

add default supported model list

3956df5

remove unnecessary arg

f2208ae

nit

d43462a

rename function

4988db0

another nit

91f41c3

fix

cb7eb9a

fix conversational

8d78dca

fix hf-inference

181b6a7

add warning when status=staging

0568aa5

update warning and use model_info

d71e92e

update import

cf1e956

Wauplin reviewed Feb 5, 2025

View reviewed changes

src/huggingface_hub/hf_api.py Outdated Show resolved Hide resolved

hanouticelina and others added 3 commits February 5, 2025 17:22

fix ExpandModelProperty_T

949e8c6

Merge branch 'main' into add-dynamic-inference-provider-mapping

80acc5c

refacto

5639de0

fix python 3.8

53cca3c

Wauplin added the inference Anything related to InferenceClient, providers, etc. label Feb 10, 2025

Wauplin added 4 commits February 10, 2025 15:06

fix test

0474833

remove newlines

ee5443a

Base class for inference providers

5dfe4b9

revert

faf19e0

Wauplin added 8 commits February 10, 2025 18:34

refacto hf-inference and fal-ai tests

632d43e

replicate tests

4053a15

samba and together tests

90521c4

reorder

72396a8

unfinished business

c6ec0f8

some docstrings

6969309

fix some tests

0c39884

fix HfInference does not require token

449342d

Wauplin marked this pull request as ready for review February 10, 2025 18:29

fix inference client tests

bd8c8e4

hanouticelina commented Feb 11, 2025

View reviewed changes

src/huggingface_hub/inference/_providers/hf_inference.py Outdated Show resolved Hide resolved

tests/test_inference_client.py Show resolved Hide resolved

tests/test_inference_client.py Show resolved Hide resolved

Wauplin added 4 commits February 11, 2025 11:37

Merge branch 'main' into add-dynamic-inference-provider-mapping

9d6f2a3

fix hf-inference _prepare_api_key

d041ceb

fai ai get response tests

e8083c0

test get_response together + replicate

eebadd4

Wauplin approved these changes Feb 11, 2025

View reviewed changes

fix prepare_url

38066e0

Wauplin mentioned this pull request Feb 11, 2025

Add Fireworks AI provider + instructions for new provider #2848

Merged

Wauplin merged commit 632c04f into main Feb 11, 2025
16 of 17 checks passed

Wauplin deleted the add-dynamic-inference-provider-mapping branch February 11, 2025 17:10

Spycsh mentioned this pull request Mar 5, 2025

AsyncInferenceClient incompatible with vLLM #2908

Closed

hanouticelina mentioned this pull request Mar 5, 2025

Fix payload model name when model id is a URL #2911

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InferenceClient] Add dynamic inference providers mapping #2836

[InferenceClient] Add dynamic inference providers mapping #2836

hanouticelina commented Feb 5, 2025 •

edited by Wauplin

Loading

HuggingFaceDocBuilderDev commented Feb 5, 2025

julien-c Feb 5, 2025

julien-c Feb 5, 2025

hanouticelina Feb 5, 2025

Wauplin commented Feb 7, 2025

Wauplin commented Feb 10, 2025

hanouticelina left a comment

Wauplin left a comment

hanouticelina commented Feb 11, 2025

Wauplin commented Feb 11, 2025

[InferenceClient] Add dynamic inference providers mapping #2836

[InferenceClient] Add dynamic inference providers mapping #2836

Conversation

hanouticelina commented Feb 5, 2025 • edited by Wauplin Loading

HuggingFaceDocBuilderDev commented Feb 5, 2025

julien-c Feb 5, 2025

Choose a reason for hiding this comment

julien-c Feb 5, 2025

Choose a reason for hiding this comment

hanouticelina Feb 5, 2025

Choose a reason for hiding this comment

Wauplin commented Feb 7, 2025

Wauplin commented Feb 10, 2025

hanouticelina left a comment

Choose a reason for hiding this comment

Wauplin left a comment

Choose a reason for hiding this comment

hanouticelina commented Feb 11, 2025

Wauplin commented Feb 11, 2025

hanouticelina commented Feb 5, 2025 •

edited by Wauplin

Loading