[InferenceSnippet] Take token from env variable if not set #1514

Wauplin · 2025-06-03T19:45:47Z

Long awaited feature for @gary149. I did not go for the cleanest solution but it works well and should be robust/flexible enough if we need to fix something in the future.

EDIT: breaking change => access token should be passed as `opts.accessToken` now in `snippets.getInferenceSnippets`

TODO

once merged:

adapt in moon-landing for snippets on model page https://github.com/huggingface-internal/moon-landing/pull/13964
adapt in doc-builder for <inferencesnippet> html tag (used in hub-docs) Access token from env in snippets doc-builder#570
hardcoded examples in hub-docs replace hf_xxxxxxxxxxxx by process.env.HF_TOKEN in examples hub-docs#1764

Some examples:

JS client

import { InferenceClient } from "@huggingface/inference";

const client = new InferenceClient(process.env.HF_TOKEN);

const chatCompletion = await client.chatCompletion({
    provider: "hf-inference",
    model: "meta-llama/Llama-3.1-8B-Instruct",
    messages: [
        {
            role: "user",
            content: "What is the capital of France?",
        },
    ],
});

console.log(chatCompletion.choices[0].message);

Python client

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
)

print(completion.choices[0].message)

openai client

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.huggingface.co/hf-inference/models/meta-llama/Llama-3.1-8B-Instruct/v1",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
)

print(completion.choices[0].message)

curl

curl https://router.huggingface.co/hf-inference/models/meta-llama/Llama-3.1-8B-Instruct/v1/chat/completions \
    -H "Authorization: Bearer $HF_TOKEN" \
    -H 'Content-Type: application/json' \
    -d '{
        "messages": [
            {
                "role": "user",
                "content": "What is the capital of France?"
            }
        ],
        "model": "meta-llama/Llama-3.1-8B-Instruct",
        "stream": false
    }'

check out PR diff for more examples

Wauplin · 2025-06-04T09:08:20Z

packages/tasks/src/model-libraries-snippets.ts

@@ -115,7 +115,7 @@ export const bm25s = (model: ModelData): string[] => [
 retriever = BM25HF.load_from_hub("${model.id}")`,
 ];

-export const chatterbox = (model: ModelData): string[] => [


not related to this PR but fixes lint in PR (introduced in #1503)

SBrandeis

Excellent!

packages/inference/src/snippets/getInferenceSnippets.ts

Co-authored-by: Simon Brandeis <[email protected]>

hanouticelina

Reviewed the snippets, looks good to me!

packages/inference/src/snippets/getInferenceSnippets.ts

SBrandeis · 2025-06-04T10:36:19Z

Merging

julien-c · 2025-06-04T11:50:38Z

nice!

@SBrandeis

Fix after #1514. Now that we use a placeholder for access token to load from env, there is no direct way to explictly generatea snippet for either a "direct request" or a "routed request" (determined [here](https://github.com/huggingface/huggingface.js/blob/1131b562d74c7c7b95966ec757fea94773a024f1/packages/inference/src/lib/makeRequestOptions.ts#L124-L141) using `accessToken.startsWith("hf_")`). This PR adds a `directRequest?: boolean;` option to the parameters which solves this problem. Will require a follow-up PR in moon-landing. cc @SBrandeis who found out the root cause ### expected behavior display routed request by default in https://huggingface.co/deepseek-ai/DeepSeek-R1-0528?inference_api=true&inference_provider=fireworks-ai&language=sh ![image](https://github.com/user-attachments/assets/0f2be3d5-9c7a-48a1-bbdb-b6ae5aa78f9d)

Wauplin added 2 commits June 3, 2025 21:38

[InferenceSnippet] Take token from env variable if not set

3857067

os.environ[...]

fd20ab4

Wauplin requested review from julien-c, hanouticelina and SBrandeis as code owners June 3, 2025 19:45

Wauplin mentioned this pull request Jun 3, 2025

[InferenceSnippet] If access token is not provided, default to HF_TOKEN environment variable #1361

Closed

remove accessToken entirely

c3d664d

Wauplin requested a review from gary149 June 4, 2025 07:53

Wauplin mentioned this pull request Jun 4, 2025

Access token from env in snippets huggingface/doc-builder#570

Merged

Wauplin added 3 commits June 4, 2025 10:48

Allow to provide accessToken

782078b

fix some JS code

a4b1636

fix lint in CI

1217b5a

Wauplin requested review from pcuenca and ngxson as code owners June 4, 2025 09:07

Wauplin commented Jun 4, 2025

View reviewed changes

Wauplin mentioned this pull request Jun 4, 2025

replace hf_xxxxxxxxxxxx by process.env.HF_TOKEN in examples huggingface/hub-docs#1764

Merged

SBrandeis approved these changes Jun 4, 2025

View reviewed changes

packages/inference/src/snippets/getInferenceSnippets.ts Outdated Show resolved Hide resolved

Wauplin and others added 2 commits June 4, 2025 11:16

Update packages/inference/src/snippets/getInferenceSnippets.ts

4cb7b08

Co-authored-by: Simon Brandeis <[email protected]>

missed that one

3022094

hanouticelina approved these changes Jun 4, 2025

View reviewed changes

packages/inference/src/snippets/getInferenceSnippets.ts Outdated Show resolved Hide resolved

Wauplin and others added 2 commits June 4, 2025 11:20

api_key instead of api_token

0f8335a

Merge branch 'main' into token-from-env-in-snippets

4ff8626

SBrandeis merged commit 90ce13c into main Jun 4, 2025
7 of 11 checks passed

SBrandeis deleted the token-from-env-in-snippets branch June 4, 2025 10:36

Wauplin mentioned this pull request Jun 4, 2025

[Inference Snippet] Add a directRequest option (false by default) #1516

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InferenceSnippet] Take token from env variable if not set #1514

[InferenceSnippet] Take token from env variable if not set #1514

Uh oh!

Wauplin commented Jun 3, 2025 •

edited

Loading

Uh oh!

Wauplin Jun 4, 2025

Uh oh!

SBrandeis left a comment

Uh oh!

Uh oh!

hanouticelina left a comment

Uh oh!

Uh oh!

SBrandeis commented Jun 4, 2025

Uh oh!

Uh oh!

julien-c commented Jun 4, 2025

Uh oh!

Uh oh!

[InferenceSnippet] Take token from env variable if not set #1514

[InferenceSnippet] Take token from env variable if not set #1514

Uh oh!

Conversation

Wauplin commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

EDIT: breaking change => access token should be passed as opts.accessToken now in snippets.getInferenceSnippets

TODO

Some examples:

JS client

Python client

openai client

curl

check out PR diff for more examples

Uh oh!

Wauplin Jun 4, 2025

Choose a reason for hiding this comment

Uh oh!

SBrandeis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hanouticelina left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SBrandeis commented Jun 4, 2025

Uh oh!

Uh oh!

julien-c commented Jun 4, 2025

Uh oh!

Uh oh!

Wauplin commented Jun 3, 2025 •

edited

Loading

EDIT: breaking change => access token should be passed as `opts.accessToken` now in `snippets.getInferenceSnippets`