Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draft: add hyperbolic support #1191

Merged
merged 21 commits into from
Feb 14, 2025

Conversation

Kaihuang724
Copy link
Contributor

Added Hyperbolic as an inference provider

@julien-c julien-c force-pushed the kai/hyperbolic-integration branch from c507b1e to 33771e6 Compare February 7, 2025 23:30
@julien-c
Copy link
Member

julien-c commented Feb 7, 2025

Thanks @Kaihuang724! i've rebased on top of main as we changed things a bit in the past few days. Let me know if my commits make sense.

As you pointed out the Hyperbolic, 2 out of 4 tests currently don't pass.

I haven't checked why yet, but here's how you can run just the Hyperbolic tests locally (from inside packages/inference):

pnpm test -- -t "Hyperbolic"

When they do pass you can record the VCR tapes (so the CI doesn't run actual requests):

VCR_MODE=record pnpm test -- -t "Hyperbolic"

Will help early next week if you're stuck

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment on lines 1185 to 1188
"meta-llama/Llama-3.2-3B-Instruct": "meta-llama/Llama-3.2-3B-Instruct",
"meta-llama/Llama-3.3-70B-Instruct": "meta-llama/Llama-3.3-70B-Instruct",
"stabilityai/stable-diffusion-2": "stabilityai/stable-diffusion-2",
"meta-llama/Llama-3.1-405B-BASE-FP8": "meta-llama/Llama-3.1-405B-BASE-FP8",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the keys should be HF model ids (it's not the case for the last one at least)

@SBrandeis SBrandeis self-assigned this Feb 10, 2025
Copy link
Contributor

@SBrandeis SBrandeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Kaihuang724 !

Would you mind updating the VCR tapes (=pre-cached API responses for online testing), please?

We need them for the CI tests.

You can do so by running the following command:

VCR_MODE=cache pnpm run test

@julien-c
Copy link
Member

@Kaihuang724 let us know if any help is needed!

@Kaihuang724
Copy link
Contributor Author

Thank you @Kaihuang724 !

Would you mind updating the VCR tapes (=pre-cached API responses for online testing), please?

We need them for the CI tests.

You can do so by running the following command:

VCR_MODE=cache pnpm run test

I went ahead and updated the VCR tapes, but I still can't get the tests to run successfully. I'm not sure why, it seems like it's trying to call our API with this model: mistralai/Mixtral-8x7B-Instruct-v0.1 even though I'm passing this model name meta-llama/Llama-3.1-405B": "meta-llama/Llama-3.1-405B for the textGeneration test. Any ideas why?

@SBrandeis
Copy link
Contributor

it seems like it's trying to call our API with this model: mistralai/Mixtral-8x7B-Instruct-v0.1 even though I'm passing this model name meta-llama/Llama-3.1-405B": "meta-llama/Llama-3.1-405B for the textGeneration test. Any ideas why?

It seems that your chat completion API default to the mistral model when no model is provided in the body - is that correct?

Note that we only add the model argument in the body when the task is chatCompletion or chatCompletionStream:
https://github.com/Kaihuang724/huggingface.js/blob/cb1ff636a5170c815f966aea9b0954845b8da77a/packages/inference/src/lib/makeRequestOptions.ts#L146-L150


Make sure the URL and body outputed by makeRequestOptions matches what you expect on your side.
https://github.com/Kaihuang724/huggingface.js/blob/cb1ff636a5170c815f966aea9b0954845b8da77a/packages/inference/src/tasks/custom/request.ts#L18

If not you will have to implement an adapter in the associated task method, eg here for text-to-image:
https://github.com/Kaihuang724/huggingface.js/blob/cb1ff636a5170c815f966aea9b0954845b8da77a/packages/inference/src/tasks/cv/textToImage.ts#L23-L40

@connorch
Copy link
Contributor

Thank you @SBrandeis ! This PR should be ready for review now.

@SBrandeis
Copy link
Contributor

Hi @connorch - thank you for your changes!
I pushed a few updates:

  • b2fbceb : Add a type for Hyperbolic's text-to-image generations
  • 6bdd200 : Remove some uneeded code in the baseUrl computation

But most importantly de26ffa :

describe.concurrent(
"Hyperbolic",
() => {
HARDCODED_MODEL_ID_MAPPING.hyperbolic = {
"meta-llama/Llama-3.2-3B-Instruct": "meta-llama/Llama-3.2-3B-Instruct",
"meta-llama/Llama-3.3-70B-Instruct": "meta-llama/Llama-3.3-70B-Instruct",
"stabilityai/stable-diffusion-2": "stabilityai/stable-diffusion-2",
"meta-llama/Llama-3.1-405B": "meta-llama/Meta-Llama-3.1-405B-Instruct",
};
it("chatCompletion - hyperbolic", async () => {
const res = await chatCompletion({
accessToken: env.HF_HYPERBOLIC_KEY,
model: "meta-llama/Llama-3.2-3B-Instruct",
provider: "hyperbolic",
messages: [{ role: "user", content: "Complete this sentence with words, one plus one is equal " }],
temperature: 0.1,
});
expect(res).toBeDefined();
expect(res.choices).toBeDefined();
expect(res.choices?.length).toBeGreaterThan(0);
if (res.choices && res.choices.length > 0) {
const completion = res.choices[0].message?.content;
expect(completion).toBeDefined();
expect(typeof completion).toBe("string");
expect(completion).toContain("two");
}
});
it("chatCompletion stream", async () => {
const stream = chatCompletionStream({
accessToken: env.HF_HYPERBOLIC_KEY,
model: "meta-llama/Llama-3.3-70B-Instruct",
provider: "hyperbolic",
messages: [{ role: "user", content: "Complete the equation 1 + 1 = , just the answer" }],
}) as AsyncGenerator<ChatCompletionStreamOutput>;
let out = "";
for await (const chunk of stream) {
if (chunk.choices && chunk.choices.length > 0) {
out += chunk.choices[0].delta.content;
}
}
expect(out).toContain("2");
});
it("textToImage", async () => {
const res = await textToImage({
accessToken: env.HF_HYPERBOLIC_KEY,
model: "stabilityai/stable-diffusion-2",
provider: "hyperbolic",
inputs: "award winning high resolution photo of a giant tortoise",
parameters: {
height: 128,
width: 128,
},
} satisfies TextToImageArgs);
expect(res).toBeInstanceOf(Blob);
});
it("textGeneration", async () => {
const res = await textGeneration({
accessToken: env.HF_HYPERBOLIC_KEY,
model: "meta-llama/Llama-3.1-405B",
provider: "hyperbolic",
inputs: "Paris is",
parameters: {
temperature: 0,
top_p: 0.01,
max_new_tokens: 10,
}
});
expect(res).toMatchObject({ generated_text: "...the capital and most populous city of France," });
});
},
TIMEOUT
);

I updated the tests to match our types and expected APIs, which revealed there is the need to implement adapters to transform inputs and outputs for the text-generation and text-to-image tasks to match what you expect on Hyperbolic side.

Namely:

  • For text-generation, convert the TextGenerationInput to the expected payload shape on Hyperbolic (which seems to be similar to ChatCompletionInput?)
  • For text-to-image, you seem to expect model_name in the body to determine which model to run inference with. You can implement the transformation here

@SBrandeis SBrandeis requested a review from julien-c February 14, 2025 11:03
Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing this over the finish line, @SBrandeis <3

@julien-c julien-c merged commit 3e78986 into huggingface:main Feb 14, 2025
5 checks passed
@connorch
Copy link
Contributor

Thank you @SBrandeis ! I appreciate you cleaning things up and getting this merged 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants