Skip to content

Commit 5430448

Browse files
Merge remote-tracking branch 'upstream/main'
2 parents 06905ff + 592a145 commit 5430448

31 files changed

+2293
-51
lines changed

_blog.yml

+116-18
Original file line numberDiff line numberDiff line change
@@ -5426,7 +5426,7 @@
54265426
tags:
54275427
- aws
54285428
- partnerships
5429-
5429+
54305430
- local: ai-art-newsletter-jan-25
54315431
title: "The AI tools for Art Newsletter - Issue 1"
54325432
author: linoyts
@@ -5439,7 +5439,7 @@
54395439
- community
54405440

54415441
- local: dabstep
5442-
title: "DABStep: Data Agent Benchmark for Multi-step Reasoning"
5442+
title: "DABStep: Data Agent Benchmark for Multi-step Reasoning"
54435443
thumbnail: /blog/assets/dabstep/thumbnail.png
54445444
author: eggie5
54455445
guest: True
@@ -5450,7 +5450,6 @@
54505450
- research
54515451
- evaluation
54525452

5453-
54545453
- local: pi0
54555454
title: "π0 and π0-FAST: Vision-Language-Action Models for General Robot Control"
54565455
author: danaaubakirova
@@ -5462,7 +5461,7 @@
54625461
- community
54635462

54645463
- local: open-deep-research
5465-
title: "Open-source DeepResearch – Freeing our search agents"
5464+
title: "Open-source DeepResearch – Freeing our search agents"
54665465
thumbnail: /blog/assets/open-deep-research/thumbnail.png
54675466
author: m-ric
54685467
date: Feb 4, 2025
@@ -5472,19 +5471,6 @@
54725471
- research
54735472
- smolagents
54745473

5475-
- local: scaling-secrets-management
5476-
title: "How Hugging Face Scaled Secrets Management for AI Infrastructure"
5477-
thumbnail: /blog/assets/infisical/thumbnail.png
5478-
author: segudev
5479-
guest: true
5480-
date: Feb 10, 2025
5481-
tags:
5482-
- secrets
5483-
- security
5484-
- shift-left
5485-
- infrastructure
5486-
- open-source
5487-
54885474
- local: leaderboard-arabic-v2
54895475
title: "The Open Arabic LLM Leaderboard 2"
54905476
thumbnail: /blog/assets/leaderboards-on-the-hub/thumbnail_arabic.png
@@ -5497,4 +5483,116 @@
54975483
- leaderboard
54985484
- LLM
54995485
- arabic
5500-
5486+
5487+
- local: vid_ds_scripts
5488+
title: "Build awesome datasets for video generation"
5489+
author: hlky
5490+
thumbnail: /blog/assets/vid_ds_scripts/thumbnail.png
5491+
date: Feb 12, 2025
5492+
tags:
5493+
- guide
5494+
- video
5495+
- datasets
5496+
5497+
- local: from-chunks-to-blocks
5498+
title: "From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub"
5499+
author: jsulz
5500+
thumbnail: /blog/assets/from-chunks-to-blocks/thumbnail.png
5501+
date: February 12, 2025
5502+
tags:
5503+
- dedupe
5504+
- storage
5505+
- content defined chunking
5506+
- quantization
5507+
5508+
- local: billion-classifications
5509+
title: "1 Billion Classifications"
5510+
author: derek-thomas
5511+
thumbnail: /blog/assets/billion-classifications/billion-classifications-thumbnail.png
5512+
guest: true
5513+
date: Feb 13, 2025
5514+
tags:
5515+
- inference-endpoints
5516+
- classification
5517+
- embedding
5518+
- embeddings
5519+
- nlp
5520+
- python
5521+
- cost
5522+
- enterprise
5523+
5524+
- local: math_verify_leaderboard
5525+
title: "Fixing Open LLM Leaderboard with Math-Verify"
5526+
author: hynky
5527+
thumbnail: /blog/assets/math_verify_leaderboard/thumbnail.png
5528+
date: Feb 14, 2025
5529+
tags:
5530+
- math-verify
5531+
- open-llm-leaderboard
5532+
- leaderboard
5533+
- evaluation
5534+
5535+
- local: fireworks-ai
5536+
title: "Welcome Fireworks.ai on the Hub 🎆"
5537+
author: julien-c
5538+
thumbnail: /blog/assets/inference-providers/welcome-fireworks.jpg
5539+
date: Feb 14, 2025
5540+
tags:
5541+
- announcement
5542+
- hub
5543+
5544+
- local: inference-providers-nebius-novita-hyperbolic
5545+
title: "Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥"
5546+
author: reach-vb
5547+
thumbnail: /blog/assets/inference-providers/second-batch-thumbnail.webp
5548+
date: Feb 18, 2025
5549+
tags:
5550+
- announcement
5551+
- hub
5552+
5553+
- local: paligemma2mix
5554+
title: "PaliGemma 2 Mix - New Instruction Vision Language Models by Google"
5555+
thumbnail: /blog/assets/paligemma2/thumbnail.png
5556+
author: ariG23498
5557+
date: Feb 19, 2025
5558+
tags:
5559+
- multimodal
5560+
- LLM
5561+
- vision
5562+
5563+
- local: smolvlm2
5564+
title: "SmolVLM2: Bringing Video Understanding to Every Device"
5565+
author: orrzohar
5566+
guest: true
5567+
thumbnail: /blog/assets/smolvlm2/banner.png
5568+
date: Feb 20, 2025
5569+
tags:
5570+
- vlm
5571+
- multimodal
5572+
- video
5573+
- on-device
5574+
- llm
5575+
- nlp
5576+
- vision
5577+
5578+
- local: siglip2
5579+
title: "SigLIP 2: A better multilingual vision language encoder"
5580+
author: ariG23498
5581+
thumbnail: /blog/assets/siglip2/thumbnail.png
5582+
date: Feb 21, 2025
5583+
tags:
5584+
- multimodal
5585+
- vision
5586+
5587+
- local: scaling-secrets-management
5588+
title: "How Hugging Face Scaled Secrets Management for AI Infrastructure"
5589+
thumbnail: /blog/assets/infisical/thumbnail.png
5590+
author: segudev
5591+
guest: true
5592+
date: Feb 10, 2025
5593+
tags:
5594+
- secrets
5595+
- security
5596+
- shift-left
5597+
- infrastructure
5598+
- open-source
Loading
485 KB
Loading
Binary file not shown.
Loading
197 KB
Loading
197 KB
Loading

assets/paligemma2/thumbnail.png

309 KB
Loading

assets/siglip2/thumbnail.png

85.8 KB
Loading

assets/smolvlm2/banner.png

205 KB
Loading

assets/vid_ds_scripts/thumbnail.png

1.33 MB
Loading

billion-classifications.md

+422
Large diffs are not rendered by default.

dabstep.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ authors:
88
guest: True
99
- user: frisokingma
1010
guest: True
11-
- user: andreu-adyen
11+
- user: andreumora
1212
guest: True
1313
- user: lvwerra
1414
- user: thomwolf

deepseek-r1-aws.md

+13-1
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ We collaborate with Amazon Web Services to make it easier for developers to depl
2424
Let’s review how you can deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.
2525
- [Deploy DeepSeek R1 models](#deploy-deepseek-r1-models)
2626
- [Deploy on AWS with Hugging Face Inference Endpoints](#deploy-on-aws-with-hugging-face-inference-endpoints)
27+
- [Deploy on Amazon Bedrock Marketplace]
2728
- [Deploy on Amazon SageMaker AI with Hugging Face LLM DLCs](#deploy-on-amazon-sagemaker-ai-with-hugging-face-llm-dlcs)
2829
- [DeepSeek R1 on GPUs](#deepseek-r1-on-gpus)
2930
- [Distilled models on GPUs](#distilled-models-on-gpus)
@@ -48,6 +49,12 @@ You can find DeepSeek R1 and distilled models, as well as other popular open LLM
4849

4950
| **Note:** The team is working on enabling DeepSeek models deployment on Inferentia instances. Stay tuned!
5051

52+
### Deploy on Amazon Bedrock Marketplace
53+
54+
You can deploy the Deepseek distilled models on Amazon Bedrock via the marketplace, which will deploy an endpoint in Amazon SageMaker AI under the hood. Here is a video of how you can navigate through the AWS console:
55+
56+
![bedrock-deployment.gif](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/deepseek-aws/bedrock-deployment.gif)
57+
5158
### Deploy on Amazon Sagemaker AI with Hugging Face LLM DLCs
5259

5360
#### DeepSeek R1 on GPUs
@@ -56,7 +63,12 @@ You can find DeepSeek R1 and distilled models, as well as other popular open LLM
5663

5764
#### Distilled models on GPUs
5865

59-
Let’s walk through the deployment of DeepSeek-R1-Distill-Llama-70B.
66+
You can deploy the Deepseek distilled models on Amazon Sagemaker AI with Hugging Face LLM DLCs using Jumpstart directly or using the Python Sagemaker SDK.
67+
Here is a video of how you can navigate through the AWS console:
68+
69+
![jumpstart-deployment.gif](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/deepseek-aws/jumpstart-deployment.gif)
70+
71+
Now we have seen how to deploy usig Jumpstart, let’s walk through the Python Sagemaker SDK deployment of DeepSeek-R1-Distill-Llama-70B.
6072

6173
Code snippets are available on the model page under the Deploy button!
6274

fireworks-ai.md

+129
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
---
2+
title: "Welcome Fireworks.ai on the Hub 🎆"
3+
thumbnail: /blog/assets/inference-providers/welcome-fireworks.jpg
4+
authors:
5+
- user: teofeliu
6+
guest: true
7+
org: fireworks-ai
8+
- user: shaunak-fireworks
9+
guest: true
10+
org: fireworks-ai
11+
- user: julien-c
12+
---
13+
14+
Following our recent announcement on [Inference Providers on the Hub](https://huggingface.co/blog/inference-providers), we're thrilled to share that **Fireworks.ai** is now a supported Inference Provider on HF Hub!
15+
16+
[Fireworks.ai](https://fireworks.ai) delivers blazing-fast serverless inference directly on model pages, as well as throughout the whole HF ecosystem of libraries and tools, making it easier than ever to run inference on your favorite models.
17+
18+
<img src="https://huggingface.co/blog/assets/inference-providers/welcome-fireworks.jpg" alt="Fireworks.ai supported as Inference Provider on Hugging Face"/>
19+
20+
Among others, starting now, you can run serverless inference to the following models via Fireworks.ai:
21+
22+
- [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)
23+
- [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3)
24+
- [mistralai/Mistral-Small-24B-Instruct-2501](https://huggingface.co/mistralai/Mistral-Small-24B-Instruct-2501)
25+
- [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct)
26+
- [meta-llama/Llama-3.2-90B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-90B-Vision-Instruct)
27+
28+
and many more, you can find the full list [here](https://huggingface.co/models?inference_provider=fireworks-ai).
29+
30+
Light up your projects with Fireworks.ai today!
31+
32+
## How it works
33+
34+
### In the website UI
35+
36+
![Fireworks.ai inference provider UI](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/inference-providers/fireworks.png)
37+
38+
Search for all models supported by Fireworks on HF **[here](https://huggingface.co/models?inference_provider=fireworks-ai)**.
39+
40+
### From the client SDKs
41+
42+
#### from Python, using huggingface_hub
43+
44+
The following example shows how to use DeepSeek-R1 using Fireworks.ai as your inference provider. You can use a [Hugging Face token](https://huggingface.co/settings/tokens) for automatic routing through Hugging Face, or your own Fireworks.ai API key if you have one.
45+
46+
Install `huggingface_hub` from source:
47+
48+
```bash
49+
pip install git+https://github.com/huggingface/huggingface_hub
50+
```
51+
52+
Use the `huggingface_hub` python library to call Fireworks.ai endpoints by defining the `provider` parameter.
53+
54+
```python
55+
from huggingface_hub import InferenceClient
56+
57+
client = InferenceClient(
58+
provider="fireworks-ai",
59+
api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
60+
)
61+
62+
messages = [
63+
{
64+
"role": "user",
65+
"content": "What is the capital of France?"
66+
}
67+
]
68+
69+
completion = client.chat.completions.create(
70+
model="deepseek-ai/DeepSeek-R1",
71+
messages=messages,
72+
max_tokens=500
73+
)
74+
75+
print(completion.choices[0].message)
76+
```
77+
78+
#### from JS using @huggingface/inference
79+
80+
```js
81+
import { HfInference } from "@huggingface/inference";
82+
83+
const client = new HfInference("xxxxxxxxxxxxxxxxxxxxxxxx");
84+
85+
const chatCompletion = await client.chatCompletion({
86+
model: "deepseek-ai/DeepSeek-R1",
87+
messages: [
88+
{
89+
role: "user",
90+
content: "How to make extremely spicy Mayonnaise?"
91+
}
92+
],
93+
provider: "fireworks-ai",
94+
max_tokens: 500
95+
});
96+
97+
console.log(chatCompletion.choices[0].message);
98+
```
99+
100+
### From HTTP calls
101+
102+
Here's how you can call Llama-3.3-70B-Instruct using Fireworks.ai as the inference provider via cURL.
103+
104+
```
105+
curl 'https://router.huggingface.co/fireworks-ai/v1/chat/completions' \
106+
-H 'Authorization: Bearer xxxxxxxxxxxxxxxxxxxxxxxx' \
107+
-H 'Content-Type: application/json' \
108+
--data '{
109+
"model": "accounts/fireworks/models/llama-v3p3-70b-instruct",
110+
"messages": [
111+
{
112+
"role": "user",
113+
"content": "What is the meaning of life if you were a dog?"
114+
}
115+
],
116+
"max_tokens": 500,
117+
"stream": false
118+
}'
119+
```
120+
121+
## Billing
122+
123+
For direct requests, i.e. when you use a Fireworks key, you are billed directly on your Fireworks account.
124+
125+
For routed requests, i.e. when you authenticate via the hub, you'll only pay the standard Fireworks API rates. There's no additional markup from us, we just pass through the provider costs directly. (In the future, we may establish revenue-sharing agreements with our provider partners.)
126+
127+
Important Note ‼️ PRO users get $2 worth of Inference credits every month. You can use them across providers. 🔥
128+
129+
Subscribe to the [Hugging Face PRO plan](https://hf.co/subscribe/pro) to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more.

0 commit comments

Comments
 (0)