-
Notifications
You must be signed in to change notification settings - Fork 247
Update function-calling-support.mdx #4372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 11 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
3f0857a
Update function-calling-support.mdx
fpagny 9ceacc2
Create deepseek-r1-distill-llama-70b
fpagny e3cd868
Create deepseek-r1-distill-llama-8b
fpagny 7c60080
Create deepseek-r1-distill-llama-70b
fpagny cc83b79
Rename deepseek-r1-distill-llama-70b to deepseek-r1-distill-llama-70b…
fpagny 425fa4c
Rename deepseek-r1-distill-llama-8b to deepseek-r1-distill-llama-8b.mdx
fpagny 1543bdc
Delete pages/managed-inference/reference-content/deepseek directory
fpagny 8309d34
feat(ai): add pages to navigation
bene2k1 0fa843c
Update llama-3-8b-instruct.mdx
fpagny d1662ee
Update deepseek-r1-distill-llama-70b.mdx
fpagny c5e4eeb
Update deepseek-r1-distill-llama-8b.mdx
fpagny 320e59f
Apply suggestions from code review
bene2k1 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
81 changes: 81 additions & 0 deletions
81
pages/managed-inference/reference-content/deepseek-r1-distill-llama-70b.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
--- | ||
meta: | ||
title: Understanding the DeepSeek-R1-Distill-Llama-70B model | ||
description: Deploy your own secure DeepSeek-R1-Distill-Llama-70B model with Scaleway Managed Inference. Privacy-focused, fully managed. | ||
content: | ||
h1: Understanding the DeepSeek-R1-Distill-Llama-70B model | ||
paragraph: This page provides information on the DeepSeek-R1-Distill-Llama-70B model | ||
tags: | ||
dates: | ||
validation: 2025-02-06 | ||
posted: 2025-02-06 | ||
categories: | ||
- ai-data | ||
--- | ||
|
||
## Model overview | ||
|
||
| Attribute | Details | | ||
|-----------------|------------------------------------| | ||
| Provider | [Deepseek](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) | | ||
| License | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | | ||
| Compatible Instances | H100-2 (BF16) | | ||
| Context Length | up to 56k tokens | | ||
|
||
## Model names | ||
|
||
```bash | ||
deepseek/deepseek-r1-distill-llama-70b:bf16 | ||
``` | ||
|
||
## Compatible Instances | ||
|
||
| Instance type | Max context length | | ||
| ------------- |-------------| | ||
| H100-2 | 56k (BF16) | | ||
|
||
## Model introduction | ||
|
||
Released January 21, 2025, Deepseek’s R1 Distilled Llama 70B is a distilled version of Llama model family based on Deepseek R1. | ||
DeepSeek R1 Distill Llama 70B is designed to improve performance of Llama models on reasoning use case such as mathematics and coding tasks. | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Why is it useful? | ||
|
||
It is great to see Deepseek improving open(weight) models, and we are excited to fully support their mission with integration in the Scaleway ecosystem. | ||
|
||
- DeepSeek-R1-Distill-Llama was optimized to reach accuracy close to Deepseek-R1 in tasks like mathematics and coding, while keeping inference costs limited and tokens speed efficient. | ||
- DeepSeek-R1-Distill-Llama supports a context window up to 56K tokens and tool calling, keeping interaction with other components possible. | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## How to use it | ||
|
||
### Sending Managed Inference requests | ||
|
||
To perform inference tasks with your DeepSeek R1 Distill Llama deployed at Scaleway, use the following command: | ||
|
||
```bash | ||
curl -s \ | ||
-H "Authorization: Bearer <IAM API key>" \ | ||
-H "Content-Type: application/json" \ | ||
--request POST \ | ||
--url "https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/chat/completions" \ | ||
--data '{"model":"deepseek/deepseek-r1-distill-llama-70b:fp8", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "temperature": 0.7, "stream": false}' | ||
``` | ||
|
||
Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. | ||
|
||
<Message type="note"> | ||
Ensure that the `messages` array is properly formatted with roles (user, assistant) and content. | ||
</Message> | ||
|
||
<Message type="tip"> | ||
This model is better used without `system prompt`, as suggested by the model provider. | ||
</Message> | ||
|
||
### Receiving Inference responses | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request. | ||
|
||
<Message type="note"> | ||
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently. | ||
</Message> |
82 changes: 82 additions & 0 deletions
82
pages/managed-inference/reference-content/deepseek-r1-distill-llama-8b.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
--- | ||
meta: | ||
title: Understanding the DeepSeek-R1-Distill-Llama-8B model | ||
description: Deploy your own secure DeepSeek-R1-Distill-Llama-8B model with Scaleway Managed Inference. Privacy-focused, fully managed. | ||
content: | ||
h1: Understanding the DeepSeek-R1-Distill-Llama-8B model | ||
paragraph: This page provides information on the DeepSeek-R1-Distill-Llama-8B model | ||
tags: | ||
dates: | ||
validation: 2025-02-06 | ||
posted: 2025-02-06 | ||
categories: | ||
- ai-data | ||
--- | ||
|
||
## Model overview | ||
|
||
| Attribute | Details | | ||
|-----------------|------------------------------------| | ||
| Provider | [Deepseek](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) | | ||
| License | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | | ||
| Compatible Instances | L4, H100 (BF16) | | ||
| Context Length | up to 131k tokens | | ||
|
||
## Model names | ||
|
||
```bash | ||
deepseek/deepseek-r1-distill-llama-8b:bf16 | ||
``` | ||
|
||
## Compatible Instances | ||
|
||
| Instance type | Max context length | | ||
| ------------- |-------------| | ||
| L4 | 39k (BF16) | | ||
| H100 | 131k (BF16) | | ||
|
||
## Model introduction | ||
|
||
Released January 21, 2025, Deepseek’s R1 Distilled Llama 8B is a distilled version of Llama model family based on Deepseek R1. | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
DeepSeek R1 Distill Llama 8B is designed to improve performance of Llama models on reasoning use case such as mathematics and coding tasks. | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Why is it useful? | ||
|
||
It is great to see Deepseek improving open(weight) models, and we are excited to fully support their mission with integration in the Scaleway ecosystem. | ||
|
||
- DeepSeek-R1-Distill-Llama was optimized to reach accuracy close to Deepseek-R1 in tasks like mathematics and coding, while keeping inference costs limited and tokens speed efficient. | ||
- DeepSeek-R1-Distill-Llama supports a context window up to 131K tokens and tool calling, keeping interaction with other components possible. | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## How to use it | ||
|
||
### Sending Managed Inference requests | ||
|
||
To perform inference tasks with your DeepSeek R1 Distill Llama deployed at Scaleway, use the following command: | ||
|
||
```bash | ||
curl -s \ | ||
-H "Authorization: Bearer <IAM API key>" \ | ||
-H "Content-Type: application/json" \ | ||
--request POST \ | ||
--url "https://<Deployment UUID>.ifr.fr-par.scaleway.com/v1/chat/completions" \ | ||
--data '{"model":"deepseek/deepseek-r1-distill-llama-8b:fp8", "messages":[{"role": "user","content": "There is a llama in my garden, what should I do?"}], "max_tokens": 500, "temperature": 0.7, "stream": false}' | ||
``` | ||
|
||
Make sure to replace `<IAM API key>` and `<Deployment UUID>` with your actual [IAM API key](/iam/how-to/create-api-keys/) and the Deployment UUID you are targeting. | ||
|
||
<Message type="note"> | ||
Ensure that the `messages` array is properly formatted with roles (user, assistant) and content. | ||
</Message> | ||
|
||
<Message type="tip"> | ||
This model is better used without `system prompt`, as suggested by the model provider. | ||
</Message> | ||
|
||
### Receiving Inference responses | ||
bene2k1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Upon sending the HTTP request to the public or private endpoints exposed by the server, you will receive inference responses from the managed Managed Inference server. | ||
Process the output data according to your application's needs. The response will contain the output generated by the LLM model based on the input provided in the request. | ||
|
||
<Message type="note"> | ||
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently. | ||
</Message> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.