Skip to content

Commit d9d9030

Browse files
authored
fix(genapi): supported models (#4667)
* fix(genapi): supported models Fix Qwen Coder maximum context size * fix(genapi): qwen maximum context size * fix(inference): qwen maximum context size
1 parent 0cc6eb5 commit d9d9030

File tree

3 files changed

+6
-6
lines changed

3 files changed

+6
-6
lines changed

pages/generative-apis/reference-content/integrating-generative-apis-with-popular-tools.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,7 @@ Zed is an IDE (Integrated Development Environment) including AI coding assistanc
202202
{
203203
"name": "qwen2.5-coder-32b-instruct",
204204
"display_name": "Qwen 2.5 Coder 32B",
205-
"max_tokens": 128000
205+
"max_tokens": 32000
206206
}
207207
],
208208
"version": "1"

pages/generative-apis/reference-content/supported-models.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer
2424
| Meta | `llama-3.3-70b-instruct` | 131k | 4096 | [Llama 3.3 Community](https://www.llama.com/llama3_3/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
2525
| Meta | `llama-3.1-8b-instruct` | 128k | 16384 | [Llama 3.1 Community](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
2626
| Mistral | `mistral-nemo-instruct-2407` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
27-
| Qwen | `qwen2.5-coder-32b-instruct` | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
27+
| Qwen | `qwen2.5-coder-32b-instruct` | 32k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
2828
| DeepSeek (Preview) | `deepseek-r1` | 20k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1) |
2929
| DeepSeek (Preview) | `deepseek-r1-distill-llama-70b` | 32k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) |
3030

pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx

+4-4
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ categories:
2020
| Provider | [Qwen](https://qwenlm.github.io/) |
2121
| License | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE) |
2222
| Compatible Instances | H100, H100-2 (INT8) |
23-
| Context Length | up to 128k tokens |
23+
| Context Length | up to 32k tokens |
2424

2525
## Model names
2626

@@ -32,8 +32,8 @@ qwen/qwen2.5-coder-32b-instruct:int8
3232

3333
| Instance type | Max context length |
3434
| ------------- |-------------|
35-
| H100 | 128k (INT8)
36-
| H100-2 | 128k (INT8)
35+
| H100 | 32k (INT8)
36+
| H100-2 | 32k (INT8)
3737

3838
## Model introduction
3939

@@ -75,4 +75,4 @@ Process the output data according to your application's needs. The response will
7575

7676
<Message type="note">
7777
Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
78-
</Message>
78+
</Message>

0 commit comments

Comments
 (0)