fix(genapi): supported models (#4667)

fpagny · web-flow · commit d9d903044a4c · 2025-03-21T13:15:30.000+01:00
* fix(genapi): supported models

Fix Qwen Coder maximum context size

* fix(genapi): qwen maximum context size

* fix(inference): qwen maximum context size
diff --git a/pages/generative-apis/reference-content/integrating-generative-apis-with-popular-tools.mdx b/pages/generative-apis/reference-content/integrating-generative-apis-with-popular-tools.mdx
@@ -202,7 +202,7 @@ Zed is an IDE (Integrated Development Environment) including AI coding assistanc
         {
           "name": "qwen2.5-coder-32b-instruct",
           "display_name": "Qwen 2.5 Coder 32B",
-          "max_tokens": 128000
+          "max_tokens": 32000
         }
       ],
       "version": "1"
diff --git a/pages/generative-apis/reference-content/supported-models.mdx b/pages/generative-apis/reference-content/supported-models.mdx
@@ -24,7 +24,7 @@ Our API supports the most popular models for [Chat](/generative-apis/how-to/quer
 | Meta        | `llama-3.3-70b-instruct`  | 131k  | 4096 | [Llama 3.3 Community](https://www.llama.com/llama3_3/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) |
 | Meta        | `llama-3.1-8b-instruct`  | 128k  | 16384 | [Llama 3.1 Community](https://llama.meta.com/llama3_1/license/) | [HF](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) |
 | Mistral      | `mistral-nemo-instruct-2407`   | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) |
-| Qwen      | `qwen2.5-coder-32b-instruct`     | 128k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
+| Qwen      | `qwen2.5-coder-32b-instruct`     | 32k | 8192 | [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) | [HF](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct) |
 | DeepSeek (Preview)  | `deepseek-r1`     | 20k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1) |
 | DeepSeek (Preview)  | `deepseek-r1-distill-llama-70b`     | 32k | 4096 | [MIT](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/mit.md) | [HF](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) |
 
diff --git a/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx b/pages/managed-inference/reference-content/qwen2.5-coder-32b-instruct.mdx
@@ -20,7 +20,7 @@ categories:
 | Provider        | [Qwen](https://qwenlm.github.io/)  |
 | License        | [Apache 2.0](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct/blob/main/LICENSE)  |
 | Compatible Instances | H100, H100-2 (INT8) |
-| Context Length | up to 128k tokens |
+| Context Length | up to 32k tokens |
 
 ## Model names
 
@@ -32,8 +32,8 @@ qwen/qwen2.5-coder-32b-instruct:int8
 
 | Instance type  | Max context length |
 | ------------- |-------------|
-| H100      | 128k (INT8)
-| H100-2      | 128k (INT8)
+| H100      | 32k (INT8)
+| H100-2      | 32k (INT8)
 
 ## Model introduction
 
@@ -75,4 +75,4 @@ Process the output data according to your application's needs. The response will
 
 <Message type="note">
   Despite efforts for accuracy, the possibility of generated text containing inaccuracies or [hallucinations](/managed-inference/concepts/#hallucinations) exists. Always verify the content generated independently.
-</Message>
+</Message>

Original file line number	Diff line number	Diff line change
`@@ -202,7 +202,7 @@ Zed is an IDE (Integrated Development Environment) including AI coding assistanc`
`202`	`202`	`{`
`203`	`203`	`"name": "qwen2.5-coder-32b-instruct",`
`204`	`204`	`"display_name": "Qwen 2.5 Coder 32B",`
`205`		`- "max_tokens": 128000`
	`205`	`+ "max_tokens": 32000`
`206`	`206`	`}`
`207`	`207`	`],`
`208`	`208`	`"version": "1"`