You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -47,7 +45,7 @@ DeepSeek R1 Distill Llama 8B is designed to improve performance of Llama models
47
45
It is great to see Deepseek improving open(weight) models, and we are excited to fully support their mission with integration in the Scaleway ecosystem.
48
46
49
47
- DeepSeek-R1-Distill-Llama was optimized to reach accuracy close to Deepseek-R1 in tasks like mathematics and coding, while keeping inference costs limited and tokens speed efficient.
50
-
- DeepSeek-R1-Distill-Llama supports a context window up to 32K tokens and tool calling, keeping interaction with other components possible.
48
+
- DeepSeek-R1-Distill-Llama supports a context window up to 131K tokens and tool calling, keeping interaction with other components possible.
0 commit comments