Closed
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- Version b4391
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Add support for DeepSeek-v3
https://huggingface.co/deepseek-ai/DeepSeek-V3
Currently not supported:
ERROR:hf-to-gguf:Model DeepseekV3ForCausalLM is not supported
Motivation
DeepSeek-v3 is a big MoE model of 685B params, would be great as offloading to RAM would be a must for most systems
Possible Implementation
There is no model card or technical report yet. I don't know how much different from v2 it is.
Edit: they have uploaded the model card and paper:
https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf
https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/README.md