Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed typo and grammer #549

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

Ankush1oo8
Copy link

FIXED issue #456
fixed typos and grammer

@dinithaw
Copy link

dinithaw commented Feb 3, 2025

Awesome ❤️

@musvaage
Copy link

musvaage commented Feb 3, 2025

This should be closed.

DeepSeek_V3.pdf

DeepSeek-V3 Technical Report

Abstract

Furthermore, DeepSeek-V3 pioneers...

requires only 2.788M H800 GPU hours for its full...

did not experience any irrecoverable...

1. Introduction

arises from encouraging...

Knowledge Distillation...

DeepSeek-V3 and notably...


cf: #432

  > [!NOTE]
- > The total size of DeepSeek-V3 models on Hugging Face is 685B, which includes 671B of the Main Model weights and 14B of the Multi-Token Prediction (MTP) Module weights.**
+ > The total size of DeepSeek-V3 models on Hugging Face is 685B, which includes 671B of the Main Model weights and 14B of the Multi-Token Prediction (MTP) Module weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants