llm_distil

Idea

Since there is this paper MiniLLM I do not see any problem of having an general framework for Knowledge Distillation. The only limitation is that 2 models should share the same Tokenizer, and maybe booth have to have the same CausalLM Denoising objective, but I thing it would be enough that they predict token by token.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm_distil

Idea

About

Releases

Packages

License

n1o/LLMDistilery

Folders and files

Latest commit

History

Repository files navigation

llm_distil

Idea

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages