-
-
Notifications
You must be signed in to change notification settings - Fork 2
Support loading from model.safetensors.index.json
#6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Would be good to add a tests if possible.
I noticed that this index file is not covered anywhere in https://huggingface.co/docs/safetensors/index or the safetensors repo. Is it a huggingface-specific thing? Is there a documentation link we can point users to for it? |
|
Added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The index.json
is defined by huggingface in their python binding for sharding the model weights and is not part of the safetensors format/spec. This should not be in this package (or at least, it should be named as something like load_shard_safetensors
instead of modifying load_safetensors
.
Hi @chengchingwen , I agree this feature is not part of the spec. The reason is that in real world cases, different packages may have different approaches to handle the shards (like loading them distributedly or GC during loading). However, I think the modifications I made here provide a nice-to-have fallback (by loading them all in memory). Regarding the naming issue, reusing the |
It's also part of the reason that I think it should not be in this package, but I agree it would be convenient to have and is a reasonable default for loading sharded weights. Personally, the |
Also, some unneeded files should be removed. e.g. |
No description provided.