-
Notifications
You must be signed in to change notification settings - Fork 96
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
254 additions
and
62 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
*.bin | ||
*.gguf | ||
*.safetensors | ||
tools/llama.cpp/ | ||
|
||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
8d1fbbc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was being used to export FLUX models into various GGUF variations. Support the quantization options please.
https://civitai.com/articles/6730/flux-gguf
Your changes broke it.
8d1fbbc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AbstractEyes This version does support both the normal and the K quants, it just involves applying the provided patch and using the resulting binary. The updated description here has all the basic steps and llama.cpp has an excellent guide on how to build it here
There's no way to support K quants with just the python code, and duplicating all the logic would be both hard to maintain and would allow people like the person that wrote that article (which reeks of chatgpt) to easily bypass the key checks to create invalid (i.e. including the VAE too, using the diffusers unet) and meaningless quantizations of models like SDXL which most likely barely benefit from it (which he already attempted to do lol).
At the end of the day people can do whatever they want, GGUF is a storage format, but this repo will be reserved to the way that follows the original spec as closely as it can with an image model. I'm happy to help out with any issues/problems anyone runs into. I've mostly been focusing on features instead of documentation for now, though I'll get around to making the readme more detailed eventually.
You could also ask the author of that article to update his guide on CivitAI, as apparently he has gotten K quants to work to at least some extent, based on his most recent comment.