You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have been testing different T5 text encoder variants. And I would very much like to convert and quantize to llama.cpp gguf files. After some trial and error, it seems that you added T5 text encoder architecture to llama.cpp just as you have done for the image AIs. Since the new T5 text encoder still needs to be loaded using your nodes, I think it will be easier to just ask for the related scripts to convert and quantize T5 text encoder if you are willing to share the scripts. Thank you in advance and I look forward to hearing from you.
The text was updated successfully, but these errors were encountered:
Hi, T5 models are supported by native llama.cpp so the only support in this repo is for the loader logic. The conversion for that was done with vanilla llama.cpp.
I believe I did T5EncoderModel.from_pretrained on the original weights, then save_pretrained to save it to a folder as safetensors. From there, you can use the llama.cpp default conver_hf_to_gguf script which should give you a valid file that the default llama-quantize binary can handle.
Ah, I see. In my case, I ported the pile T5 layers to the T5 encoder and merged the two using SVD. It was an experiment and didn't expect it to work so well without fine-tuning since Pile T5 is trained on different datasets and uses a different tokenizer. However, it turned out to unlock Flux a bit as the filtering shield of T5 was removed.
I leveraged your convert.py to convert clip_g by adding the architecture (https://huggingface.co/Old-Fisherman/SDXL_Finetune_GGUF_Files/resolve/main/convert_g.py?download=true). But I have no idea how to patch it to make it work with llama-quantize. If I use the same method to convert T5 to F16 gguf, could you give me some guidance on how to llama-quantize?
Hi, I have been testing different T5 text encoder variants. And I would very much like to convert and quantize to llama.cpp gguf files. After some trial and error, it seems that you added T5 text encoder architecture to llama.cpp just as you have done for the image AIs. Since the new T5 text encoder still needs to be loaded using your nodes, I think it will be easier to just ask for the related scripts to convert and quantize T5 text encoder if you are willing to share the scripts. Thank you in advance and I look forward to hearing from you.
The text was updated successfully, but these errors were encountered: