-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: setup tokenizer on the provider model config #656
Conversation
0. Pulls message conversion utils into format module 1. Overhauls the provider tests which were flakey in CI 2. Removes moderation references for now, as we are investigating how to bring them in without the false positives 3. Removes cost tracking, as we don't want to keep up to date with the pricing details. We will track tokens instead
it was nice to have ollama testing the format as an integration in CI, but those are covered well by unit tests already and this increased test times significantly
crates/goose/build.rs
Outdated
"Xenova/llama3-tokenizer", | ||
"Xenova/gemma-2-tokenizer", | ||
"Qwen/Qwen2.5-Coder-32B-Instruct", | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if it makes sense to embed 5 tokenizer files in the binary. maybe we just need 2 or 3? @baxen @michaelneale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
big +1 let's remove all we can!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i just kept gpt-4o and claude tokenizer. removed the other 3 (fallback is to download so its okay)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
crates/goose/build.rs
Outdated
"Xenova/llama3-tokenizer", | ||
"Xenova/gemma-2-tokenizer", | ||
"Qwen/Qwen2.5-Coder-32B-Instruct", | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
big +1 let's remove all we can!
* origin/v1.0: fix: clean up providers (#650)
branched off this PR branch: #650 (will merge after that PR)
only one tokenizer gets used depending on the model, so we put tokenizer name on the provider and then try our best to map the tokenizer name. this way we can avoid loading up these tokenizers that aren't used