Release v0.21.0 · keras-team/keras-hub

Summary

New Models.
- Xception: Added Xception architecture for image classification tasks.
- Qwen: Added Qwen2.5 large language models and presets of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.
- Qwen MoE: Added transformer-based Mixture of Experts (MoE) decoder-only language model with a base variant having 2.7B activated parameters during runtime.
- Mixtral: Added Mixtral LLM, a pretrained generative Sparse Mixture of Experts with pre-trained and instruction tuned models having 7 billion activated parameters.
- Moonshine: Added Moonshine, a speech recognition task model.
- CSPNet: Added Cross Stage Partial Network (CSPNet) classification task model.
- Llama3: Added support for Llama 3.1 and 3.2.
Added sharded weight support to KerasPresetSaver and KerasPresetLoader, defaulting to a 10GB maximum shard size.

What's Changed

Fix Roformer export symbol by @abheesht17 in #2199
Bump up master version to 0.21 by @abheesht17 in #2204
reenable test by @mattdangerw in #2188
Add xception model by @mattdangerw in #2179
Make image converter built by @mattdangerw in #2206
Qwen - Fix Preset Loader + Add Causal LM Test by @kanpuriyanawab in #2193
Update Qwen conversion script by @laxmareddyp in #2207
Revert "Do not export Qwen for release" by @sachinprasadhs in #2208
Fixes compute_output_shape for PaliGemmaVitEncoder and Gemma3VisionEncoderBlock by @JyotinderSingh in #2210
Python 3.12 fix by @mattdangerw in #2211
Small Gemma3 doc-string edits by @abheesht17 in #2214
Llama3.1 by @pctablet505 in #2132
Update gemma3_causal_lm_preprocessor.py by @pctablet505 in #2217
fix: apply weights_only = True by @b8zhong in #2215
Fix the keras_hub package for typecheckers and IDEs by @mattdangerw in #2222
Add utility to map COCO IDs to class names by @mattdangerw in #2219
Set GPU timeouts to 2 hours by @mattdangerw in #2226
Fix nightly by @mattdangerw in #2227
Another fix for nightly builds by @mattdangerw in #2229
Cast a few more input to tensors in SD3 by @mattdangerw in #2234
Fix up package build scripts again by @mattdangerw in #2230
Add qwen presets by @laxmareddyp in #2241
script for converting retinanet weights from trochvision by @sineeli in #2233
Sharded weights support by @james77777778 in #2218
Add Qwen Moe by @kanpuriyanawab in #2163
Add Mixtral by @kanpuriyanawab in #2196
Made label data optional for inference and adopted other required changes by @laxmareddyp in #2183
Fix the layer names by @kanpuriyanawab in #2247
Add new CSPNet preset and add manual padding. by @sachinprasadhs in #2212
Update the int8 quant logic in ReversibleEmbedding by @james77777778 in #2250
Add Moonshine to KerasHub by @harshaljanjani in #2093
Add Kaggle handle for moonshine presets by @laxmareddyp in #2253
Update requirements-jax-cuda.txt by @pctablet505 in #2252
Add Mixtral,Qwen-MoE presets and Update conversion script. by @laxmareddyp in #2248
fix flash attention test by @divyashreepathihalli in #2263
Fix JAX bugs for qwen moe & mixtral by @kanpuriyanawab in #2258
Create pull_request_template.md by @sachinprasadhs in #2262
Update preset versions for sharded models by @laxmareddyp in #2264
Add AudioToText and AudioToTextPreprocessor class stubs to enable auto class functionality by @harshaljanjani in #2265
register moonshine presets by @sachinprasadhs in #2267
Version bump 0.21.0.dev1 by @laxmareddyp in #2273
Version bump to 0.21.0 by @laxmareddyp in #2275

New Contributors

@JyotinderSingh made their first contribution in #2210
@pctablet505 made their first contribution in #2132
@b8zhong made their first contribution in #2215

Full Changelog: v0.20.0...v0.21.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.21.0

Summary

What's Changed

New Contributors

Contributors

Uh oh!