Support various Whisper model with Metal backend #113

seyeong-han · 2025-11-13T00:23:23Z

Summary

This PR enhances Whisper model flexibility during export while simplifying the runtime interface by removing the model_name argument from the runner.

Changes

Export Scripts Enhancement

Added model_name argument to export.sh
- Allows specifying any HuggingFace Whisper model (tiny, base, small, medium, large, large-v3, large-v3-turbo)
- Defaults to openai/whisper-large-v3-turbo if not specified
Automatic feature size detection based on model variant
- Uses 128 mel features for large-v3/large-v3-turbo models
- Uses 80 mel features for all other models
- Prevents tensor shape mismatch errors by correctly configuring the preprocessor

Runtime Simplification

Removed model_name argument from run.sh and main.cpp
- Hardcoded decoder_start_token_id=50258 for all models
- Fixes tokenizer compatibility issue where all Whisper models from HuggingFace now use the v3 tokenizer format
- Eliminates confusion about which model name to pass at runtime

E2E Script Updates

Updated e2e.sh to support --model-name flag during export
Simplified run step to no longer pass model name

Documentation

Comprehensive README updates with model comparison table
Added examples for different model variants
Documented mel features and tokenizer configuration

Why These Changes?

Export Flexibility: Users can now easily export any Whisper model variant without modifying scripts
Correct Preprocessing: Automatic feature size detection ensures the preprocessor matches the model's requirements
Tokenizer Fix: All HuggingFace Whisper models now use the updated tokenizer format (token 50257 = <|endoftext|>, token 50258 = <|startoftranscript|>), so hardcoding 50258 works universally
Simplified UX: Removing the model_name runtime argument reduces user confusion and potential errors

Testing

Tested with whisper-tiny and whisper-large-v3-turbo models to verify correct transcription output.

cc. @manuelcandales

…us model support

seyeong-han added 4 commits November 12, 2025 16:18

fix: rm model_name, executorch #15798

e9f80b4

feat: add model_name param to support various whisper models

fed9527

feat: support standard model's FEATURE_SIZE with 80

88401de

docs: update model_name requirement, different FEATURE_SIZE and vario…

968c34c

…us model support

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 13, 2025

seyeong-han changed the title ~~docs: update model_name requirement, different FEATURE_SIZE and various model support~~ Support various Whisper model with Metal backend Nov 13, 2025

manuelcandales approved these changes Nov 13, 2025

View reviewed changes

manuelcandales merged commit 58e8f3a into meta-pytorch:main Nov 15, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support various Whisper model with Metal backend #113

Support various Whisper model with Metal backend #113

seyeong-han commented Nov 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Support various Whisper model with Metal backend #113

Support various Whisper model with Metal backend #113

Conversation

seyeong-han commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Export Scripts Enhancement

Runtime Simplification

E2E Script Updates

Documentation

Why These Changes?

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

seyeong-han commented Nov 13, 2025 •

edited

Loading