feat: [EXPERIMENTAL FEATURE] Add ASR Support to Nemo Automodel by rylativity · Pull Request #1263 · NVIDIA-NeMo/Automodel

rylativity · 2026-02-12T20:49:32Z

What does this PR do ?

[EXPERIMENTAL FEATURE] Adds comprehensive ASR support for Whisper (5 variants) and Parakeet CTC (2 variants) with distributed training and PEFT and lays groundwork for incorporating additional ASR models in Nemo Automodel

Changelog

New Model Support:

Add NeMoAutoModelForSpeechSeq2Seq for encoder-decoder ASR models (Whisper family: tiny/base/small/medium/large-v3)
Add NeMoAutoModelForCTC for CTC-based ASR models (Parakeet CTC: 0.6B/1.1B)

New Components:

Add ASR dataset component with LibriSpeech, Common Voice, and custom dataset loaders (nemo_automodel/components/datasets/asr/)
Add processor-specific collate functions with automatic mel-spectrogram extraction and tokenization (nemo_automodel/components/datasets/asr/collate_fns.py)
Implement collate function registry for automatic processor selection

New Recipe:

Add ASR fine-tuning recipe with support for both CTC and Seq2Seq loss computation (nemo_automodel/recipes/asr/finetune.py)
Implement validation loop with loss tracking and metrics logging
Add pipeline parallelism support via AutoPipeline

Example Configurations:

Add 8 YAML configs for Whisper and Parakeet models with full and PEFT fine-tuning examples
Add finetune.py entry point script for ASR examples (examples/asr_finetune/finetune.py)
Include distributed training configurations with device mesh setup

Testing:

Add 4 functional tests covering Whisper and Parakeet fine-tuning (full and PEFT) (tests/functional_tests/asr_finetune/)
Add comprehensive unit tests for dataset loaders and collate functions (tests/unit_tests/datasets/asr/)
Include pytest test class with parameterized model/PEFT configurations

Documentation:

Add comprehensive README for ASR fine-tuning with quick start examples, PEFT guide, and troubleshooting (examples/asr_finetune/README.md)
Update root README with ASR examples and usage
Add inline documentation for ASR model classes and dataset utilities

Dependencies:

Add librosa and torchcodec as ASR extras in pyproject.toml
Update Docker build with ASR-specific dependencies

Other:

Update model exports in nemo_automodel/init.py and _transformers/init.py
Ensure component independence (no cross-component imports, verified by lint-imports)
Add copyright year 2026 across new files

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Linting/formatting passed
Commits DCO signed
Confirmed documentation builds successfully

…eech dataset Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

…xample config Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

copy-pr-bot · 2026-02-12T20:49:36Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…set and model functionality Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

jgerh

Completed tech pubs review and provided a few copyedits.

docs/guides/asr/dataset.md

docs/guides/dataset-overview.md

docs/model-coverage/asr.md

docs/guides/dataset-overview.md

docs/model-coverage/overview.md

examples/asr_finetune/README.md

akoumpa · 2026-02-19T07:25:17Z

docker/Dockerfile

    git submodule init && git submodule update && \
-    pip install nvidia-mathdx==25.1.1 && \
-    env NVTE_CUDA_ARCHS="80;90;100;120" NVTE_BUILD_THREADS_PER_JOB=8 pip install --no-cache-dir --no-build-isolation -v . && \
+    uv pip install nvidia-mathdx==25.1.1 && \


Hi @rylativity , I'm not sure about this one.

@thomasdhc can you provide guidance?

Please remove all instances of uv pip install changes here. pip is used by design

akoumpa · 2026-02-19T07:27:48Z

nemo_automodel/recipes/asr/finetune.py

+            # NeMoAutoModel handles infrastructure internally
+            model = cfg_model.instantiate(**kwargs)
+        else:
+            raise ValueError(


ok but this won't allow anyone to bring their own model

akoumpa · 2026-02-19T07:29:17Z

nemo_automodel/recipes/asr/finetune.py

+
+        # Build pipeline config if PP enabled
+        self.pipeline_config = None
+        if self.pp_enabled:


do we need PP for asr models? I would review models we want to support, if it's <40B then I'd skip PP to simplify train loop.

akoumpa · 2026-02-19T07:33:15Z

nemo_automodel/recipes/asr/finetune.py

+                    )
+
+                train_ctx, batch = make_cp_batch_and_ctx(self.device_mesh, batch, labels)
+                with train_ctx():


I would rename this to something else, since train_ctx is inside a validation function.

Also, why not use _forward_backward_step here?

akoumpa · 2026-02-19T07:39:43Z

Thanks a lot @rylativity for adding this feature!

Since this is a new model category, I want your help with adding a bit more testing, for example, for the dataset and data preprocessing, can we add a functional test that for a cached dataset + cached preprocessor to ensure data is correctly transformed? And I also see a few functional tests for finetuning, if it's not too much trouble would you mind adding some loss matching testing too (to ensure we avoid convergence regression over time).

Also we have the ability to add longer-running tests on our nightly suite, so I would also encourage including there as well, once this PR is merged.

Next steps:

Evaluate whether we want to keep pipeline parallelism or not, and proceed accordingly (simpler is better if functionality is not affected).
Please consult with @thomasdhc whether the Dockerfile changes are ok.
Please include additional tests for dataset, data preprocessing and loss-reproducibility tests.
Please ping me when ready.

Co-authored-by: jgerh <163925524+jgerh@users.noreply.github.com>

akoumpa · 2026-02-19T07:41:13Z

/ok to test 74bc2db

Ryan Stewart added 8 commits February 12, 2026 15:31

intial working implementation of ASR tuning for whisper using librisp…

09b7afa

…eech dataset Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

remove unneeded dependencies

3846823

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

working implementation for parakeet CTC model

fb72157

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

docs updates to reflect Nvidia Parakeet models

38de77b

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

update inline comments

b680002

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

add functional and unit tests for asr models; update whisper medium e…

70b4e25

…xample config Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

add ASR PEFT config examples and tests; update copyright year to 2026

6c801ea

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

linting, formatting, hooks run, confirm docs build without issue

68de402

Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

rylativity requested review from a team, HuiyingLi, ZhiyuLi-Nvidia, adil-a, akoumpa, hemildesai and snowmanwwg as code owners February 12, 2026 20:49

github-actions bot added the community-request label Feb 12, 2026

update dataset and model coverage docs with info about added asr data…

f7cd916

…set and model functionality Signed-off-by: Ryan Stewart <rystewart@nvidia.com>

rylativity requested a review from jgerh as a code owner February 12, 2026 21:45

rylativity changed the title ~~[EXPERIMENTAL FEATURE] Add ASR Support to Nemo Automodel~~ feat: [EXPERIMENTAL FEATURE] Add ASR Support to Nemo Automodel Feb 13, 2026

chtruong814 added the needs-follow-up Issue needs follow-up label Feb 14, 2026

jgerh reviewed Feb 18, 2026

View reviewed changes

akoumpa reviewed Feb 19, 2026

View reviewed changes

akoumpa and others added 3 commits February 18, 2026 23:40

Apply suggestions from code review

3fa1fa8

Co-authored-by: jgerh <163925524+jgerh@users.noreply.github.com>

Update examples/asr_finetune/README.md

8fa32ca

Co-authored-by: jgerh <163925524+jgerh@users.noreply.github.com>

Update examples/asr_finetune/README.md

74bc2db

Co-authored-by: jgerh <163925524+jgerh@users.noreply.github.com>

copy-pr-bot bot temporarily deployed to nemo-ci February 19, 2026 07:41 Inactive

copy-pr-bot bot temporarily deployed to test February 19, 2026 07:41 Inactive

akoumpa linked an issue Feb 19, 2026 that may be closed by this pull request

plan to support Qwen3-ASR ? #1250

Open

akoumpa mentioned this pull request Feb 19, 2026

plan to support Qwen3-ASR ? #1250

Open

copy-pr-bot bot had a problem deploying to nemo-ci February 19, 2026 08:06 Failure

chtruong814 removed the needs-follow-up Issue needs follow-up label Feb 19, 2026

Conversation

rylativity commented Feb 12, 2026

What does this PR do ?

Changelog

Pre checks:

Uh oh!

copy-pr-bot bot commented Feb 12, 2026

Uh oh!

jgerh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

akoumpa Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

thomasdhc Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akoumpa Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

akoumpa Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

akoumpa Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

akoumpa commented Feb 19, 2026

Uh oh!

akoumpa commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Comments

thomasdhc Feb 19, 2026 •

edited

Loading