Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & `Tensor.repeat_interleave`) #2162

xenova · 2025-01-18T00:53:32Z

What does this PR do?

This PR does the following:

Adds ONNX export support for Moonshine
Adds support for models which previously failed due to usage of torch.Tensor.repeat_interleave, which are unable to export due to a bug in pytorch: torch.onnx.export (dynamo=False) fails with uninformative error when exporting apply_rotary_pos_emb/repeat_interleave pytorch/pytorch#145100. Note that this bug most likely won't be fixed as the pytorch team transitions to the new dynamo-based exporter.
Builds on Transformers 4.48 #2158 to ensure that seq2seq models are correctly patched.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

HuggingFaceDocBuilderDev · 2025-02-11T20:38:44Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

xenova · 2025-02-12T11:43:24Z

~~Seeing some failures for SAM exports regarding the repeat_interleave op. Looking into it now.~~

Edit: Fixed ✅ (current failing tests unrelated)

…f rank 0.

IlyasMoutawwakil · 2025-02-17T06:48:31Z

optimum/exporters/onnx/model_patcher.py

+    # TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
+    PatchingSpec(torch.Tensor, "__len__", lambda x: x.shape[0], torch.Tensor.__len__),


GREAT ! Thanks for omitting this !

IlyasMoutawwakil · 2025-02-17T06:49:54Z

optimum/exporters/onnx/model_patcher.py

@@ -239,7 +283,7 @@ def patched_forward(*args, **kwargs):
            # contains the output names of the model. In the case of Timm classification models, the output
            # is of type tensor. By default, it is assumed that the output names mentioned in the ONNX config
            # match the outputs in order.
-            filterd_outputs = {}
+            filtered_outputs = {}


nice catch ! I'm embarrassed by the amount of times I've modified this file without seeing this x)

IlyasMoutawwakil

LGTM

xenova added 7 commits December 16, 2024 20:54

Add moonshine ONNX config

3269458

Remove use_cache_position for whisper exports

17604ac

Merge branch 'main' into add-moonshine-onnx

03c4816

Merge branch 'main' into add-moonshine-onnx

adf8aba

Patch torch repeat_interleave during export

83348a0

Add support for exporting models with non-legacy caches

e668d89

Formatting

121ed80

xenova mentioned this pull request Jan 18, 2025

Add support for Helium and Glm huggingface/transformers.js#1156

Merged

xenova mentioned this pull request Feb 11, 2025

Unable to export GLM models to ONNX huggingface/transformers#35021

Open

4 tasks

Merge branch 'main' into add-moonshine-onnx

fc6913f

xenova added 2 commits February 11, 2025 21:09

Re-use model patcher for seq2seq models

abbe4c5

Add moonshine unit tests

b270476

xenova changed the title ~~Add support for Moonshine ONNX export (& models with non-legacy cache & Tensor.repeat_interleave)~~ Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & Tensor.repeat_interleave) Feb 11, 2025

Formatting

8373720

xenova marked this pull request as ready for review February 11, 2025 21:14

xenova requested review from echarlaix and IlyasMoutawwakil February 11, 2025 21:14

When tracing, repeats passed as an int will be turned into a tensor o…

c69f6b4

…f rank 0.

IlyasMoutawwakil reviewed Feb 17, 2025

View reviewed changes

IlyasMoutawwakil approved these changes Feb 17, 2025

View reviewed changes

Fix failing unit test on 4.45.1 CI. Confirmed it works above 4.46 too.

ade05e7

xenova merged commit 414afab into main Feb 17, 2025
35 of 40 checks passed

xenova deleted the add-moonshine-onnx branch February 17, 2025 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & `Tensor.repeat_interleave`) #2162

Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & `Tensor.repeat_interleave`) #2162

xenova commented Jan 18, 2025 •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 11, 2025

xenova commented Feb 12, 2025 •

edited

Loading

IlyasMoutawwakil Feb 17, 2025

IlyasMoutawwakil Feb 17, 2025

IlyasMoutawwakil left a comment

		# TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
		PatchingSpec(torch.Tensor, "__len__", lambda x: x.shape[0], torch.Tensor.__len__),

Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & Tensor.repeat_interleave) #2162

Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & Tensor.repeat_interleave) #2162

Conversation

xenova commented Jan 18, 2025 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Feb 11, 2025

xenova commented Feb 12, 2025 • edited Loading

IlyasMoutawwakil Feb 17, 2025

Choose a reason for hiding this comment

IlyasMoutawwakil Feb 17, 2025

Choose a reason for hiding this comment

IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & `Tensor.repeat_interleave`) #2162

Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache & `Tensor.repeat_interleave`) #2162

xenova commented Jan 18, 2025 •

edited

Loading

xenova commented Feb 12, 2025 •

edited

Loading