Update whisper transformer module to 4.48.0 #24382

jchen351 · 2025-04-10T17:28:13Z

Description

Motivation and Context

Branched off from #24291

…llama2

…xadupre/llama

onnxruntime/python/tools/transformers/models/llama/llama_inputs.py

@@ -7,6 +7,7 @@

 import numpy as np
 import torch
+import transformers


The best way to fix the problem is to remove the from transformers import AutoConfig, AutoTokenizer statement and use the transformers.AutoConfig and transformers.AutoTokenizer directly in the code. This approach maintains the existing functionality while eliminating the confusion caused by the dual import.

Remove the from transformers import AutoConfig, AutoTokenizer statement.

Replace all instances of AutoConfig and AutoTokenizer with transformers.AutoConfig and transformers.AutoTokenizer, respectively.

onnxruntime/python/tools/transformers/models/llama/llama_parity.py

 import torch
+import transformers


To fix the problem, we should remove the from transformers import AutoConfig statement and use transformers.AutoConfig instead. This will ensure that the transformers module is only imported once, reducing confusion and potential namespace conflicts.

Remove the from transformers import AutoConfig statement.

Replace all instances of AutoConfig with transformers.AutoConfig.

onnxruntime/python/tools/transformers/models/torch_export_patches/patches/patch_torch.py

+def _catch_produce_guards_and_solve_constraints(
+    previous_function: Callable,
+    fake_mode: "FakeTensorMode",
+    gm: "torch.fx.GraphModule",
+    dynamic_shapes: dict[str, Any] | tuple[Any] | list[Any] | None,
+    equalities_inputs: "EqualityConstraint",  # noqa: F821
+    original_signature: inspect.Signature,
+    _is_torch_jit_trace: bool = False,
+    verbose: int = 0,
+):


To fix the problem, we need to add an explicit return statement at the end of the _catch_produce_guards_and_solve_constraints function. This will ensure that the function always returns a value, even when an exception is caught and the if conditions are not met. The explicit return statement should return None to maintain the existing functionality.

onnxruntime/python/tools/transformers/models/torch_export_patches/patches/patch_torch.py

+def patch__check_input_constraints_for_graph(
+    previous_function: Callable,
+    input_placeholders: list[torch.fx.Node],
+    flat_args_with_path,
+    range_constraints,
+    verbose: int = 0,
+) -> None:


To fix the problem, we need to add an explicit return statement at the end of the function patch__check_input_constraints_for_graph. This ensures that the function consistently returns a value, making the code easier to read and understand. The explicit return value should be None to maintain the existing functionality.

onnxruntime/python/tools/transformers/models/torch_export_patches/patches/patch_torch.py

+            # if config.print_specializations:
+            #    self.log.warning(
+            #         "Specializing %s to %s", self.var_to_sources[a][0].name(), tgt


To fix the problem, we should remove the commented-out code. This will make the code cleaner and reduce potential confusion for future developers. If the logging statement is needed in the future, it can be reintroduced with proper documentation.

Remove the commented-out logging statement on lines 304-308.

Ensure that the removal does not affect the existing functionality of the code.

onnxruntime/python/tools/transformers/models/torch_export_patches/patches/patch_transformers.py

+            # if input_ids.shape[1] == 0:
+            #     inputs_embeds = inputs_embeds[:, -cache_position.shape[0] :]
+            # else:
+            #     if cache_position[-1] >= input_ids.shape[1]:
+            #         input_ids = input_ids[:, -cache_position.shape[0] :]
+            #     else:
+            #         if input_ids.shape[1] != cache_position.shape[0]:
+            #             input_ids = input_ids[:, cache_position]


To fix the problem, we should remove the commented-out code. This will make the code cleaner and less confusing for future developers. The removal should be done in the _cache_dependant_input_preparation_exporting method, specifically lines 280 to 288.

xadupre and others added 29 commits March 28, 2025 15:01

first draft to migrate to newer version of transformers

5453405

add patches

31e82a9

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

299f116

…llama2

fix import

cdec2d0

fix build and import

827d3bd

build

18b649e

fix lint

0e77ed4

lint

4633a3e

lint

b12287a

rename

1b926cb

lint

6646e61

lint

a14b8b3

remove args.dynamo

9f3a816

fix issues

0c88e42

copy inputs

8b60535

fix shape

741285b

fix validation

f8490a5

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

dbe202c

…llama2

add use_dynamo_export

ca43041

lint

19d4dfb

Merge branch 'llama2' of https://github.com/xadupre/onnxruntime into …

49fb806

…xadupre/llama

fix requirements

8d3b0ba

fix requitmeents

835b76e

fix dynamic shapes

a0a8c21

2.6

f61c27b

remove duplicated section

902c6af

lint

e3188ad

Update whisper transformer module to 4.48.0

ca8233f

Merge remote-tracking branch 'origin/xadupre/llama' into Cjian/whisper

6378cc0

github-advanced-security bot found potential problems Apr 10, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into Cjian/whisper

d5eedbc

@@ -10,3 +10,2 @@
             import transformers
-            from transformers import AutoConfig, AutoTokenizer
@@ -32,3 +31,3 @@
             def get_sample_inputs(
-                config: AutoConfig,
+                config: transformers.AutoConfig,
                 device: torch.device,
@@ -67,3 +66,3 @@
             def get_sample_with_past_kv_inputs(
-                config: AutoConfig,
+                config: transformers.AutoConfig,
                 device: torch.device,

@@ -28,3 +28,3 @@
             from models.torch_export_patches.cache_helper import make_dynamic_cache
-            from transformers import AutoConfig
@@ -35,3 +35,3 @@
-            def get_sequence_lengths(args: argparse.Namespace, config: AutoConfig):
+            def get_sequence_lengths(args: argparse.Namespace, config: transformers.AutoConfig):
                 past_sequence_length, curr_sequence_length = (8, 1) if args.use_past_kv else (0, 8)
@@ -41,3 +41,3 @@
-            def get_inputs(args: argparse.Namespace, config: AutoConfig):
+            def get_inputs(args: argparse.Namespace, config: transformers.AutoConfig):
                 # Dummy values for parity
@@ -104,3 +104,3 @@
                 pytorch_model: None | torch.nn.Module = None,
-                config: None | AutoConfig = None,
+                config: None | transformers.AutoConfig = None,
             ):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update whisper transformer module to 4.48.0 #24382

Update whisper transformer module to 4.48.0 #24382

jchen351 commented Apr 10, 2025 •

edited

Loading

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

@@ -43,3 +43,3 @@
                         )
+                    return None

Update whisper transformer module to 4.48.0 #24382

Are you sure you want to change the base?

Update whisper transformer module to 4.48.0 #24382

Conversation

jchen351 commented Apr 10, 2025 • edited Loading

Description

Motivation and Context

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

Check notice

Copilot Autofix

jchen351 commented Apr 10, 2025 •

edited

Loading