[5750013][5591945][5360813]: AutoCast standalone implementation for type inference #719

galagam · 2025-12-22T13:39:02Z

What does this PR do?

Type of change: New feature

Overview:
AutoCast runs full type inference to get the new types after adding casts. ONNX doesn't have a separate function for type inference, and it is done as part of shape inference. Shape inference is a much more complex task than type inference, especially when dynamic shapes are involved. We're seeing some shape inference related bugs in AutoCast. Typically we can WAR, but it's cumbersome. A local implementation might allow users to WAR shape inference related issues. This is opt-in and marked as experimental.

Usage

python -m modelopt.onnx.autocast --onnx_path /path/to/input.onnx [options] --use_standalone_type_inference

Testing

Added use_standalone_type_inference=True to all existing PrecisionConverter tests.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: Yes
Did you update Changelog?: Yes

Additional Information

A more permanent fix would be to decouple type and shape inference in ONNX, we should invest in that when we have the resources - see onnx/onnx#7100
. This is a quick fix, which is also why it is opt-in and not the default mode.

Signed-off-by: Gal Hubara Agam <[email protected]>

copy-pr-bot · 2025-12-22T13:39:06Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

codecov · 2025-12-22T13:50:38Z

Codecov Report

❌ Patch coverage is 69.45607% with 73 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.62%. Comparing base (03dc386) to head (28365ea).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/onnx/utils.py	67.94%	67 Missing ⚠️
modelopt/onnx/autocast/precisionconverter.py	82.60%	4 Missing ⚠️
modelopt/onnx/autocast/convert.py	66.66%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #719      +/-   ##
==========================================
- Coverage   74.69%   74.62%   -0.08%     
==========================================
  Files         192      192              
  Lines       18946    19169     +223     
==========================================
+ Hits        14152    14305     +153     
- Misses       4794     4864      +70

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

created by cursor and committed by mistake Signed-off-by: Gal Hubara Agam <[email protected]>

Signed-off-by: Gal Hubara Agam <[email protected]>

modelopt/onnx/utils.py

Signed-off-by: Gal Hubara Agam <[email protected]>

…add to changelog Signed-off-by: Gal Hubara Agam <[email protected]>

ajrasane · 2026-01-09T05:23:42Z

This is not related to this PR, but a shape inference issue I encountered previously. It was happening due to using strict mode in shape inference. Would it be possible to not use the strict mode and use the default mode itself?

ajrasane · 2026-01-09T05:28:10Z

modelopt/onnx/autocast/convert.py

+    if use_standalone_type_inference:
+        model = onnx_utils.infer_types(model)
+    else:
+        model = onnx_utils.infer_shapes(model)


Could you create a util function for this as it is reused in multiple places?

ajrasane · 2026-01-09T05:30:42Z

modelopt/onnx/autocast/precisionconverter.py

+                        if not self.use_standalone_type_inference:
+                            for idx, d in enumerate(inp.type.tensor_type.shape.dim):
+                                if d.dim_value:
+                                    inp.type.tensor_type.shape.dim[idx].dim_param = "unk"


Similarly for this, can we create a util function?

ajrasane · 2026-01-09T05:40:28Z

modelopt/onnx/utils.py

+                    return vi.type.tensor_type.elem_type
+            return None
+
+        def str_to_tensor_dtype(dtype_str: str) -> onnx.TensorProto.DataType:


We have a smaller subset of this mapping here:

Model-Optimizer/modelopt/onnx/quantization/qdq_utils.py

Line 39 in ecda7b0

onnx_dtype_map = {

Would it be possible to update/reuse this?

ajrasane · 2026-01-09T05:51:18Z

modelopt/onnx/utils.py

+        node_name_to_onnx = {node.name: node for node in graph.node}
+
+        # Get nodes to process (from graphsurgeon if available, otherwise from graph directly)
+        if gs_graph is not None:


Should we convert the graph back to an onnx graph to avoid this logic?

ajrasane · 2026-01-09T05:53:30Z

modelopt/onnx/utils.py

+                if not inp_name:
+                    continue


Do we expect any inputs with empty names?

ajrasane · 2026-01-09T05:58:01Z

modelopt/onnx/utils.py

+                for attr in node.attribute:
+                    if attr.name == "value" and attr.type == onnx.AttributeProto.TENSOR:
+                        if attr.t.HasField("data_type"):
+                            const_type = attr.t.data_type


This pattern is used in multiple places in our codebase, should we create a utility function for it?

gcunhase · 2026-01-09T15:40:03Z

modelopt/onnx/utils.py

+            tensor_types[init_name] = init.data_type
+
+        # Helper function to get tensor type
+        def get_tensor_type(tensor_name: str) -> int | None:


Can we re-use _get_tensor_type() in modelopt.onnx.utils.py or, if not, move this function to modelopt.onnx.utils.py as _get_tensor_type_from_tensor_name or a variation of that?

gcunhase · 2026-01-09T15:55:02Z

modelopt/onnx/utils.py

+                    return vi.type.tensor_type.elem_type
+            return None
+
+        def str_to_tensor_dtype(dtype_str: str) -> onnx.TensorProto.DataType:


This can be replaced by onnx_type_str_to_enum() in modelopt.onnx.utils.

gcunhase · 2026-01-09T15:58:16Z

modelopt/onnx/utils.py

+                            break
+                assert const_type is not None
+                output_types = [const_type]
+            elif node.op_type == "ConstantOfShape":


Would this PR fix the issue in bug 5763424?

Draft: AutoCast local implementation for type inference

7a8c6ba

Signed-off-by: Gal Hubara Agam <[email protected]>

galagam added 3 commits December 22, 2025 17:25

revert precisionconverter type infer logic change

1161d2d

created by cursor and committed by mistake Signed-off-by: Gal Hubara Agam <[email protected]>

fix logic for subgraphs

bc4a6a6

Signed-off-by: Gal Hubara Agam <[email protected]>

Merge branch 'NVIDIA:main' into feat/local-type-infer

2afb8d9