fix(ONNX): avoids resizing unsupported dimensions #3945

bjacobgordon · 2025-01-07T22:21:34Z

Prevents #3453

zjgarvey

I think the main structural question is about the need for adding the BaseTensorType method. If it were useful elsewhere (I have some doubts, since we would need to know too much about the two tensor shapes prior to using it- namely that they are present, and they have the same rank), I would consider keeping it; however, the code is simplified here by not using it, and I suspect that the same would be true in other circumstances where it might be used.

lib/Dialect/Torch/IR/TorchTypes.cpp

include/torch-mlir/Dialect/Torch/IR/TorchTypes.h

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

zjgarvey

Sorry for the misdirect earlier, we need to perform the runtime asserts on the scales or sizes values instead of sizes, since we will not have access to the correct output sizes ahead of time.

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

IanWood1 · 2025-01-14T17:07:07Z

I think the renaming of loc -> opLocation should probably be split into a different PR so it can be reviewed separately, it adds ~600 lines of changes that aren't directly related to the functional changes. It would help to keep this PR focused and easier to review.

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

bjacobgordon · 2025-01-17T21:59:05Z

Okay, @zjgarvey, I think we're in business! Got green on the CI a few hours ago. Just wrapped up self-review.

I'm guessing now we'll:

discuss how the diffs might need to change
finalize those changes
discuss which chunks of commits need to be in the PR stack
Something like that?

I kept the commits atomic and ordered such that it's easier to propagate changes to the head of the branch in case an earlier commit needs to be inserted/tweaked/excised.

Let me know what you think!

zjgarvey · 2025-01-17T23:14:54Z

Okay, @zjgarvey, I think we're in business! Got green on the CI a few hours ago. Just wrapped up self-review.

I'm guessing now we'll:
* discuss how the diffs might need to change

* finalize those changes

* discuss which chunks of commits need to be in the PR stack
  Something like that?
I kept the commits atomic and ordered such that it's easier to propagate changes to the head of the branch in case an earlier commit needs to be inserted/tweaked/excised.

Let me know what you think!

Nice!

At least for now, please exclude any commits which involve style changes not directly related to the fix content. E.g. widespread enforcement of naming preferences like the loc -> opLocation should be factored out into a separate PR as @IanWood1 suggested.

…in onnx.resize - avoids SSA before match failures

- cast to `ValueTensorType` was overly specific for the methods used

…nx.resize

… onnx.resize

- intellisense is able to infer `unsigned` aspect from `.size()`

…size - emphasizes parallel to `inputTensorType`

…resize

…in onnx.resize

…nnx.resize

- easier to read - allows for cleaner diffs if they ever change

bjacobgordon · 2025-02-10T15:30:51Z

Hey, @zjgarvey, I've rebased onto main!

To make it easier for you to re-review, I could segment this PR into a stack of 3 PRs. It'll be really easy; the commits are already primed for it!

Would you like me to do that for you?

zjgarvey · 2025-02-10T16:38:29Z

Hey, @zjgarvey, I've rebased onto main!

To make it easier for you to re-review, I could segment this PR into a stack of 3 PRs. It'll be really easy; the commits are already primed for it!

Would you like me to do that for you?

No, this change looks manageable and self-contained.

I'll review a bit more carefully today.

zjgarvey

Looks good!

ScottTodd · 2025-02-12T20:25:36Z

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp

+          Value scaleIdentity = rewriter.create<Torch::ConstantFloatOp>(
+              loc, rewriter.getF64FloatAttr(1.0));


FYI this appears to have caused some test failures downstream in the IREE project on iree-org/iree#19976. I did not bisect to this specific change or line of code, but this looked most relevant. These are the logs: https://github.com/iree-org/iree/actions/runs/13292751088/job/37117771168?pr=19976#step:8:50

_ IREE compile and run: test_resize_downsample_scales_cubic_align_corners::model.mlir::model.mlir::cpu_llvm_sync _ [gw2] linux -- Python 3.11.11 /home/runner/work/iree/iree/venv/bin/python Error invoking iree-compile Error code: 1 Stderr diagnostics: <unknown>:0: error: failed to legalize operation 'torch.constant.float' <unknown>:0: note: see current operation: %6 = "torch.constant.float"() <{value = 1.000000e+00 : f64}> : () -> !torch.float Stdout diagnostics: Test case source: https://github.com/iree-org/iree-test-suites/blob/main/onnx_ops/onnx/node/generated/test_resize_downsample_scales_cubic_align_corners Input program: ``` module { func.func @test_resize_downsample_scales_cubic_align_corners(%arg0: !torch.vtensor<[1,1,4,4],f32>, %arg1: !torch.vtensor<[4],f32>) -> !torch.vtensor<[1,1,3,3],f32> attributes {torch.onnx_meta.ir_version = 9 : si64, torch.onnx_meta.opset_version = 19 : si64, torch.onnx_meta.producer_name = "backend-test", torch.onnx_meta.producer_version = ""} { %none = torch.constant.none %0 = torch.operator "onnx.Resize"(%arg0, %none, %arg1) {torch.onnx.coordinate_transformation_mode = "align_corners", torch.onnx.mode = "cubic"} : (!torch.vtensor<[1,1,4,4],f32>, !torch.none, !torch.vtensor<[4],f32>) -> !torch.vtensor<[1,1,3,3],f32> return %0 : !torch.vtensor<[1,1,3,3],f32> } } ``` Compiled with: cd /home/runner/work/iree/iree/iree-test-suites/onnx_ops/onnx/node/generated/test_resize_downsample_scales_cubic_align_corners && iree-compile model.mlir --iree-hal-target-backends=llvm-cpu --iree-input-demote-f64-to-f32=false -o model_cpu_llvm_sync.vmfb

By default, IREE demotes f64 to f32 as 64 bits of precision is rarely needed in ML models and many hardware targets either do not support f64 at all or support it with significant performance penalties. The tests there do override that default by setting --iree-input-demote-f64-to-f32=false though.

Is f64 needed here, or would f32 work? I see lots of uses of f64 in this file 🤔

More context: some of the tests in the ONNX test suite require f64, which is why we run the tests without f64 to f32 demotion: iree-org/iree#18111.

We dont need f64, this is a small bug with the changes. Will post a quick fix in a minute.

If I remember correctly when writing this, using f32 for scaleIdentity caused a test case or two within torch mlir to fail.

@zjgarvey Any insights here?

Wait, F64 is the correct attr type for constant float ops. I'll take a look at the test failures.

Looks like a simple issue. AtenEqFloatOp doesn't have a lowering, but it should be easy to add. I'll post a PR shortly.

I re-ran the iree tests with #4022

The failing tests go back to passing with that change.

Nice, thanks!

Addresses an issue introduced by <#3945> in an external test suite.

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch 3 times, most recently from 6baa8d5 to ab7e021 Compare January 8, 2025 23:02

bjacobgordon changed the title ~~fix(ONNX): protects against mismatched dynamic meta dimensions~~ fix(ONNX): avoids resizing fixed dimensions Jan 9, 2025

bjacobgordon mentioned this pull request Jan 9, 2025

convert-torch-onnx-to-torch generates invalid IR for onnx.Resize where scaling is in the first two dimensions #3453

Closed

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from ab7e021 to 7aec80b Compare January 9, 2025 17:17

zjgarvey reviewed Jan 9, 2025

View reviewed changes

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from 7aec80b to a20ee29 Compare January 10, 2025 23:03

bjacobgordon changed the title ~~fix(ONNX): avoids resizing fixed dimensions~~ fix(ONNX): avoids non-scalable dimensions in onnx.resize Jan 13, 2025

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from a20ee29 to 574f4fe Compare January 13, 2025 15:22

bjacobgordon marked this pull request as ready for review January 13, 2025 17:06

bjacobgordon requested a review from zjgarvey January 13, 2025 17:06

zjgarvey requested changes Jan 13, 2025

View reviewed changes

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from 574f4fe to cb371ea Compare January 14, 2025 15:04

bjacobgordon marked this pull request as draft January 15, 2025 14:49

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch 3 times, most recently from 05ea165 to cb20894 Compare January 15, 2025 23:19

zjgarvey reviewed Jan 16, 2025

View reviewed changes

lib/Conversion/TorchOnnxToTorch/DefaultDomainQtoZ.cpp Outdated Show resolved Hide resolved

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch 4 times, most recently from 8a78147 to 54cf76e Compare January 17, 2025 21:50

bjacobgordon requested a review from zjgarvey January 17, 2025 21:59

bjacobgordon marked this pull request as ready for review January 17, 2025 21:59

bjacobgordon changed the title ~~fix(ONNX): avoids non-scalable dimensions in onnx.resize~~ fix(ONNX): avoids resizing unsupported dimensions Jan 17, 2025

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from 54cf76e to 052404a Compare January 20, 2025 14:51

bjacobgordon added 19 commits February 10, 2025 14:59

refactor(ONNX): enforces min assignment-usage distance for noneVal …

bd23b93

…in onnx.resize - avoids SSA before match failures

refactor(ONNX): extracts loc within onnx.resize

87f9f54

refactor(ONNX): moves rank closer to first usage in onnx.resize

874827d

refactor(ONNX): forces cast of operand in onnx.resize

1d77eb9

refactor(ONNX): loosens downcast in onnx.resize

e41fa62

- cast to `ValueTensorType` was overly specific for the methods used

refactor(ONNX): extracts inputTensor within onnx.resize

140b628

refactor(ONNX): extracts inputTensorType from rank derivation in on…

fff08f2

…nx.resize

refactor(ONNX): extracts sizesOfInputTensor from rank derivation in…

1f7cdf0

… onnx.resize

refactor(ONNX): uses auto annotation for rank in onnx.resize

e835f1a

- intellisense is able to infer `unsigned` aspect from `.size()`

refactor(ONNX): renames rank to rankOfInputTensor in onnx.resize

a858e45

refactor(ONNX): renames resultType to outputTensorType in onnx.re…

3f41467

…size - emphasizes parallel to `inputTensorType`

refactor(ONNX): renames sizesValueList to supportedSizes in onnx.…

948a53e

…resize

refactor(ONNX): renames scalesValueList to supportedScaleFactors …

fd20a79

…in onnx.resize

refactor(ONNX): renames scaleOperand to proposedScaleFactors in o…

b897a34

…nnx.resize

refactor(ONNX): renames sizeOperand to proposedSizes in onnx.resize

01e2274

refactor(ONNX): prefers multiline attributes in onnx.resize tests

266a820

- easier to read - allows for cleaner diffs if they ever change

refactor(ONNX): distills checks in lit tests for onnx.resize

c9f2197

fix(ONNX): differentiates names of lit tests for onnx.resize

6dc3fdf

fix(ONNX): avoids resizing unsupported dimensions

a230084

bjacobgordon force-pushed the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch from a824f38 to a230084 Compare February 10, 2025 15:06

zjgarvey approved these changes Feb 10, 2025

View reviewed changes

zjgarvey merged commit 460c9f3 into llvm:main Feb 10, 2025
3 checks passed

bjacobgordon deleted the fix-onnx-adds-exceptions-enforcing-convention-in-resize-op branch February 10, 2025 17:38

ScottTodd reviewed Feb 12, 2025

View reviewed changes

zjgarvey mentioned this pull request Feb 12, 2025

[TorchToArith] Add a lowering for AtenEqFloat #4022

Merged

zjgarvey added a commit that referenced this pull request Feb 12, 2025

[TorchToArith] Add a lowering for AtenEqFloat (#4022)

aa74936

Addresses an issue introduced by <#3945> in an external test suite.

ScottTodd mentioned this pull request Feb 14, 2025

Integrate torch-mlir at aa74936c iree-org/iree#19979

Merged

zjgarvey mentioned this pull request Feb 18, 2025

ONNX "resize" op test failures iree-org/iree#17345

Open

		Value scaleIdentity = rewriter.create<Torch::ConstantFloatOp>(
		loc, rewriter.getF64FloatAttr(1.0));

fix(ONNX): avoids resizing unsupported dimensions #3945

fix(ONNX): avoids resizing unsupported dimensions #3945

Conversation

bjacobgordon commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zjgarvey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zjgarvey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

IanWood1 commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

bjacobgordon commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zjgarvey commented Jan 17, 2025

Uh oh!

bjacobgordon commented Feb 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zjgarvey commented Feb 10, 2025

Uh oh!

zjgarvey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ScottTodd Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

ScottTodd Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

bjacobgordon Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ScottTodd Feb 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bjacobgordon commented Jan 7, 2025 •

edited

Loading

IanWood1 commented Jan 14, 2025 •

edited

Loading

bjacobgordon commented Jan 17, 2025 •

edited

Loading

bjacobgordon commented Feb 10, 2025 •

edited

Loading

zjgarvey Feb 12, 2025 •

edited

Loading