Arm backend: Support channels-last input and output #14259

AdrianLundell · 2025-09-12T14:39:03Z

Insert transposes for input/output iff the incoming/outgoing data is
in channels first format.
For testing using tosa_reference_mode, transpose numpy arrays to and
from correct data format since numpy doesn't have the concept of
dim_order.
Remove checks for channels_first only input.
Remove check for not changing dim_order before to_tosa_memory_format
pass since the behaviour of channel last tensors is non-predictable.
Add dim order testing of example networks and mv2
Add a section to the documentation about memory formats.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Summary: Pull Request resolved: pytorch#14191 Redoing pytorch#14111 with additional fixes Reviewed By: digantdesai Differential Revision: D82171193

- Insert transposes for input/output iff the incoming/outgoing data is in channels first format. - For testing using tosa_reference_mode, transpose numpy arrays to and from correct data format since numpy doesn't have the concept of dim_order. - Remove checks for channels_first only input. - Remove check for not changing dim_order before to_tosa_memory_format pass since the behaviour of channel last tensors is non-predictable. - Add dim order testing of example networks and mv2 - Add a section to the documentation about memory formats. Signed-off-by: Adrian Lundell <[email protected]> Change-Id: I05548b9f3b4671da6faad90a9dd7366fda4498d6

pytorch-bot · 2025-09-12T14:39:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14259

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Unrelated Failure

As of commit adec5aa with merge base cf6e895 ():

NEW FAILURES - The following jobs have failed:

pull / unittest / macos / macos-job (gh)
exir/tests/test_quant_fusion_pass.py::TestQuantFusionPass::test_embedding_torchao
trunk / test-models-macos-cpu (mobilebert, xnnpack-quantization-delegation) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 139
trunk / test-models-macos-cpu (vit, xnnpack-quantization-delegation) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 139

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

trunk / test-models-macos-coreml (emformer_join) / macos-job (gh) (trunk failure)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 139

This comment was automatically generated by Dr. CI and updates every 15 minutes.

zingo · 2025-09-14T11:04:16Z

As #14191 has been merged I rebased this PR to "clean" out #14191 from the PR

facebook-github-bot · 2025-09-15T14:28:28Z

@mergennachin has imported this pull request. If you are a Meta employee, you can view this in D82449155.

backends/arm/test/misc/test_dim_order_guards.py

digantdesai

Review automatically exported from Phabricator review in Meta.

backends/arm/_passes/to_tosa_memory_format_pass.py

digantdesai · 2025-09-15T19:50:30Z

backends/arm/runtime/EthosUBackend.cpp

@@ -249,15 +249,6 @@ class EthosUBackend final : public ::executorch::runtime::BackendInterface {
            handles.inputs->io[i].elem_size);
        return Error::InvalidProgram;
      }
-      supported = executorch::runtime::is_contiguous_dim_order(


this implies it can handle anything i.e. transpose op is inserted if it was needed. But what about asserting expectations. I.e. if user exported with NCHW and we inserted a transpose_to_nhwc AoT, what if now user supplied NHWC (instead of assumed NCHW), shouldn't we validate since we don't "check and optionally transpose" at runtime.

Good point and I agree, however since we are past the branch cutoff date and we need this patch to unblock a major use case for us, may I ask to ignore this for now and fix this in a later PR?

No worries. I assumed that and stamped already :)

Does the stamp mean we can merge, or do we still wait och Meta to merge?

haha good I was in the non-GA mode already where stamp --> you merge --> Internal failure --> we revert

But looking at the activity from @mergennachin he is, rightfully, still in GA mental mode for this GA critical PR. So if the internal CI is clean, he or I can merge this.

Thanks, Im happy and you handle it fast so no problem, I just want to avoid an n"o one does it" situation 😆 as there might be a merge/sync to 1.0 coming up.

And I also dont expect any PR not 1.0 milestone tagged to be be merged. If so its just a bonus.

…ort-D82171193

Signed-off-by: Adrian Lundell <[email protected]>

AdrianLundell · 2025-09-16T12:52:16Z

Failing arm-backend test is a very small numerical diff for backends/arm/test/ops/test_var.py::test_var_dim_tosa_INT[var_3d_dim_neg_1_no_keep_dim_biased] -> Flaky test, not related.

Other failures are non-arm specific.

facebook-github-bot · 2025-09-16T16:34:00Z

@mergennachin has imported this pull request. If you are a Meta employee, you can view this in D82449155.

mergennachin · 2025-09-16T18:34:43Z

The new tests are failing. Here's an example error message:

=================================== FAILURES ===================================
_________________ test_dim_order_u55_INT[channels_last_output] _________________

module = <class 'executorch.backends.arm.test.misc.test_dim_order.ChannelsLastOutput'>

    [@common](https://www.internalfb.com/intern/profile/common).XfailIfNoCorstone300
    [@common](https://www.internalfb.com/intern/profile/common).parametrize("module", test_modules)
    def test_dim_order_u55_INT(module):
        pipeline = EthosU55PipelineINT[input_t1](module(), module.inputs, [])
>       pipeline.run()

../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/arm/test/misc/test_dim_order.py:116: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/arm/test/tester/test_pipeline.py:281: in run
    raise e
../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/arm/test/tester/test_pipeline.py:278: in run
    stage()
../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/arm/test/tester/test_pipeline.py:89: in __call__
    self.func(*self.args, **self.kwargs)
../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/arm/test/tester/arm_tester.py:491: in run_method_and_compare_outputs
    self._compare_outputs(
../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/arm/test/tester/arm_tester.py:678: in _compare_outputs
    raise e
../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/arm/test/tester/arm_tester.py:662: in _compare_outputs
    super()._compare_outputs(
../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/test/harness/tester.py:423: in _compare_outputs
    Tester._assert_outputs_equal(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

model_output = (tensor([[[[ 1.0034, 24.8351],
          [ 4.0138, 35.8729]],

         [[ 9.0310, 48.9177],
          [16.0550, 63.9692]]]]),)
ref_output = (tensor([[[[ 1.0034,  4.0138],
          [ 9.0310, 16.0550]],

         [[24.8351, 35.8729],
          [48.9177, 63.9692]]]]),)
atol = 0.2518597671985626, rtol = 0.001, statistics_callback = None

    [@staticmethod](https://www.internalfb.com/intern/profile/staticmethod)
    def _assert_outputs_equal(
        model_output,
        ref_output,
        atol=1e-03,
        rtol=1e-03,
        statistics_callback: Callable[[ErrorStatistics], None] | None = None,
    ):
        """
        Helper testing function that asserts that the model output and the reference output
        are equal with some tolerance. Due to numerical differences between eager mode and
        the XNNPACK's backend, we relax the detal such that absolute tolerance is 1e-3. and
        relative tolerance is 1e-3. In the event that the computation was quantized, we
        further relax the tolerance to one quantized step (equal to the quantization scale).
        This allows the quantized value to differ by 1 between the reference and model output.
        """
    
        assert len(model_output) == len(ref_output)
    
        for i in range(len(model_output)):
            model = model_output[i]
            ref = ref_output[i]
    
            error_stats = ErrorStatistics.from_tensors(model, ref)
            if statistics_callback is not None:
                statistics_callback(error_stats)
    
            assert (
                ref.shape == model.shape
            ), f"Output {i} shape {model.shape} does not match reference output shape {ref.shape}"
            if model.dtype == torch.bool:
                assert torch.equal(model, ref), (
                    f"Output {i} (bool tensor) does not match reference output.
"
                    f"\tShape: {model.shape}
"
                    f"\tMismatched count: {(model != ref).sum().item()} / {model.numel()}
"
                )
            else:
>               assert torch.allclose(
                    model,
                    ref,
                    atol=atol,
                    rtol=rtol,
                    equal_nan=True,
                ), (
                    f"Output {i} does not match reference output.
"
                    f"\tGiven atol: {atol}, rtol: {rtol}.
"
                    f"\tOutput tensor shape: {model.shape}, dtype: {model.dtype}
"
                    f"\tDifference: max: {torch.max(model-ref)}, abs: {torch.max(torch.abs(model-ref))}, mean abs error: {torch.mean(torch.abs(model-ref).to(torch.double))}.
"
                    f"\t-- Model vs. Reference --
"
                    f"\t Numel: {model.numel()}, {ref.numel()}
"
                    f"\tMedian: {model.median()}, {ref.median()}
"
                    f"\t  Mean: {model.to(torch.double).mean()}, {ref.to(torch.double).mean()}
"
                    f"\t   Max: {model.max()}, {ref.max()}
"
                    f"\t   Min: {model.min()}, {ref.min()}
"
                )
E               AssertionError: Output 0 does not match reference output.
E               	Given atol: 0.2518597671985626, rtol: 0.001.
E               	Output tensor shape: torch.Size([1, 2, 2, 2]), dtype: torch.float32
E               	Difference: max: 20.821361541748047, abs: 32.862632751464844, mean abs error: 13.420998275279999.
E               	-- Model vs. Reference --
E               	 Numel: 8, 8
E               	Median: 16.055025100708008, 16.055025100708008
E               	  Mean: 25.462266877293587, 25.462266877293587
E               	   Max: 63.969242095947266, 63.969242095947266
E               	   Min: 1.0034390687942505, 1.0034390687942505

../buck-out/v2/gen/fbcode/6253fc99cd6e7a23/executorch/backends/arm/test/__dim_order__/dim_order#link-tree/executorch/backends/test/harness/tester.py:378: AssertionError

AdrianLundell · 2025-09-17T12:07:22Z

@mergennachin The only thing that I can come up with that would cause this is that you need to update the way you run the tests internally to match the changes made in this PR. Would it be OK to disable the tests for now and put a ticket on you to fix this, or do you have any other solutions you prefer?

digantdesai · 2025-09-17T14:28:44Z

yeah likely not a problem with this PR. Let me take a look.

mergennachin and others added 2 commits September 11, 2025 15:32

Arm backend: Refactor compile spec handling (Try 2) (pytorch#14191)

861e98a

Summary: Pull Request resolved: pytorch#14191 Redoing pytorch#14111 with additional fixes Reviewed By: digantdesai Differential Revision: D82171193

AdrianLundell added this to the 1.0.0 milestone Sep 12, 2025

AdrianLundell added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Sep 12, 2025

AdrianLundell requested review from digantdesai and mergennachin as code owners September 12, 2025 14:39

AdrianLundell added ciflow/trunk release notes: arm Changes to the ARM backend delegate labels Sep 12, 2025

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 12, 2025

Merge branch 'main' into export-D82171193

bccaa2a

mergennachin reviewed Sep 15, 2025

View reviewed changes

backends/arm/test/misc/test_dim_order_guards.py Show resolved Hide resolved

digantdesai approved these changes Sep 15, 2025

View reviewed changes

digantdesai reviewed Sep 15, 2025

View reviewed changes

backends/arm/_passes/to_tosa_memory_format_pass.py Outdated Show resolved Hide resolved

digantdesai reviewed Sep 15, 2025

View reviewed changes

backends/arm/_passes/to_tosa_memory_format_pass.py Outdated Show resolved Hide resolved

digantdesai approved these changes Sep 15, 2025

View reviewed changes

AdrianLundell and others added 4 commits September 16, 2025 09:25

Merge branch 'main' of https://github.com/pytorch/executorch into exp…

9a4440e

…ort-D82171193

Fix upsteam review comments

cc7d899

Signed-off-by: Adrian Lundell <[email protected]>

Fix mypy linter error

843e600

Merge branch 'main' into export-D82171193

adec5aa

facebook-github-bot merged commit 5348ea9 into pytorch:main Sep 17, 2025
385 of 392 checks passed

Arm backend: Support channels-last input and output #14259

Arm backend: Support channels-last input and output #14259

Conversation

AdrianLundell commented Sep 12, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14259

❌ 3 New Failures, 1 Unrelated Failure

Uh oh!

zingo commented Sep 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Sep 15, 2025

Uh oh!

Uh oh!

digantdesai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

digantdesai Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

AdrianLundell Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

digantdesai Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

zingo Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

digantdesai Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

zingo Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AdrianLundell commented Sep 16, 2025

Uh oh!

facebook-github-bot commented Sep 16, 2025

Uh oh!

mergennachin commented Sep 16, 2025

Uh oh!

AdrianLundell commented Sep 17, 2025

Uh oh!

digantdesai commented Sep 17, 2025

Uh oh!

Uh oh!

Uh oh!

AdrianLundell commented Sep 12, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Sep 12, 2025 •

edited

Loading

zingo commented Sep 14, 2025 •

edited

Loading

zingo Sep 16, 2025 •

edited

Loading