[Backend Tester] Add tensor error statistic reporting #12809

GregoryComer · 2025-07-24T05:25:16Z

Report various error statistics for the test outputs, including SQNR, mean absolute error (MAE), and L2 norm. These are saved in the detail report per test case.

As an example, here is the output from Core ML running MobileNet V2 (roughly formatted from csv -> sheets -> markdown):

Output 0 Error Max	Output 0 Error MAE	Output 0 Error MSD	Output 0 Error L2	Output 0 SQNR
0.0005887411535		0.0001199183663		2.32E-06		0.004750485188		41.28595734

TODO: Round to a reasonable number of significant figures. (Or leave in full precision in the report?)

[ghstack-poisoned]

GregoryComer · 2025-07-24T05:25:18Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-07-24T05:25:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12809

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ef7af5c with merge base 4fd2079 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 9eb6b0a ghstack-comment-id: 3112003831 Pull-Request: #12809

GregoryComer · 2025-07-24T06:20:31Z

backends/test/harness/tester.py

@@ -302,17 +303,15 @@ def run_method_and_compare_outputs(
        atol=1e-03,
        rtol=1e-03,
        qtol=0,
+        statistics_callback: Callable[[ErrorStatistics], None] | None = None,


I'm not completely happy with the callback approach for exposing this, but I don't really have a better idea, since the tester relies on a builder-style pattern where it returns self to allow chaining. I'm open to suggestions.

just update the tester method?

Can you clarify what you're thinking? Are you meaning update the tester run_method_and_compare outputs to directly return the error stats and then update all of the callers to not use it in a chained fashion? Or something else?

Modify the existing method, keeping the outside behavior but also add def get_comparison_stats(self) method on that stage or something?
Or if you want to pass a callback for flexibility that's also fine.

GregoryComer · 2025-07-24T06:21:01Z

I'll add some tests for the error statistic calculation to this PR shortly.

[ghstack-poisoned]

ghstack-source-id: dd1b232 ghstack-comment-id: 3112003831 Pull-Request: #12809

[ghstack-poisoned]

ghstack-source-id: 63819cb ghstack-comment-id: 3112003831 Pull-Request: #12809

Update

fd26fc7

[ghstack-poisoned]

GregoryComer requested a review from larryliu0820 as a code owner July 24, 2025 05:25

GregoryComer requested review from kirklandsign, lucylq, jackzhxng, shoumikhin, mergennachin, SS-JIA and cccclai as code owners July 24, 2025 05:25

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 24, 2025

GregoryComer added a commit that referenced this pull request Jul 24, 2025

[Backend Tester] Add tensor error statistic reporting

ac768f7

ghstack-source-id: 9eb6b0a ghstack-comment-id: 3112003831 Pull-Request: #12809

This was referenced Jul 24, 2025

[Backend Tester] Add Vulkan tester and register test flow #12738

Merged

[Backend Tester] [WIP] Add Qualcomm tester and register flow #12739

Open

[Backend Tester] Add CSV report generation #12741

Open

GregoryComer requested a review from digantdesai July 24, 2025 05:29

GregoryComer commented Jul 24, 2025

View reviewed changes

digantdesai approved these changes Jul 24, 2025

View reviewed changes

Update

a27d18c

[ghstack-poisoned]

GregoryComer added a commit that referenced this pull request Jul 24, 2025

[Backend Tester] Add tensor error statistic reporting

a36b7f4

ghstack-source-id: dd1b232 ghstack-comment-id: 3112003831 Pull-Request: #12809

Update

ef7af5c

[ghstack-poisoned]

GregoryComer added a commit that referenced this pull request Jul 24, 2025

[Backend Tester] Add tensor error statistic reporting

60ef142

ghstack-source-id: 63819cb ghstack-comment-id: 3112003831 Pull-Request: #12809

This was referenced Jul 24, 2025

[Backend Tester] Report quantization and lowering times #12838

Open

[Backend Tester] Report delegation statistics #12846

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Backend Tester] Add tensor error statistic reporting #12809

[Backend Tester] Add tensor error statistic reporting #12809

Uh oh!

GregoryComer commented Jul 24, 2025 •

edited

Loading

Uh oh!

GregoryComer commented Jul 24, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading

Uh oh!

GregoryComer Jul 24, 2025

Uh oh!

digantdesai Jul 24, 2025

Uh oh!

GregoryComer Jul 24, 2025

Uh oh!

digantdesai Jul 28, 2025

Uh oh!

GregoryComer commented Jul 24, 2025

Uh oh!

Uh oh!

[Backend Tester] Add tensor error statistic reporting #12809

Are you sure you want to change the base?

[Backend Tester] Add tensor error statistic reporting #12809

Uh oh!

Conversation

GregoryComer commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GregoryComer commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12809

✅ No Failures

Uh oh!

GregoryComer Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

digantdesai Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

GregoryComer Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

digantdesai Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

GregoryComer commented Jul 24, 2025

Uh oh!

Uh oh!

GregoryComer commented Jul 24, 2025 •

edited

Loading

GregoryComer commented Jul 24, 2025 •

edited

Loading

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading