-
Notifications
You must be signed in to change notification settings - Fork 116
Improved error reporting for failed tolerance checks #988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Improved error reporting for failed tolerance checks #988
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
just replace the existing error number with the new one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the tolerance validation system in the MFC test suite by adding comprehensive error diagnostics when tolerance checks fail. The enhancement maintains the existing fast-fail behavior for performance while providing detailed analysis of numerical errors across all test files when failures occur.
Key changes:
- Enhanced error reporting with maximum error analysis across all files
- Added comprehensive diagnostic information including worst-case absolute and relative errors
- Maintained fast-fail performance for passing tests
toolchain/mfc/packer/tol.py
Outdated
|
||
# Return the average relative error | ||
return avg_err.get(), None | ||
|
||
|
||
def format_diagnostic_message(max_errors: typing.Tuple[typing.Optional[typing.Tuple[str, int, float, float, float, float]], typing.Optional[typing.Tuple[str, int, float, float, float, float]]]) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function signature uses deeply nested tuple types that are difficult to read and maintain. Consider defining named tuple types or dataclasses for the error information to improve code clarity.
Copilot uses AI. Check for mistakes.
Scan all files to find the maximum absolute and relative errors. | ||
Returns tuple of: | ||
- max_abs_info: (filepath, var_index, golden_val, candidate_val, absolute_error, relative_error) | ||
- max_rel_info: (filepath, var_index, golden_val, candidate_val, relative_error, absolute_error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type annotation uses the same deeply nested tuple structure as format_diagnostic_message. This complex type signature makes the code harder to understand and maintain. Consider using named tuples or dataclasses.
- max_rel_info: (filepath, var_index, golden_val, candidate_val, relative_error, absolute_error) | |
def format_diagnostic_message(max_errors: Tuple[Optional[ErrorInfo], Optional[ErrorInfo]]) -> str: | |
"""Format the diagnostic message showing maximum errors.""" | |
max_abs_info, max_rel_info = max_errors | |
diagnostic_msg = "" | |
if max_abs_info: | |
rel_error_str = f"{max_abs_info.relative_error:.2E}" if not math.isnan(max_abs_info.relative_error) else "NaN" | |
diagnostic_msg += f"\n\nDiagnostics - Maximum absolute error across ALL files:\n" \ | |
f" - File: {max_abs_info.filepath}\n" \ | |
f" - Variable n°{max_abs_info.var_index+1}\n" \ | |
f" - Candidate: {max_abs_info.candidate_val}\n" \ | |
f" - Golden: {max_abs_info.golden_val}\n" \ | |
f" - Absolute Error: {max_abs_info.absolute_error:.2E}\n" \ | |
f" - Relative Error: {rel_error_str}" | |
if max_rel_info: | |
diagnostic_msg += f"\n\nDiagnostics - Maximum relative error across ALL files:\n" \ | |
f" - File: {max_rel_info.filepath}\n" \ | |
f" - Variable n°{max_rel_info.var_index+1}\n" \ | |
f" - Candidate: {max_rel_info.candidate_val}\n" \ | |
f" - Golden: {max_rel_info.golden_val}\n" \ | |
f" - Relative Error: {max_rel_info.relative_error:.2E}\n" \ | |
f" - Absolute Error: {max_rel_info.absolute_error:.2E}" | |
return diagnostic_msg | |
def find_maximum_errors(candidate: Pack, golden: Pack) -> Tuple[Optional[ErrorInfo], Optional[ErrorInfo]]: | |
""" | |
Scan all files to find the maximum absolute and relative errors. | |
Returns tuple of: | |
- max_abs_info: ErrorInfo or None | |
- max_rel_info: ErrorInfo or None |
Copilot uses AI. Check for mistakes.
max_abs_error = -1.0 | ||
max_abs_info = None | ||
|
||
max_rel_error = -1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using -1.0 as an initial value for maximum error tracking is unclear and could be problematic if all errors are actually negative (though unlikely in this context). Consider using float('-inf') or None with explicit checks for better clarity.
max_rel_error = -1.0 | |
max_abs_error = float('-inf') | |
max_abs_info = None | |
max_rel_error = float('-inf') |
Copilot uses AI. Check for mistakes.
max_abs_error = -1.0 | ||
max_abs_info = None | ||
|
||
max_rel_error = -1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as with max_abs_error - using -1.0 as initial value is unclear. Consider using float('-inf') or None with explicit checks for better clarity.
max_rel_error = -1.0 | |
max_abs_error = float('-inf') | |
max_abs_info = None | |
max_rel_error = float('-inf') |
Copilot uses AI. Check for mistakes.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #988 +/- ##
=======================================
Coverage 40.93% 40.93%
=======================================
Files 70 70
Lines 20288 20288
Branches 2517 2517
=======================================
Hits 8305 8305
Misses 10447 10447
Partials 1536 1536 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
User description
Description
Please include a summary of the changes and the related issue(s) if they exist.
Please also include relevant motivation and context.
Improves the tolerance validation system by adding comprehensive error diagnostics when tests fail.
Current behavior:
Enhanced behavior:
Benefits:
This enhancement helps developers quickly identify both where tests first fail and where the most significant numerical errors occur in the simulation results.
Fixes #(issue) [optional]
Type of change
Please delete options that are not relevant.
Scope
If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration
Test Configuration:
Checklist
docs/
)examples/
that demonstrate my new feature performing as expected.They run to completion and demonstrate "interesting physics"
./mfc.sh format
before committing my codeIf your code changes any code source files (anything in
src/simulation
)To make sure the code is performing as expected on GPU devices, I have:
nvtx
ranges so that they can be identified in profiles./mfc.sh run XXXX --gpu -t simulation --nsys
, and have attached the output file (.nsys-rep
) and plain text results to this PR./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace
, and have attached the output file and plain text results to this PR.PR Type
Enhancement
Description
Enhanced tolerance validation with comprehensive error diagnostics
Added maximum error scanning across all files after failure
Improved error reporting with detailed diagnostic information
Maintained fast-fail performance for passing tests
Diagram Walkthrough
File Walkthrough
tol.py
Enhanced tolerance validation with comprehensive error diagnostics
toolchain/mfc/packer/tol.py
find_maximum_errors()
function to scan all files for worst-caseerrors
format_diagnostic_message()
function for detailed errorformatting
locations