-
Notifications
You must be signed in to change notification settings - Fork 4k
sql/inspect: clarify error message for internal failures #155766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/inspect: clarify error message for internal failures #155766
Conversation
a2f463e to
5d723f5
Compare
pkg/sql/inspect/inspect_job.go
Outdated
| if jobErr != nil && errors.Is(jobErr, errInspectFoundInconsistencies) { | ||
| // Record RunsWithIssues metric if the job failed due to finding issues (including internal errors). | ||
| if jobErr != nil && | ||
| (errors.Is(jobErr, errInspectFoundInconsistencies) || errors.Is(jobErr, errInspectInternalErrors)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: errors.Is already has a nil check, so we can skip that explicit check here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/sql/inspect/inspect_logger.go
Outdated
| // tracking aggregate state about the issues encountered. | ||
| type inspectLoggerBundle struct { | ||
| loggers []inspectLogger | ||
| foundIssue atomic.Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we need foundIssue? it seems like we can always derive it from sawInternalIssue || sawNonInternalIssue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
pkg/sql/inspect/inspect_logger.go
Outdated
| func (l *inspectLoggerBundle) logIssue(ctx context.Context, issue *inspectIssue) error { | ||
| l.foundIssue.Store(true) | ||
| if issue != nil { | ||
| if issue.ErrorType == InternalError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making sure I follow -- the comment on InternalError says
// InternalError occurs when an internal error (e.g., data corruption, encoding
// issues) prevents the check from completing normally. These errors indicate
// potential data corruption or other serious issues that require investigation.
but the commit message here says these are issues with the inspect code itself, and are not data problems.
is it just a matter of updating the comment, or do we need to account for other important cases where InternalError is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to call this out. The answer is that it depends. If you have a corruption such that it causes bad things to happen in the query engine, it will manifest as an internal error. But you can also have errors where the SQL crafted by INSPECT isn't valid. An example of that is seen in #155772: you cannot have an equality comparison between column of types REFCURSOR.
Let me update the commit/PR description.
Previously, when INSPECT encountered internal errors, it always reported "INSPECT found inconsistencies," even if the issue was caused by something other than actual data corruption. This made it hard to tell whether the problem was in user data or within INSPECT itself (for example, a bad query generated internally). This commit refines that behavior. If all observed errors are internal, INSPECT now reports: "INSPECT encountered internal errors." This makes it clear that the problem might stem from an internal failure in INSPECT or from data corruption detected during internal queries, rather than always implying user data inconsistencies. Additionally, a hint is now included in the error message, regardless of error type, guiding users to run SHOW INSPECT ERRORS to retrieve more information. Informs: cockroachdb#155676 Epic: CRDB-55075 Release note: none
5d723f5 to
84ea39c
Compare
spilchen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @spilchen)
pkg/sql/inspect/inspect_job.go
Outdated
| if jobErr != nil && errors.Is(jobErr, errInspectFoundInconsistencies) { | ||
| // Record RunsWithIssues metric if the job failed due to finding issues (including internal errors). | ||
| if jobErr != nil && | ||
| (errors.Is(jobErr, errInspectFoundInconsistencies) || errors.Is(jobErr, errInspectInternalErrors)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/sql/inspect/inspect_logger.go
Outdated
| // tracking aggregate state about the issues encountered. | ||
| type inspectLoggerBundle struct { | ||
| loggers []inspectLogger | ||
| foundIssue atomic.Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
pkg/sql/inspect/inspect_logger.go
Outdated
| func (l *inspectLoggerBundle) logIssue(ctx context.Context, issue *inspectIssue) error { | ||
| l.foundIssue.Store(true) | ||
| if issue != nil { | ||
| if issue.ErrorType == InternalError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to call this out. The answer is that it depends. If you have a corruption such that it causes bad things to happen in the query engine, it will manifest as an internal error. But you can also have errors where the SQL crafted by INSPECT isn't valid. An example of that is seen in #155772: you cannot have an equality comparison between column of types REFCURSOR.
Let me update the commit/PR description.
rafiss
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @spilchen)
|
TFTR! bors r+ |
155766: sql/inspect: clarify error message for internal failures r=spilchen a=spilchen Previously, when `INSPECT` encountered internal errors, it always reported "INSPECT found inconsistencies," even if the issue was caused by something other than actual data corruption. This made it hard to tell whether the problem was in user data or within INSPECT itself (for example, a bad query generated internally). This commit refines that behavior. If all observed errors are internal, `INSPECT` now reports: "INSPECT encountered internal errors." This makes it clear that the problem might stem from an internal failure in INSPECT or from data corruption detected during internal queries, rather than always implying user data inconsistencies. Additionally, a hint is now included in the error message, regardless of error type, guiding users to run SHOW INSPECT ERRORS to retrieve more information. Informs: #155676 Epic: CRDB-55075 Release note: none 155824: sql: fix top-level query stats when "inner" plans are present r=yuzefovich a=yuzefovich Previously, whenever we ran an "inner" plan (via `runPlanInsidePlan` helper), we ignored the top-level query stats for the "inner" plan. As a result, reads and writes performed by the inner plan weren't reflected in the "outer" top-level query stats. This is now fixed by adjusting the routines, apply joins, and recursive CTEs to propagate their metrics as ProducerMetadata objects. Note that for routines the access to the DistSQL infra is rather difficult, so we plumbed the access via the planner straight into the DistSQLReceiver, and even though it's ugly, it should work in practice. The only alternative I see is adding the necessary reference into the `eval.Context`, but then it gets tricky to actually set that and ensure the right copy of the eval context is observed by all routines (plus we'd need to make the copy or restore the original state somehow), so I chose to not pursue it. Epic: None Release note (bug fix): Previously, CockroachDB didn't include reads and writes performed by routines (user-defined functions and stored procedures) as well as apply joins into `bytes read`, `rows read`, and `rows written` statement execution statistics, and this is now fixed. The bug has been present since before 23.2 version. Co-authored-by: Matt Spilchen <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
|
Build failed (retrying...): |
Previously, when
INSPECTencountered internal errors, it always reported "INSPECT found inconsistencies," even if the issue was caused by something other than actual data corruption. This made it hard to tell whether the problem was in user data or within INSPECT itself (for example, a bad query generated internally).This commit refines that behavior. If all observed errors are internal,
INSPECTnow reports: "INSPECT encountered internal errors." This makes it clear that the problem might stem from an internal failure in INSPECT or from data corruption detected during internal queries, rather than always implying user data inconsistencies.Additionally, a hint is now included in the error message, regardless of error type, guiding users to run SHOW INSPECT ERRORS to retrieve more information.
Informs: #155676
Epic: CRDB-55075
Release note: none