Skip to content

Conversation

@spilchen
Copy link
Contributor

Before performing the expensive full join to compare primary and secondary indexes during the INSPECT index consistency check, this change adds a fast, low-cost precheck step.

We now compute and compare row counts and hash values from both indexes. If the counts and hashes match, we can confidently skip the full check. This optimization should improves performance for healthy indexes.

Closes: #150927
Epic: CRDB-55075
Release note: None

@spilchen spilchen self-assigned this Oct 20, 2025
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@spilchen spilchen force-pushed the gh-150927/251020/0945/inspect-hash-speedup/pr-ready branch 2 times, most recently from 10e8237 to ee37440 Compare October 21, 2025 14:13
@spilchen
Copy link
Contributor Author

Here are the results of the performance tests when I ran this in roachtest:


1. Admission Control Test (25M rows)

Fast version:

  • Calibration INSPECT (no workload): 3m58.8s (3,272 rows/s/cpu)
  • INSPECT under load (AC enabled): 3m21.7s (3,874 rows/s/cpu)
  • INSPECT under load (AC disabled): 42.9s (18,223 rows/s/cpu)

Slow version:

  • Calibration INSPECT (no workload): 11m6.6s (1,172 rows/s/cpu)
  • INSPECT under load (AC enabled): 24m55.5s (522 rows/s/cpu)
  • INSPECT under load (AC disabled): 2m24.8s (5,396 rows/s/cpu)

Performance improvement (fast vs slow):

  • Calibration: 2.8x faster (11m6s → 3m59s)
  • Under load with AC: 7.4x faster (24m55s → 3m22s)
  • Under load without AC: 3.4x faster (2m25s → 43s)

2. Throughput Test - 500M rows, 1 index

Fast version:

  • INSPECT execution: 32m0.4s

Slow version:

  • INSPECT execution: 58m53.9s

Performance improvement:

  • 1.8x faster (58m54s → 32m0s)

3. Throughput Test - 1B rows, 2 indexes

Fast version:

  • INSPECT with 1 check: 1h53m0.5s
  • INSPECT with 2 checks: 2h29m45.4s

Slow version:

  • INSPECT with 1 check: 2h25m15.8s
  • INSPECT with 2 checks: 3h51m30.7s

Performance improvement:

  • 1 check: 1.3x faster (2h25m16s → 1h53m1s)
  • 2 checks: 1.6x faster (3h51m31s → 2h29m45s)

@spilchen spilchen force-pushed the gh-150927/251020/0945/inspect-hash-speedup/pr-ready branch from ee37440 to 380b450 Compare October 22, 2025 11:36
@spilchen spilchen marked this pull request as ready for review October 22, 2025 11:36
@spilchen spilchen requested a review from a team as a code owner October 22, 2025 11:36
@spilchen spilchen requested a review from rafiss October 22, 2025 11:36
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a great improvement, nice work! lgtm -- my comments are just about error wrapping and word choice, so feel free to merge after responding.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @spilchen)


pkg/sql/inspect/index_consistency_check.go line 242 at r1 (raw file):

			// be ignored.
			if isQueryConstructionError(hashErr) {
				return hashErr

shouldn't we never expect a query construction error? if so, then this could be wrapped with an assertion failure.


pkg/sql/inspect/index_consistency_check.go line 244 at r1 (raw file):

				return hashErr
			}
			log.Dev.Infof(ctx, "hash precheck for index consistency failed; falling back to full check: %v", hashErr)

super nit: instead of "failed" could we use language closer to "did not match"? i'm thinking that might make this log message more clear, since although the check did "fail" someone who sees this might think that CRDB itself encountered a failure.

@spilchen spilchen force-pushed the gh-150927/251020/0945/inspect-hash-speedup/pr-ready branch from 380b450 to ced327e Compare October 23, 2025 19:32
Copy link
Contributor Author

@spilchen spilchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)


pkg/sql/inspect/index_consistency_check.go line 242 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

shouldn't we never expect a query construction error? if so, then this could be wrapped with an assertion failure.

Yes, makes sense. We should never expect this since it's an internally generated query.


pkg/sql/inspect/index_consistency_check.go line 244 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

super nit: instead of "failed" could we use language closer to "did not match"? i'm thinking that might make this log message more clear, since although the check did "fail" someone who sees this might think that CRDB itself encountered a failure.

Done.

@spilchen spilchen force-pushed the gh-150927/251020/0945/inspect-hash-speedup/pr-ready branch from ced327e to d5f078b Compare October 23, 2025 23:15
Before performing the expensive full join to compare primary and secondary
indexes during the INSPECT index consistency check, this change adds a fast,
low-cost precheck step.

We now compute and compare row counts and hash values from both indexes. If the
counts and hashes match, we can confidently skip the full check. This
optimization should improves performance for healthy indexes.

Closes: cockroachdb#150927
Epic: CRDB-55075
Release note: None
@spilchen spilchen force-pushed the gh-150927/251020/0945/inspect-hash-speedup/pr-ready branch from d5f078b to da01a52 Compare October 24, 2025 11:01
@spilchen
Copy link
Contributor Author

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 24, 2025

@craig craig bot merged commit 54a9a86 into cockroachdb:master Oct 24, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql/inspect: speed up index consistency check with a hash

3 participants