Skip to content

Prefetch all collection items globally before tests#5043

Open
kenyonj wants to merge 2 commits intomainfrom
global-graphql-batch-prefetch
Open

Prefetch all collection items globally before tests#5043
kenyonj wants to merge 2 commits intomainfrom
global-graphql-batch-prefetch

Conversation

@kenyonj
Copy link
Contributor

@kenyonj kenyonj commented Feb 13, 2026

Summary

  • Collect all repo/user references across all collections upfront and deduplicate
  • Batch GraphQL queries in chunks of 100 to pre-warm the NewOctokit cache globally
  • Reduces API round-trips from ~300 (3 per collection) to ~10-15 total

Changes

  • test/collections_test_helper.rb: Add prefetch_all_collection_items! function that globally collects and batches all items
  • test/collections_test.rb: Replace per-collection cache_* calls with single prefetch_all_collection_items! call (idempotent, runs once)

How it works

The prefetch_all_collection_items! function:

  1. Iterates all collections and collects every repo and user reference
  2. Deduplicates the lists
  3. Batches repos into chunks of 100 and runs cache_repos_exist_check! for each batch
  4. Batches users into chunks of 100 and runs cache_users_exist_check! for each batch
  5. Finds users not found as users, batches them and runs cache_orgs_exist_check!
  6. Sets @_prefetched = true to prevent re-running

Since NewOctokit uses class-level caches (@@repos, @@users), subsequent per-collection checks hit the cache with zero API calls.

Instead of making 3 GraphQL API calls per collection (~300 total for
100 collections), collect all repo and user references across all
collections upfront, deduplicate them, and batch the GraphQL queries.

This reduces API round-trips from ~300 to ~10-15 (depending on total
unique items), significantly improving test suite performance while
maintaining the same validation coverage.
@kenyonj kenyonj requested a review from a team as a code owner February 13, 2026 20:42
Copilot AI review requested due to automatic review settings February 13, 2026 20:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce GitHub API round-trips in the collections test suite by prefetching and deduplicating all repo/user references across collections, then warming NewOctokit’s caches via batched GraphQL queries before per-collection validations run.

Changes:

  • Added prefetch_all_collection_items! to aggregate/deduplicate all collection items and prefetch them in GraphQL batches.
  • Updated the “renamed or removed” collections test to call prefetch_all_collection_items! and removed per-collection cache_* prewarming calls.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
test/collections_test_helper.rb Introduces global prefetch helper and batching constant for GraphQL cache warming
test/collections_test.rb Switches from per-collection cache warming to a single prefetch call before validations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 112 to 114
def prefetch_all_collection_items!
return if @_prefetched

Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@_prefetched is an instance variable on the current Minitest::Spec instance; since Minitest creates a new instance per it, this memoization won’t persist across tests/collections, so the global prefetch will rerun many times (and still issue GraphQL requests each time). Use a shared flag (e.g., a class variable on NewOctokit, a module-level instance variable, or a constant guarded by defined?) so the prefetch truly runs once per test process.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 44f56ef. Replaced @_prefetched with @@global_prefetch_done class variable on NewOctokit, accessed via NewOctokit.global_prefetch_done? and NewOctokit.global_prefetch_done! class methods. This persists across Minitest instances since class variables are shared across all instances of the class.

Comment on lines 131 to 147
# Batch repos in chunks to stay within GraphQL query limits
all_repos.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
cache_repos_exist_check!(batch)
end

# Batch users in chunks
all_users.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
cache_users_exist_check!(batch)
end

# Check orgs for users not found
not_found_users = users_not_found_from(all_users)
not_found_users.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
cache_orgs_exist_check!(batch)
end

@_prefetched = true
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prefetch flag is set unconditionally at the end. If any of the cache_*_exist_check! calls return early (e.g., graphql_query returns nil) or raise partway through, subsequent tests will skip prefetching and fall back to many uncached REST calls. Consider only marking prefetched after all batches succeed, or resetting/avoiding the flag on failure.

Suggested change
# Batch repos in chunks to stay within GraphQL query limits
all_repos.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
cache_repos_exist_check!(batch)
end
# Batch users in chunks
all_users.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
cache_users_exist_check!(batch)
end
# Check orgs for users not found
not_found_users = users_not_found_from(all_users)
not_found_users.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
cache_orgs_exist_check!(batch)
end
@_prefetched = true
all_prefetch_succeeded = true
# Batch repos in chunks to stay within GraphQL query limits
all_repos.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
all_prefetch_succeeded &&= cache_repos_exist_check!(batch)
end
# Batch users in chunks
all_users.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
all_prefetch_succeeded &&= cache_users_exist_check!(batch)
end
# Check orgs for users not found
not_found_users = users_not_found_from(all_users)
not_found_users.each_slice(GRAPHQL_BATCH_SIZE) do |batch|
all_prefetch_succeeded &&= cache_orgs_exist_check!(batch)
end
@_prefetched = true if all_prefetch_succeeded

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 44f56ef. Wrapped all cache_* calls in a begin/rescue block. NewOctokit.global_prefetch_done! is now only called after ALL batch operations complete successfully. If any call raises, the rescue logs a warning and the flag stays false, so individual tests fall back to their own per-test caching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant