Conversation
Add a JSON-based cache for GitHub API results (repo existence/rename checks and user/org lookups) that persists across CI runs using GitHub Actions cache. Each cached entry has a 24-hour TTL. On subsequent runs, items verified within the last 24 hours are loaded from cache, avoiding redundant API calls. This is safe because the collections-renames workflow runs hourly to catch any renames. The cache is scoped per matrix test type (collections vs all) and uses the run ID as the cache key with restore-keys fallback to pick up the most recent cache.
There was a problem hiding this comment.
Pull request overview
This PR adds a JSON-based cache for GitHub API verification results to reduce redundant API calls across CI runs. The cache persists for 24 hours using GitHub Actions' actions/cache, helping to speed up CI while still catching stale references within a reasonable timeframe.
Changes:
- Implements cache loading and saving helpers in test_helper.rb with 24-hour TTL
- Integrates cache restoration/persistence into the CI workflow for collections and all test types
- Adds .api-cache.json to .gitignore to prevent accidental commits
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| test/test_helper.rb | Adds load_api_cache! and save_api_cache! methods to persist @@repos and @@users cache entries to JSON; loads cache at startup and saves after test run |
| .github/workflows/test.yml | Adds cache restoration step using actions/cache@v4 for collections and all test types with run-specific keys and type-based restore keys |
| .gitignore | Excludes .api-cache.json file from version control |
Comments suppressed due to low confidence (3)
test/test_helper.rb:334
- When a cached user entry is rate-limited, it's stored as
truein@@users(line 102 in NewOctokit#user). When saving the cache, ifvalue = true, it will be converted to{ "login" => "true" }(line 334), which may cause issues when loading sincetrue.to_sis"true". This creates a struct withlogin = "true", which is incorrect. Consider handling thetruesentinel value explicitly to avoid caching rate-limited entries or use a different sentinel value that can be properly serialized and deserialized.
users_data[key.to_s] = {
"cached_at" => now,
"value" => value.nil? ? nil : { "login" => value.respond_to?(:login) ? value.login : value.to_s },
test/test_helper.rb:306
- If a cache entry's
valueis not nil but also doesn't have aloginkey (e.g., corrupted data), the code on line 306 will call.new(nil)sinceresult["login"]will be nil. This creates a struct withlogin = nil, which might not match the expected behavior. Consider validating that required keys exist in the cached value before reconstructing the struct.
cached = if result.nil?
nil
else
Struct.new(:login).new(result["login"])
test/test_helper.rb:276
- The method should be defined as
def load_api_cache!instead ofdef self.load_api_cache!. All other helper methods in this file (likeclient,graphql_query,cache_users_exist_check!, etc.) are defined as instance methods withoutself., and this is being called asload_api_cache!on line 344 which expects an instance method. Usingself.makes this a method on the main object's singleton class, which is inconsistent with the rest of the codebase.
def self.load_api_cache!
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
test/test_helper.rb
Outdated
| warn "Failed to load API cache: #{e.message}" | ||
| end | ||
|
|
||
| def self.save_api_cache! |
There was a problem hiding this comment.
The method should be defined as def save_api_cache! instead of def self.save_api_cache!. All other helper methods in this file are defined as instance methods without self., and this is being called as save_api_cache! on line 357 which expects an instance method. Using self. makes this a method on the main object's singleton class, which is inconsistent with the rest of the codebase.
This issue also appears on line 276 of the same file.
There was a problem hiding this comment.
Fixed — removed self. from both load_api_cache! and save_api_cache! so they're instance methods consistent with the rest of the helpers.
| repos_data[key.to_s] = { | ||
| "cached_at" => now, | ||
| "value" => value.nil? ? nil : { "full_name" => value.respond_to?(:full_name) ? value.full_name : value.to_s }, |
There was a problem hiding this comment.
When a cached entry has value.nil?, it's stored as nil in the JSON (line 325), and when loaded back (line 289-290), it's correctly reconstructed as nil. However, if a repository truly doesn't exist (NotFound), the code stores nil in the cache (line 92 in NewOctokit#repository), but this is indistinguishable from rate-limited entries where repos[item] = true is stored (line 90). When saving the cache, a rate-limited entry with value = true will be converted to { "full_name" => "true" } (line 325), which may cause issues when loading since true doesn't respond to .full_name correctly. Consider handling the true sentinel value explicitly.
This issue also appears on line 332 of the same file.
There was a problem hiding this comment.
Fixed — added next if value == true before serializing both repos and users entries, so rate-limit sentinel values are skipped and won't produce corrupted cache data like { "full_name" => "true" }.
test/test_helper.rb
Outdated
| client.messages << "::#{type} file=#{file},line=#{line_number}::#{message}" | ||
| end | ||
|
|
||
| CACHE_FILE = File.expand_path("../.api-cache.json", File.dirname(__FILE__)) |
There was a problem hiding this comment.
The CACHE_FILE path uses File.expand_path("../.api-cache.json", File.dirname(__FILE__)). Since __FILE__ is test/test_helper.rb, File.dirname(__FILE__) is test/, and ../.api-cache.json relative to test/ is .api-cache.json in the repository root. This is correct and matches the .gitignore entry and workflow path. However, using File.expand_path("../.api-cache.json", __dir__) would be more idiomatic Ruby.
| CACHE_FILE = File.expand_path("../.api-cache.json", File.dirname(__FILE__)) | |
| CACHE_FILE = File.expand_path("../.api-cache.json", __dir__) |
There was a problem hiding this comment.
Fixed — changed to File.expand_path("../.api-cache.json", __dir__). More idiomatic Ruby.
test/test_helper.rb
Outdated
|
|
||
| if data["repos"] | ||
| data["repos"].each do |key, entry| | ||
| next if now - entry["cached_at"].to_i > ttl |
There was a problem hiding this comment.
The TTL check compares now - entry["cached_at"].to_i > ttl, which skips entries that are older than the TTL. However, if the cache file is corrupted and entry["cached_at"] is missing or not a valid number, .to_i will return 0, making now - 0 a very large number, which will correctly skip the entry. This is safe, but consider adding an explicit check for missing or invalid cached_at values for clarity.
There was a problem hiding this comment.
Fixed — added explicit next unless cached_at guard before the TTL comparison in both repos and users loops. Makes the intent clear rather than relying on .to_i returning 0 for nil.
| # Reconstruct a minimal object that responds to .full_name | ||
| cached = if result.nil? | ||
| nil | ||
| else | ||
| Struct.new(:full_name).new(result["full_name"]) |
There was a problem hiding this comment.
If a cache entry's value is not nil but also doesn't have a full_name key (e.g., corrupted data), the code on line 292 will call .new(nil) since result["full_name"] will be nil. This creates a struct with full_name = nil, which might not match the expected behavior. Consider validating that required keys exist in the cached value before reconstructing the struct.
This issue also appears on line 303 of the same file.
There was a problem hiding this comment.
Fixed — added next unless result["full_name"] and next unless result["login"] validation before reconstructing the Struct, so corrupted/incomplete cache entries are skipped instead of creating Struct.new(...).new(nil).
Summary
Adds a lightweight JSON cache for GitHub API verification results (repo/user existence checks) that persists across CI runs using
actions/cache. Cache entries expire after 24 hours.Changes
test/test_helper.rb—load_api_cache!/save_api_cache!helpers that read and write.api-cache.jsonwith a 24-hour TTL..github/workflows/test.yml—actions/cache@v4step restores/saves the cache file, scoped tocollectionsandalltest types..gitignore— ignores the local cache file.Why
The test suite verifies that every referenced GitHub repo and user actually exists, which requires many API calls. Most of these results don't change between runs, so caching them for 24 hours avoids redundant requests and speeds up CI while still catching stale references within a day.