Skip to content

chore: update crd status for incremental scrape#1986

Open
yashmehrotra wants to merge 1 commit intomainfrom
inc-js
Open

chore: update crd status for incremental scrape#1986
yashmehrotra wants to merge 1 commit intomainfrom
inc-js

Conversation

@yashmehrotra
Copy link
Member

@yashmehrotra yashmehrotra commented Mar 13, 2026

Summary by CodeRabbit

  • New Features
    • CRD status now records scraping operation results with success/error status and timestamps, providing better visibility into the last execution details.

@yashmehrotra yashmehrotra requested a review from moshloop March 13, 2026 09:57
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

Walkthrough

Enhanced Kubernetes watch consumption flow in the incremental scraper by capturing processing timestamps and adding CRD status update functionality. When processing completes and agent ID is unset, the scraper now patches the CRD status with LastRunStatus containing success state, error details, and execution timestamp.

Changes

Cohort / File(s) Summary
CRD Status Update Enhancement
scrapers/incremental.go
Added timestamp capture and CRD status patching logic to track last run status (success/failure, error details, execution timestamp). Integrated status updates in both primary resource processing and involved-objects paths.
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: updating CRD status for incremental scrape operations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch inc-js
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scrapers/incremental.go`:
- Around line 201-203: The check for unset agent_id is too strict: in the block
where you read source := sc.ScrapeConfig().GetAnnotations()["source"] and
agentID := sc.ScrapeConfig().GetAnnotations()["agent_id"], replace the single
equality check against uuid.Nil.String() with a check that treats missing/empty
values as unset (e.g., if source == models.SourceCRD && (agentID == "" ||
agentID == uuid.Nil.String())). Update the conditional in the function/method
that contains these variables so CRD status updates run when agent_id is absent
or empty as well as when it equals the nil UUID.
- Around line 201-214: The CRD last-run status (v1.LastRunStatus) is only
patched after consumeResources succeeds, so failures before that never write
LastRunStatus; modify the flow around sc.ScrapeConfig()/jobCtx.History to always
attempt updateCRDStatus(jobCtx.Context, sc.ScrapeConfig(), lastRun) regardless
of consumeResources outcome—e.g., compute lastRun using jobCtx.History and start
time and invoke updateCRDStatus in a defer or in both the success and error
return paths (referencing updateCRDStatus, sc.ScrapeConfig, v1.LastRunStatus,
jobCtx.History); ensure if updateCRDStatus itself fails you surface or combine
that error with the original error (using wrapping or errors.Join) rather than
silently dropping it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f070578e-0636-4b50-b7a7-b7d2553151ef

📥 Commits

Reviewing files that changed from the base of the PR and between 2a13c2c and ccd9a27.

📒 Files selected for processing (1)
  • scrapers/incremental.go

Comment on lines +201 to +203
source := sc.ScrapeConfig().GetAnnotations()["source"]
agentID := sc.ScrapeConfig().GetAnnotations()["agent_id"]
if source == models.SourceCRD && agentID == uuid.Nil.String() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unset agent_id check is too strict.

Line 203 only matches the nil UUID literal. If agent_id is absent/empty, this block won’t run and CRD status won’t be updated.

Proposed fix
- if source == models.SourceCRD && agentID == uuid.Nil.String() {
+ if source == models.SourceCRD && (agentID == "" || agentID == uuid.Nil.String()) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
source := sc.ScrapeConfig().GetAnnotations()["source"]
agentID := sc.ScrapeConfig().GetAnnotations()["agent_id"]
if source == models.SourceCRD && agentID == uuid.Nil.String() {
source := sc.ScrapeConfig().GetAnnotations()["source"]
agentID := sc.ScrapeConfig().GetAnnotations()["agent_id"]
if source == models.SourceCRD && (agentID == "" || agentID == uuid.Nil.String()) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scrapers/incremental.go` around lines 201 - 203, The check for unset agent_id
is too strict: in the block where you read source :=
sc.ScrapeConfig().GetAnnotations()["source"] and agentID :=
sc.ScrapeConfig().GetAnnotations()["agent_id"], replace the single equality
check against uuid.Nil.String() with a check that treats missing/empty values as
unset (e.g., if source == models.SourceCRD && (agentID == "" || agentID ==
uuid.Nil.String())). Update the conditional in the function/method that contains
these variables so CRD status updates run when agent_id is absent or empty as
well as when it equals the nil UUID.

Comment on lines +201 to +214
source := sc.ScrapeConfig().GetAnnotations()["source"]
agentID := sc.ScrapeConfig().GetAnnotations()["agent_id"]
if source == models.SourceCRD && agentID == uuid.Nil.String() {
lastRun := v1.LastRunStatus{
Success: jobCtx.History.SuccessCount,
Error: jobCtx.History.ErrorCount,
Timestamp: metav1.Time{Time: start},
Errors: jobCtx.History.Errors,
}

if err := updateCRDStatus(jobCtx.Context, sc.ScrapeConfig(), lastRun); err != nil {
return fmt.Errorf("error patching crd status: %w", err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Failure runs can exit before CRD status is patched.

This block runs only after successful consumeResources; if processing fails earlier (Line 184), LastRunStatus is never written for that run.

Proposed fix
+ source := sc.ScrapeConfig().GetAnnotations()["source"]
+ agentID := sc.ScrapeConfig().GetAnnotations()["agent_id"]
+ shouldPatchCRDStatus := source == models.SourceCRD && (agentID == "" || agentID == uuid.Nil.String())
+
  if err := consumeResources(jobCtx, *sc.ScrapeConfig(), *config, objs, deletedObjects); err != nil {
  	jobCtx.History.AddErrorf("failed to consume resources: %v", err)
+ 	if shouldPatchCRDStatus {
+ 		lastRun := v1.LastRunStatus{
+ 			Success:   jobCtx.History.SuccessCount,
+ 			Error:     jobCtx.History.ErrorCount,
+ 			Timestamp: metav1.Time{Time: start},
+ 			Errors:    jobCtx.History.Errors,
+ 		}
+ 		if patchErr := updateCRDStatus(jobCtx.Context, sc.ScrapeConfig(), lastRun); patchErr != nil {
+ 			jobCtx.Logger.Errorf("error patching crd status after consume failure: %v", patchErr)
+ 		}
+ 	}
  	return err
  }
...
- source := sc.ScrapeConfig().GetAnnotations()["source"]
- agentID := sc.ScrapeConfig().GetAnnotations()["agent_id"]
- if source == models.SourceCRD && agentID == uuid.Nil.String() {
+ if shouldPatchCRDStatus {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scrapers/incremental.go` around lines 201 - 214, The CRD last-run status
(v1.LastRunStatus) is only patched after consumeResources succeeds, so failures
before that never write LastRunStatus; modify the flow around
sc.ScrapeConfig()/jobCtx.History to always attempt
updateCRDStatus(jobCtx.Context, sc.ScrapeConfig(), lastRun) regardless of
consumeResources outcome—e.g., compute lastRun using jobCtx.History and start
time and invoke updateCRDStatus in a defer or in both the success and error
return paths (referencing updateCRDStatus, sc.ScrapeConfig, v1.LastRunStatus,
jobCtx.History); ensure if updateCRDStatus itself fails you surface or combine
that error with the original error (using wrapping or errors.Join) rather than
silently dropping it.

@moshloop
Copy link
Member

@yashmehrotra I think we will need to rate-limit this reduce updates to etcd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants