Skip to content

Conversation

@pierDipi
Copy link
Member

@pierDipi pierDipi commented Oct 17, 2025

Notices a few flaky failures on other PRs, this should improve reliability

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved handling of concurrent update conflicts to prevent failed operations during resource modifications.
    • Enhanced status update resilience with patch-based approach for safer concurrent operations.
  • Tests

    • Added synchronization to prevent race conditions in test execution, ensuring reliable test results.

Signed-off-by: Pierangelo Di Pilato <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Oct 17, 2025

Walkthrough

Test files across controller and webhook layers are enhanced with resilience mechanisms for concurrent resource updates. Changes include retry-on-conflict handling for secret updates, patch-based status updates for InferenceService, and synchronization waits for ConfigMap retrieval to mitigate race conditions.

Changes

Cohort / File(s) Summary
Retry-on-conflict for updates
internal/controller/core/secret_controller_test.go
Wraps updateSecretData and updateSecretLabel calls in retry.RetryOnConflict with retry.DefaultRetry; adds k8s.io/client-go/util/retry import; Get and Update operations execute within retry loop.
Patch-based status update
internal/controller/serving/inferenceservice_controller_test.go
Replaces direct Get and status Update with patch-based approach inside Eventually block; retrieves latest InferenceService, DeepCopy, set Status.URL, then Patch with MergeFrom for concurrent-safe updates.
Synchronization for resource availability
internal/webhook/serving/v1alpha1/llminferenceservice_webhook_test.go
Adds controller-runtime client import; introduces wait after ConfigMap creation using k8sClient.Get with client.ObjectKeyFromObject helper to ensure ConfigMap is available before subsequent operations.

Sequence Diagram(s)

sequenceDiagram
    participant Test
    participant k8sClient
    participant Eventually
    participant InferenceService

    Test->>Eventually: Eventually(patch status)
    Eventually->>k8sClient: Get InferenceService
    k8sClient-->>Eventually: inferred service object
    
    rect rgb(220, 240, 255)
    note right of Eventually: Deep Copy & Modify
    Eventually->>InferenceService: DeepCopy()
    InferenceService-->>Eventually: copied instance
    Eventually->>Eventually: set Status.URL
    end
    
    rect rgb(220, 255, 220)
    note right of Eventually: Patch-based Update
    Eventually->>k8sClient: Patch(MergeFrom)
    k8sClient-->>Eventually: success or retry
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Twitches whiskers with delight
Retries patch where conflicts fight,
ConfigMaps synced, no race to fear,
Resilient tests, the path is clear! 🌟

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Fix flaky tests" directly aligns with the changeset's primary purpose. All three modified files are test files, and each change implements mechanisms to address race conditions and timing-related issues: adding retry-on-conflict handling for secret updates, implementing patch-based status updates within Eventually blocks for InferenceService, and introducing ConfigMap synchronization waits. The title is concise, specific, and clearly communicates the main objective without requiring implementation details, which is appropriate for a PR title.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c7b42c3 and b30a316.

📒 Files selected for processing (3)
  • internal/controller/core/secret_controller_test.go (2 hunks)
  • internal/controller/serving/inferenceservice_controller_test.go (1 hunks)
  • internal/webhook/serving/v1alpha1/llminferenceservice_webhook_test.go (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
internal/controller/core/secret_controller_test.go (1)
internal/controller/testing/config.go (1)
  • Client (56-60)
🔇 Additional comments (9)
internal/webhook/serving/v1alpha1/llminferenceservice_webhook_test.go (4)

26-26: LGTM: Import added for ObjectKeyFromObject helper.

The controller-runtime client import is necessary for the ObjectKeyFromObject helper used in the synchronization waits.


89-94: Excellent addition of synchronization to prevent race conditions.

The Eventually block ensures the ConfigMap is fully propagated and readable before the test proceeds. This prevents flaky failures when the webhook validation tries to read the ConfigMap before it's available.


189-194: Consistent synchronization pattern applied.

The same Eventually wait is correctly applied here for the second ConfigMap creation.


237-242: Consistent synchronization pattern applied.

The same Eventually wait is correctly applied here for the third ConfigMap creation.

internal/controller/core/secret_controller_test.go (3)

30-30: LGTM: Import added for retry utility.

The k8s.io/client-go/util/retry package is the standard library for handling optimistic locking conflicts.


200-224: Excellent use of RetryOnConflict to handle concurrent updates.

Wrapping the Get-Update sequence in retry.RetryOnConflict correctly handles conflicts when the controller updates the secret concurrently. The retry loop fetches the latest version on each attempt, ensuring the update is applied to the current state.


226-251: Excellent use of RetryOnConflict to handle concurrent updates.

The same robust pattern is applied here for label updates. Fetching the latest secret inside the retry loop ensures each attempt operates on the current resource version.

internal/controller/serving/inferenceservice_controller_test.go (2)

189-204: Excellent refactor to use Eventually with Patch for status updates.

This change replaces a direct Status().Update() call with an Eventually block that:

  1. Fetches the latest InferenceService on each attempt
  2. Creates a patched copy with the updated status
  3. Uses Status().Patch() instead of Update() to reduce conflicts

This pattern is more resilient to concurrent controller reconciliations and directly addresses the flaky test issue.


405-419: Consistent application of Eventually + Patch pattern.

The same robust pattern is applied here in the BeforeEach setup. Fetching the latest version on each retry ensures the status update succeeds even when the controller modifies the resource concurrently.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@Jooho Jooho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 17, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jooho, pierDipi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants