-
Notifications
You must be signed in to change notification settings - Fork 291
fix: invalidate cluster cache after sync operations #745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix: invalidate cluster cache after sync operations #745
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #745 +/- ##
==========================================
- Coverage 54.26% 47.27% -7.00%
==========================================
Files 64 64
Lines 6164 6589 +425
==========================================
- Hits 3345 3115 -230
- Misses 2549 3218 +669
+ Partials 270 256 -14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my comments.
Signed-off-by: Peter Jiang <[email protected]>
Signed-off-by: Peter Jiang <[email protected]>
Signed-off-by: Peter Jiang <[email protected]>
Signed-off-by: Peter Jiang <[email protected]>
Signed-off-by: Peter Jiang <[email protected]>
Signed-off-by: Peter Jiang <[email protected]>
f303abe
to
6342d8f
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my comments
@@ -74,6 +74,13 @@ func (e *gitOpsEngine) Sync(ctx context.Context, | |||
namespace string, | |||
opts ...sync.SyncOpt, | |||
) ([]common.ResourceSyncResult, error) { | |||
// Ensure cache is synced before getting managed live objects | |||
// This forces a refresh if the cache was invalidated | |||
err := e.cache.EnsureSynced() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that this function was designed to be called just when the gitops engine is being initialized. This will call the clusterCacheSync.synced
function that will just validate on the sync time. The ClusterCache should always be synced because it updates based on resource watches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after looking at argo-cd code more it seems like GetManagedLiveObjs()
actually calls c.getSyncedCluster(destCluster)
which eventually calls clusterCache.EnsureSynced()
anyway.
So this part is definitely not needed
opts = append(opts, sync.WithCacheInvalidationCallback(func(modifiedResources []kube.ResourceKey) { | ||
// Only invalidate the specific resources that were modified | ||
if len(modifiedResources) > 0 { | ||
e.cache.InvalidateResources(modifiedResources) | ||
} | ||
})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we need to invalidate the cache on every sync if the cluster cache is based on resource watches? I feel that the root problem is elsewhere. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah seems like this solution is reactively removing cached resources and is more of a workaround.
The problem is more likely in watchEvents
not properly updating resources like CRDs schema changes.
// Invalidate cache after successful sync | ||
if sc.cacheInvalidationCallback != nil { | ||
modifiedResources := make([]kubeutil.ResourceKey, 0, len(sc.syncRes)) | ||
for _, result := range sc.syncRes { | ||
modifiedResources = append(modifiedResources, result.ResourceKey) | ||
} | ||
sc.cacheInvalidationCallback(modifiedResources) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment applies:
Why would we need to invalidate the cache on every sync if the cluster cache is based on resource watches? I feel that the root problem is elsewhere. WDYT?
fixes argoproj/argo-cd#20458
There have been several times after a sync that the diff shows the live resource before the sync, causing incorrect diffs. This change will ensure if there is a sync with a valid task (non no-op) then we invalidate the cache. We only invalidate cache for resources that were modified to save on operational cost
This is guaranteeing fresh data for subsequent diff calculations
This could also help/fix an issue with CRDs as previously after a CRD schema changes and is synced, we run into a diff error because diff is using stale schema