-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduler Framework Logic: Preview Only, Do Not Merge #1
base: main
Are you sure you want to change the base?
Conversation
// EqualCondition compares one condition with another; it ignores the LastTransitionTime and Message fields, | ||
// and will consider the ObservedGeneration values from the two conditions a match if the current | ||
// condition is newer. | ||
func EqualCondition(current, desired *metav1.Condition) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there no such util in the existing k8s code base? feel like something useful. I searched in k8s repo but I can't seem to find one though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it happened back then when fleet networking was under development as well. There isn't one available so eventually we build one of our own.
@@ -32,6 +37,16 @@ type CycleStatePluginReadWriter interface { | |||
type CycleState struct { | |||
// store is a concurrency-safe store (a map). | |||
store sync.Map | |||
|
|||
// skippedFilterPlugins is a set of Filter plugins that should be skipped in the current scheduling cycle. | |||
skippedFilterPlugins sets.String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sets.String is deprecated, why not use the generic set (https://github.com/kubernetes/kubernetes/blob/96d853f4b86b1b44bfb9340e9d05f5fc6b3f4bb9/staging/src/k8s.io/apimachinery/pkg/util/sets/set.go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ryan! The reason why generic set is not used is that it is not supported yet with our current dependency version. It was added since 1.26; ours use 1.25. Is it OK if I bump version?
) | ||
|
||
// A no-op, dummy plugin which connects to all extension points. | ||
type DummyAllPurposePlugin struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be good to add the var _ Interface = DummyAllPurposePlugin{} to make it clear what interfaces this implements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we create an interface for Profile? I know there are already interface for its individual functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do for static check.
For the interface part, at this moment I couldn't think of any case where multiple implementations of profiles are needed, and there seems to be no need to limit access to this component. Though I don't really have any strong opinion against adding one though.
func ExtractNumOfClustersFromPolicySnapshot(policy *fleetv1beta1.ClusterPolicySnapshot) (int, error) { | ||
numOfClustersStr, ok := policy.Annotations[fleetv1beta1.NumberOfClustersAnnotation] | ||
if !ok { | ||
return 0, fmt.Errorf("cannot find annotation %s", fleetv1beta1.NumberOfClustersAnnotation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will CRP name help the debug or this error will be wrapped with more info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller will wrap this with more info I believe.
if diff := maxClusterDecisionCount - len(refreshed); diff > 0 { | ||
current := policy.Status.ClusterDecisions | ||
for _, decision := range current { | ||
if !decision.Selected { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how are we sure that the cluster here is not part of the already existing bindings? We update the status because we think the status might be out of sync with the binding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this one is a missight on my end. It is true that we cannot trust any old decisions on policy snapshot. It has been fixed.
pkg/scheduler/framework/framework.go
Outdated
// Calculate the batch size. | ||
// | ||
// Note that obsolete bindings are not counted. | ||
state.desiredBatchSize = int(*policy.Spec.Policy.NumberOfClusters) - len(active) - len(creating) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NumberOfClusters is optional, what if it is nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have split the flow for PickAll and PickN for better clarity.
For this specific point, it is a leftover missight from earlier code where it is not yet certain whether numOfClusters will be annotated; it should use the extracted numOfClusters instead. Many apologies.
pkg/scheduler/framework/framework.go
Outdated
// | ||
// Note that a race condition may arise here, when a rollout controller attempts to update bindings | ||
// at the same time with the scheduler. An error induced requeue will happen in this case. | ||
if err := f.updateBindings(ctx, toUpdate); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there real race between scheduler/rollout controller in this case as they manage different part of the binding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ryan! A race could happen if the rollout controller are transitioning a binding from creating stated to active while the scheduler is trying to update the binding at the same time.
pkg/scheduler/framework/framework.go
Outdated
// | ||
// This is set to happen after new bindings are created and old bindings are updated, to | ||
// avoid interruptions (deselected then reselected) in a best effort manner. | ||
if err := f.markAsDeletingFor(ctx, toDelete); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here as scheduler has the truth state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The race could that the scheduler would like to mark a creating binding as deleting while the rollout controller is moving the binding to the active state.
// Trimming obsolete and creating bindings alone is not enough, move on to active bindings. | ||
|
||
// Sort the bindings by their CreationTimestamps. | ||
sorted := sortByCreationTimestampBindings(active) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that the create time of the binding is the best indicator. What about last update time? Need to think through what's the real life implication of picking one bind vs another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ryan! The objectmeta does not have a lastUpdateTimestamp, there are only Creation and Deletion timestamps I think. We could use lastTransitionTime, but it is in the condition (which is manipulated by the Dispatcher). The alternative is to add a label of our own to mark lastUpdateTimestamp.
* drift detection and takeover implementation #1 * Minor fixes * Included all changes + added/migrated tests * Minor fixes * Minor fixes
No description provided.