Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform a scheduling simulation in dry-run mode. #1585

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions pkg/descheduler/evictions/evictions.go
Original file line number Diff line number Diff line change
@@ -38,6 +38,7 @@ import (

"sigs.k8s.io/descheduler/metrics"
eutils "sigs.k8s.io/descheduler/pkg/descheduler/evictions/utils"
"sigs.k8s.io/descheduler/pkg/descheduler/node"
"sigs.k8s.io/descheduler/pkg/features"
"sigs.k8s.io/descheduler/pkg/tracing"
)
@@ -546,6 +547,7 @@ func (pe *PodEvictor) EvictPod(ctx context.Context, pod *v1.Pod, opts EvictOptio
}

if pe.dryRun {
pe.simulateSchedulingInDryRun(pod)
klog.V(1).InfoS("Evicted pod in dry run mode", "pod", klog.KObj(pod), "reason", opts.Reason, "strategy", opts.StrategyName, "node", pod.Spec.NodeName, "profile", opts.ProfileName)
} else {
klog.V(1).InfoS("Evicted pod", "pod", klog.KObj(pod), "reason", opts.Reason, "strategy", opts.StrategyName, "node", pod.Spec.NodeName, "profile", opts.ProfileName)
@@ -561,6 +563,22 @@ func (pe *PodEvictor) EvictPod(ctx context.Context, pod *v1.Pod, opts EvictOptio
return nil
}

// simulateSchedulingInDryRun is used to reserve the resources of the selected fit node for pods that will be evicted.
// Only be used in dry-run mode.
func (pe *PodEvictor) simulateSchedulingInDryRun(pod *v1.Pod) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why we need to create a pod in dry run mode. The purpose of dry run mode is to simulate the behavior of Descheduler to preview which pods may be evicted. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments:#1584 (comment)

I don't quite understand why we need to create a pod in dry run mode.

The purpose is to reserve the resources.

If any part is not described clearly, please feel free to comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got this. thanks

simlatePod := pod.DeepCopy()
if nodeName, ok := simlatePod.Annotations[node.FitNodeToPodAnnotationKey]; ok {
simlatePod.Spec.NodeName = nodeName
simlatePod.Name = pod.Name + "-simulate-scheduler"
simlatePod.Namespace = "simulate"
if _, err := pe.client.CoreV1().Pods(simlatePod.Namespace).Create(context.TODO(), simlatePod, metav1.CreateOptions{}); err != nil {
klog.ErrorS(err, "failed to simulating scheduler pod in dry-run mode", "pod", klog.KObj(pod))
} else {
klog.V(5).InfoS("Simulate scheduling in dry-run mode.", "pod", klog.KObj(pod), "node", nodeName)
}
}
}

// return (ignore, err)
func (pe *PodEvictor) evictPod(ctx context.Context, pod *v1.Pod) (bool, error) {
deleteOptions := &metav1.DeleteOptions{}
14 changes: 13 additions & 1 deletion pkg/descheduler/node/node.go
Original file line number Diff line number Diff line change
@@ -34,7 +34,10 @@ import (
"sigs.k8s.io/descheduler/pkg/utils"
)

const workersCount = 100
const (
workersCount = 100
FitNodeToPodAnnotationKey = "descheduler.alpha.kubernetes.io/fit-node-name"
)

// ReadyNodes returns ready nodes irrespective of whether they are
// schedulable or not.
@@ -161,6 +164,7 @@ func podFitsNodes(nodeIndexer podutil.GetPodsAssignedToNodeFunc, pod *v1.Pod, no
if err == nil {
klog.V(4).InfoS("Pod fits on node", "pod", klog.KObj(pod), "node", klog.KObj(node))
atomic.AddInt32(&filteredLen, 1)
addFitNodeNameToPod(pod, node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to label all pods, whether they are dry-run or non-dry-run, with this label?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be ideal if this behavior only occurred during dry-run and if it could pass the nodename without setting it as an annotation. However, I haven't figured out how to implement this yet. If this approach is accepted, we can give it a try.

cancel()
} else {
klog.V(4).InfoS("Pod does not fit on node", "pod", klog.KObj(pod), "node", klog.KObj(node), "err", err.Error())
@@ -173,6 +177,14 @@ func podFitsNodes(nodeIndexer podutil.GetPodsAssignedToNodeFunc, pod *v1.Pod, no
return filteredLen > 0
}

func addFitNodeNameToPod(pod *v1.Pod, node *v1.Node) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a fit node to the pod's annotation is not the most appropriate best practice. It seems more reasonable to add a return value to type FilterFunc func(*v1.Pod) bool, changing it to type FilterFunc func(*v1.Pod) (string, bool). However, many filters do not require this string return value. Is there a better implementation approach?

if pod.Annotations != nil {
pod.Annotations[FitNodeToPodAnnotationKey] = node.Name
} else {
pod.Annotations = map[string]string{FitNodeToPodAnnotationKey: node.Name}
}
}

// PodFitsAnyOtherNode checks if the given pod will fit any of the given nodes, besides the node
// the pod is already running on. The predicates used to determine if the pod will fit can be found in the NodeFit function.
func PodFitsAnyOtherNode(nodeIndexer podutil.GetPodsAssignedToNodeFunc, pod *v1.Pod, nodes []*v1.Node) bool {