-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lownodeutilization]: Actual utilization: integration with Prometheus #1533
[lownodeutilization]: Actual utilization: integration with Prometheus #1533
Conversation
Hello, master. Due to the company's busy schedule previously, I only managed to complete half of the related KEP. I'm glad to see that you're working on this. It looks like you're aiming to reuse the current Node utilization logic. I have a few suggestions: It should support different data sources, similar to PayPal's load-watcher. Hope these suggestions help! |
Hello sir :) thank you for taking part in composing the out-of-tree descheduling plugin KEP.
You are on the right track here. I'd like to get in touch with load-watcher maintainers and extend the codebase to provide a generic interface for accessing metrics related to pod utilization as well. Currently, only actual node utilization gets collected. Meantime, I am forming the code here to be able to better integrate with other utilization sources like metrics.
This is where we can debate more. Thank you for sharing the specifics. There's an open issue for the pod autoscaler suggesting to introduce EMA: kubernetes/kubernetes#62235. Are you aware if there's a similar issue or a discussion for the cluster autoscaler? I'd love to learn more about how it's implemented there. Ultimately, the current plugin just needs to know which pod, when evicted, will improve the overall node/workload utilization when properly re-scheduled. I could see various ways to produce the utilization snapshot using various methods.
I can see how evicting hotspot pods is related to consuming the metrics/real-time node utilization. In the current plugin context this is more suitable for a new/different plugin. I can also see how |
c889a53
to
1f55c4d
Compare
d744a96
to
800c92c
Compare
kubernetes/kubernetes#128663 to address the discrepancy in the fake metrics client node/pod metricses resource name. |
f30f8a1
to
2e63411
Compare
0330902
to
baa6650
Compare
/test pull-descheduler-verify-master |
Integration with kubernetes metrics in #1555. |
baa6650
to
2442967
Compare
477104c
to
e6e5bf9
Compare
d143aad
to
0f0c525
Compare
/cc |
pkg/descheduler/descheduler.go
Outdated
}, nil | ||
if namespacedSharedInformerFactory != nil && deschedulerPolicy.Prometheus != nil { | ||
namespacedSharedInformerFactory.Core().V1().Secrets().Informer().AddEventHandler(desch.eventHandler()) | ||
desch.namespacedSecretsLister = namespacedSharedInformerFactory.Core().V1().Secrets().Lister().Secrets(deschedulerPolicy.Prometheus.AuthToken.SecretReference.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nil check for AuthToken
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/descheduler/descheduler.go
Outdated
@@ -462,7 +604,19 @@ func RunDeschedulerStrategies(ctx context.Context, rs *options.DeschedulerServer | |||
} | |||
} | |||
|
|||
if namespacedSharedInformerFactory != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create an extra variable similar to reconcileInClusterSAToken
to condition this? At this point it is not entirerly clean why it depends on namespacedSharedInformerFactory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, sa token and secret reconciller are mutually exclusive. Can we use iota enum here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
README.md
Outdated
metrics outside of the kubernetes metrics server. The query is expected to return a vector of values for | ||
each node. The values are expected to be any real number within <0; 1> interval. During eviction only | ||
a single pod is evicted at most from each overutilized node. There's currently no support for evicting | ||
more. Kubernetes metric server takes precedence over Prometheus. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, we can update the text above now
client._nodeUtilization = make(map[string]map[v1.ResourceName]*resource.Quantity) | ||
client._pods = make(map[string][]*v1.Pod) | ||
|
||
results, warnings, err := promv1.NewAPI(client.promClient).Query(context.TODO(), client.promQuery, time.Now()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
context passing could be still improved
0f0c525
to
4a9a008
Compare
8085495
to
d7421d7
Compare
1832705
to
893bda5
Compare
pkg/api/types.go
Outdated
URL string | ||
// authToken used for authentication with the prometheus server. | ||
// If not set the in cluster authentication token for the descheduler service | ||
// account is read from the container's file system is read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// account is read from the container's file system is read. | |
// account is read from the container's file system. |
pkg/api/v1alpha2/types.go
Outdated
|
||
type Prometheus struct { | ||
URL string `json:"url,omitempty"` | ||
// If not set the in cluster authentication token from the container's file system is read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs update as well
if d.previousPrometheusClientTransport != nil { | ||
d.previousPrometheusClientTransport.CloseIdleConnections() | ||
} | ||
d.previousPrometheusClientTransport = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
893bda5
to
d365253
Compare
LGTM |
@atiratree thank you for your patience and expertise. Making the code much more better. Squashing the comments before the final merge. |
d365253
to
e283c31
Compare
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Extend the actual utilization awareness with Prometheus integration.
For testing purposes:
TODO: