Restricting VPA recommender scans to specific namespaces #7697

ncmuthu · 2025-01-15T15:12:42Z

Which component are you using?:
/area vertical-pod-autoscaler

What version of the component are you using?:
1.0.0

Component version:

What k8s version are you using (kubectl version)?:

kubectl version Output

$ kubectl version $ kubectl version Client Version: v1.32.0 Kustomize Version: v5.5.0 Server Version: v1.32.0

What environment is this in?:
AWS EKS and local Kind cluster

What did you expect to happen?:
I am using the flag --vpa-base-namespace=vpa to limit the VPA functionality only to vpa namespace. It is detecting the VPA resources only from the specified vpa namespace, but the vpa recommender scans the verticalpodautoscalercheckpoints of all namespaces every 10minutes instead of scanning only the specified namespace. We have around 3000 namespaces, so scanning all the namespaces every 10minutes adds load to the kube-api server.

What happened instead?:
Vpa recommender scans the verticalpodautoscalercheckpoints of all namespaces every 10minutes instead of scanning only the specified namespace. Would like to avoid scanning all the namespaces

How to reproduce it (as minimally and precisely as possible):

Install the VPA with default parameters and add --vpa-base-namespace=vpa
Create 3000+ empty namespaces in the cluster
After 13 minutes, will be able to see the logs similar to below.

Anything else we need to know?:
Logs:

I0115 14:54:19.448086       1 flags.go:57] FLAG: --add-dir-header="false"
I0115 14:54:19.448217       1 flags.go:57] FLAG: --address=":8942"
I0115 14:54:19.448218       1 flags.go:57] FLAG: --alsologtostderr="false"
I0115 14:54:19.448219       1 flags.go:57] FLAG: --checkpoints-gc-interval="10m0s"
I0115 14:54:19.448220       1 flags.go:57] FLAG: --checkpoints-timeout="1m0s"
I0115 14:54:19.448221       1 flags.go:57] FLAG: --container-name-label="name"
I0115 14:54:19.448222       1 flags.go:57] FLAG: --container-namespace-label="namespace"
I0115 14:54:19.448224       1 flags.go:57] FLAG: --container-pod-name-label="pod_name"
I0115 14:54:19.448225       1 flags.go:57] FLAG: --cpu-histogram-decay-half-life="24h0m0s"
I0115 14:54:19.448226       1 flags.go:57] FLAG: --cpu-integer-post-processor-enabled="false"
I0115 14:54:19.448227       1 flags.go:57] FLAG: --external-metrics-cpu-metric=""
I0115 14:54:19.448228       1 flags.go:57] FLAG: --external-metrics-memory-metric=""
I0115 14:54:19.448229       1 flags.go:57] FLAG: --history-length="8d"
I0115 14:54:19.448230       1 flags.go:57] FLAG: --history-resolution="1h"
I0115 14:54:19.448231       1 flags.go:57] FLAG: --kube-api-burst="20"
I0115 14:54:19.448232       1 flags.go:57] FLAG: --kube-api-qps="5"
I0115 14:54:19.448238       1 flags.go:57] FLAG: --kubeconfig=""
I0115 14:54:19.448240       1 flags.go:57] FLAG: --log-backtrace-at=":0"
I0115 14:54:19.448242       1 flags.go:57] FLAG: --log-dir=""
I0115 14:54:19.448243       1 flags.go:57] FLAG: --log-file=""
I0115 14:54:19.448244       1 flags.go:57] FLAG: --log-file-max-size="1800"
I0115 14:54:19.448246       1 flags.go:57] FLAG: --logtostderr="true"
I0115 14:54:19.448251       1 flags.go:57] FLAG: --memory-aggregation-interval="24h0m0s"
I0115 14:54:19.448252       1 flags.go:57] FLAG: --memory-aggregation-interval-count="8"
I0115 14:54:19.448253       1 flags.go:57] FLAG: --memory-histogram-decay-half-life="24h0m0s"
I0115 14:54:19.448254       1 flags.go:57] FLAG: --memory-saver="false"
I0115 14:54:19.448256       1 flags.go:57] FLAG: --metric-for-pod-labels="up{job=\"kubernetes-pods\"}"
I0115 14:54:19.448257       1 flags.go:57] FLAG: --min-checkpoints="10"
I0115 14:54:19.448258       1 flags.go:57] FLAG: --one-output="false"
I0115 14:54:19.448260       1 flags.go:57] FLAG: --oom-bump-up-ratio="1.2"
I0115 14:54:19.448264       1 flags.go:57] FLAG: --oom-min-bump-up-bytes="1.048576e+08"
I0115 14:54:19.448265       1 flags.go:57] FLAG: --password=""
I0115 14:54:19.448267       1 flags.go:57] FLAG: --pod-label-prefix="pod_label_"
I0115 14:54:19.448275       1 flags.go:57] FLAG: --pod-name-label="kubernetes_pod_name"
I0115 14:54:19.448276       1 flags.go:57] FLAG: --pod-namespace-label="kubernetes_namespace"
I0115 14:54:19.448277       1 flags.go:57] FLAG: --pod-recommendation-min-cpu-millicores="15"
I0115 14:54:19.448279       1 flags.go:57] FLAG: --pod-recommendation-min-memory-mb="100"
I0115 14:54:19.448280       1 flags.go:57] FLAG: --prometheus-address=""
I0115 14:54:19.448281       1 flags.go:57] FLAG: --prometheus-cadvisor-job-name="kubernetes-cadvisor"
I0115 14:54:19.448282       1 flags.go:57] FLAG: --prometheus-query-timeout="5m"
I0115 14:54:19.448285       1 flags.go:57] FLAG: --recommendation-margin-fraction="0.15"
I0115 14:54:19.448292       1 flags.go:57] FLAG: --recommender-interval="1m0s"
I0115 14:54:19.448293       1 flags.go:57] FLAG: --recommender-name="default"
I0115 14:54:19.448294       1 flags.go:57] FLAG: --skip-headers="false"
I0115 14:54:19.448295       1 flags.go:57] FLAG: --skip-log-headers="false"
I0115 14:54:19.448296       1 flags.go:57] FLAG: --stderrthreshold="2"
I0115 14:54:19.448297       1 flags.go:57] FLAG: --storage=""
I0115 14:54:19.448298       1 flags.go:57] FLAG: --target-cpu-percentile="0.9"
I0115 14:54:19.448299       1 flags.go:57] FLAG: --use-external-metrics="false"
I0115 14:54:19.448300       1 flags.go:57] FLAG: --username=""
I0115 14:54:19.448302       1 flags.go:57] FLAG: --v="4"
I0115 14:54:19.448303       1 flags.go:57] FLAG: --vmodule=""
I0115 14:54:19.448304       1 flags.go:57] FLAG: --vpa-object-namespace="vpa"
I0115 14:54:19.448309       1 main.go:110] Vertical Pod Autoscaler 1.0.0 Recommender: default
I0115 14:54:19.448702       1 reflector.go:221] Starting reflector *v1.DaemonSet (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.448715       1 reflector.go:257] Listing and watching *v1.DaemonSet from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.549978       1 shared_informer.go:303] caches populated
I0115 14:54:19.550002       1 controller_fetcher.go:141] Initial sync of DaemonSet completed
I0115 14:54:19.550102       1 reflector.go:221] Starting reflector *v1.Deployment (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.550111       1 reflector.go:257] Listing and watching *v1.Deployment from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.651238       1 shared_informer.go:303] caches populated
I0115 14:54:19.651281       1 controller_fetcher.go:141] Initial sync of Deployment completed
I0115 14:54:19.651474       1 reflector.go:221] Starting reflector *v1.ReplicaSet (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.651489       1 reflector.go:257] Listing and watching *v1.ReplicaSet from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.753745       1 shared_informer.go:303] caches populated
I0115 14:54:19.753790       1 controller_fetcher.go:141] Initial sync of ReplicaSet completed
I0115 14:54:19.754054       1 reflector.go:221] Starting reflector *v1.StatefulSet (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.754082       1 reflector.go:257] Listing and watching *v1.StatefulSet from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.856309       1 shared_informer.go:303] caches populated
I0115 14:54:19.856382       1 controller_fetcher.go:141] Initial sync of StatefulSet completed
I0115 14:54:19.856767       1 reflector.go:221] Starting reflector *v1.ReplicationController (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.856799       1 reflector.go:257] Listing and watching *v1.ReplicationController from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.957446       1 shared_informer.go:303] caches populated
I0115 14:54:19.957476       1 controller_fetcher.go:141] Initial sync of ReplicationController completed
I0115 14:54:19.957626       1 reflector.go:221] Starting reflector *v1.Job (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:19.957638       1 reflector.go:257] Listing and watching *v1.Job from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:20.059084       1 shared_informer.go:303] caches populated
I0115 14:54:20.059119       1 controller_fetcher.go:141] Initial sync of Job completed
I0115 14:54:20.059381       1 reflector.go:221] Starting reflector *v1.CronJob (10m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:20.059402       1 reflector.go:257] Listing and watching *v1.CronJob from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/controller_fetcher/controller_fetcher.go:136
I0115 14:54:20.160830       1 shared_informer.go:303] caches populated
I0115 14:54:20.160883       1 controller_fetcher.go:141] Initial sync of CronJob completed
I0115 14:54:20.161137       1 main.go:148] Using Metrics Server.
I0115 14:54:20.161276       1 reflector.go:221] Starting reflector *v1.Pod (1h0m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/cluster_feeder.go:171
I0115 14:54:20.161295       1 reflector.go:257] Listing and watching *v1.Pod from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/recommender/input/cluster_feeder.go:171
I0115 14:54:20.161542       1 reflector.go:221] Starting reflector *v1.VerticalPodAutoscaler (1h0m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/vpa/api.go:88
I0115 14:54:20.161556       1 reflector.go:257] Listing and watching *v1.VerticalPodAutoscaler from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/utils/vpa/api.go:88
I0115 14:54:20.262025       1 shared_informer.go:303] caches populated
I0115 14:54:20.262112       1 api.go:92] Initial VPA synced successfully
I0115 14:54:20.262410       1 shared_informer.go:303] caches populated
I0115 14:54:20.262456       1 fetcher.go:99] Initial sync of DaemonSet completed
I0115 14:54:20.262494       1 shared_informer.go:303] caches populated
I0115 14:54:20.262501       1 fetcher.go:99] Initial sync of Deployment completed
I0115 14:54:20.262509       1 shared_informer.go:303] caches populated
I0115 14:54:20.262516       1 fetcher.go:99] Initial sync of ReplicaSet completed
I0115 14:54:20.262532       1 shared_informer.go:303] caches populated
I0115 14:54:20.262558       1 fetcher.go:99] Initial sync of StatefulSet completed
I0115 14:54:20.262571       1 shared_informer.go:303] caches populated
I0115 14:54:20.262576       1 fetcher.go:99] Initial sync of ReplicationController completed
I0115 14:54:20.262583       1 shared_informer.go:303] caches populated
I0115 14:54:20.262588       1 fetcher.go:99] Initial sync of Job completed
I0115 14:54:20.262627       1 shared_informer.go:303] caches populated
I0115 14:54:20.262633       1 fetcher.go:99] Initial sync of CronJob completed
W0115 14:54:20.344780       1 shared_informer.go:419] The sharedIndexInformer has started, run more than once is not allowed
W0115 14:54:20.344812       1 shared_informer.go:419] The sharedIndexInformer has started, run more than once is not allowed
W0115 14:54:20.344841       1 shared_informer.go:419] The sharedIndexInformer has started, run more than once is not allowed
W0115 14:54:20.344845       1 shared_informer.go:419] The sharedIndexInformer has started, run more than once is not allowed
W0115 14:54:20.344856       1 shared_informer.go:419] The sharedIndexInformer has started, run more than once is not allowed
W0115 14:54:20.344853       1 shared_informer.go:419] The sharedIndexInformer has started, run more than once is not allowed
W0115 14:54:20.344857       1 shared_informer.go:419] The sharedIndexInformer has started, run more than once is not allowed
I0115 14:54:20.345077       1 recommender.go:210] New Recommender created &{clusterState:0x400001d0e0 clusterStateFeeder:0x4000161a40 checkpointWriter:0x4000310588 checkpointsGCInterval:600000000000 controllerFetcher:0x400068a2d0 lastCheckpointGC:{wall:13968495778911565158 ext:946342960 loc:0x23c8b00} vpaClient:0x4000417490 podResourceRecommender:0x40005121b0 useCheckpoints:true lastAggregateContainerStateGC:{wall:13968495778911564949 ext:946342794 loc:0x23c8b00} recommendationPostProcessor:[0x23fea40]}
I0115 14:54:20.345181       1 cluster_feeder.go:245] Initializing VPA from checkpoints
I0115 14:54:20.345214       1 cluster_feeder.go:317] Start selecting the vpaCRDs.
I0115 14:54:20.345229       1 cluster_feeder.go:352] Fetched 1 VPAs.
I0115 14:54:20.345311       1 cluster_feeder.go:362] Using selector app=nginx for VPA vpa/nginx-vpa
I0115 14:54:20.345362       1 cluster_feeder.go:254] Fetching checkpoints from namespace vpa
I0115 14:54:20.351032       1 cluster_feeder.go:261] Loading VPA vpa/nginx-vpa checkpoint for nginx

I0115 15:04:20.363110       1 recommender.go:155] Recommender Run
I0115 15:04:20.363224       1 cluster_feeder.go:317] Start selecting the vpaCRDs.
I0115 15:04:20.363240       1 cluster_feeder.go:352] Fetched 1 VPAs.
I0115 15:04:20.363347       1 cluster_feeder.go:362] Using selector app=nginx for VPA vpa/nginx-vpa
I0115 15:04:20.375177       1 metrics_client.go:74] 14 podMetrics retrieved for all namespaces
I0115 15:04:20.375355       1 cluster_feeder.go:440] ClusterSpec fed with #36 ContainerUsageSamples for #18 containers. Dropped #0 samples.
I0115 15:04:20.375387       1 recommender.go:165] ClusterState is tracking 14 PodStates and 1 VPAs
I0115 15:04:20.384038       1 checkpoint_writer.go:114] Saved VPA vpa/nginx-vpa checkpoint for nginx
I0115 15:04:20.384091       1 cluster_feeder.go:272] Starting garbage collection of checkpoints
I0115 15:04:20.384110       1 cluster_feeder.go:317] Start selecting the vpaCRDs.
I0115 15:04:20.384116       1 cluster_feeder.go:352] Fetched 1 VPAs.
I0115 15:04:20.384185       1 cluster_feeder.go:362] Using selector app=nginx for VPA vpa/nginx-vpa
I0115 15:04:22.362238       1 request.go:622] Waited for 192.79075ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns101/verticalpodautoscalercheckpoints
I0115 15:04:22.563352       1 request.go:622] Waited for 198.225876ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1010/verticalpodautoscalercheckpoints
I0115 15:04:22.761011       1 request.go:622] Waited for 192.809625ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1011/verticalpodautoscalercheckpoints
I0115 15:04:22.962065       1 request.go:622] Waited for 196.395709ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1012/verticalpodautoscalercheckpoints
I0115 15:04:23.161045       1 request.go:622] Waited for 193.92025ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1013/verticalpodautoscalercheckpoints
I0115 15:04:23.364266       1 request.go:622] Waited for 199.671167ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1014/verticalpodautoscalercheckpoints
I0115 15:04:23.563172       1 request.go:622] Waited for 193.308292ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1015/verticalpodautoscalercheckpoints
I0115 15:04:23.762318       1 request.go:622] Waited for 194.669916ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1016/verticalpodautoscalercheckpoints
I0115 15:04:23.961083       1 request.go:622] Waited for 192.070292ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1017/verticalpodautoscalercheckpoints
I0115 15:04:24.161377       1 request.go:622] Waited for 195.512209ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1018/verticalpodautoscalercheckpoints
I0115 15:04:24.360894       1 request.go:622] Waited for 195.286375ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1019/verticalpodautoscalercheckpoints
I0115 15:04:24.562020       1 request.go:622] Waited for 195.598125ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns102/verticalpodautoscalercheckpoints
I0115 15:04:24.763033       1 request.go:622] Waited for 195.887625ms due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/autoscaling.k8s.io/v1/namespaces/ns1020/verticalpodautoscalercheckpoints

We can avoid the client side throttling by increasing the kube-api-qps, but would like to avoid scanning all namespaces where we are not going to create VPA resources.

The text was updated successfully, but these errors were encountered:

adrianmoisey · 2025-01-15T15:45:02Z

Thanks for opening this issue.

Having the GarbageCollectCheckpoints() run on all namespaces seems to be an oversight, I think.

The purpose of the garbage collection is to obviously look for garbage, so it looks in all namespaces, in case someone used to have the VPA configured in a different namespace.

I assume we could add a flag to only look at the specified namespace. I can make a PR and see what others think.

/assign

adrianmoisey · 2025-01-15T18:59:56Z

Hold on, I don't have capacity for this yet. I'll unsaying and let someone else take it

/unassign
/triage accepted

omerap12 · 2025-01-16T07:43:58Z

Thanks for opening this issue.

Having the GarbageCollectCheckpoints() run on all namespaces seems to be an oversight, I think.

The purpose of the garbage collection is to obviously look for garbage, so it looks in all namespaces, in case someone used to have the VPA configured in a different namespace.

I assume we could add a flag to only look at the specified namespace. I can make a PR and see what others think.

/assign

We already have two flags, VpaObjectNamespace and IgnoredVpaObjectNamespaces (https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/recommender/main.go#L129).
Wouldn't it be better to make use of these existing flags instead of introducing a new one? I prefer to keep the number of flags to a minimum.

adrianmoisey · 2025-01-16T07:53:42Z

We already have two flags, VpaObjectNamespace and IgnoredVpaObjectNamespaces (https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/recommender/main.go#L129).
Wouldn't it be better to make use of these existing flags instead of introducing a new one? I prefer to keep the number of flags to a minimum.

Agreed, I like that option too. But it changes behaviour of those flags.
Ie, what if someone has the VPA setup to use multiple namespaces, and after some time changes it to a single namespace using VpaObjectNamespace.
If we re-used that flag, then there would be orphaned VPA checkpoint objects in other namespaces that never get cleaned up.

We just need to figure out a solution to handle this sort of thing (it could be just writing some documentation though)

omerap12 · 2025-01-16T08:01:13Z

Hmm, I didn’t think of that. Good point. Not sure what the best solution is here. Do you think just documenting this behavior would be enough?

omerap12 · 2025-01-16T20:50:28Z

/assign

iamzili · 2025-01-17T10:38:05Z

Another option could be to modify the interval (perhaps temporarily) for searching orphaned checkpoint objects across all namespaces by updating the --checkpoints-gc-interval flag. This would help avoid checking all namespaces every 10 minutes.

@ncmuthu what do you think?

voelzmo · 2025-01-17T11:31:15Z

@omerap12

Hmm, I didn’t think of that. Good point. Not sure what the best solution is here. Do you think just documenting this behavior would be enough?

Yeah, I think rather than introducing yet-another-flag for this scenario, it should be sufficient to document that making the range of namespaces that a recommender operates in smaller has the potential to leave things behind that users need to clean up themselves. This is true for the VPAs and for the VPACheckpoints as well.

omerap12 · 2025-01-17T11:32:18Z

@omerap12

Hmm, I didn’t think of that. Good point. Not sure what the best solution is here. Do you think just documenting this behavior would be enough?

Yeah, I think rather than introducing yet-another-flag for this scenario, it should be sufficient to document that making the range of namespaces that a recommender operates in smaller has the potential to leave things behind that users need to clean up themselves. This is true for the VPAs and for the VPACheckpoints as well.

Agreed. I will work on that :)

voelzmo · 2025-01-17T11:33:06Z

@iamzili

Another option could be to modify the interval (perhaps temporarily) for searching orphaned checkpoint objects across all namespaces by updating the --checkpoints-gc-interval flag. This would help avoid checking all namespaces every 10 minutes.

@ncmuthu what do you think?

another recommendation is to increase --kube-api-burst and --kube-api-qps when working in large-scale scenarios. The defaults are absolutely not suited for the scale that @ncmuthu mentioned in the original post.

ncmuthu · 2025-01-17T14:23:50Z

Another option could be to modify the interval (perhaps temporarily) for searching orphaned checkpoint objects across all namespaces by updating the --checkpoints-gc-interval flag. This would help avoid checking all namespaces every 10 minutes.

@ncmuthu what do you think?

Thank you for the response and checking on this. This would be help. Even though I do not operate on other namespaces, I can increase to much higher value which will not affect the performance of the kube-api.

ncmuthu · 2025-01-17T14:25:32Z

@iamzili

Another option could be to modify the interval (perhaps temporarily) for searching orphaned checkpoint objects across all namespaces by updating the --checkpoints-gc-interval flag. This would help avoid checking all namespaces every 10 minutes.
@ncmuthu what do you think?

another recommendation is to increase --kube-api-burst and --kube-api-qps when working in large-scale scenarios. The defaults are absolutely not suited for the scale that @ncmuthu mentioned in the original post.

Thank you for the response. At this time I mitigated with these settings, the only question is, since we want to operate on only one or few namespaces, looking for option to avoid queryng all the namespaces in regular interval.

ncmuthu added the kind/bug Categorizes issue or PR as related to a bug. label Jan 15, 2025

k8s-ci-robot added the area/vertical-pod-autoscaler label Jan 15, 2025

k8s-ci-robot assigned adrianmoisey Jan 15, 2025

voelzmo mentioned this issue Jan 15, 2025

Make informer use correct namespace flag #7698

Merged

k8s-ci-robot unassigned adrianmoisey Jan 15, 2025

k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jan 15, 2025

k8s-ci-robot assigned omerap12 Jan 16, 2025

omerap12 linked a pull request Jan 17, 2025 that will close this issue

Feat: improve VPA filtering and checkpoint garbage collection #7716

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restricting VPA recommender scans to specific namespaces #7697

Restricting VPA recommender scans to specific namespaces #7697

ncmuthu commented Jan 15, 2025

adrianmoisey commented Jan 15, 2025

adrianmoisey commented Jan 15, 2025

omerap12 commented Jan 16, 2025

adrianmoisey commented Jan 16, 2025

omerap12 commented Jan 16, 2025

omerap12 commented Jan 16, 2025

iamzili commented Jan 17, 2025

voelzmo commented Jan 17, 2025 •

edited

Loading

omerap12 commented Jan 17, 2025

voelzmo commented Jan 17, 2025

ncmuthu commented Jan 17, 2025

ncmuthu commented Jan 17, 2025

Restricting VPA recommender scans to specific namespaces #7697

Restricting VPA recommender scans to specific namespaces #7697

Comments

ncmuthu commented Jan 15, 2025

adrianmoisey commented Jan 15, 2025

adrianmoisey commented Jan 15, 2025

omerap12 commented Jan 16, 2025

adrianmoisey commented Jan 16, 2025

omerap12 commented Jan 16, 2025

omerap12 commented Jan 16, 2025

iamzili commented Jan 17, 2025

voelzmo commented Jan 17, 2025 • edited Loading

omerap12 commented Jan 17, 2025

voelzmo commented Jan 17, 2025

ncmuthu commented Jan 17, 2025

ncmuthu commented Jan 17, 2025

voelzmo commented Jan 17, 2025 •

edited

Loading