Skip to content

[FG:InPlacePodVerticalScaling] When resize container, before CanAdmitPod, need to check if the node avaliable resources are sufficient #134581

@shiya0705

Description

@shiya0705

In current code, canResizePod() only checks node allocatable resource and then execute canAdmitPod(), but it doesn't check node available resource.

For container resize, if node available resources are sufficient, container should enter to Inferred status and stop following process, but with current logic, it still enter to canAdmitPod to allocate cpu and memory for container.

On the other side, in cpu‑manager‑policy=static mode, CanAdmitPod can allocate a cpuset for a Guaranteed pod and then verify the node’s available resources.
there is an potential error for guaranteed pod resize when node available resources are sufficient.

Consider the following scenario:
Test configuration

  • Node total CPUs: 20
  • Reserved CPUs: 0, 11
  • Allocatable CPUs: 18
  • Total CPU requests on the node: 1100 m
  • Available CPUs on the node: 18000 m − 1100 m = 16900 m

Step 1: Create container #0 with a CPU request of 16 cores. The assigned cpuset is 1‑8,11‑18.
Step 2: Resize container #0, increasing the request from 16 cores to 17 cores.
- CanAdmitPod allocates a new cpuset 1‑9,11‑18 and writes it to the container’s cgroup (cpuset.cpus: 1‑9,11‑18).
- Afterwards, the node‑resource check fails because the required 17000 m exceeds the available 16900 m, causing the resize to enter Deferred status.

The problem is that, although the resize is deferred due to insufficient node resources, the new cpuset has already been applied to the container’s cgroup. This leaves the container with a resource allocation that the node cannot actually provide, which is undesirable.

Suggested improvements:
Perform the node available resource check before canAdmitPod, so that container enter deferred status and reduce unnecessary operations in canAdmitPod, and solve the potential error in guaranteed pod resize

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions