Skip to content

Conversation

@bwsalmon
Copy link
Contributor

@bwsalmon bwsalmon commented Oct 6, 2025

  • One-line PR description: Node resizing via balloons
  • Other comments:

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 6, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Oct 6, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @bwsalmon. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 6, 2025
@haircommander haircommander mentioned this pull request Oct 7, 2025
4 tasks
Comment on lines +203 to +205
#### Multiple Kubelets per node for testing

A user who is testing some Kubernetes feature would like to run many Kubelets on the same host to decrease the amount of resources needed to test scenarios with large numbers of Kubelets. By enabling balloons and then resizing the balloons to ensure each Kubelet only consumes one Nth of the host, the customer can place N Kubelets on the same host without overloading the host itself.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This usecase seems to be better served by #5319

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me take a look. Obviously there are a few ways to address the use case.

@ffromani
Copy link
Contributor

/ok-to-test

please make sure to fill keps/prod-readiness/sig-node/NNNN.yaml. It's probably too late for the 1.35 cycle.

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 13, 2025
Consider including folks who also work outside the SIG or subproject.
-->

## Design Details

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Does this mean that the Balloon Pods are DaemonSets that specify the new PriorityClass introduced in this KEP? In other words, this doesn’t involve adding a new core resource like BalloonPod, right?
  • Since they are expected to run in the kube-system namespace, is that achieved by restricting the new PriorityClass so that it can only be used within the kube-system namespace?
  • Since they’re treated like regular pods, that means Balloon Pods are actually launched as containers (e.g. containers that just keep sleeping indefinitely), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent questions:

  1. Yes, I don't believe we need to add a custom resource, although if we find a need I'm open to be convinced otherwise. The hope is for this to be very light weight.
  2. I think that is reasonable, yes. I don't have enough clarity on the expectations of ACLs to feel confident about that yet, though.
  3. Yes, that is correct; they should be normal containers that do nothing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It's much clearer now.

Yes, I don't believe we need to add a custom resource

It might be better to specify in the Non-Goals section that new core and custom resources should not be added. Also, you could include a sample manifest of how to use Balloon Pods in the Proposal section or somewhere, such as a DaemonSet with a new priority class.

@bwsalmon
Copy link
Contributor Author

bwsalmon commented Oct 15, 2025

/ok-to-test

please make sure to fill keps/prod-readiness/sig-node/NNNN.yaml. It's probably too late for the 1.35 cycle.

Sg, thanks! I'll fill in the PRR yaml. This should be fine for 1.36, unless @dchen1107 has a different opinion than our last conversation...

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 15, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bwsalmon
Once this PR has been reviewed and has the lgtm label, please assign jpbetz, mrunalp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants