Skip to content

feat: add new kubeletconfigs for node hardening#8497

Open
mxj220 wants to merge 4 commits into
mainfrom
markibrahim/kubeletconfig-node-hdng
Open

feat: add new kubeletconfigs for node hardening#8497
mxj220 wants to merge 4 commits into
mainfrom
markibrahim/kubeletconfig-node-hdng

Conversation

@mxj220
Copy link
Copy Markdown
Contributor

@mxj220 mxj220 commented May 12, 2026

What this PR does / why we need it:
Add new kubelet configs for node hardening (soft eviction)
Which issue(s) this PR fixes:

Fixes #

Copilot AI review requested due to automatic review settings May 12, 2026 21:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces additional kubelet configuration surface area to support node hardening (soft eviction thresholds/grace periods and cgroup-tiering via kube/system reserved cgroups), spanning the Go-side kubelet config generation and the Linux CSE bootstrap scripts.

Changes:

  • Add new translatable kubelet flags and render them into the generated kubelet config file (evictionSoft/evictionSoftGracePeriod/evictionMaxPodGracePeriod, kubeReservedCgroup/systemReservedCgroup).
  • Extend the kubelet config datamodel with the new fields.
  • Add Linux CSE logic to pre-create the systemd slice(s) required by kubelet’s reserved cgroup enforcement and add Go unit tests for the new kubelet config fields.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pkg/agent/utils.go Adds new translatable kubelet flags and renders them into the kubelet config JSON.
pkg/agent/utils_test.go Adds unit tests validating new kubelet config fields are rendered/omitted as expected.
pkg/agent/datamodel/types.go Extends AKSKubeletConfiguration with eviction-soft and reserved-cgroup fields.
parts/linux/cloud-init/artifacts/cse_helpers.sh Adds ensureKubeletCgroupHierarchy helper to create/start kubelet.slice under cgroupv2.
parts/linux/cloud-init/artifacts/cse_config.sh Calls the new cgroup hierarchy helper before starting kubelet.
Comments suppressed due to low confidence (1)

parts/linux/cloud-init/artifacts/cse_config.sh:823

  • ShellSpec coverage: this PR adds new provisioning behavior (ensureKubeletCgroupHierarchy and the new ensureKubelet call path) but does not add/update corresponding ShellSpec tests under spec/parts/linux/ to prevent regressions. Please add ShellSpec tests covering (1) config-file mode vs flags mode cgroup extraction and (2) the expected systemd unit file/start behavior (can be mocked) so future refactors don’t break hardened pools.
    # Node Memory Hardening (F2/F5): if the RP rendered --kube-reserved-cgroup or
    # --system-reserved-cgroup, ensure the corresponding systemd slices exist before
    # kubelet starts so its NodeAllocatable enforcement loop can find them. The
    # helper is a no-op when neither flag is present (back-compat with non-hardened pools).
    KUBE_RESERVED_CGROUP=$(extract_value_from_kubelet_flags "$KUBELET_FLAGS" "kube-reserved-cgroup")
    SYSTEM_RESERVED_CGROUP=$(extract_value_from_kubelet_flags "$KUBELET_FLAGS" "system-reserved-cgroup")
    export KUBE_RESERVED_CGROUP SYSTEM_RESERVED_CGROUP
    if [ -n "${KUBE_RESERVED_CGROUP}" ] || [ -n "${SYSTEM_RESERVED_CGROUP}" ]; then
        if ! logs_to_events "AKS.CSE.ensureKubelet.ensureKubeletCgroupHierarchy" ensureKubeletCgroupHierarchy; then
            exit $ERR_KUBELET_START_FAIL
        fi
    fi

Comment thread parts/linux/cloud-init/artifacts/cse_config.sh
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
@mxj220 mxj220 changed the title feat: Add new kubeletconfigs for node hardening feat: add new kubeletconfigs for node hardening May 13, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

pkg/agent/utils.go:58

  • These cgroup flags are filtered out of KUBELET_FLAGS in config-file mode, but the aks-node-controller config-file proto does not expose kubeReservedCgroup or systemReservedCgroup, so scriptless/aks-node-controller provisioning cannot render them into KUBELET_CONFIG_FILE_CONTENT. Please update the aks-node-controller config model/rendering as well, or avoid filtering these flags until both provisioning paths can carry the values.
	"--kube-reserved-cgroup":              true,
	"--system-reserved-cgroup":            true,

Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread parts/linux/cloud-init/artifacts/cse_config.sh Outdated
Comment thread pkg/agent/utils.go
Copilot AI review requested due to automatic review settings May 13, 2026 18:52
@github-actions
Copy link
Copy Markdown
Contributor

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 13, 2026, 6:53 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants