feat: add new kubeletconfigs for node hardening#8497
Open
mxj220 wants to merge 4 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces additional kubelet configuration surface area to support node hardening (soft eviction thresholds/grace periods and cgroup-tiering via kube/system reserved cgroups), spanning the Go-side kubelet config generation and the Linux CSE bootstrap scripts.
Changes:
- Add new translatable kubelet flags and render them into the generated kubelet config file (evictionSoft/evictionSoftGracePeriod/evictionMaxPodGracePeriod, kubeReservedCgroup/systemReservedCgroup).
- Extend the kubelet config datamodel with the new fields.
- Add Linux CSE logic to pre-create the systemd slice(s) required by kubelet’s reserved cgroup enforcement and add Go unit tests for the new kubelet config fields.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/agent/utils.go | Adds new translatable kubelet flags and renders them into the kubelet config JSON. |
| pkg/agent/utils_test.go | Adds unit tests validating new kubelet config fields are rendered/omitted as expected. |
| pkg/agent/datamodel/types.go | Extends AKSKubeletConfiguration with eviction-soft and reserved-cgroup fields. |
| parts/linux/cloud-init/artifacts/cse_helpers.sh | Adds ensureKubeletCgroupHierarchy helper to create/start kubelet.slice under cgroupv2. |
| parts/linux/cloud-init/artifacts/cse_config.sh | Calls the new cgroup hierarchy helper before starting kubelet. |
Comments suppressed due to low confidence (1)
parts/linux/cloud-init/artifacts/cse_config.sh:823
- ShellSpec coverage: this PR adds new provisioning behavior (
ensureKubeletCgroupHierarchyand the new ensureKubelet call path) but does not add/update corresponding ShellSpec tests underspec/parts/linux/to prevent regressions. Please add ShellSpec tests covering (1) config-file mode vs flags mode cgroup extraction and (2) the expected systemd unit file/start behavior (can be mocked) so future refactors don’t break hardened pools.
# Node Memory Hardening (F2/F5): if the RP rendered --kube-reserved-cgroup or
# --system-reserved-cgroup, ensure the corresponding systemd slices exist before
# kubelet starts so its NodeAllocatable enforcement loop can find them. The
# helper is a no-op when neither flag is present (back-compat with non-hardened pools).
KUBE_RESERVED_CGROUP=$(extract_value_from_kubelet_flags "$KUBELET_FLAGS" "kube-reserved-cgroup")
SYSTEM_RESERVED_CGROUP=$(extract_value_from_kubelet_flags "$KUBELET_FLAGS" "system-reserved-cgroup")
export KUBE_RESERVED_CGROUP SYSTEM_RESERVED_CGROUP
if [ -n "${KUBE_RESERVED_CGROUP}" ] || [ -n "${SYSTEM_RESERVED_CGROUP}" ]; then
if ! logs_to_events "AKS.CSE.ensureKubelet.ensureKubeletCgroupHierarchy" ensureKubeletCgroupHierarchy; then
exit $ERR_KUBELET_START_FAIL
fi
fi
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
pkg/agent/utils.go:58
- These cgroup flags are filtered out of
KUBELET_FLAGSin config-file mode, but the aks-node-controller config-file proto does not exposekubeReservedCgrouporsystemReservedCgroup, so scriptless/aks-node-controller provisioning cannot render them intoKUBELET_CONFIG_FILE_CONTENT. Please update the aks-node-controller config model/rendering as well, or avoid filtering these flags until both provisioning paths can carry the values.
"--kube-reserved-cgroup": true,
"--system-reserved-cgroup": true,
Contributor
|
The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Add new kubelet configs for node hardening (soft eviction)
Which issue(s) this PR fixes:
Fixes #