-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
📖 Add In-place updates proposal #11029
base: main
Are you sure you want to change the base?
📖 Add In-place updates proposal #11029
Conversation
Skipping CI for Draft Pull Request. |
c77a225
to
be97dc6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the write up.
i left some comments, but i did not go in detailed review on the controller interaction (diagrams) part.
|
||
Both `KCP` and `MachineDeployment` controllers follow a similar pattern around updates, they first detect if an update is required and then based on the configured strategy follow the appropiate update logic (note that today there is only one valid strategy, `RollingUpdate`). | ||
|
||
With `ExternalUpdate` strategy, CAPI controllers will compute the set of desired changes and iterate over the registered external updaters, requesting through the Runtime Hook the set of changes each updater can handle. The changes supported by an updater can be the complete set of desired changes, a subset of them or an empty set, signaling it cannot handle any of the desired changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're falling back to rolling update, to @neolit123's point, it doesn't make sense to me that ExternalUpdate
is a rollout strategy on its own, but rather it should be a field, or set of fields within rolling update that control its behavior?
Note that technically, a rolling update it doesn't have to be a replace operation, but it can be done in place, so imo it can be expanded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an interesting point. I'm not against representing external updates as a subtype of rolling update strategy. You are right that we what we are proposing here, CAPI is following a rolling update process except it delegates the machine update instead of replacing the machine by itself. But capi orchestrates the rolling process.
As long as we can represent the fallback as optional, I'm ok with this if folks think it makes more sense.
|
||
We propose a pluggable update strategy architecture that allows External Update Extension to handle the update process. The design decouples core CAPI controllers from the specific extension implementation responsible for updating a machine. The External Update Strategy will be configured reusing the existing field in KCP and MD resources, by introducing new type of strategy called `ExternalUpdate` (reusing the existing field in KCP and MD). This allows us to provide a consistent user experience: the interaction witht he CAPI resources is the same as in rolling updates. | ||
|
||
This proposal introduces a Lifecycle Hook named `ExternalUpdate` for communication between CAPI and external update implementers. Multiple external updaters can be registered, each of them only covering a subset of machine changes. The CAPI controllers will ask the external updaters what kind of changes they can handle and, based on the reponse, compose and orchestrate them to achieve the desired state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposal is a missing details in how the external updater logic would work, and how the "kind of changes they can handle" is handled. How is that going to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think It'd be good for the proposal to include a reference external updater implementation and shape around one common/trivial driving use case. E.g perform an in-place rolling update of the kubernetes version for a pool of Nodes. Then we can grasp and discuss design implications for RBAC, drain...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enxebre In the 'test plan' section we mention a "CAPD Kubeadm Updater", which will be a reference implementation and also used for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean with "how is that going to work?"? Are you referring to how the external updater knows what are the desired changes? Or how does the external updater compute what changes it can perform and what changes it can't?
Trying to give a generic answer here, the external updater will receive something like "current state" and "desired state" for a particular machine (including machine, infra machine and bootstrap) in the CanUpdateRequest
. Then it will respond with something like an array of fields for those objects (kubeadmconfig -> ["spec.files", "spec.mounts", "spec.files"]
), which would signal the subset of fields that it can update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enxebre
The idea of opening the draft at this stage for review is to get feedback on the core ideas and high level flow before we invest more time on this direction. Unless you think that a reference implementation is necessary to have these discussions, I would prefer to avoid that work.
That said, I totally get that it's possible that the lack of detail in certain areas is making difficult to have the high level discussion. If that's the case, we are happy to add that detail wherever needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to give a generic answer here, the external updater will receive something like "current state" and "desired state" for a particular machine (including machine, infra machine and bootstrap) in the CanUpdateRequest. Then it will respond with something like an array of fields for those objects (kubeadmconfig -> ["spec.files", "spec.mounts", "spec.files"]), which would signal the subset of fields that it can update.
These details must be part of the proposal, the details on how the entire flow from MachineDeployment, to the external request, back to the Machine, and reflecting status are not present, which makes it hard to understand how the technical flow will go and/or propose alternative solutions.
Co-authored-by: Lubomir I. Ivanov <[email protected]>
Co-authored-by: Lubomir I. Ivanov <[email protected]>
Co-authored-by: Lubomir I. Ivanov <[email protected]>
Co-authored-by: Lubomir I. Ivanov <[email protected]>
Co-authored-by: Alexander Demicev <[email protected]>
5eb6664
to
472a336
Compare
Hey folks 👋 @g-gaston Dropping by from the Flatcar Container Linux project - we're a container optimised Linux distro; we joined the CNCF a few weeks ago (incubating). We've been driving implementation spikes of in-place OS and Kubernetes updates in ClusterAPI for some time - at the OS level. Your proposal looks great from our point of view. While progress has been slower in the recent months due to project resource constraints, Flatcar has working proof-of-concept implementations for both in-place updating the OS and Kubernetes - independently. Our implementation is near production ready on the OS level, update activation can be coordinated via kured, and the worker cluster control plane picks up the correct versions. We do lack any signalling to the management cluster as well as more advanced features like coordinated roll-backs (though this would be easy to implement on the OS level). In theory, our approach of in-place Kubernetes updates is distro agnostic (given the "mutable sysext" changes in recent versions of systemd starting with release 256). We presented our work in a CAPZ office hours call earlier this year: https://youtu.be/Fpn-E9832UQ?feature=shared&t=164 (slide deck: https://drive.google.com/file/d/1MfBQcRvGHsb-xNU3g_MqvY4haNJl-WY2/view). We hope our work can provide some insights that help to further flesh out this proposal. Happy to chat if folks are interested. (CC: @tormath1 for visibility) EDIT after initial feedback from @neolit123 : in-place updates of Kubernetes in CAPI are in "proof of concept" stage. Just using sysexts to ship Kubernetes (with and without CAPI) has been in production on (at least) Flatcar for quite some time. Several CAPI providers (OpenStack, Linode) use sysexts as preferred mechanism for Flatcar worker nodes. |
i don't think i've seen usage of sysext with k8s. it's provisioning of image extensions seems like something users can do, but they might as well stick to the vanilla way of using the k8s package registries and employing update scripts for e.g. containerd. the kubeadm upgrade docs, just leverage the package manager upgrade way: one concern that i think i have with systemd-sysext that you still have a intermediate build process for the extension, while the k8s package build process is already done by the k8s release folks. |
On Flatcar, sysexts are the preferred way to run Kubernetes. "Packaging" is straightforward - create a filesystem from a subdirectory - and does not require any distro specific information. The resulting sysext can be used across many distros. I'd argue that the overhead is negligible: download release binaries into a sub-directory and run Drawbacks of the packaging process are:
Sysexts are already used by the ClusterAPI OpenStack and the Linode providers with Flatcar (though without in-place updates). |
the kubeadm and kubelet systemd drop-in files (in the official k8s packages) have some distro specific nuances like Debian vs RedHat paths. is sysexts capable of managing different drop-in files if the target distro is different, perhaps even detecting that automatically? |
Sysexts focus on shipping application bits (Kubernetes in the case at hand); configuration is usually supplied by separate means. That said, a complementary image-based configuration mechanism ("confext") exists for etc. Both approaches have their pros and cons, I'd say it depends on the specifics (I'm not very familiar with kubeadm on Debian vs. Red Hat, I'm more of an OS person :) ). But this should by no means be a blocker. (Sorry for the sysext nerd sniping. I think we should stick to the topic of this PR - I merely wanted to raise that we have a working PoC of in-place Kubernetes updates. Happy to discuss Kubernetes sysexts elsewhere) |
Co-authored-by: Alexandr Demicev <[email protected]> Co-authored-by: Danil-Grigorev <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
@anmazzotti: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Would it make sense to move this PR out of draft status? |
I think this goes in the right direction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work in-place WG team!
Mostly a few nits and cleanup from my side, If we keep iterating quickly on feedback I really think we can get this merged after KubeCon
discussing this today at the office hours, plan is to merge by lazy consensus 1 or 2 weeks after kubecon. |
FYI we discussed the topic at KubeCon and all the people present confirmed that there are no blockers in merging in 1 or 2 weeks. @g-gaston please close as much comments as possible |
addressed all comments :) |
|
||
- To provide rollbacks in case of an in-place update failure. Failed updates need to be fixed manually by the user on the machine or by replacing the machine. | ||
- Introduce any changes to KCP (or any other control plane provider), MachineDeployment, MachineSet, Machine APIs. | ||
- Maintain a coherent user experience for both rolling and in-place updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really be a non-goal now?
- To provide rollbacks in case of an in-place update failure. Failed updates need to be fixed manually by the user on the machine or by replacing the machine. | ||
- Introduce any changes to KCP (or any other control plane provider), MachineDeployment, MachineSet, Machine APIs. | ||
- Maintain a coherent user experience for both rolling and in-place updates. | ||
- Allow in-place updates for single-node clusters without the requirement to reprovision hosts (future goal). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is still valid
What about the OnDelete strategy of MDs?
Probably we just shouldn't try in-place if OnDelete is configured? (so maybe it's a non-goal?)
#11029 (comment)
(just opened a new comment to make it easier to deal with, considering GitHub UI limitations...)
What this PR does / why we need it:
Proposal doc for In-place updates written by the In-place updates feature group.
Starting this as a draft to collect early feedback on the main ideas and high level flow. APIs and some other lower level details are left purposefully as TODOs to focus the conversation on the rest of the doc, speed up consensus and avoid rework.
Fixes #9489
/area documentation