design-proposal: split kubernetes package and add Talos backends#8
design-proposal: split kubernetes package and add Talos backends#8kvaps wants to merge 1 commit intocozystack:mainfrom
Conversation
Propose extracting node pools from the kubernetes application into a sibling kubernetes-nodes application, modelled on the vm-instance/vm-disk split. Add a backend abstraction that supports the existing KubeVirt+kubeadm flow alongside new Talos backends: KubeVirt+Talos via clastix/talos-csr-signer, and cloud-talos for Hetzner and Azure without Cluster API. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request proposes a significant architectural change to split the monolithic kubernetes package into separate control-plane and node-pool components, enabling independent lifecycles and multi-backend support for KubeVirt and cloud-native Talos workers. The review feedback identifies a technical error regarding the gRPC protocol (TCP vs UDP), suggests consolidating the node-lifecycle-controller and Talos tokens at the cluster level for better efficiency and simpler configuration, and requests further implementation details for the dependency discovery mechanism in the proposed admission webhook.
|
|
||
| Renders the same CAPI/CAPK objects as above, but with a `TalosConfigTemplate` (from `cluster-api-bootstrap-provider-talos`) replacing `KubeadmConfigTemplate`. Worker VMs boot from a Talos image. Bootstrap fetches the Talos machineconfig from CAPI and joins the cluster via standard Talos PKI. | ||
|
|
||
| The tenant's `KamajiControlPlane` carries an `additionalContainers` entry running `clastix/talos-csr-signer` listening on UDP/50001, exposed alongside `:6443` on the tenant API LoadBalancer. This is what allows `talosctl` to operate against worker nodes whose control-plane is Kamaji rather than Talos. |
|
|
||
| When `cluster-autoscaler` scales a `cloud-talos-*` pool down, it deletes the cloud VM. The tenant's apiserver still has a `Node` object that will linger until something deletes it. CAPI was previously the agent doing this; without CAPI, we need an equivalent. | ||
|
|
||
| The `node-lifecycle-controller` from `cozystack/local-ccm` is a good fit for this role. The `kubernetes-nodes` chart for `cloud-talos-*` backends renders an NLC Deployment that runs in the management cluster but uses a kubeconfig pointing to the **tenant** apiserver. It watches Node objects with the `ToBeDeletedByClusterAutoscaler:NoSchedule` taint and removes them after a configurable grace period and unreachability check. |
There was a problem hiding this comment.
Deploying a separate node-lifecycle-controller (NLC) for each kubernetes-nodes release leads to redundant processes watching the same tenant API server. A more efficient approach would be to manage a single NLC instance per tenant cluster (e.g., within the kubernetes control-plane chart) that handles node cleanup for all associated node pools.
| ## Failure and edge cases | ||
|
|
||
| - **`kubernetes-nodes` HelmRelease created before its parent `kubernetes` HelmRelease** → chart `fail`s the render with a clear error message identifying the missing parent. No partial CAPI/autoscaler resources created. | ||
| - **Parent `kubernetes` HelmRelease deleted while children exist** → all `kubernetes-nodes` HelmReleases for that cluster fail subsequent reconciles. An admission webhook on `kubernetes` HelmRelease delete blocks the operation if any `kubernetes-nodes` references it. |
There was a problem hiding this comment.
The proposed admission webhook for blocking kubernetes HelmRelease deletion requires a mechanism to discover name-based dependencies. Clarifying where this logic will reside and how it will perform discovery would strengthen the proposal, especially since the linkage doesn't use standard Kubernetes owner references.
| **System layer (chart-managed, not exposed to user):** | ||
|
|
||
| - Cluster CA, machine CA, apiserver endpoint — read at template time via `lookup` from the tenant's `KamajiControlPlane`. | ||
| - Talos token — generated once per pool, stored alongside the machineconfig. |
There was a problem hiding this comment.
Generating a TALOS_TOKEN per pool while running a single talos-csr-signer sidecar per cluster (as discussed in Open Question #3 on line 253) introduces a configuration challenge for the signer, as it must be aware of all active tokens. Using a single TALOS_TOKEN per tenant cluster would simplify the sidecar configuration and the kubernetes-nodes chart logic while maintaining sufficient isolation between tenants.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Adjust the proposal to reflect that the controller will be developed as an independent project under the kilo-io organization, per confirmed interest from Kilo maintainer @squat. Generalize the CRD from a tenant-specific TenantMeshLink to a tenant-agnostic ClusterMesh that references peer clusters through a map of kubeconfig Secrets. Move all tenant semantics into a dedicated Cozystack integration section that also accounts for the kubernetes-nodes split (PR cozystack#8) so a single ClusterMesh covers multi-location, multi-backend tenants. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Summary
Adds a design proposal for extracting worker node pools from the
kubernetesapplication into a siblingkubernetes-nodesapplication, modelled on the existingvm-instance/vm-diskprecedent.The split decouples control-plane and node-pool lifecycles, and introduces a backend abstraction so a single tenant cluster can mix:
kubevirt-kubeadm— the existing flow, kept as default for backward compatibility.kubevirt-talos— new, Talos workers on KubeVirt VMs viaclastix/talos-csr-signer.cloud-talos-hetznerandcloud-talos-azure— new, no Cluster API; uses native cloud-autoscaler with Talosmachineconfiginjected via cloud-init, mirroring the model already proven for the management cluster (/docs/v1.3/operations/multi-location/autoscaling/).Includes:
vm-instance/vm-diskprecedent)machineconfigtemplate + user-overlay pattern (system layer hidden from the user)cluster-autoscalerrunning in the management clustercozystack/local-ccmforcloud-talos-*backendsLooking for feedback on the open questions, especially long-term CAPI removal scope, per-pool vs cluster-wide Talos token model, and NLC reuse vs fork.
Test plan
This is a design proposal; no code yet. Implementation testing is scoped in the proposal:
kubernetes-nodesvalues.yaml