Add 3-node HA + cloud-ui in TF + local-fast SC for DC-Controlplane#102
Add 3-node HA + cloud-ui in TF + local-fast SC for DC-Controlplane#102HiranAdikari wants to merge 7 commits into
Conversation
Replace the list-passthrough machine_pools input with a per-node-count interface (node_count, node_ips, sizing, vips). The module now generates one machine_pool per node, each with quantity=1, stable names node1/node2/node3, and its own cloud-init that pins enp1s0 to the matching static IP from node_ips. node_count is validated to 1/3/5 so callers cannot accidentally pick a count that cannot form an etcd quorum. A kube-vip static-pod manifest is rendered once at module level (ARP mode, leader-elected) and shipped via cloud-init write_files to /var/lib/rancher/rke2/server/manifests/kube-vip.yaml on every server node. The API VIP is added automatically to the apiserver tls-san so clients reaching the cluster through kube-vip see a cert-valid name; operators can add more SANs via tls_san_extra. The API VIP is NOT allocated from the Harvester IPPool — it is reserved out-of-band and claimed by kube-vip via ARP. The IPPool keeps a single ingress VIP range. To support per-node user_data without duplicating the k8s-cluster module, the machine_pools list entry gains an optional user_data field. When set on a pool it overrides the module-level user_data for that pool only; existing single-user_data callers are unaffected. Adds api_vip, ingress_vip, and node_ips outputs so consumer layers can build /etc/hosts entries, kubeconfigs, and Rancher proxy URLs without re-deriving the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The harvester-cloud-provider Helm chart that RKE2 installs automatically on harvester clusters ships its own kube-vip DaemonSet in kube-system, configured for Service-type LoadBalancer announcements (cp_enable=false, svc_enable=true). Naming our apiserver-VIP static pod kube-vip too is fine technically (different name suffixes) but turns kubectl listings into "which kube-vip am I looking at" guessing games. Rename to kube-vip-apiserver so the role is obvious at a glance. Adds a short comment to the manifest explaining the split. No behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cloud-ui Deployment + Service are applied out-of-band by the cloud-ui GitHub workflow. The matching Ingress was applied the same way, which means a fresh cluster lost cloud-ui routing until the next workflow run — easy to miss during a planned rebuild. Move just the Ingress (not the Deployment/Service) into this module so TF restores routing as part of the cluster bring-up; the Service the Ingress references appears when the workflow next runs. Opt-in: cloudui_hostname empty string means no Ingress is created, so existing consumers without a cloud-ui are unaffected. When set, the Ingress reuses the dc-api self-signed cert via SAN — the existing wildcard in ingress_additional_dns_names already covers sibling hostnames in the same env, so one browser cert warning covers both UIs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds kubernetes_deployment.cloud_ui and kubernetes_service.cloud_ui, gated on a new cloudui_image variable (empty = skip). Mirrors the dc-api dual-ownership pattern: TF owns the Deployment shape, CI's kubectl set image rolls the image, lifecycle ignores image so the two don't fight. Before this, a cluster destroy+recreate left cloud-ui completely missing until the next push to the cloud-ui workflow. With the Deployment in TF the rebuild brings cloud-ui up immediately with whatever cloudui_image is set to (default :latest), and CI takes over rolling forward on the next push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On slow / overloaded host storage, the default 2-replica Longhorn
StorageClass replicates every etcd fsync across the network, which is
the bottleneck that causes 3-node etcd bootstrap to time out and
single-node steady-state to hang. The fix is to put the control-plane
VM root disks on a 1-replica, strict-local Longhorn volume so fsync
becomes a local-disk call. The application-layer etcd quorum is what
makes the loss of disk-level HA acceptable.
dc-controlplane module:
- Creates a kubernetes_storage_class_v1 on the Harvester host
cluster (numberOfReplicas=1, dataLocality=strict-local) using
the existing kubernetes.harvester provider alias. Gated on a new
create_local_storage_class flag, default true.
- vm_storage_class_override lets operators point at a pre-existing
custom SC if they prefer.
- Plumbs the resolved SC name into each generated machine_pools
entry's storage_class_name field.
k8s-cluster module:
- machine_pools entries gain an optional storage_class_name. When
set, it's threaded into the disk_info JSON as storageClassName,
which Harvester honours per-disk. When unset, the disk_info JSON
omits the field entirely — preserves the original behaviour
(use the host cluster's default SC).
In-cluster PVCs (e.g. dc-postgres) are unaffected — those continue to
use the downstream cluster's default SC, which the Harvester CSI
driver translates to the host's default replicated SC.
Production environments with adequate host IOPS should set
create_local_storage_class = false to recover full per-disk
replication for the control-plane VM root disks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harvester's validator webhook rejects StorageClasses with dataLocality=strict-local because it would forbid VM live-migration of any workload using that SC, which conflicts with Harvester's operational model. Switch to best-effort: Longhorn still tries to place the (single) replica on the same Harvester host as the consuming VM, but allows migration when capacity demands it. For our control-plane use case the practical effect is identical: at volume creation Longhorn picks the local host and stays there; fsyncs are local. The only behavioural difference would be the fallback when local capacity is exhausted — strict-local refuses to place at all, best-effort places remotely as a last resort. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Harvester's VM provisioner ignores the storageClassName field that HarvesterConfig.disk_info advertises — boot-disk PVCs are cloned from the source image's PVC and silently inherit the SOURCE image's StorageClass. So even though our pools requested dcapi-controlplane-local, the cloned PVCs ended up on longhorn-ubuntu-22-04 (the SC the upstream Ubuntu image was imported with) — full 2-replica replicated writes, no fsync win. The fix: when create_local_storage_class is true, the module now also creates its OWN VirtualMachineImage in the project namespace, downloaded from var.node_image_url (default upstream Ubuntu 22.04 cloud image), with storageClassName pinned to the local SC. Clones from this image inherit dcapi-controlplane-local — finally giving us single-replica strict-local fsyncs. node_image_name becomes optional. When the module creates its own image, that image's reference flows into local.effective_image_name and gets passed to each pool. node_image_name is only honored when create_local_storage_class is false (the prod path that uses a pre-existing image, typically from a bootstrap layer). A precondition fails fast if the operator turns off local SC without supplying an image. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR introduces cloud-ui workload support and performs a comprehensive refactor of the control-plane module to enable HA RKE2 cluster provisioning with static node configuration. The cloud-ui module conditionally deploys a hardened nginx container, service, and ingress when configuration variables are provided. The control-plane refactor replaces dynamic machine-pool configuration with per-node static provisioning via cloud-init, introduces kube-vip for API server and ingress VIP announcement, adds optional Longhorn-backed local storage classes for VM root disks, and extends the machine-pool schema to support per-pool cloud-init and storage-class overrides. All changes are additive or refactoring; no existing resources are removed. Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@modules/management/dc-controlplane-services/main.tf`:
- Around line 825-827: The TLS config may not include var.cloudui_hostname in
the certificate SANs, causing mTLS/hostname validation failures; update the
certificate creation so the generated cert covers the cloud-ui hostname by
adding var.cloudui_hostname to the certificate's subjectAltNames/hosts list used
to create kubernetes_secret_v1.dc_api_tls (or the cert-manager Certificate
resource) and ensure the tls block continues to reference
kubernetes_secret_v1.dc_api_tls.metadata[0].name so the ingress uses the secret
containing the cert that includes var.cloudui_hostname.
In `@modules/management/dc-controlplane-services/variables.tf`:
- Around line 64-67: The variable description for cloudui_hostname is out of
date: it still states the cloud-ui Deployment and Service are deployed
separately, but the module now creates them when cloudui_image is set. Update
the description of variable "cloudui_hostname" to state that when non-empty the
module creates an Ingress that routes the hostname to a Service named "cloud-ui"
in the dc-system namespace and that if cloudui_image is provided the module will
also create the cloud-ui Deployment and Service (or set to empty string to opt
out).
In `@modules/management/dc-controlplane/main.tf`:
- Around line 83-127: The module pins the created Harvester image to
var.local_storage_class_name instead of the resolved
local.vm_storage_class_name, so a vm_storage_class_override is ignored; update
the harvester_image.dcapi_node resource to set storage_class_name =
local.vm_storage_class_name, and keep the locals.effective_image_name logic
as-is so the module will prefer the created image when present (ensuring the
image you create is actually pinned to the resolved storage class).
In `@modules/management/dc-controlplane/variables.tf`:
- Around line 193-223: Update the documentation and variable descriptions to
match the actual StorageClass type used (best-effort) instead of saying
strict-local: change the module header comment and the description fields for
variable "create_local_storage_class" and "local_storage_class_name" (and any
nearby comments that mention strict-local) to state that the created Longhorn
StorageClass uses best-effort topology/migration semantics and explain the
resulting placement/migration behavior and tradeoffs; keep
"vm_storage_class_override" behavior note (empty string defers to
create_local_storage_class) but ensure its description does not reference
strict-local either.
- Around line 34-52: Add a validation that enforces node_count matches the
length of node_ips so provisioning can't silently mismatch sizes: in the
variable "node_ips" validation (or as an additional validation on "node_count")
add a condition requiring length(var.node_ips) == var.node_count and provide a
clear error_message like "length(node_ips) must equal node_count"; reference the
variables node_count and node_ips when adding the check so the module fails fast
on mismatched values.
In `@modules/workloads/k8s-cluster/main.tf`:
- Around line 118-122: The user_data assignment currently treats whitespace-only
strings as non-empty; update the conditional checks that use
try(each.value.user_data, "") and try(each.value.user_data, null) to apply
trimspace(...) before comparing to "" (e.g., trimspace(try(each.value.user_data,
"")) != "") so whitespace-only overrides are treated as empty and fall back to
local.effective_node_user_data; make the same change to the other similar block
that uses try(each.value.node_user_data, "") so both override paths use
trimspace checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7052de2a-efb4-4d6c-be83-5463e1382656
📒 Files selected for processing (9)
modules/management/dc-controlplane-services/main.tfmodules/management/dc-controlplane-services/variables.tfmodules/management/dc-controlplane/main.tfmodules/management/dc-controlplane/outputs.tfmodules/management/dc-controlplane/templates/kube-vip-rke2.yaml.tftplmodules/management/dc-controlplane/templates/node-cloud-init.yaml.tftplmodules/management/dc-controlplane/variables.tfmodules/workloads/k8s-cluster/main.tfmodules/workloads/k8s-cluster/variables.tf
| tls { | ||
| hosts = [var.cloudui_hostname] | ||
| secret_name = kubernetes_secret_v1.dc_api_tls.metadata[0].name |
There was a problem hiding this comment.
Ensure cloud-ui TLS hostname is covered by the generated certificate.
Line 826 uses var.cloudui_hostname, but cert SANs are not automatically guaranteed to include it. This can cause certificate mismatch when cloud-ui ingress is enabled.
Suggested fix
resource "tls_self_signed_cert" "dc_api" {
@@
- dns_names = concat(
- [var.dcapi_hostname],
- var.ingress_additional_dns_names,
- )
+ dns_names = distinct(compact(concat(
+ [var.dcapi_hostname],
+ var.cloudui_hostname != "" ? [var.cloudui_hostname] : [],
+ var.ingress_additional_dns_names,
+ )))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modules/management/dc-controlplane-services/main.tf` around lines 825 - 827,
The TLS config may not include var.cloudui_hostname in the certificate SANs,
causing mTLS/hostname validation failures; update the certificate creation so
the generated cert covers the cloud-ui hostname by adding var.cloudui_hostname
to the certificate's subjectAltNames/hosts list used to create
kubernetes_secret_v1.dc_api_tls (or the cert-manager Certificate resource) and
ensure the tls block continues to reference
kubernetes_secret_v1.dc_api_tls.metadata[0].name so the ingress uses the secret
containing the cert that includes var.cloudui_hostname.
| variable "cloudui_hostname" { | ||
| type = string | ||
| description = "Public hostname for the cloud-ui ingress. When non-empty, this module creates an Ingress (and re-uses the dc-api self-signed cert via SAN) that routes the hostname to a Service named 'cloud-ui' in the dc-system namespace. The cloud-ui Deployment + Service themselves are deployed separately (out-of-band CI workflow); leaving the Ingress here means a cluster rebuild restores routing as soon as the workflow drops the Service back in. Set to empty string to opt out." | ||
| default = "" |
There was a problem hiding this comment.
Update cloudui_hostname description to match current module behavior.
The text says cloud-ui Deployment/Service are deployed separately, but this module now creates them when cloudui_image is set. Please align the description so consumers don’t misconfigure ownership expectations.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modules/management/dc-controlplane-services/variables.tf` around lines 64 -
67, The variable description for cloudui_hostname is out of date: it still
states the cloud-ui Deployment and Service are deployed separately, but the
module now creates them when cloudui_image is set. Update the description of
variable "cloudui_hostname" to state that when non-empty the module creates an
Ingress that routes the hostname to a Service named "cloud-ui" in the dc-system
namespace and that if cloudui_image is provided the module will also create the
cloud-ui Deployment and Service (or set to empty string to opt out).
| # Resolved SC name for the VM root disks. Empty string = let Harvester use | ||
| # the host cluster's default StorageClass (the original module behaviour). | ||
| locals { | ||
| vm_storage_class_name = ( | ||
| var.vm_storage_class_override != "" ? var.vm_storage_class_override : | ||
| var.create_local_storage_class ? var.local_storage_class_name : | ||
| "" | ||
| ) | ||
| } | ||
|
|
||
| # ── VirtualMachineImage on the local SC ────────────────────────────────────── | ||
| # Harvester's VM provisioner ignores the storageClassName field passed via | ||
| # HarvesterConfig.disk_info — it clones the boot disk from the source image's | ||
| # PVC and the cloned PVC inherits the SOURCE image's StorageClass, not the | ||
| # requested one. To actually land VM root disks on dcapi-controlplane-local | ||
| # we have to give the source image itself that SC. So when the operator opts | ||
| # into the local SC, the module also creates its own VirtualMachineImage in | ||
| # the project namespace, downloaded from the same upstream URL, but with | ||
| # storageClassName pinned. VMs cloned from this image inherit the local SC. | ||
| resource "harvester_image" "dcapi_node" { | ||
| count = var.create_local_storage_class ? 1 : 0 | ||
|
|
||
| name = "dcapi-controlplane-ubuntu" | ||
| namespace = var.project_name | ||
| display_name = var.node_image_display_name | ||
| source_type = "download" | ||
| url = var.node_image_url | ||
|
|
||
| storage_class_name = var.local_storage_class_name | ||
|
|
||
| depends_on = [ | ||
| module.project, | ||
| kubernetes_storage_class_v1.vm_local, | ||
| ] | ||
| } | ||
|
|
||
| # Resolved image reference passed to each machine_pool. When the module | ||
| # created its own image, use that. Otherwise the operator must supply | ||
| # node_image_name themselves (typically from a bootstrap layer's output). | ||
| locals { | ||
| effective_image_name = ( | ||
| var.create_local_storage_class && length(harvester_image.dcapi_node) > 0 | ||
| ? "${var.project_name}/${harvester_image.dcapi_node[0].name}" | ||
| : var.node_image_name | ||
| ) |
There was a problem hiding this comment.
Honor vm_storage_class_override when selecting the source image.
vm_storage_class_override wins in local.vm_storage_class_name, but the generated harvester_image is still pinned to var.local_storage_class_name, and effective_image_name always prefers that image while create_local_storage_class = true. Because the boot disk inherits the source image StorageClass, the override is ignored on the path where it matters most.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modules/management/dc-controlplane/main.tf` around lines 83 - 127, The module
pins the created Harvester image to var.local_storage_class_name instead of the
resolved local.vm_storage_class_name, so a vm_storage_class_override is ignored;
update the harvester_image.dcapi_node resource to set storage_class_name =
local.vm_storage_class_name, and keep the locals.effective_image_name logic
as-is so the module will prefer the created image when present (ensuring the
image you create is actually pinned to the resolved storage class).
| variable "node_count" { | ||
| type = number | ||
| description = "Number of control-plane nodes. Must be 1, 3 or 5 to form an etcd quorum. 3 is the recommended HA default." | ||
| default = 3 | ||
|
|
||
| validation { | ||
| condition = contains([1, 3, 5], var.node_count) | ||
| error_message = "node_count must be 1, 3 or 5 (etcd quorum requirement)." | ||
| } | ||
| } | ||
|
|
||
| variable "node_ips" { | ||
| type = list(string) | ||
| description = "Per-node static management IPs. List length must equal node_count. Order is significant — index N maps to pool 'node{N+1}'." | ||
|
|
||
| validation { | ||
| condition = length(var.node_ips) == length(distinct(var.node_ips)) | ||
| error_message = "node_ips entries must be unique." | ||
| } |
There was a problem hiding this comment.
Enforce node_count against node_ips.
The module builds per-node cloud-init and machine pools from var.node_ips, so node_count currently does not affect provisioning. A mismatch here can silently create the wrong control-plane size. Add a validation or precondition that requires length(node_ips) == node_count.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modules/management/dc-controlplane/variables.tf` around lines 34 - 52, Add a
validation that enforces node_count matches the length of node_ips so
provisioning can't silently mismatch sizes: in the variable "node_ips"
validation (or as an additional validation on "node_count") add a condition
requiring length(var.node_ips) == var.node_count and provide a clear
error_message like "length(node_ips) must equal node_count"; reference the
variables node_count and node_ips when adding the check so the module fails fast
on mismatched values.
| # ── VM root-disk storage ───────────────────────────────────────────────────── | ||
| # By default, this module creates a single-replica, strict-local Longhorn | ||
| # StorageClass on the Harvester host cluster and points the control-plane VM | ||
| # root disks at it. The fsync-latency drop versus the default 2-replica SC is | ||
| # 5-10× — necessary for etcd quorum to bootstrap and stay healthy on slow or | ||
| # overloaded host storage. The trade is that losing the Harvester host where | ||
| # a VM lives loses that VM's root disk; etcd quorum at the application layer | ||
| # is what compensates. | ||
| # | ||
| # For environments with fast underlying storage (good NVMe + spare IOPS), | ||
| # set create_local_storage_class = false to fall back to the host cluster's | ||
| # default StorageClass and recover full per-disk replication. | ||
| # | ||
| # In-cluster PVCs (e.g. dc-postgres) are unaffected — they continue to use | ||
| # the downstream cluster's default StorageClass, which the Harvester CSI | ||
| # driver translates to the host's default (replicated) SC. | ||
| variable "create_local_storage_class" { | ||
| type = bool | ||
| description = "When true, create a single-replica strict-local Longhorn StorageClass on the Harvester host cluster for the control-plane VM root disks. Recommended for dev / slow-disk environments where etcd fsync latency is the bottleneck. Set false in production when host storage has adequate IOPS for the default 2-replica SC." | ||
| default = true | ||
| } | ||
|
|
||
| variable "local_storage_class_name" { | ||
| type = string | ||
| description = "Name of the local-fast StorageClass created on the Harvester host cluster when create_local_storage_class = true." | ||
| default = "dcapi-controlplane-local" | ||
| } | ||
|
|
||
| variable "vm_storage_class_override" { | ||
| type = string | ||
| description = "Explicit StorageClass name for the control-plane VM root disks. Takes precedence over create_local_storage_class — use this to point at a pre-existing custom SC on the Harvester cluster. Empty string defers to the create_local_storage_class logic." |
There was a problem hiding this comment.
Align the storage-class description with the implementation.
These comments and the create_local_storage_class description still say strict-local, but the resource now uses best-effort. Keeping the wording stale will mislead operators about the expected placement and migration behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modules/management/dc-controlplane/variables.tf` around lines 193 - 223,
Update the documentation and variable descriptions to match the actual
StorageClass type used (best-effort) instead of saying strict-local: change the
module header comment and the description fields for variable
"create_local_storage_class" and "local_storage_class_name" (and any nearby
comments that mention strict-local) to state that the created Longhorn
StorageClass uses best-effort topology/migration semantics and explain the
resulting placement/migration behavior and tradeoffs; keep
"vm_storage_class_override" behavior note (empty string defers to
create_local_storage_class) but ensure its description does not reference
strict-local either.
| user_data = ( | ||
| try(each.value.user_data, null) != null && try(each.value.user_data, "") != "" | ||
| ? each.value.user_data | ||
| : local.effective_node_user_data | ||
| ) |
There was a problem hiding this comment.
Treat whitespace-only overrides as empty values.
The checks on Line 119 and Line 134 only compare against "", so whitespace-only values are treated as valid overrides. Use trimspace(...) != "" to match the documented “non-empty” behavior.
Proposed fix
user_data = (
- try(each.value.user_data, null) != null && try(each.value.user_data, "") != ""
+ try(each.value.user_data, null) != null && trimspace(try(each.value.user_data, "")) != ""
? each.value.user_data
: local.effective_node_user_data
)
@@
- try(each.value.storage_class_name, null) != null && try(each.value.storage_class_name, "") != "" ? {
+ try(each.value.storage_class_name, null) != null && trimspace(try(each.value.storage_class_name, "")) != "" ? {
storageClassName = each.value.storage_class_name
} : {}Also applies to: 125-137
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modules/workloads/k8s-cluster/main.tf` around lines 118 - 122, The user_data
assignment currently treats whitespace-only strings as non-empty; update the
conditional checks that use try(each.value.user_data, "") and
try(each.value.user_data, null) to apply trimspace(...) before comparing to ""
(e.g., trimspace(try(each.value.user_data, "")) != "") so whitespace-only
overrides are treated as empty and fall back to local.effective_node_user_data;
make the same change to the other similar block that uses
try(each.value.node_user_data, "") so both override paths use trimspace checks.
Summary
Three related changes to the
dc-controlplane+dc-controlplane-servicesmodules, plus a small
k8s-clusterchange to support per-pool SC overrides.1. 3-node HA dc-controlplane (2 commits)
dc-controlplanemodule now provisions 3 control-plane VMs.kube-vipruns as a static pod on each control-plane node, exposingthe apiserver via an ARP-announced VIP.
dc-controlplane/templates/kube-vip-rke2.yaml.tftpl.dc-controlplane/templates/node-cloud-init.yaml.tftplpulls in the kube-vip manifest at bootstrap.
Harvester host clusters that ship their own kube-vip.
2. cloud-ui in Terraform (2 commits)
dc-controlplane-servicesmodule now creates the cloud-uiDeployment + Service, with optional Ingress (
cloud_ui_ingress_enabled).imagefield islifecycle.ignoredso CI'skubectl set imagerollouts don't fight Terraform.3. Local-fast StorageClass for control-plane disks (3 commits)
create_local_storage_classflag (defaulttrue) indc-controlplane. When set, the module creates a 1-replica,best-effort-local Longhorn SC on the Harvester host cluster and
threads it through each pool's
storage_class_name.VirtualMachineImagewith that SC pinned. Requiredbecause Harvester ignores
storageClassNameon cloned PVCs — theysilently inherit the source image's SC.
recover full replicated SC for control-plane root disks.
Test plan
terraform fmt -recursive -checkcleanterraform validateclean across affected modulesweeks — control-plane Ready, kube-vip serving the apiserver VIP,
cloud-ui Deployment shape matches the TF, control-plane root disks
on the local-fast SC.
Risks
create_local_storage_class=trueaffectsany environment consuming this module on Harvester host storage
without configured
vm_storage_class_override. Operators upgradingshould review the flag.
apiserver VIP for ~30s during the rolling restart of control-plane
nodes. Plan a maintenance window if applying to a live HA cluster.
🤖 Generated with Claude Code