-
Notifications
You must be signed in to change notification settings - Fork 2
docs(cozystack-upgrade): add KubeVirt 1.6→1.8 VM cold-restart workflow #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
kvaps
wants to merge
1
commit into
main
Choose a base branch
from
feat/kubevirt-1.6-to-1.8-vm-restart
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -253,6 +253,127 @@ Restore from backup. There is no clean in-cluster recovery for a deleted `cozy-s | |||||||||
| 3. Re-apply the Platform Package from rescue.yaml (manual review required; CRD schemas may have moved). | ||||||||||
| 4. Expect tenant disruption; communicate to users. | ||||||||||
|
|
||||||||||
| ## 8. KubeVirt 1.6.x → 1.8.x: live-migration of pre-existing VMs fails on `virtio-net` | ||||||||||
|
|
||||||||||
| ### Symptom | ||||||||||
|
|
||||||||||
| After the Cozystack upgrade rolls out a new KubeVirt version that crosses the QEMU bump boundary (specifically 1.6.x → 1.7+), every live-migration that KubeVirt's `workloadUpdateMethods` triggers fails with: | ||||||||||
|
|
||||||||||
| ```text | ||||||||||
| virError(Code=9, Domain=10, Message='operation failed: job 'migration in' failed: | ||||||||||
| load of migration failed: Operation not permitted') | ||||||||||
| qemu-kvm: error while loading state for instance 0x0 of device '0000:00:02.0:00.0/virtio-net' | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| `kubectl get vmim -A` shows a growing pile of `Failed` evacuations on every running VM. KubeVirt keeps retrying — VMs stay up but the migration loop never converges. | ||||||||||
|
|
||||||||||
| ### Root cause | ||||||||||
|
|
||||||||||
| [kubevirt/kubevirt#16386](https://github.com/kubevirt/kubevirt/issues/16386). When KubeVirt is upgraded across a QEMU version bump (e.g. `qemu-9.1.0-19.el9` → `qemu-9.1.0-20.el9`), VMs that were running before the upgrade have an in-memory device state tied to the old QEMU. The new QEMU can't reload that state for some devices (notably `virtio-net`) → migration `in` fails with `Operation not permitted`. | ||||||||||
|
|
||||||||||
| This is **not** specific to network/storage configuration. It affects every VM that started under the old QEMU and never restarted. New VMs and VMs restarted after the upgrade are unaffected. | ||||||||||
|
|
||||||||||
| Switching `workloadUpdateMethods` to `[Evict]` does **not** help — the `virt-launcher-eviction-interceptor` webhook converts evictions back into live-migrations because VMIs have `evictionStrategy: LiveMigrate` (an immutable field on a running VMI). | ||||||||||
|
|
||||||||||
| ### Recovery / workaround | ||||||||||
|
|
||||||||||
| The only fix is to cold-restart every VM that was running before the upgrade — that re-initialises its in-memory state under the new QEMU. The procedure below disables the operator's auto-migration before the upgrade so it doesn't trigger a flapping loop, then restarts VMs in a controlled, paced sequence. | ||||||||||
|
|
||||||||||
| **Run this before the `helm upgrade` (Step 5 of the main skill) when the target version crosses KubeVirt 1.6.x → 1.8.x.** | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| # 1. Snapshot baseline so you can verify what changed | ||||||||||
| kubectl get vmi -A -o wide > /tmp/vmis-pre-upgrade.txt | ||||||||||
| kubectl get pods -l kubevirt.io=virt-launcher -A \ | ||||||||||
| -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}{"\t"}{.spec.containers[?(@.name=="compute")].image}{"\n"}{end}' \ | ||||||||||
| > /tmp/launchers-pre-upgrade.txt | ||||||||||
| kubectl -n cozy-kubevirt get kubevirt kubevirt -o yaml > /tmp/kubevirt-pre.yaml | ||||||||||
|
|
||||||||||
| # 2. Disable workloadUpdateMethods so the new operator doesn't auto-migrate every VM | ||||||||||
| kubectl -n cozy-kubevirt patch kubevirt kubevirt --type=merge \ | ||||||||||
| -p '{"spec":{"workloadUpdateStrategy":{"workloadUpdateMethods":[]}}}' | ||||||||||
|
|
||||||||||
| # 3. Suspend the kubevirt HelmRelease so Flux doesn't reconcile | ||||||||||
| # workloadUpdateMethods back from the chart values | ||||||||||
| kubectl -n cozy-kubevirt patch hr kubevirt --type=merge \ | ||||||||||
| -p '{"spec":{"suspend":true}}' | ||||||||||
|
|
||||||||||
| # 4. Verify both took effect | ||||||||||
| kubectl -n cozy-kubevirt get kubevirt kubevirt \ | ||||||||||
| -o jsonpath='{.spec.workloadUpdateStrategy.workloadUpdateMethods}{"\n"}' | ||||||||||
| # expected: [] | ||||||||||
|
|
||||||||||
| # 5. NOW run helm upgrade for cozystack (Step 5 of the main skill). | ||||||||||
| # The control plane (virt-api/controller/handler/operator) will roll over to | ||||||||||
| # v1.8.x. Existing virt-launcher pods are NOT touched, so VMs keep running | ||||||||||
| # on the old QEMU. Live-migration BETWEEN two old launchers still works. | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| After the upgrade reaches `Ready=True`, do the phased cold-restart: | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| # 6. Build the worklist of VMIs to restart. Excludes any that the operator | ||||||||||
| # must leave alone (replace EXCLUDED_NS as needed). | ||||||||||
| EXCLUDED_NS=tenant-edoors # comma-separated if more than one; adjust grep below | ||||||||||
| kubectl get vmi -A --no-headers \ | ||||||||||
| | awk -v ex="$EXCLUDED_NS" ' | ||||||||||
| BEGIN { n=split(ex,e,","); for (i in e) skip[e[i]]=1 } | ||||||||||
| $4 == "Running" && !($1 in skip) { print $1"/"$2 }' \ | ||||||||||
| > /tmp/vms-to-restart.txt | ||||||||||
| wc -l /tmp/vms-to-restart.txt | ||||||||||
|
|
||||||||||
| # 7. Restart each VMI in turn at 30s spacing. delete pod → VMI controller | ||||||||||
| # creates a new launcher on the now-current image. Per-VM downtime ~30-60s. | ||||||||||
| while read entry; do | ||||||||||
| ns="${entry%%/*}" | ||||||||||
| vmi="${entry##*/}" | ||||||||||
| pod=$(kubectl -n "$ns" get pods -l kubevirt.io=virt-launcher,vm.kubevirt.io/name="$vmi" \ | ||||||||||
| -o jsonpath='{.items[0].metadata.name}' 2>/dev/null) | ||||||||||
|
Comment on lines
+330
to
+331
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To ensure the script targets the active workload and avoids issues with pods in
Suggested change
|
||||||||||
| if [ -n "$pod" ]; then | ||||||||||
| echo "$(date +%H:%M:%S) restart $ns/$vmi (pod $pod)" | ||||||||||
| kubectl -n "$ns" delete pod "$pod" --wait=false | ||||||||||
| fi | ||||||||||
| sleep 30 | ||||||||||
| done < /tmp/vms-to-restart.txt | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| **Pacing.** 30s spacing × N VMs = total wall time. For 161 VMs that's ~85 min. Tighter spacing risks storage IO surges (DRBD/LINSTOR resyncs). Loosen if storage is hot, tighten if maintenance window is short. | ||||||||||
|
|
||||||||||
| After the loop: | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| # 8. Verify everything landed on the new launcher image | ||||||||||
| kubectl get pods -l kubevirt.io=virt-launcher -A \ | ||||||||||
| -o jsonpath='{range .items[*]}{.spec.containers[?(@.name=="compute")].image}{"\n"}{end}' \ | ||||||||||
| | sort | uniq -c | ||||||||||
| # expected: only excluded VMs (if any) remain on the old image | ||||||||||
|
|
||||||||||
| # 9. Confirm no VMI is wedged | ||||||||||
| kubectl get vmi -A --no-headers \ | ||||||||||
| | awk '$4 != "Running" && $4 != "Pending"' | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ### Steady state | ||||||||||
|
|
||||||||||
| If any VMs were intentionally skipped (e.g. tenants who couldn't take downtime in this window), leave `workloadUpdateMethods` empty until those VMs are restarted naturally. Once the cluster is uniformly on the new launcher image: | ||||||||||
|
|
||||||||||
| ```bash | ||||||||||
| kubectl -n cozy-kubevirt patch hr kubevirt --type=merge \ | ||||||||||
| -p '{"spec":{"suspend":false}}' | ||||||||||
|
|
||||||||||
| kubectl -n cozy-kubevirt patch kubevirt kubevirt --type=merge \ | ||||||||||
| -p '{"spec":{"workloadUpdateStrategy":{"workloadUpdateMethods":["LiveMigrate","Evict"]}}}' | ||||||||||
| ``` | ||||||||||
|
|
||||||||||
| ### Coordination with the user | ||||||||||
|
|
||||||||||
| Before starting, communicate clearly: | ||||||||||
|
|
||||||||||
| - Every VM (except explicit opt-outs) will get **one** ~30-60s downtime during the restart loop. | ||||||||||
| - The order is alphabetical by namespace; rough ETA is ~30s per VM. | ||||||||||
| - Tenants with HA workloads on top of single VMIs (e.g. single-replica databases) should be warned individually if their app can't tolerate a brief restart. | ||||||||||
| - Tenants who need to defer should be added to the exclusion list; their VM will keep running on the old QEMU until they restart it themselves. | ||||||||||
|
|
||||||||||
| ## Diagnostic quick reference | ||||||||||
|
|
||||||||||
| | Question | Command | | ||||||||||
|
|
||||||||||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example value
tenant-edoorsis very specific and might be accidentally used if the user copy-pastes the block. It's better to provide an empty default. Also, the comment mentions "adjust grep below" but the implementation usesawk.