Skip to content

Commit 1521aaa

Browse files
committed
Add details about non-migratable VMs
How VM is marked as non-migratable How VM is processed on upgrade process How VM is processed on node maintenance process How `migrate` menu comes Signed-off-by: Jian Wang <[email protected]>
1 parent 984f147 commit 1521aaa

File tree

6 files changed

+123
-23
lines changed

6 files changed

+123
-23
lines changed

docs/host/host.md

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ Because Harvester is built on top of Kubernetes and uses etcd as its database, t
2323

2424
Admin users can enable Maintenance Mode (select **⋮ > Enable Maintenance Mode**) to automatically evict all virtual machines from a node. This mode leverages the **live migration** feature to migrate the virtual machines to other nodes, which is useful when you need to reboot, upgrade firmware, or replace hardware components. At least two active nodes are required to use this feature.
2525

26+
Check if there are any [non-migratable VMs](../vm/live-migration.md#non-migratable-vms) and take essential actions.
27+
2628
:::warning
2729

2830
A [bug](https://github.com/harvester/harvester/issues/7128) may cause an I/O error to occur in virtual machines while Maintenance Mode is enabled on the underlying node. To mitigate the issue, you can set a taint on the node before enabling Maintenance Mode.
@@ -135,19 +137,7 @@ Eviction cannot be completed if the remaining nodes cannot accept replicas from
135137
136138
### 4. Manage non-migratable VMs.
137139
138-
[Live migration](../vm/live-migration.md) cannot be performed for VMs with certain properties.
139-
140-
- The VM has PCI passthrough devices or vGPU devices.
141-
142-
A PCI device is bound to a node. You must remove the PCI device from the VM, or delete the VM and then create a new VM from a backup or snapshot.
143-
144-
- The VM has a node selector or affinity rules that bind it to the node to be removed.
145-
146-
You must change the node selector or affinity rules.
147-
148-
- The VM is on a VM network that binds it to the node to be removed.
149-
150-
You must select a different VM network.
140+
Check if there are any [non-migratable VMs](../vm/live-migration.md#non-migratable-vms) and take essential actions.
151141
152142
:::tip
153143
Create a backup or snapshot for each non-migratable VM before modifying the settings that bind it to the node that you want to remove.

docs/upgrade/automatic.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,22 @@ The Harvester and Rancher upgrade processes are independent of each other. Durin
7777

7878
When a Rancher version reaches its End of Maintenance (EOM) date, Harvester only provides fixes for critical security-related issues that affect integration functions (Virtualization Management). For more information, see the [Harvester & Rancher Support Matrix](https://www.suse.com/suse-harvester/support-matrix/all-supported-versions/).
7979

80+
## VM management through the upgrade
81+
82+
### Live-migratable VMs
83+
84+
Those VMs are migrated to other nodes automatically when the hosting node is to be upgraded, they have zero down-time through the upgrade.
85+
86+
### Non-migratable VMs
87+
88+
When an upgrade is triggered, Harvester does a couple of checks, and depends on the value of [upgrade-config setting option `restoreVM`](../advanced/settings.md#upgrade-config):
89+
90+
- False: Harvester refuses the upgrade when any [non-migratable VM](../vm/live-migration.md#non-migratable-vms) is still running. You need to power off them manually.
91+
92+
- True: Harvester will power off those [non-migratable VM](../vm/live-migration.md#non-migratable-vms) when the node is upgraded and then restore them after the node is rebooted.
93+
94+
See [Phase 4: Upgrade Nodes](./troubleshooting.md#phase-4-upgrade-nodes) for more details.
95+
8096
## Before starting an upgrade
8197

8298
Check out the available [`upgrade-config` setting](../advanced/settings.md#upgrade-config) to tweak the upgrade strategies and behaviors that best suit your cluster environment.

docs/upgrade/troubleshooting.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ If the upgrade fails at this point, you must generate a [support bundle](../trou
9393
The Harvester controller creates the following jobs on each node:
9494

9595
- Multi-node clusters:
96-
- `pre-drain` job: Live-migrates or shuts down virtual machines on the node. Once completed, the embedded Rancher service upgrades the RKE2 runtime on the node.
96+
- `pre-drain` job: [Live-migrates or shuts down virtual machines](./automatic.md#vm-management-through-the-upgrade) on the node. Once completed, the embedded Rancher service upgrades the RKE2 runtime on the node.
9797
- `post-drain` job: Upgrades and reboots the operating system.
9898
- Single-node clusters:
9999
- `single-node-upgrade` job: Upgrades the operating system and RKE2 runtime. The job name uses the format `hvst-upgrade-xxx-single-node-upgrade-<hostname>`.
@@ -277,4 +277,4 @@ New images are loaded to each Harvester node during upgrades. When disk usage ex
277277
278278
If you encounter the error message `Node xxx will reach xx.xx% storage space after loading new images. It's higher than kubelet image garbage collection threshold 85%.`, run `crictl rmi --prune` to clean up unused images before starting a new upgrade.
279279
280-
![Disk space not enough error message](/img/v1.4/upgrade/disk-space-not-enough-error-message.png)
280+
![Disk space not enough error message](/img/v1.4/upgrade/disk-space-not-enough-error-message.png)

docs/vm/create-vm.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,8 +209,48 @@ It is also possible to connect VMs using additional networks with Harvester's bu
209209
In bridge VLAN, virtual machines are connected to the host network through a linux `bridge`. The network IPv4 address is delegated to the virtual machine via DHCPv4. The virtual machine should be configured to use DHCP to acquire IPv4 addresses.
210210

211211
## Node Scheduling
212+
212213
`Node Scheduling` allows you to constrain which nodes your VMs can be scheduled on based on node labels.
213214

215+
![vm-node-scheduling](/img/v1.6/vm/vm-node-scheduling.png)
216+
217+
There are three options:
218+
219+
- Run virtual machine on any aviailable node
220+
221+
- Run virtual machine on specific node
222+
223+
Following example shows the VM targets a node with `hostname harv21`:
224+
225+
```
226+
nodeSelector:
227+
kubernetes.io/hostname: harv21
228+
```
229+
230+
- Run virtual machine on node(s) matching scheduling rules
231+
232+
A flexiable option to customize the VM to be scheduled to a group of nodes. Following example shows the VM targets those nodes with label key `harvesterhci.io/group` and value `engineering` or `qa`.
233+
234+
```
235+
spec:
236+
affinity:
237+
nodeAffinity:
238+
requiredDuringSchedulingIgnoredDuringExecution:
239+
nodeSelectorTerms:
240+
- matchExpressions:
241+
- key: harvesterhci.io/group
242+
operator: In
243+
values:
244+
- engineering
245+
- qa
246+
```
247+
248+
:::note
249+
250+
The VM might be [non-migratable](./live-migration.md#non-migratable-vms) when `Run virtual machine on specific node` is selected.
251+
252+
:::
253+
214254
See the [Kubernetes Node Affinity Documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity) for more details.
215255
216256
## VM Scheduling
@@ -296,6 +336,12 @@ No affinity rules are applied when a virtual machine connects to VM networks tha
296336
297337
:::
298338
339+
:::note
340+
341+
The VM might be [non-migratable](./live-migration.md#non-migratable-vms) when there is only one node participates in the `cluster network`.
342+
343+
:::
344+
299345
#### Related CPU Pinning Concepts
300346
301347
When you enable the [CPU Manager](./cpu-pinning.md#enable-and-disable-cpu-manager) on nodes, Harvester applies the following label to related `node` objects.
@@ -325,6 +371,12 @@ spec:
325371
- 'true'
326372
```
327373
374+
:::note
375+
376+
The VM might be [non-migratable](./live-migration.md#non-migratable-vms) when the `CPU Manager` is only enabled on one node.
377+
378+
:::
379+
328380
## Annotations
329381
330382
Harvester allows you to attach custom metadata to virtual machines using annotations. These key-value pairs enable extended features or behaviors without requiring changes to the core virtual machine configuration.

docs/vm/live-migration.md

Lines changed: 50 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,15 +17,37 @@ description: Live migration means moving a virtual machine to a different host w
1717

1818
Live migration means moving a virtual machine to a different host without downtime.
1919

20-
:::note
20+
## Non-migratable VMs
2121

22-
- Live migration is not allowed when the virtual machine is using a management network of bridge interface type.
23-
- Live migration is not allowed when the virtual machine has any volume of the `CD-ROM` type. Such volumes should be ejected before live migration.
24-
- Live migration is not allowed when the virtual machine has any volume of the `Container Disk` type. Such volumes should be removed before live migration.
25-
- Live migration is not allowed when the virtual machine has any `PCIDevice` passthrough enabled. Such devices need to be removed before live migration.
26-
- Live migration is not allowed when the volumeAccessMode of any volume in the virtual machine is `ReadWriteOnce`. Such volumes should be removed before live migration.
22+
The definitions of VM are versatile, a VM cannot perform live migration when one or more of following conditions are met.
2723

28-
:::
24+
Remove the related device or add more schedulable nodes can make the VM live-migratable.
25+
26+
### Has non-migratable devices or node-selector
27+
28+
- The VM has any volume of the `CD-ROM` type.
29+
30+
- The VM has any volume of the `Container Disk` type.
31+
32+
- The VM has any volume with `volumeAccessMode` `ReadWriteOnce`.
33+
34+
- The VM has `PCI passthrough` or `vGPU` devices.
35+
36+
- The VM has a [node selector](./create-vm.md#node-scheduling) that binds it to a specific node.
37+
38+
### Has scheduling rules which can only match one node
39+
40+
Following conditions are checked on the runtime (e.g. before an upgrade) to mark the VM is non-migratable if only one node matches.
41+
42+
- The VM is on a `cluster network` which spreads to only one node.
43+
44+
See [Automatically Applied Affinity Rules](./create-vm.md#related-networking-concepts) for more details.
45+
46+
- The VM has `cpu-pinning` enabled and there is only one node enables `CPU Manager`.
47+
48+
See [Automatically Applied Affinity Rules](./create-vm.md#related-cpu-pinning-concepts) for more details.
49+
50+
- Other [node scheduling](./create-vm.md#node-scheduling) rules.
2951

3052
## How Migration Works
3153

@@ -51,6 +73,18 @@ However, `host-model` only allows migration of the VM to a node with same CPU mo
5173

5274
![](/img/v1.2/vm/migrate-action.png)
5375

76+
:::note
77+
78+
The `Migrate` menu is not available when:
79+
80+
- This is a single-node cluster.
81+
82+
- The VM is `non-migratable` due to it [has non-migratable devices or node-selector](#has-non-migratable-devices-or-node-selector).
83+
84+
- The VM already has a running or pending migration process.
85+
86+
:::
87+
5488
When you have [node scheduling rules](./create-windows-vm.md#node-scheduling-tab) configured for a VM, you must ensure that the target nodes you are migrating to meet the VM's runtime requirements. The list of nodes you get to search and select from will be generated based on:
5589
- VM scheduling rules.
5690
- Possibly node rules from the network configuration.
@@ -62,6 +96,14 @@ When you have [node scheduling rules](./create-windows-vm.md#node-scheduling-tab
6296
1. Go to the **Virtual Machines** page.
6397
1. Find the virtual machine in migrating status that you want to abort. Select **⋮ > Abort Migration**.
6498

99+
:::note
100+
101+
- The `Abort Migration` menu is available when the VM already has a running or pending migration process.
102+
103+
- Don't click `Abort Migration` if it is triggered by the [Harvester upgrade](../upgrade/automatic.md#live-migratable-vms) or [node maintenance](../host/host.md#node-maintenance).
104+
105+
:::
106+
65107
## Migration Timeouts
66108

67109
### Completion Timeout
@@ -88,4 +130,4 @@ Migrating a VM with `host-model` is not possible because the values of `host-mod
88130
- Cluster level: Run `kubectl edit kubevirts.kubevirt.io -n harvester-system` and add `spec.configuration.cpuModel: "123"`. This change also affects newly created VMs.
89131
- Individual VMs: Modify the VM configuration to include `spec.template.spec.domain.cpu.model: "123"`.
90132

91-
Both methods require the restarting the VMs. If you are certain that all nodes in the cluster support a specific CPU model, you can define this at the cluster level before creating any VMs. In doing so, you eliminate the need to restart the VMs (to assign the CPU model) during live migration.
133+
Both methods require the restarting the VMs. If you are certain that all nodes in the cluster support a specific CPU model, you can define this at the cluster level before creating any VMs. In doing so, you eliminate the need to restart the VMs (to assign the CPU model) during live migration.
159 KB
Loading

0 commit comments

Comments
 (0)