|
| 1 | +# Running a live migration "by hand" with a crucible boot disk |
| 2 | + |
| 3 | +In the product, live migration is managed by nexus. Still, it is extremely |
| 4 | +useful for development to be able to test software components in isolation. One |
| 5 | +obstacle for testing inter-machine migration in propolis without a full control |
| 6 | +plane is the need for shared storage — in particular, a source of shared storage |
| 7 | +for the guest's boot disk. |
| 8 | + |
| 9 | +Since crucible will be providing storage in the product, I chose to get |
| 10 | +that working over other options. This document has some instructions on how to |
| 11 | +get propolis to use crucible as a backend for a boot disk. At the moment it's |
| 12 | +not the most user-friendly experience, but since it took some effort to figure |
| 13 | +out, I wanted to at least capture what I did. |
| 14 | + |
| 15 | +## Requirements |
| 16 | + |
| 17 | +For this setup, you'll need: |
| 18 | +- a "source" propolis server |
| 19 | +- a "destination" propolis server (on the same machine or otherwise) |
| 20 | +- a propolis CLI |
| 21 | +- a copy of [crucible](https://github.com/oxidecomputer/crucible) and a place to |
| 22 | + run downstairs processes that the source/destination machines can access |
| 23 | +- an OS image that you'd like to boot |
| 24 | + |
| 25 | +## Setup |
| 26 | + |
| 27 | +### Seed crucible downstairs with the OS image |
| 28 | + |
| 29 | +From the machine where you will run the crucible downstairs, set up 3 crucible |
| 30 | +downstairs regions. Use the `--import-path` flag to specify where the OS image |
| 31 | +is on the filesystem. Specify the address and port the downstairs will listen |
| 32 | +on using the `-a` and the `-p` flags, respectively. The address may be |
| 33 | +`localhost` or the external IP address of the machine where the downstairs is |
| 34 | +running. Note that the IP:port specification will be used again later in the |
| 35 | +JSON file that gets passed to propolis. |
| 36 | + |
| 37 | +For example: |
| 38 | +``` |
| 39 | +$ ./target/release/crucible-downstairs create --import-path /home/jordan/images/helios-generic-ttya-base_20230109.raw --data region8810 --uuid $(uuidgen) --extent-size 64000 --extent-count 64 |
| 40 | +
|
| 41 | +$ ./target/release/crucible-downstairs create --import-path /home/jordan/images/helios-generic-ttya-base_20230109.raw --data region8820 --uuid $(uuidgen) --extent-size 64000 --extent-count 64 |
| 42 | +
|
| 43 | +$ ./target/release/crucible-downstairs create --import-path /home/jordan/images/helios-generic-ttya-base_20230109.raw --data region8830 --uuid $(uuidgen) --extent-size 64000 --extent-count 64 |
| 44 | +``` |
| 45 | + |
| 46 | +Each `create` will setup a region file. In the above example, these files are |
| 47 | +`region8810`, `region8820`, and `region8830`, respectively. |
| 48 | + |
| 49 | +### Run the crucible downstairs |
| 50 | + |
| 51 | +After seeding the downstairs with the image, run the downstairs processes. |
| 52 | + |
| 53 | +For example: |
| 54 | +``` |
| 55 | +$ ./target/release/crucible-downstairs run -d region8810 -p 8810 -a 172.20.3.73 |
| 56 | +$ ./target/release/crucible-downstairs run -d region8820 -p 8820 -a 172.20.3.73 |
| 57 | +$ ./target/release/crucible-downstairs run -d region8830 -p 8830 -a 172.20.3.73 |
| 58 | +``` |
| 59 | + |
| 60 | +### Create a JSON file with disk requests |
| 61 | + |
| 62 | +Now that we've got a crucible volume setup, we need to configure propolis to be |
| 63 | +aware of it as a backend. One can do this by passing the `--crucible-disks` |
| 64 | +flag and a JSON file of an array of `DiskRequest`s when creating or migrating |
| 65 | +a VM. |
| 66 | + |
| 67 | +On the source machine, create a JSON file like this: |
| 68 | + |
| 69 | +``` |
| 70 | +[ |
| 71 | +{ |
| 72 | + "device": "virtio", |
| 73 | + "name": "helios-blockdev", |
| 74 | + "read_only": false, |
| 75 | + "slot": 1, |
| 76 | + "volume_construction_request": { |
| 77 | + "type": "volume", |
| 78 | + "block_size": 512, |
| 79 | + "id": "0cedae45-3d6e-4d90-b2cb-56f1a1a42a89", |
| 80 | + "read_only_parent": null, |
| 81 | + "sub_volumes": [ |
| 82 | + { |
| 83 | + "type": "region", |
| 84 | + "block_size": 512, |
| 85 | + "blocks_per_extent": 64000, |
| 86 | + "extent_count": 64, |
| 87 | + "gen": 1, |
| 88 | + "opts": { |
| 89 | + "cert_pem": null, |
| 90 | + "control": null, |
| 91 | + "flush_timeout": null, |
| 92 | + "id": "0cedae45-3d6e-4d90-b2cb-56f1a1a42a89", |
| 93 | + "key": null, |
| 94 | + "key_pem": null, |
| 95 | + "lossy": false, |
| 96 | + "read_only": false, |
| 97 | + "root_cert_pem": null, |
| 98 | + "target": ["172.20.3.73:8810", |
| 99 | + "172.20.3.73:8820", |
| 100 | + "172.20.3.73:8830" |
| 101 | + ] |
| 102 | + } |
| 103 | + } |
| 104 | + ] |
| 105 | + } |
| 106 | +} |
| 107 | +] |
| 108 | +``` |
| 109 | + |
| 110 | +Several fields in this file must match the parameters specified when the |
| 111 | +crucible downstairs processes were created, specifically: `block_size` (note |
| 112 | +that it occurs twice in the JSON file), `blocks_per_extent`, and |
| 113 | +`extent_count`. The `target` field is an array of IP:port addresses where the |
| 114 | +downstairs are expected to be running. |
| 115 | + |
| 116 | +One important thing to know is that the generation number field (`gen`) must be |
| 117 | +bumped manually each time a VM is created (or migrated). (In the product, the |
| 118 | +generation number is tracked by nexus.) A fresh crucible downstairs will start |
| 119 | +with generation number 1. |
| 120 | + |
| 121 | +To see the current generation number of a downstairs, you can dump the region |
| 122 | +and check the highest generation number. Use the `-d` flag to select the |
| 123 | +directory containing the region: |
| 124 | + |
| 125 | +``` |
| 126 | +$ ./target/debug/crucible-downstairs dump -d region8810 |
| 127 | +EXT BLOCKS GEN0 FL0 D0 |
| 128 | + 0 0000000-0063999 10 608 F |
| 129 | + 1 0064000-0127999 0 1 F |
| 130 | + 2 0128000-0191999 0 1 F |
| 131 | + 3 0192000-0255999 0 1 F |
| 132 | +
|
| 133 | +... (output elided) |
| 134 | +
|
| 135 | +Max gen: 11, Max flush: 642 |
| 136 | +``` |
| 137 | + |
| 138 | +You will need to use the max generation number of all 3 downstairs. |
| 139 | + |
| 140 | +### Create the VM on the source server |
| 141 | + |
| 142 | +On the source machine, run the propolis server with whatever TOML configuration |
| 143 | +you desire, except for the boot disk, which will be specified through the API. |
| 144 | + |
| 145 | +Create the VM using the `--crucible-disks` flag and the JSON file. For example: |
| 146 | +``` |
| 147 | +$ ./target/debug/propolis-cli -s 172.20.3.73 -p 8000 new --crucible-disks disks.json vm0 |
| 148 | +``` |
| 149 | + |
| 150 | +Run the VM: |
| 151 | +``` |
| 152 | +$ ./target/debug/propolis-cli -s 172.20.3.73 -p 8000 state run |
| 153 | +``` |
| 154 | + |
| 155 | +You may wish to watch the console to make sure it boots: |
| 156 | +``` |
| 157 | +$ ./target/debug/propolis-cli -s 172.20.3.73 -p 8000 serial |
| 158 | +``` |
| 159 | + |
| 160 | +### Migrate the VM to the destination server |
| 161 | + |
| 162 | +Now it's time to migrate the VM. The destination server will need to have the |
| 163 | +same instance spec as the source server, so run the destination server with the |
| 164 | +same TOML configuration as the source server. Similarly, the destination server |
| 165 | +will need to know about the crucible backend. Like with the `create` command, we |
| 166 | +can tell the destination server about this disk via request with the `migrate` |
| 167 | +command and the `crucible-disks` flag. |
| 168 | + |
| 169 | +Ensure the destination server is running. Make a copy of the JSON file you |
| 170 | +created above and increment the generation number. Then, from the source, run |
| 171 | +something like: |
| 172 | +``` |
| 173 | +$ ./target/debug/propolis-cli -s 172.20.3.73 -p 8000 migrate 172.20.3.71 -p 8000 --crucible-disks disks2.json |
| 174 | +``` |
| 175 | + |
| 176 | +If successful, you should be able to run the VM and see the serial console on |
| 177 | +the destination side. |
| 178 | + |
| 179 | + |
0 commit comments