-
Notifications
You must be signed in to change notification settings - Fork 130
Restoring from VM Backups on Google Cloud
See https://github.com/codalab/codalab-competitions/wiki/Developers-Instance-Hosts---Commands-and-Reference-Guide for common debug commands + docker container information.
- Start from your Google Cloud Storage Control Panel, and click
VM Instances
underCompute Engine
in the side menu
![img](https://user-images.githubusercontent.com/11784999/61740648-736d9200-ad8f-11e9-85b9-3f9ce1d24f4e.png)
- Select create new instance on the top menu
![img](https://user-images.githubusercontent.com/28552312/61484517-f0de7000-a953-11e9-9dda-a44e4a3c7e98.png)
- Configure your machine options and name. We recommend for a production level Codalab server with heavy traffic between 4-8 vCpus/Cpus and between 16-32GB of RAM.
![img](https://user-images.githubusercontent.com/28552312/61484636-2a16e000-a954-11e9-8392-9efa55d81c32.png)
- Under boot disk, select change
![img](https://user-images.githubusercontent.com/28552312/61484689-46b31800-a954-11e9-96e0-62d1be0cbe8b.png)
-
Select the
snapshots
tab. -
Select the snapshot you wish to restore from. They will display the date they were created here as well.
![img](https://user-images.githubusercontent.com/11784999/61739200-81191c80-ad7b-11e9-87a0-456351d61c09.png)
- Ensure the disk size and type is correct and submit the form. Note: We recommend at least 50+ gb of disk space.
![img](https://user-images.githubusercontent.com/28552312/61484883-a8738200-a954-11e9-9ebb-dc60acacf9a2.png)
- Under firewall configuration, enable HTTP/HTTPS traffic
![img](https://user-images.githubusercontent.com/28552312/61484958-d8228a00-a954-11e9-8182-eadae21efc94.png)
- Expand
Management, Security, Disks, Networking, Sole Tenancy
- Add the tag
allow-rabbitmq-and-flower
under network tags. (This allows RabbitMQ/Celery/Etc to communicate with workers)
-
Submit the form and create your new instance from a snapshot backup.
-
SSH into the instance via the SSH button from the instances menu
![img](https://user-images.githubusercontent.com/11784999/61739339-c2a9c780-ad7b-11e9-9f46-c8472baeb848.png)
- Change directory to the Ubuntu user's
codalab-competitions
directory. (Note if you SSH in under the userUbuntu
it's in the same directory as you will be)
![img](https://user-images.githubusercontent.com/28552312/61485163-4bc49700-a955-11e9-9289-e850b8c55ae0.png)
- !OPTIONAL!
- If the DNS is not yet configured to point to the instance IP (Ex: autodl.lri.fr is not pointing to the new IP) then you need to edit the Docker environment configuration file for the project if you would like to use it in the mean time. (.env) Sudo is not needed if SSH'ed in as Ubuntu. (To do that you need an editor, if you hate vim, run
sudo apt install emacs
to get emacs).
![img](https://user-images.githubusercontent.com/28552312/61485333-b8d82c80-a955-11e9-916b-e02b7e86736c.png)
- Change all references of your domain name, to the instance IP. The most important setting is
CADDY_DOMAIN
as this is what Caddy will try to serve. If the DNS/IP it's trying to serve doesn't match the IP it's being served from it will not work. You will see a message likeautodl.lri.fr is not served from this instance
. Make sure if using an IP to append:80
on the end to specify not to use SSL. Otherwise you will receive an SSL error as it will try to run with SSL enabled, but not be able to retrieve a certificate.
![img](https://user-images.githubusercontent.com/28552312/61485511-33a14780-a956-11e9-97b5-edd4e7636cb5.png)
-
WARNING: Google cloud creates a NEW DYNAMIC IP every time you restore your VM, so unless you had a static IP assigned, you need to redo this procedure every time you restart your VM.
-
If you want to assign a new URL, then instead of the IP address, put the new URL in
CADDY_DOMAIN
. Do not forget to create an A record to make your domain point to your IP address at your ISP. Here is how it is done at Moniker (it may take a while for the DNS to propagate):
- Run
docker-compose up -d
to update all containers (Note: Your output should show the containers as having been recreated)
![img](https://user-images.githubusercontent.com/28552312/61485638-719e6b80-a956-11e9-9800-fc7f0f780cd6.png)
-
Verify you can connect via web browser to the instance
-
If you get the below error after following these steps, make sure you're not trying to use https.
![img](https://user-images.githubusercontent.com/28552312/61490595-2f7b2700-a962-11e9-9b84-f6f3d8d9f222.png)
- If you get a message similar to: "The Codalab site is not currently available" then Codalab is probably still starting up. Check the
django
container logs withdocker-compose logs -f django
. The last few steps should be checking static files and running any migrations.
*OPTIONAL: If not restoring a specific domain or doing a test it is very likely the instance will not have the default worker enabled, and no compute workers will be attached. To re-enable the default worker you can simply rename or delete docker-compose.override.yaml
. It may also be necessary to ensure the correct worker version is specified.
- Verify all services are running with
docker ps
![img](https://user-images.githubusercontent.com/28552312/61485701-ac080880-a956-11e9-8d88-0dcd78706036.png)
- Access the logs with the command
docker-compose logs -f
(Usedocker-compose logs -f <container>
to view a specific container's logs)
![img](https://user-images.githubusercontent.com/28552312/61485941-2a64aa80-a957-11e9-9811-b83831b20ba0.png)
- Upload test competitions + submissions and verify everything is working correctly. If your submissions get stuck, make sure you're submitting to the default queue.