|
1 | 1 | # Administrator guide
|
2 | 2 |
|
| 3 | +Welcome to the systems administrator guide to the [ELIXIR Cloud][elixir-cloud]. |
| 4 | +Whether you would like to onboard your data or compute center, set up your own |
| 5 | +[GA4GH][ga4gh]-based cloud or simply play around with our compute and storage |
| 6 | +solutions, this is the right place to get you off the ground. |
| 7 | + |
| 8 | +## General deployment notes |
| 9 | + |
| 10 | +Most of our services (see our [GitHub organization][elixir-cloud-aai-github] |
| 11 | +for a comprehensive list) come with [Helm](https://helm.sh/) charts for |
| 12 | +deployment on Cloud Native infrastructure and [Docker |
| 13 | +Compose](https://docs.docker.com/compose/) configurations for |
| 14 | +testing/development deployments. If you do not have experience with these |
| 15 | +technologies, please find some brief primers with references to additional |
| 16 | +documentation below. |
| 17 | + |
| 18 | +### Using Helm |
| 19 | + |
| 20 | +[Helm][helm] is an IaC tool that is described as the "package manager for |
| 21 | +Kubernetes". It allows the management of the lifecycle of a Kubernetes |
| 22 | +application, i.e., its deployment, configuration, upgrade, retiring, etc. |
| 23 | +Applications ara packaged into "Charts". Using Helm Charts allows us to |
| 24 | +version control an application and therefore follow its evolution over time, |
| 25 | +make identical copies (e.g., development, staging, production), make |
| 26 | +predictable upgrades, and share/publish the application. |
| 27 | + |
| 28 | +Some useful Helm commands to manage a Chart are: |
| 29 | + |
| 30 | +- `helm create`: Create a Helm Chart |
| 31 | +- `helm install`: Install an application |
| 32 | +- `helm upgrade`: Upgrade an application |
| 33 | +- `helm uninstall`: Uninstall an application |
| 34 | + |
| 35 | +### Using Docker Compose |
| 36 | + |
| 37 | +Most of our services provide a [Docker Compose][docker-compose] configuration |
| 38 | +file for easy deployment of the software on a local machine. If the [Docker |
| 39 | +Engine][docker-engine] and [Docker Compose][docker-compose] are already |
| 40 | +installed on your system, it is as simple as cloning the service's Git |
| 41 | +repository, changing into the folder where the Docker Compose file resides |
| 42 | +(typically `docker-compose.yml` in a repository's root directory) and running |
| 43 | +the following: |
| 44 | + |
| 45 | +```sh |
| 46 | +docker-compose up -d |
| 47 | +``` |
| 48 | + |
| 49 | +!!! note "Non-standard name or location of config file" |
| 50 | + |
| 51 | + The command will be different if the Docker Compose config file is _not_ in |
| 52 | + the current working directory and/or is _not_ called `docker-compose.yml`. |
| 53 | + |
| 54 | +This will bring the service up. The argument `-d` (or `--detach`) starts the |
| 55 | +app in daemonized mode, i.e., all launched containers that compose creates run |
| 56 | +in the background. |
| 57 | + |
| 58 | +In order to stop the deployment, simply run: |
| 59 | + |
| 60 | +```sh |
| 61 | +docker-compose down |
| 62 | +``` |
| 63 | + |
| 64 | +## Onboarding your compute center |
| 65 | + |
| 66 | +Follow the instructions below to onboard your compute node with the [ELIXIR |
| 67 | +Cloud][elixir-cloud]. Afterwards, your compute cluster will be accessible |
| 68 | +through the [GA4GH][ga4gh] Task Execution Service ([TES][ga4gh-tes]) API and, |
| 69 | +optionally, available in the ELIXIR Cloud compute network. |
| 70 | + |
| 71 | +### Deploying compute |
| 72 | + |
| 73 | +Depending on whether you have a Native Cloud cluster or an HPC/HTC, you will |
| 74 | +need to follow the instructions for deploying [TESK][tesk] or [Funnel][funnel] |
| 75 | +below, respectively. |
| 76 | + |
| 77 | +#### Deploying TESK |
| 78 | + |
| 79 | +[TESK][tesk] uses the Kubernetes Batch API ([Jobs][k8s-jobs]) to schedule |
| 80 | +execution of TES tasks. This means that it should be possible to deploy TESK in |
| 81 | +any flavor of Kubernetes, but tests are currently only performed with |
| 82 | +[Kubernetes][k8s], [OpenShift][openshift], and [Minikube][minikube]. Follow |
| 83 | +these instructions if you wish to deploy a TES endpoint on your Native Cloud |
| 84 | +cluster, and please let us know if you deploy TESK in any new and interensting |
| 85 | +platform. |
| 86 | + |
| 87 | +TESK currently does not use any other storage (DB) than Kubernetes itself. |
| 88 | +[Persistent Volume Claims][k8s-pvc] are used as a temporary storage to handle |
| 89 | +input and output files of a task and pass them over between executors of a |
| 90 | +task. Note that PVCs are destroyed immediately after task completion! This |
| 91 | +means your cluster will need to provide a ReadWriteMany |
| 92 | +[StorageClass][k8s-storage-class]. Commonly used storage classes are |
| 93 | +[NFS][nfs] and [CephFS][cephfs]. |
| 94 | + |
| 95 | +Here is an overview of TESK's architecture: |
| 96 | + |
| 97 | +<div> |
| 98 | + <a href="https://github.com/elixir-cloud-aai/TESK"> |
| 99 | + <img src="images/tesk_architecture.png" alt="TESK architecture" width="627"/> |
| 100 | + </a> |
| 101 | +</div> |
| 102 | + |
| 103 | +A [Helm][helm] chart is provided for the convenient deployment of TESK. The |
| 104 | +chart is available in the [TESK code repository][tesk-helm]. |
| 105 | + |
| 106 | +Follow these steps: |
| 107 | + |
| 108 | +1. [Install Helm][helm-install] |
| 109 | +2. Clone the [TESK repository][tesk]: |
| 110 | + |
| 111 | + ```sh |
| 112 | + git clone https://github.com/elixir-cloud-aai/TESK.git |
| 113 | + ``` |
| 114 | + |
| 115 | +3. Find the Helm chart at `charts/tesk` |
| 116 | +4. Edit file |
| 117 | + [`values.yaml`] |
| 118 | + (see [notes](#notes-for-editing-chart-values) below) |
| 119 | +5. Log into the cluster and install TESK with: |
| 120 | + |
| 121 | + ```sh |
| 122 | + helm install -n TESK-NAMESPACE TESK-DEPLOYMENT-NAME . \ |
| 123 | + -f secrets.yaml \ |
| 124 | + -f values.yaml |
| 125 | + ``` |
| 126 | + |
| 127 | + * Replace `TESK-NAMESPACE` with the name of the namespace where you want to |
| 128 | + install TESK. If the namespace is not specified, the default namespace will |
| 129 | + be used. |
| 130 | + * The argument provided for `TESK-DEPLOYMENT-NAME` will be used by Helm to |
| 131 | + refer to the deployment, for example when upgrading or deleting the |
| 132 | + deployment. You can choose whichever name you like. |
| 133 | + |
| 134 | +You should now have a working TESK isntance! |
| 135 | + |
| 136 | +##### Notes for editing chart values |
| 137 | + |
| 138 | +In the [TESK deployment documentation][tesk-docs-deploy] documentation there is |
| 139 | +a [description of every value][tesk-docs-deploy-values]. Briefly, the most |
| 140 | +important are: |
| 141 | + |
| 142 | +1. `host_name`: Will be used to serve the API. |
| 143 | +2. `storageClass`: Specify the storage class. If left empty, TESK will use the |
| 144 | + default one configred in the Kubernetes cluster. |
| 145 | +3. `auth.mode`: Enable (`auth`) or disable (`noauth`; default) authentication. |
| 146 | + When enabled, an OIDC client **must** be in a file `./secrets.yaml`, with |
| 147 | + the following format: |
| 148 | + |
| 149 | + ```yaml |
| 150 | + auth: |
| 151 | + client_id: <client_id> |
| 152 | + client_secret: <client_secret> |
| 153 | + ``` |
| 154 | + |
| 155 | +4. `ftp`: Which FTP credentials mode to use. Two options are supported: |
| 156 | + `.classic_ftp_secret` for basic authentication (username and password) or |
| 157 | + `.netrc_secret` for using a [`.netrc`][netrc] file. |
| 158 | + |
| 159 | + For the classic approach, you must write in `values.yaml`: |
| 160 | + |
| 161 | + ```yaml |
| 162 | + ftp: |
| 163 | + classic_ftp_secret: ftp-secret |
| 164 | + ``` |
| 165 | + |
| 166 | + And in a file `.secrets.yaml` write down the username and password as: |
| 167 | + |
| 168 | + ```yaml |
| 169 | + ftp: |
| 170 | + username: <username> |
| 171 | + password: <password> |
| 172 | + ``` |
| 173 | + |
| 174 | + For the `.netrc` approach, create a `.netrc` file in the `ftp` folder with |
| 175 | + the connections details in the correct format. |
| 176 | + |
| 177 | +5. `clusterType`: Type of Kubernetes flavor. Currently supported: `kubernetes` |
| 178 | + (default) and `openshift`. |
| 179 | + |
| 180 | +!!! warning "Careful" |
| 181 | + When creating a `.secrets.yaml` file, ensure that the file is never shared |
| 182 | + or committed to a code repository! |
| 183 | + |
| 184 | +#### Deploying Funnel |
| 185 | + |
| 186 | +Follow these instructions if you wish to deploy a TES endpoint in front of your |
| 187 | +HPC/HTC cluster (currently tested with [Slurm][slurm] and [OpenPBS][openpbs]. |
| 188 | + |
| 189 | +1. Make sure the build dependencies `make` and [Go 1.11+][go-install] are |
| 190 | + installed, `GOPATH` is set and `GOPATH/bin` is added to `PATH`. |
| 191 | + |
| 192 | + For example, in Ubuntu this can be achieved via: |
| 193 | + |
| 194 | + ```sh |
| 195 | + sudo apt update |
| 196 | + sudo apt install make golang-go |
| 197 | + export GOPATH=/your/desired/path |
| 198 | + export PATH=$GOPATH/bin:$PATH |
| 199 | + go version |
| 200 | + ``` |
| 201 | + |
| 202 | +2. Clone the repository: |
| 203 | + |
| 204 | + ```sh |
| 205 | + git clone https://github.com/ohsu-comp-bio/funnel.git |
| 206 | + ``` |
| 207 | + |
| 208 | +3. Build Funnel: |
| 209 | + |
| 210 | + ```sh |
| 211 | + cd funnel |
| 212 | + make |
| 213 | + ``` |
| 214 | + |
| 215 | +4. Test the installation by starting the Funnel server with: |
| 216 | + |
| 217 | + ```sh |
| 218 | + funnel server run |
| 219 | + ``` |
| 220 | + |
| 221 | +If all works, Funnel should be ready for deployment on your HPC/HTC. |
| 222 | + |
| 223 | +##### Slurm |
| 224 | + |
| 225 | +For the use of Funnel with Slurm, make sure the following conditions are met: |
| 226 | + |
| 227 | +1. The `funnel` binary must be placed in a server with access to Slurm. |
| 228 | +2. A config file must be created and placed on the same server. [This |
| 229 | + file][funnel-config-slurm] can be used as a starting point. |
| 230 | +3. If we would like to deploy Funnel as a Systemd service, |
| 231 | + [this file][funnel-config-slurm-service] can be used as a template. Set the |
| 232 | + correct paths to the `funnel` binary and config file. |
| 233 | + |
| 234 | +If successfull Funnel should be listening on port `8080`. |
| 235 | + |
| 236 | +##### OpenPBS |
| 237 | + |
| 238 | +!!! warning "Under construction" |
| 239 | + More info coming soon... |
| 240 | + |
| 241 | + |
| 242 | +### Deploying storage |
| 243 | + |
| 244 | +Follow the instructions below to connect your TES endpoint to one or more |
| 245 | +ELIXIR Cloud cloud storage solutions. The currently supported solutions are: |
| 246 | + |
| 247 | +- [MinIO][minio] (Amazon S3) |
| 248 | +- [`vsftpd`][vsftp] (FTP) |
| 249 | + |
| 250 | +!!! note "Other storage solutions" |
| 251 | + |
| 252 | + Other S3 and FTP implementations may work but have not being tested. |
| 253 | + |
| 254 | +#### Deploying MinIO (Amazon S3) |
| 255 | + |
| 256 | +In order to deploy the [MinIO][minio] server, follow the [official |
| 257 | +documentation][minio-docs-k8s]. It is very simple |
| 258 | + |
| 259 | +If you are deploying Minio to OpenShift, you may find this |
| 260 | +[Minio-OpenShift][minio-deploy-openshift-template] template useful. |
| 261 | + |
| 262 | +#### Deploying `vsftpd` (FTP) |
| 263 | + |
| 264 | +There are a lot of guides available online to deploy [`vsftpd`][vsftpd], for |
| 265 | +example [this one][vsftpd-deploy]. There are only two considerations: |
| 266 | + |
| 267 | +1. It is required to activate secure FTP support with `ssl_enable=YES`. |
| 268 | +2. For onboarding with the ELIXIR Cloud, currently the server should have one |
| 269 | + account with a specific username and password created. Please [contact |
| 270 | + us][elixir-cloud-aai-email] for details. |
| 271 | + |
| 272 | +### Registering your TES service |
| 273 | + |
| 274 | +We are currently working on implementing access control mechanisms and |
| 275 | +providing a user interface for the [ELIXIR Cloud |
| 276 | +Registry][elixir-cloud-registry]. Once available, we will add registration |
| 277 | +instructions here. For now, please let us know about your new TES endpoint by |
| 278 | +[email][elixir-cloud-aai-email]. |
| 279 | +## Custom cloud deployments |
| 280 | + |
3 | 281 | !!! warning "Under construction"
|
4 | 282 | More info coming soon...
|
0 commit comments