Skip to content

Commit d2a0f95

Browse files
uniqueglvarin
andauthored
docs(admin): add first draft (#9)
Co-authored-by: Alvaro Gonzalez <[email protected]>
1 parent 276b506 commit d2a0f95

File tree

3 files changed

+308
-0
lines changed

3 files changed

+308
-0
lines changed

docs/guides/guide-admin/index.md

Lines changed: 278 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,282 @@
11
# Administrator guide
22

3+
Welcome to the systems administrator guide to the [ELIXIR Cloud][elixir-cloud].
4+
Whether you would like to onboard your data or compute center, set up your own
5+
[GA4GH][ga4gh]-based cloud or simply play around with our compute and storage
6+
solutions, this is the right place to get you off the ground.
7+
8+
## General deployment notes
9+
10+
Most of our services (see our [GitHub organization][elixir-cloud-aai-github]
11+
for a comprehensive list) come with [Helm](https://helm.sh/) charts for
12+
deployment on Cloud Native infrastructure and [Docker
13+
Compose](https://docs.docker.com/compose/) configurations for
14+
testing/development deployments. If you do not have experience with these
15+
technologies, please find some brief primers with references to additional
16+
documentation below.
17+
18+
### Using Helm
19+
20+
[Helm][helm] is an IaC tool that is described as the "package manager for
21+
Kubernetes". It allows the management of the lifecycle of a Kubernetes
22+
application, i.e., its deployment, configuration, upgrade, retiring, etc.
23+
Applications ara packaged into "Charts". Using Helm Charts allows us to
24+
version control an application and therefore follow its evolution over time,
25+
make identical copies (e.g., development, staging, production), make
26+
predictable upgrades, and share/publish the application.
27+
28+
Some useful Helm commands to manage a Chart are:
29+
30+
- `helm create`: Create a Helm Chart
31+
- `helm install`: Install an application
32+
- `helm upgrade`: Upgrade an application
33+
- `helm uninstall`: Uninstall an application
34+
35+
### Using Docker Compose
36+
37+
Most of our services provide a [Docker Compose][docker-compose] configuration
38+
file for easy deployment of the software on a local machine. If the [Docker
39+
Engine][docker-engine] and [Docker Compose][docker-compose] are already
40+
installed on your system, it is as simple as cloning the service's Git
41+
repository, changing into the folder where the Docker Compose file resides
42+
(typically `docker-compose.yml` in a repository's root directory) and running
43+
the following:
44+
45+
```sh
46+
docker-compose up -d
47+
```
48+
49+
!!! note "Non-standard name or location of config file"
50+
51+
The command will be different if the Docker Compose config file is _not_ in
52+
the current working directory and/or is _not_ called `docker-compose.yml`.
53+
54+
This will bring the service up. The argument `-d` (or `--detach`) starts the
55+
app in daemonized mode, i.e., all launched containers that compose creates run
56+
in the background.
57+
58+
In order to stop the deployment, simply run:
59+
60+
```sh
61+
docker-compose down
62+
```
63+
64+
## Onboarding your compute center
65+
66+
Follow the instructions below to onboard your compute node with the [ELIXIR
67+
Cloud][elixir-cloud]. Afterwards, your compute cluster will be accessible
68+
through the [GA4GH][ga4gh] Task Execution Service ([TES][ga4gh-tes]) API and,
69+
optionally, available in the ELIXIR Cloud compute network.
70+
71+
### Deploying compute
72+
73+
Depending on whether you have a Native Cloud cluster or an HPC/HTC, you will
74+
need to follow the instructions for deploying [TESK][tesk] or [Funnel][funnel]
75+
below, respectively.
76+
77+
#### Deploying TESK
78+
79+
[TESK][tesk] uses the Kubernetes Batch API ([Jobs][k8s-jobs]) to schedule
80+
execution of TES tasks. This means that it should be possible to deploy TESK in
81+
any flavor of Kubernetes, but tests are currently only performed with
82+
[Kubernetes][k8s], [OpenShift][openshift], and [Minikube][minikube]. Follow
83+
these instructions if you wish to deploy a TES endpoint on your Native Cloud
84+
cluster, and please let us know if you deploy TESK in any new and interensting
85+
platform.
86+
87+
TESK currently does not use any other storage (DB) than Kubernetes itself.
88+
[Persistent Volume Claims][k8s-pvc] are used as a temporary storage to handle
89+
input and output files of a task and pass them over between executors of a
90+
task. Note that PVCs are destroyed immediately after task completion! This
91+
means your cluster will need to provide a ReadWriteMany
92+
[StorageClass][k8s-storage-class]. Commonly used storage classes are
93+
[NFS][nfs] and [CephFS][cephfs].
94+
95+
Here is an overview of TESK's architecture:
96+
97+
<div>
98+
<a href="https://github.com/elixir-cloud-aai/TESK">
99+
<img src="images/tesk_architecture.png" alt="TESK architecture" width="627"/>
100+
</a>
101+
</div>
102+
103+
A [Helm][helm] chart is provided for the convenient deployment of TESK. The
104+
chart is available in the [TESK code repository][tesk-helm].
105+
106+
Follow these steps:
107+
108+
1. [Install Helm][helm-install]
109+
2. Clone the [TESK repository][tesk]:
110+
111+
```sh
112+
git clone https://github.com/elixir-cloud-aai/TESK.git
113+
```
114+
115+
3. Find the Helm chart at `charts/tesk`
116+
4. Edit file
117+
[`values.yaml`]
118+
(see [notes](#notes-for-editing-chart-values) below)
119+
5. Log into the cluster and install TESK with:
120+
121+
```sh
122+
helm install -n TESK-NAMESPACE TESK-DEPLOYMENT-NAME . \
123+
-f secrets.yaml \
124+
-f values.yaml
125+
```
126+
127+
* Replace `TESK-NAMESPACE` with the name of the namespace where you want to
128+
install TESK. If the namespace is not specified, the default namespace will
129+
be used.
130+
* The argument provided for `TESK-DEPLOYMENT-NAME` will be used by Helm to
131+
refer to the deployment, for example when upgrading or deleting the
132+
deployment. You can choose whichever name you like.
133+
134+
You should now have a working TESK isntance!
135+
136+
##### Notes for editing chart values
137+
138+
In the [TESK deployment documentation][tesk-docs-deploy] documentation there is
139+
a [description of every value][tesk-docs-deploy-values]. Briefly, the most
140+
important are:
141+
142+
1. `host_name`: Will be used to serve the API.
143+
2. `storageClass`: Specify the storage class. If left empty, TESK will use the
144+
default one configred in the Kubernetes cluster.
145+
3. `auth.mode`: Enable (`auth`) or disable (`noauth`; default) authentication.
146+
When enabled, an OIDC client **must** be in a file `./secrets.yaml`, with
147+
the following format:
148+
149+
```yaml
150+
auth:
151+
client_id: <client_id>
152+
client_secret: <client_secret>
153+
```
154+
155+
4. `ftp`: Which FTP credentials mode to use. Two options are supported:
156+
`.classic_ftp_secret` for basic authentication (username and password) or
157+
`.netrc_secret` for using a [`.netrc`][netrc] file.
158+
159+
For the classic approach, you must write in `values.yaml`:
160+
161+
```yaml
162+
ftp:
163+
classic_ftp_secret: ftp-secret
164+
```
165+
166+
And in a file `.secrets.yaml` write down the username and password as:
167+
168+
```yaml
169+
ftp:
170+
username: <username>
171+
password: <password>
172+
```
173+
174+
For the `.netrc` approach, create a `.netrc` file in the `ftp` folder with
175+
the connections details in the correct format.
176+
177+
5. `clusterType`: Type of Kubernetes flavor. Currently supported: `kubernetes`
178+
(default) and `openshift`.
179+
180+
!!! warning "Careful"
181+
When creating a `.secrets.yaml` file, ensure that the file is never shared
182+
or committed to a code repository!
183+
184+
#### Deploying Funnel
185+
186+
Follow these instructions if you wish to deploy a TES endpoint in front of your
187+
HPC/HTC cluster (currently tested with [Slurm][slurm] and [OpenPBS][openpbs].
188+
189+
1. Make sure the build dependencies `make` and [Go 1.11+][go-install] are
190+
installed, `GOPATH` is set and `GOPATH/bin` is added to `PATH`.
191+
192+
For example, in Ubuntu this can be achieved via:
193+
194+
```sh
195+
sudo apt update
196+
sudo apt install make golang-go
197+
export GOPATH=/your/desired/path
198+
export PATH=$GOPATH/bin:$PATH
199+
go version
200+
```
201+
202+
2. Clone the repository:
203+
204+
```sh
205+
git clone https://github.com/ohsu-comp-bio/funnel.git
206+
```
207+
208+
3. Build Funnel:
209+
210+
```sh
211+
cd funnel
212+
make
213+
```
214+
215+
4. Test the installation by starting the Funnel server with:
216+
217+
```sh
218+
funnel server run
219+
```
220+
221+
If all works, Funnel should be ready for deployment on your HPC/HTC.
222+
223+
##### Slurm
224+
225+
For the use of Funnel with Slurm, make sure the following conditions are met:
226+
227+
1. The `funnel` binary must be placed in a server with access to Slurm.
228+
2. A config file must be created and placed on the same server. [This
229+
file][funnel-config-slurm] can be used as a starting point.
230+
3. If we would like to deploy Funnel as a Systemd service,
231+
[this file][funnel-config-slurm-service] can be used as a template. Set the
232+
correct paths to the `funnel` binary and config file.
233+
234+
If successfull Funnel should be listening on port `8080`.
235+
236+
##### OpenPBS
237+
238+
!!! warning "Under construction"
239+
More info coming soon...
240+
241+
242+
### Deploying storage
243+
244+
Follow the instructions below to connect your TES endpoint to one or more
245+
ELIXIR Cloud cloud storage solutions. The currently supported solutions are:
246+
247+
- [MinIO][minio] (Amazon S3)
248+
- [`vsftpd`][vsftp] (FTP)
249+
250+
!!! note "Other storage solutions"
251+
252+
Other S3 and FTP implementations may work but have not being tested.
253+
254+
#### Deploying MinIO (Amazon S3)
255+
256+
In order to deploy the [MinIO][minio] server, follow the [official
257+
documentation][minio-docs-k8s]. It is very simple
258+
259+
If you are deploying Minio to OpenShift, you may find this
260+
[Minio-OpenShift][minio-deploy-openshift-template] template useful.
261+
262+
#### Deploying `vsftpd` (FTP)
263+
264+
There are a lot of guides available online to deploy [`vsftpd`][vsftpd], for
265+
example [this one][vsftpd-deploy]. There are only two considerations:
266+
267+
1. It is required to activate secure FTP support with `ssl_enable=YES`.
268+
2. For onboarding with the ELIXIR Cloud, currently the server should have one
269+
account with a specific username and password created. Please [contact
270+
us][elixir-cloud-aai-email] for details.
271+
272+
### Registering your TES service
273+
274+
We are currently working on implementing access control mechanisms and
275+
providing a user interface for the [ELIXIR Cloud
276+
Registry][elixir-cloud-registry]. Once available, we will add registration
277+
instructions here. For now, please let us know about your new TES endpoint by
278+
[email][elixir-cloud-aai-email].
279+
## Custom cloud deployments
280+
3281
!!! warning "Under construction"
4282
More info coming soon...

includes/abbreviations.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
*[FOSS]: Free & Open Source Software
55
*[GA4GH]: The Global Alliance for Genomics and Health is a policy-framing and technical standards-setting organization, seeking to enable responsible genomic data sharing within a human rights framework.
66
*[GSoC]: Google Summer of Code
7+
*[IaC]: Infrastructure as Code
78
*[LIMS]: Laboratory Information Management System
89
*[NBDC]: National Bioscience Database Center
910
*[TES]: GA4GH Task Execution Service API

includes/references.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,16 @@
66
[bh-denbi]: <https://www.denbi.de/de-nbi-events/1454-biohackathon-germany>
77
[bh-elixir]: <https://www.biohackathon-europe.org/>
88
[bh-mena]: <https://cbrcconferences.kaust.edu.sa/bio-hackathon-2023>
9+
[cephfs]: <https://docs.ceph.com/en/quincy/cephfs/>
910
[contributor-covenant]: <https://www.contributor-covenant.org>
1011
[conv-commits]: <https://www.conventionalcommits.org/en/v1.0.0-beta.2/#specification>
1112
[conv-commits-blog]: <https://nitayneeman.com/posts/understanding-semantic-commit-messages-using-git-and-angular/>
1213
[conv-commits-lint]: <https://github.com/conventional-changelog/commitlint>
1314
[cwl-tes]: <https://github.com/ohsu-comp-bio/cwl-tes>
15+
[docker-compose]: <https://docs.docker.com/compose/>
16+
[docker-engine]: <https://docs.docker.com/engine/>
1417
[elixir]: <https://elixir-europe.org/>
18+
[elixir-cloud]: <https://elixir-cloud.dcc.sib.swiss/>
1519
[elixir-cloud-aai]: <https://elixir-cloud.dcc.sib.swiss/>
1620
[elixir-cloud-aai-contributors]: <https://elixir-cloud.dcc.sib.swiss/contributors>
1721
[elixir-cloud-aai-github]: <https://github.com/elixir-cloud-aai/>
@@ -28,6 +32,8 @@
2832
[elixir-cloud-services]: <https://github.com/elixir-cloud-aai/elixir-cloud-aai/blob/dev/resources/resources.md>
2933
[fair]: <https://www.go-fair.org/fair-principles/>
3034
[funnel]: <https://ohsu-comp-bio.github.io/funnel/>
35+
[funnel-config-slurm]: <https://raw.githubusercontent.com/lvarin/test-funnel-slurm/main/funnel_config.yml>
36+
[funnel-config-slurm-service]: <https://raw.githubusercontent.com/ohsu-comp-bio/funnel/52ef90fb76e620226f2af1bca5d14d35e1c4ad4a/deployments/systemd/funnel-server.service>
3137
[ga4gh]: <https://ga4gh.org/>
3238
[ga4gh-cloud]: <https://ga4gh-cloud.github.io/>
3339
[ga4gh-dps]: <https://www.ga4gh.org/how-we-work/driver-projects/>
@@ -48,6 +54,8 @@
4854
[github-merge-squash]: <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/incorporating-changes-from-a-pull-request/about-pull-request-merges#squash-and-merge-your-commits>
4955
[github-merge-rebase]: <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/incorporating-changes-from-a-pull-request/about-pull-request-merges#rebase-and-merge-your-commits>
5056
[github-pr]: <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request>
57+
[go-gopath]: <https://go.dev/doc/gopath_code#GOPATH>
58+
[go-install]: <https://go.dev/doc/install>
5159
[good-issues]: <https://medium.com/nyc-planning-digital/writing-a-proper-github-issue-97427d62a20f>
5260
[good-bug-reports]: <http://testthewebforward.org/docs/bugs.html>
5361
[gsoc]: <https://summerofcode.withgoogle.com/>
@@ -62,8 +70,14 @@
6270
[gsoc-ga4gh]: <https://summerofcode.withgoogle.com/organizations/6274606475771904/>
6371
[gsoc-stipends]: <https://developers.google.com/open-source/gsoc/help/student-stipends>
6472
[gsoc-timeline]: <https://developers.google.com/open-source/gsoc/timeline>
73+
[helm]: <https://helm.sh/>
74+
[helm-install]: <https://helm.sh/docs/intro/install/>
6575
[issue-tracker-example]: <https://github.com/elixir-cloud-aai/elixir-cloud-aai.github.io/issues>
6676
[jsdoc]: <https://jsdoc.app/index.html>
77+
[k8s]: <https://kubernetes.io/>
78+
[k8s-jobs]: <https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion>
79+
[k8s-pvc]: <https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims>
80+
[k8s-storage-class]: <https://kubernetes.io/docs/concepts/storage/storage-classes/>
6781
[linkedin-vani]: <https://www.linkedin.com/in/vani-s-78701315b/>
6882
[linkedin-sarthak]: <https://www.linkedin.com/in/sarthakgupta072/>
6983
[linkedin-akash]: <https://www.linkedin.com/in/akash-saini-ak7778/>
@@ -72,7 +86,15 @@
7286
[linkedin-ayush]: <https://www.linkedin.com/in/ayush-kumar-514a17197/>
7387
[linkedin-lakshya]: <https://www.linkedin.com/in/lakshyaagarg/>
7488
[linkedin-suyash]: <https://www.linkedin.com/in/sgalpha01/>
89+
[minikube]: <https://minikube.sigs.k8s.io/>
90+
[minio]: <https://min.io/>
91+
[minio-deploy-openshift-template]: <https://github.com/CSCfi/Minio-OpenShift>
92+
[minio-docs-k8s]: <https://min.io/docs/minio/kubernetes/upstream/index.html>
7593
[nextflow]: <https://www.nextflow.io/>
94+
[netrc]: <https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html>
95+
[nfs]: <https://en.wikipedia.org/wiki/Network_File_System>
96+
[openpbs]: <https://www.openpbs.org/>
97+
[openshift]: <https://www.redhat.com/en/technologies/cloud-computing/openshift>
7698
[osi]: <https://opensource.org/>
7799
[py]: <https://www.python.org/>
78100
[py-black]: <https://github.com/psf/black>
@@ -88,6 +110,13 @@
88110
[py-pytest]: <https://docs.pytest.org/en/latest/>
89111
[py-typing]: <https://docs.python.org/3/library/typing.html>
90112
[sem-ver]: <https://semver.org/>
113+
[slurm]: <https://slurm.schedmd.com/>
91114
[snakemake]: <https://snakemake.readthedocs.io/en/stable/>
92115
[snakemake-docs]: <https://snakemake.readthedocs.io/en/stable/executing/cloud.html#executing-a-snakemake-workflow-via-ga4gh-tes>
93116
[tesk]: <https://github.com/elixir-cloud-aai/TESK>
117+
[tesk-docs-deploy]: <https://github.com/elixir-cloud-aai/TESK/blob/master/charts/tesk/README.md>
118+
[tesk-docs-deploy-values]: <https://github.com/elixir-cloud-aai/TESK/tree/master/charts/tesk#description-of-values>
119+
[tesk-helm]: <https://github.com/elixir-cloud-aai/TESK/tree/master/charts/tesk>
120+
[tesk-helm-values]: <https://github.com/elixir-cloud-aai/TESK/blob/master/charts/tesk/values.yaml>
121+
[vsftpd]: <https://security.appspot.com/vsftpd.html>
122+
[vsftpd-deploy]: <https://www.digitalocean.com/community/tutorials/how-to-set-up-vsftpd-for-a-user-s-directory-on-ubuntu-20-04>

0 commit comments

Comments
 (0)