Skip to content
This repository was archived by the owner on May 31, 2024. It is now read-only.

Commit 4a90087

Browse files
committed
Add Toil engine documentation
1 parent d00c516 commit 4a90087

File tree

2 files changed

+64
-0
lines changed

2 files changed

+64
-0
lines changed

site/content/en/docs/Concepts/engines.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ Currently, Amazon Genomics CLI's officially supported engines can be used to run
2121
| [Nextflow](https://www.nextflow.io) | [Nextflow DSL](https://www.nextflow.io/docs/latest/script.html) | Standard and DSL 2 | Head Process |
2222
| [miniwdl](https://miniwdl.readthedocs.io/en/latest/) | [WDL](https://openwdl.org) | [documented here](https://miniwdl.readthedocs.io/en/latest/runner_reference.html?highlight=errata#wdl-interoperability) | Head Process |
2323
| [Snakemake](https://snakemake.readthedocs.io/en/stable/) | [Snakemake](https://snakemake.readthedocs.io/en/stable/snakefiles/writing_snakefiles.html) | All versions | Head Process |
24+
| [Toil](http://toil.ucsc-cgl.org/) | [CWL](https://www.commonwl.org/) | All versions up to 1.2 | Server |
2425

2526
Overtime we plan to add additional engine and language support and provide the ability for third party developers to
2627
develop engine plugins.
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
---
2+
title: "Toil"
3+
date: 2022-04-26T15:34:00-04:00
4+
draft: false
5+
weight: 20
6+
description: >
7+
Details on the Toil engine deployed by Amazon Genomics CLI
8+
---
9+
10+
## Description
11+
12+
[Toil](http://toil.ucsc-cgl.org/) is a workflow engine developed by the
13+
[Computational Genomics Lab](https://cglgenomics.ucsc.edu/) at the
14+
[UC Santa Cruz Genomics Institute](https://genomics.ucsc.edu/). In Amazon Genomics
15+
CLI, Toil is an engine that can be deployed in a
16+
[context]( {{< relref "../Concepts/contexts" >}} ) as an
17+
[engine]( {{< relref "../Concepts/engines">}} ) to run workflows based on the
18+
[CWL](https://www.commonwl.org/) specification.
19+
20+
Toil is an open source project distributed by UC Santa Cruz under the [Apache 2
21+
license](https://github.com/DataBiosphere/toil/blob/master/LICENSE) and
22+
available on
23+
[GitHub](https://github.com/DataBiosphere/toil).
24+
25+
## Architecture
26+
27+
There are two components of a Toil engine as deployed in an Amazon Genomics
28+
CLI context:
29+
30+
### Engine Service
31+
32+
The Toil engine is run in "server mode" as a container service in ECS. The
33+
engine can run multiple workflows asynchronously. Workflow tasks are run in an
34+
elastic [compute environment]( #compute-environment ) and monitored by Toil.
35+
Amazon Genomics CLI communicates with the Toil engine via a GA4GH
36+
[WES](https://github.com/ga4gh/workflow-execution-service-schemas) REST service
37+
which the server offers, available via API Gateway.
38+
39+
### Compute Environment
40+
41+
Workflow tasks are submitted by Toil to an AWS Batch queue and run in
42+
Toil-provided containers using an AWS Compute Environment. Tasks which use the
43+
[CWL `DockerRequirement`](https://www.commonwl.org/user_guide/07-containers/index.html)
44+
will additionally be run under
45+
[Singularity](https://github.com/sylabs/singularity#readme). AWS Batch
46+
coordinates the elastic provisioning of EC2 instances (container hosts) based
47+
on the available work in the queue. Batch will place containers on container
48+
hosts as space allows.
49+
50+
#### Disk Expansion
51+
52+
Container hosts in the Batch compute environment use EBS volumes as local
53+
scratch space. As an EBS volume approaches a capacity threshold, new EBS
54+
volumes will be attached and merged into the file system. These volumes are
55+
destroyed when AWS Batch terminates the container host. CWL disk space
56+
requirements are ignored by Toil when running against AWS Batch.
57+
58+
This setup means that workflows that succeed on AGC may fail on other CWL
59+
runners (because they do not request enough disk space) and workflows that
60+
succeed on other CWL runners may fail on AGC (because they allocate disk space
61+
faster than the expansion process can react).
62+
63+

0 commit comments

Comments
 (0)