Skip to content

Commit

Permalink
Merge branch 'master' of github.com:DataBiosphere/toil into issues/46…
Browse files Browse the repository at this point in the history
…32-good-config-wdl-cwl
  • Loading branch information
stxue1 committed Nov 16, 2023
2 parents 3ab06c4 + d710213 commit bbaa068
Show file tree
Hide file tree
Showing 16 changed files with 15 additions and 648 deletions.
15 changes: 6 additions & 9 deletions attic/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#Toil
Python based pipeline management software for clusters that makes running recursive and dynamically scheduled computations straightforward. So far works with gridEngine, lsf, parasol and on multi-core machines.
Python based pipeline management software for clusters that makes running recursive and dynamically scheduled computations straightforward. So far works with gridEngine, lsf, and on multi-core machines.

##Authors
[Benedict Paten](https://github.com/benedictpaten/), [Dent Earl](https://github.com/dentearl/), [Daniel Zerbino](https://github.com/dzserbino/), [Glenn Hickey](https://github.com/glennhickey/), other UCSC people.
Expand Down Expand Up @@ -28,9 +28,9 @@ The following walks through running a toil script and using the command-line too

Once toil is installed, running a toil script is performed by executing the script from the command-line, e.g. (using the file sorting toy example in **tests/sort/scriptTreeTest_Sort.py**):

<code>[]$ scriptTreeTest_Sort.py --fileToSort foo --toil bar/toil --batchSystem parasol --logLevel INFO --stats</code>
<code>[]$ scriptTreeTest_Sort.py --fileToSort foo --toil bar/toil --batchSystem slurm --logLevel INFO --stats</code>

Which in this case uses the parasol batch system, and INFO level logging and where foo is the file to sort and bar/toil is the location of a directory (which should not already exist) from which the batch will be managed. Details of the toil options are described below; the stats option is used to gather statistics about the jobs in a run.
Which in this case uses the slurm batch system, and INFO level logging and where foo is the file to sort and bar/toil is the location of a directory (which should not already exist) from which the batch will be managed. Details of the toil options are described below; the stats option is used to gather statistics about the jobs in a run.

The script will return a zero exit value if the toil system is successfully able to run to completion, else it will create an exception. If the script fails because a job failed then the log file information of the job will be reported to std error.
The toil directory (here 'bar/toil') is not automatically deleted regardless of success or failure, and contains a record of the jobs run, which can be enquired about using the **toilStatus** command. e.g.
Expand Down Expand Up @@ -150,17 +150,14 @@ The important arguments to **toilStats** are:

--batchSystem=BATCHSYSTEM
The type of batch system to run the job(s) with,
currently can be
'singleMachine'/'parasol'/'acidTest'/'gridEngine'/'lsf'.
currently can be 'singleMachine'/'gridEngine'/'lsf'.
default=singleMachine
--maxThreads=MAXTHREADS
The maximum number of threads (technically processes
at this point) to use when running in single machine
mode. Increasing this will allow more jobs to run
concurrently when running on a single machine.
default=4
--parasolCommand=PARASOLCOMMAND
The command to run the parasol program default=parasol

Options to specify default cpu/memory requirements (if not
specified by the jobs themselves), and to limit the total amount of
Expand Down Expand Up @@ -202,7 +199,7 @@ The important arguments to **toilStats** are:
--bigBatchSystem=BIGBATCHSYSTEM
The batch system to run for jobs with larger
memory/cpus requests, currently can be
'singleMachine'/'parasol'/'acidTest'/'gridEngine'.
'singleMachine'/'gridEngine'.
default=none
--bigMemoryThreshold=BIGMEMORYTHRESHOLD
The memory threshold above which to submit to the big
Expand Down Expand Up @@ -240,7 +237,7 @@ The important arguments to **toilStats** are:

The following sections are for people creating toil scripts and as general information. The presentation **[docs/toilSlides.pdf](https://github.com/benedictpaten/toil/blob/master/doc/toilSlides.pdf)** is also a quite useful, albeit slightly out of date, guide to using toil. -

Most batch systems (such as LSF, Parasol, etc.) do not allow jobs to spawn
Most batch systems (such as LSF) do not allow jobs to spawn
other jobs in a simple way.

The basic pattern provided by toil is as follows:
Expand Down
1 change: 0 additions & 1 deletion contrib/admin/mypy-with-ignore.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ def main():
'src/toil/batchSystems/slurm.py',
'src/toil/batchSystems/gridengine.py',
'src/toil/batchSystems/singleMachine.py',
'src/toil/batchSystems/parasol.py',
'src/toil/batchSystems/torque.py',
'src/toil/batchSystems/options.py',
'src/toil/batchSystems/registry.py',
Expand Down
6 changes: 3 additions & 3 deletions docs/contributing/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,10 @@ depend on a currently installed *feature*, use
This will run only the tests that don't depend on the ``aws`` extra, even if
that extra is currently installed. Note the distinction between the terms
*feature* and *extra*. Every extra is a feature but there are features that are
not extras, such as the ``gridengine`` and ``parasol`` features. To skip tests
involving both the ``parasol`` feature and the ``aws`` extra, use the following::
not extras, such as the ``gridengine`` feature. To skip tests
involving both the ``gridengine`` feature and the ``aws`` extra, use the following::

$ make test tests="-m 'not aws and not parasol' src"
$ make test tests="-m 'not aws and not gridengine' src"



Expand Down
1 change: 0 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ If using Toil for your research, please cite
.. _website: http://toil.ucsc-cgl.org/
.. _announce: https://groups.google.com/forum/#!forum/toil-announce
.. _GridEngine: http://gridscheduler.sourceforge.net/
.. _Parasol: http://genecats.soe.ucsc.edu/eng/parasol.html
.. _Apache Mesos: http://mesos.apache.org/
.. _spot market: https://aws.amazon.com/ec2/spot/
.. _Amazon Web Services: https://aws.amazon.com/
Expand Down
12 changes: 2 additions & 10 deletions docs/running/cliOptions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ levels in toil are based on priority from the logging module:

--batchSystem BATCHSYSTEM
The type of batch system to run the job(s) with,
currently can be one of aws_batch, parasol, single_machine,
currently can be one of aws_batch, single_machine,
grid_engine, lsf, mesos, slurm, tes, torque,
htcondor, kubernetes. (default: single_machine)
--disableAutoDeployment
Expand Down Expand Up @@ -220,14 +220,6 @@ levels in toil are based on priority from the logging module:
unset, the Toil work directory will be used. Only
works for grid engine batch systems such as gridengine,
htcondor, torque, slurm, and lsf.
--parasolCommand PARASOLCOMMAND
The name or path of the parasol program. Will be
looked up on PATH unless it starts with a
slash. (default: parasol)
--parasolMaxBatches PARASOLMAXBATCHES
Maximum number of job batches the Parasol batch is
allowed to create. One batch is created for jobs with
a unique set of resource requirements. (default: 1000)
--mesosEndpoint MESOSENDPOINT
The host and port of the Mesos server separated by a
colon. (default: <leader IP>:5050)
Expand Down Expand Up @@ -280,7 +272,7 @@ Allows configuring Toil's data storage.
filesystem-based job stores only. (Default=False)
--caching BOOL Set caching options. This must be set to "false"
to use a batch system that does not support
cleanup, such as Parasol. Set to "true" if caching
cleanup. Set to "true" if caching
is desired.

**Autoscaling Options**
Expand Down
2 changes: 1 addition & 1 deletion docs/running/hpcEnvironments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Then make sure to log out and back in again for the setting to take effect.
Standard Output/Error from Batch System Jobs
--------------------------------------------

Standard output and error from batch system jobs (except for the Parasol and Mesos batch systems) are redirected to files in the ``toil-<workflowID>`` directory created within the temporary directory specified by the ``--workDir`` option; see :ref:`optionsRef`.
Standard output and error from batch system jobs (except for the Mesos batch system) are redirected to files in the ``toil-<workflowID>`` directory created within the temporary directory specified by the ``--workDir`` option; see :ref:`optionsRef`.
Each file is named as follows: ``toil_job_<Toil job ID>_batch_<name of batch system>_<job ID from batch system>_<file description>.log``, where ``<file description>`` is ``std_output`` for standard output, and ``std_error`` for standard error.
HTCondor will also write job event log files with ``<file description> = job_events``.

Expand Down
4 changes: 2 additions & 2 deletions docs/running/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Toil is built in a modular way so that it can be used on lots of different syste
The three configurable pieces are the

- :ref:`jobStoreInterface`: A filepath or url that can host and centralize all files for a workflow (e.g. a local folder, or an AWS s3 bucket url).
- :ref:`batchSystemInterface`: Specifies either a local single-machine or a currently supported HPC environment (lsf, parasol, mesos, slurm, torque, htcondor, kubernetes, or grid_engine).
- :ref:`batchSystemInterface`: Specifies either a local single-machine or a currently supported HPC environment (lsf, mesos, slurm, torque, htcondor, kubernetes, or grid_engine).
- :ref:`provisionerOverview`: For running in the cloud only. This specifies which cloud provider provides instances to do the "work" of your workflow.

.. _jobStoreOverview:
Expand Down Expand Up @@ -53,7 +53,7 @@ Batch System
------------

A Toil batch system is either a local single-machine (one computer) or a
currently supported cluster of computers (lsf, parasol, mesos, slurm, torque,
currently supported cluster of computers (lsf, mesos, slurm, torque,
htcondor, or grid_engine) These environments manage individual worker nodes
under a leader node to process the work required in a workflow. The leader and
its workers all coordinate their tasks and files through a centralized job
Expand Down
1 change: 0 additions & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ markers =
local_cuda
lsf
mesos
parasol
rsync
server_mode
slow
Expand Down
Loading

0 comments on commit bbaa068

Please sign in to comment.