If you are new to our pipeline ecosystem, we recommend you first check out our general setup guide here. That said, the instructions below will probably be sufficient for most users.

Installing nextflow

Nextflow is a highly portable pipeline engine. Please see the official installation guide to learn how to set it up.

This pipeline expects Nextflow version 25.04.5, available here. Other, more recent versions, will probably work also.

Software provisioning

This pipeline is set up to work with a range of software provisioning technologies - no need to manually install packages.

You can choose one of the following options:

There is also the option to use Conda, but we strongly discourage this because Conda environments are not guaranteed to be reproducible.

Conda

The pipeline comes with simple pre-set profiles for all of these as described here; if you plan to use this pipeline regularly, consider adding your own custom profile to our central repository to better leverage your available resources.

Installing the references

This pipeline requires locally stored references, matched to the (major!) pipeline version you plan to use (-r). To build these, do:

nextflow run marchoeppner/gmo-check -r 1.0 -profile singularity \\
--build_references \\
--run_name build_refs \\
--reference_base /path/to/references

where /path/to/references could be something like /data/pipelines/references or whatever is most appropriate on your system. On a distributed compute environment, this directory needs to live on a shared file system. If you already use a site-specific config file, the --reference_base option does not need to be set.

If you do not have singularity on your system, you can also specify docker, podman or conda for software provisioning - see the usage information.

Please note that the build process will create a pipeline-specific subfolder (gabi) that must not be given as part of the --reference_base argument. GMO-Check is part of a collection of pipelines that use a shared reference directory and it will choose/create the appropriate subfolder automatically.

Finally, depending on your internet connection, the installation process can take a little while - some of the reference databases are "fairly" large (8-10GB). However, once installed you are all set and ready to go.

Site-specific config file

If you run on anything other than a local system, this pipeline requires a site-specific configuration file to be able to talk to your cluster or compute infrastructure. Nextflow supports a wide range of such infrastructures, including Slurm, LSF and SGE - but also Kubernetes and AWS. For more information, see here.

Site-specific config-files for our pipeline ecosystem are stored centrally on github. Please talk to us if you want to add your system.

Custom config

If you absolutely do not want to add your system to this repository, you can manually pass a compatible configuration to nextflow using the -c command line option:

nextflow -c my.config run  marchoeppner/gmo-check -profile myprofile -r 1.0 --input samples.tsv --run_name my_run_name --reference_base /path/to/references

A basic example using Singularity may look as follows:

process {
  resourceLimits = [ cpus: 16, memory: 64.GB, time: 72.h ]
}

singularity {
  enabled = true
  cacheDir = "/path/to/singularity_cache"
}

This would be for a single computer, with 16 cores and 64GB Ram, using Singularity. Containers are cached to the specified location to be re-used on subsequent pipeline runs.

Or with the Conda/Mamba package manager:

process {
  resourceLimits = [ cpus: 16, memory: 64.GB, time: 72.h ]
}

conda {
  enabled = true
  useMamba = true
  cacheDir = "/path/to/conda_cache"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!