1.1.0
Quality of life update with a completely reworked report, new options to provide input data and some more experimental options for primer trimming and read merging. Details below:
Features
-
New argument
--reads
is mutually exclusive with--input
and takes a glob path as argument. File names are parsed with the glob pattern and paired. This allows bypassing the need for a sample-sheet. It is not recommended way to provide input data but can be helpful in many cases.nextflow run bio-raum/FooDMe2 -profile singularity --reads '/path/to/reads/*_R{1,2}_001.fastq.gz'
-
Add the parameter
--non_overlapping
to simply concatenate R1 and R2 reads instead of merging with an overlapping sequence. This is useful in case the amplicon or seuqnce length produce reads with no overlaps. -
Add the argument
--cutadapt_trim_flex
to attempt trimming on both 5' and 3' of reads but also keep reads where only a 5' trimming was performed.
Reporting
- Complete rework of the HTML report: now uses a fully customized markdown template.
- Disabled hard filtering of samples based on read number after the primer trimming step. Samples will now soft fail in subsequent steps (e.g. clustering) or keep going to the end.
- All samples (including) failed samples now appear in the end report. Samples where no primmer trimming or clustering could be performed are marked as fail.
- Added read counts to the excel report
Documentation
- Added ressource usage arguments to usage documentation
- Added some information on the
--reference_base
argument in the troobleshooting section - Added information on the new
--reads
argument - Added information on the use of a local configuration file
- Added information on BLAST, and clustering-specific arguments
Bugs
- Fix conda environment definition path for module DADA2:RMCHIMERA that could lead to a failure to genreate the environment for conda user depending on the channel settings.
- Actually implements chimera removal skipping behaviour for
--remove_chimera false
- Enforce similar filtering procedure for both VSEARCH and DADA2 workflows:
- Filtering of sequences based on expected amplicon size (
amplicon_min_size
andamplicon_max_size
) now correctly happens after read merging instead on being applied to the read length. - Filtering based on maxEE and maxNs now happens at the read level, prior to merging for VSEARCH
- For both Workflows, read pairs are filtered baed on MaxEE and MaxNs, then merged, then merged pairs are filtered for expected size
- Filtering of sequences based on expected amplicon size (