Skip to content

1.1.0

Latest
Compare
Choose a tag to compare
@gregdenay gregdenay released this 11 Dec 12:42
2967065

1.1.0

Quality of life update with a completely reworked report, new options to provide input data and some more experimental options for primer trimming and read merging. Details below:

Features

  • New argument --reads is mutually exclusive with --input and takes a glob path as argument. File names are parsed with the glob pattern and paired. This allows bypassing the need for a sample-sheet. It is not recommended way to provide input data but can be helpful in many cases.

    nextflow run bio-raum/FooDMe2 -profile singularity --reads '/path/to/reads/*_R{1,2}_001.fastq.gz'
    
  • Add the parameter --non_overlapping to simply concatenate R1 and R2 reads instead of merging with an overlapping sequence. This is useful in case the amplicon or seuqnce length produce reads with no overlaps.

  • Add the argument --cutadapt_trim_flex to attempt trimming on both 5' and 3' of reads but also keep reads where only a 5' trimming was performed.

Reporting

  • Complete rework of the HTML report: now uses a fully customized markdown template.
  • Disabled hard filtering of samples based on read number after the primer trimming step. Samples will now soft fail in subsequent steps (e.g. clustering) or keep going to the end.
  • All samples (including) failed samples now appear in the end report. Samples where no primmer trimming or clustering could be performed are marked as fail.
  • Added read counts to the excel report

Documentation

  • Added ressource usage arguments to usage documentation
  • Added some information on the --reference_base argument in the troobleshooting section
  • Added information on the new --reads argument
  • Added information on the use of a local configuration file
  • Added information on BLAST, and clustering-specific arguments

Bugs

  • Fix conda environment definition path for module DADA2:RMCHIMERA that could lead to a failure to genreate the environment for conda user depending on the channel settings.
  • Actually implements chimera removal skipping behaviour for --remove_chimera false
  • Enforce similar filtering procedure for both VSEARCH and DADA2 workflows:
    • Filtering of sequences based on expected amplicon size (amplicon_min_size and amplicon_max_size) now correctly happens after read merging instead on being applied to the read length.
    • Filtering based on maxEE and maxNs now happens at the read level, prior to merging for VSEARCH
    • For both Workflows, read pairs are filtered baed on MaxEE and MaxNs, then merged, then merged pairs are filtered for expected size