-
Notifications
You must be signed in to change notification settings - Fork 3
Usage and config file parameters
Håkon Kaspersen edited this page Jan 27, 2021
·
1 revision
If using either of the assembly tracks of the pipeline, the reads need to have a specific naming convention in order to match in the pipeline. Overall, avoid using dots (.) in the read file names, except at the end (.fastq.gz). For short reads, the following naming convention must be followed:
*_R{1,2}.fastq.gz
Where the "*" represents the sample ID. For long reads, the following convention must be followed:
*.fastq.gz
Where "*" must exactly match the "*" in the short reads.
To run the pipeline, copy the main.config or the plasmap.config file and edit it to your needs. Then, run the following:
To run Ellipsis main pipeline:
path/to/ellipsis.sh main config_file.config output_folder
To run PlasMap pipeline:
path/to/ellipsis.sh plasmap plasmap.config output_folder
Java is automatically activated and deactivated.
- params.track: Which workflow to run; either "hybrid","short_assembly", or "no_assembly".
- params.reads: The path to the directory that holds the illumina reads. Must match the readfiles inside, not just the directory (see example in the config file).
- params.longreads: The path to the directory that holds the long reads. See naming convention mentioned above.
-
params.assemblies: The path to the directory holding the assemblies if
params.track = "no_assembly"
. - params.*db: Path to each database for each respective program.
-
params.chrom: If
true
, the chromosome is included in the annotations downstream. -
params.trim: If
true
, trimming is run on both long- and short reads (Canu and Trim-galore). -
params.illumina_filtering: If
true
, use Illumina reads as a quality reference for filtering long reads.
- params.phred_score: Which phred score cutoff is used to trim Illumina reads (default 15).
- params.genomesize: Approximate size of the organisms genome.
-
params.sequencer:
nanopore
orpacbio
reads? - params.minlen: Minimum length of long reads to keep.
- params.keep_percent: Filter away low-quality long reads until x percent remains.
- params.target_bases: Maximum number of bases to keep after filtering.
- params.mode: Unicycler mode of assembly, see here for more information.
- params.min_fasta_length: All contigs below this size threshold will be removed.
-
params.prokka_additional: Any additional prokka options may be added here (must be added as-is, i.e. as they are typed when running prokka on a terminal). F.ex.
--proteins <path>
may be used here.