This repository contains analyses required to recreate the analyses found in Schultz, Kotlobay, et al. 2018
Specifically, it contains the scripts to recreate figure 2 (the transcript alignment figure), as well as the orthology search with other polychaetes.
This is a reproducible pipeline assuming that all of the required software and sequencing reads are on your file system.
Required software:
- Python 3.x - I recommend Anaconda
- Snakemake
- Trimmomatic v0.35
- a local copy of the NCBI nr database
- a local operating version of blast
- pauvre plotting software
- the Trinity RNAseq assembler docker
- bioawk and normal awk
- hisat2
- bwa
- samtools
- minimap2
For now you will need to edit config.yaml
and specify the location
of the reads files from O. undecimdonta sequencing project on ENA/SRA.
An automatic download feature will be added when the reads become available.
After install all of the required software above, download this repository and execute the following commands:
git clone https://github.com/conchoecia/odontosyllis_undecimdonta_luciferase.git
cd odontosyllis_undecimdonta_luciferase/
snakemake --cores <desired number of cores>
This analysis took approximately 10 days on a linux server with 90 threads. Most of the computation time is spent assembling transcriptomes.