-
Notifications
You must be signed in to change notification settings - Fork 2
Input data
Freya Arthen edited this page Mar 3, 2022
·
6 revisions
- genomic FASTA file
- nucleotide sequence of your assembly divided in contigs/scaffolds
- GFF3 file
- proteins FASTA file (optional)
- protein sequences for all protein coding genes in your data set
- will be extracted within taXaminer pipeline using the tool gffread if not provided
- coverage information (optional, but recommended)
- taXaminer can run with or without coverage information
- if you wish to include it, you need one of the following:
- per base coverage (PBC) file: tab-separated file with 3 columns (scaffold name, base number and coverage at given position) \
- mapping file (BAM format): sorted and indexed \
- raw read FASTA files: forward and backward or unpaired raw read files
- multiple coverage data can be used
- config file
- YAML format