Skip to content

Annotation

Vinh Tran edited this page Jun 12, 2023 · 4 revisions

Table of Contents

Introduction

fas.doAnno is used to perform feature annotation for a list of input protein sequences in FASTA format.

Currently we annotate the protein sequences with 7 annotation tools/databases: cast, THMHH, COILS, SignalP, SEG, PFAM and SMART.

NOTE: fas.doAnno function requires hmmscan to do the annotation. Please install it if needed!!!

Usage

Annotate input sequences

fas.doAnno -i input.fasta -o /annotation/path/

The annotation output (input.json by default) will be saved in /annotation/path/. A custom input file name can be specified using the option -n/--name, e.g.:

fas.doAnno -i input.fasta -o /annotation/path/ -n myname

Re-annotate existing annotation

If you want to redo the annotation for one of the annotation tools/databses, you can use the --redo option:

fas.doAnno -i input.fasta -o /annotation/path/ --redo flps

The possible input for this option is one of these choices flps, tmhmm, signalp, coils2, seg, smart, pfam. Only one selection allowed, and it must be in /path/to/annotation_fas/annoTools.txt (check here for more info about this file).

Extract annotations from existing file

In case you need to extract the annotations for a subset of proteins from an existing file, you can use this command:

fas.doAnno -i input.fasta -o /annotation/path/ --extract --annoFile /path/to/existing/Json_file

More

To have the complete list of available options, please read the help message of annoFAS

fas.doAnno -h