HIV_Tools

This repository serves to store tools for various objectives related with HIV sequencing data.

The tools present in this repository will grow and suffer changes along time, as new problems arise and new solutions get discovered. As such, this README file will be always updated in order to reflect descriptions, instructions and documentation for all the tools present.

Typing with COMET: (COntext-based Modeling for Expeditious Typing)

COMET is a platform provided by the Luxemburg Institute of Health (https://comet.lih.lu/), for HIV-1 and HIV-2 typing. It is currently only available on their web platform, which can be practical for some but impractical for others.

typing_with_comet.py is a python script created to be ran in the terminal (command-line) to type all sequences present in a fasta file and produce an excel or csv report with the typing results. By default, after typing, the script will automatically create a new fasta file where the headers of each sequences will have the identified type, separated by a comma, and then the previous header. This can be turned off using an optional argument described below.
The script contains a -h or --Help with instructions and descriptions for the arguments.

E.g. (original fasta file)

IMCJ_KR020_1 tggcgcccgaacagggacttgaggaagagtgagagtcttcggagcacggctgagtgagggcagtaagggcggcaggaatc aaccacgacggagagctcctgtaaaagcgcaggccggtaccaggcagcgtgaggagcgggaggagaagaggcctccggga

(new fasta file)

B.IMCJ_KR020_1 tggcgcccgaacagggacttgaggaagagtgagagtcttcggagcacggctgagtgagggcagtaagggcggcaggaatc aaccacgacggagagctcctgtaaaagcgcaggccggtaccaggcagcgtgaggagcgggaggagaagaggcctccggga

It has the following mandatory arguments:

--hiv_type: specify the type of sequences to type (1 for HIV-1, 2 for HIV-2).
--output_type: specify the output type, xlsx or csv.
--fasta_file: the directory for the fasta (or fasta.gz).
--output_directory: the directory for the output. A folder called "Comet_Output" will be created in this directory, storing the results.

It has the following optional arguments:

--disable_auto_rename: Y to disable the creation of the new fasta file.
--split_per_subtype: Single or multiple subtypes for splitting per subtype analysis, (e.g --split_by_subtype A or --split_by_subtype B C)

Example of usage:
[Windows]
>python typing_using_comet.py --hiv_type 1 --fasta_file unknown_seqs.fasta --output_directory C:\Users\results --output_type xlsx --split_per_subtype A B

[MAC]
>python3 typing_using_comet.py --hiv_type 2 --fasta_file unknown_seqs.fasta.gz --output_directory C:\Users\results --output_type csv --disable_auto_rename Y --split_per_subtype G

Dependencies

Its always advisable to use a Virtual Environment.

Bio=1.6.2
biopython=1.83
pandas=2.0.3
Requests=2.28.1
selenium=4.4.0
webdriver_manager=4.0.2

[Windows]
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt

[MAC]
python3 -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE		LICENSE
README.md		README.md
typing_using_comet.py		typing_using_comet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HIV_Tools

Typing with COMET: (COntext-based Modeling for Expeditious Typing)

Dependencies

About

Releases

Packages

Languages

License

xiaodre21/HIV_Tools

Folders and files

Latest commit

History

Repository files navigation

HIV_Tools

Typing with COMET: (COntext-based Modeling for Expeditious Typing)

Dependencies

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages