Skip to content
forked from medvir/MinVar

A tool to detect minority variants in HIV-1 and HCV populations

License

Notifications You must be signed in to change notification settings

MaryamZaheri/MinVar

This branch is 2 commits ahead of medvir/MinVar:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

f0bc64d · Sep 19, 2023
Mar 8, 2017
Aug 28, 2018
Jul 18, 2018
Sep 20, 2018
May 26, 2023
Jul 18, 2018
May 31, 2017
Jun 7, 2017
Jun 7, 2017
Jul 17, 2018
Jun 26, 2015
Dec 27, 2019
Nov 29, 2017
Sep 14, 2018
May 30, 2017
Dec 13, 2017
Mar 23, 2018
Nov 29, 2017
Nov 29, 2017

Repository files navigation

install with bioconda codebeat badge

MinVar: automatic detection of drug-resistance mutations in HIV-1

MinVar is a command-line tool to discover mutations conferring drug resistance in HIV-1 and HCV populations using deep sequencing data.


The simplest example

[user@host ~]$ minvar -f sample_file.fastq
... a few minutes later ...
[user@host ~]$ column -t -s ',' merged_muts_drm_annotated.csv
gene      pos  mut  freq    category
...
RT        238  T    1.0     NNRTI
RT        250  N    0.9547  unannotated
RT        272  P    1.0     unannotated
RT        293  V    1.0     unannotated
RT        297  A    1.0     unannotated
RT        333  D    0.9384  unannotated
RT        333  E    0.0354  unannotated
RT        335  C    1.0     unannotated
protease  10   P    0.0223  Other
protease  10   Q    0.0185  Other
protease  10   S    0.0741  Other
protease  10   T    0.0468  Other
protease  10   V    0.5948  PIMinor
protease  11   L    1.0     PIMinor
protease  13   V    1.0     unannotated
protease  14   R    1.0     unannotated
protease  15   V    0.7143  unannotated
protease  20   T    1.0     PIMinor
protease  32   I    1.0     PIMajor
...

Important features

  • MinVar is an opinionated software: it just takes a fastq file as input and does not ask the standard user to set any parameter at run time. Nevertheless, the experienced user/developer can easily change some of its settings in the source code.
  • It has been tested with HIV-1 on both Illumina MiSeq and Roche/454 sequencing reads. HCV has been tested on MiSeq only.
  • It uses state-of-the-art third tools to filter, recalibrate, and align reads and to call variants.
  • Finally, single nucleotide variants are phased at codon level and amino acid mutations are called and annotated.
  • HIV-1 drug-resistance mutations are annotated according to Stanford HIV Drug Resistance Database (HIVDB).
  • The annotated mutations are saved in a csv file (see example above) and also included in a report in markdown format that is finally converted to PDF.
  • The PDF report can be customized by adding contact information specified in the file ~/.minvar/contact.ini with the following syntax (only change what comes after the = sign)
[contact]
unit = name_of_your_unit_here
phone = phone_number
fax = fax_number
email = your_unit@your_company
logo = filename_without_extension

The logo file in pdf format must be present in the same directory. In other words, if we want to use the file ~/.minvar/company_logo_bw.pdf, then in the INI file we will write logo = company_logo_bw.

Documentation

See the official documentation.

API documentation can be created by cloning this repo, cd-ing into apidoc and running make html.

Citation

MinVar (version 1, HIV-1 support only) has been introduced and validated in
Huber, Metzner et al., (2017) MinVar: A rapid and versatile tool for HIV-1 drug resistance genotyping by deep sequencing Journal of virological methods 240:7-13, doi:10.1016/j.jviromet.2016.11.008

Output files

Created by prepare.py

  • subtype_evidence.csv percent of reads best aligned to each subtype (or genotype),
  • subtype_ref.fasta references of the subtype identified,
  • cns_final.fasta: sample consensus created by iteratively aligning reads and writing variants into the sequence,

Created by callvar.py

  • hq_2_cns_final_recal.bam sorted bam alignment of reads to the consensus sequence, recalibrated with either GATK or lofreq (indels only),
  • hq_2_cns_final_recal.vcf VCF file of mutations found on reads with respect to consensus in cns_final.fasta.

Created by annotate.py

  • merged_mutations_nt.csv a list of all variants observed at single positions,
  • max_freq_muts_aa.csv the amminoacid found at maximum frequency at each codon,
  • final.csv mutations at amminoacid level with indication of the gene, the position on the gene, wild type and frequency

Created by reportdrm.py

  • merged_muts_drm_annotated.csv is the join of final.csv with the annotation of DRM/RAS,
  • report.md and report.pdf final report with subtye estimate based on alignment of reads to different references and tables with mutations. The pdf report is created from the template minvar/db/template.tex.

About

A tool to detect minority variants in HIV-1 and HCV populations

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 89.5%
  • Batchfile 4.7%
  • Makefile 4.6%
  • TeX 1.1%
  • Shell 0.1%