31 Dec 12:10

antonylebechec

c08a91e

v0.12.1.1 Latest

Latest

Few updates and fixes.

Updates

Improve transcripts table creation
Improve Docker image:
- Using Micromamba instead of mamba
- Reduce size by combining layers and cleaning caches

Fixes

Fix tag of howard tool and docker images

Assets 2

16 Dec 15:36

antonylebechec

v0.12.0

064468c

v0.12.0

This release introduce 'BigWig' annotation, prioritization options and transcripts view, improve samples management, INFO/tags rename, annotation databases generation and operations, configuration files in YAML format, and python packages stability.

News

Add annotation method 'BigWig'
Add prioritization options:
- SQL syntax available to define filters
- New 'Class' prioritization field
New transcripts view:
- Create a transcript view, using a structure from multiple source type (e.g. snpEff, external annotation databases)
- Mapping between multiple transcript ID source (e.g. refSeq, Ensembl)
- Transcripts prioritization, using same prioritization process than variants
- Export transcripts table as a file, in multiple format such as VCF, TSV, Parquet
Export with a specific sample list
Rename or remove INFO/tags before exporting
Configuration and parameters files in YAML format allowed
Add dynamic transcript column for NOMEN calculation (using transcript prioritization column)
Add plugins:
- 'update_databases'

Updates

Improve snpEff annotations operations
New option 'uniquify' for dbSNFP generation, identification of columns type
Management, check and export of samples columns
Improve query type mode
Improve splice annotation
Improve NOMEN generation

Fixes

Genotype format detection
Fix packages releases
Fix parameters and configuration files options
Fix calculations list and parametrization
Fix empty file export
Fix BED annotation with parquet method
More explicite log messages

Assets 2

12 Jul 10:27

antonylebechec

v0.11.0

d35f922

v0.11.0

This release introduce splice annotation tool, and update duckDB python package for improve stability.

News

Add splice tool with docker image
Add snpSift tool to annotate with VCF databases
Add quick annotation tool option (e.g. --annotation_parquet, --annotation_snpsift)
Add database generation from gene annotations database
Add plugins:
- 'genebe' (GeneBe annotation using REST API)
- 'minimalize' (Minimalize a VCF file, such as removing INFO/Tags or samples)

Updates

DuckDB 1.0.0 stable Snow Duck (Anas Nivis) release
Add API Documentation
Improve tests

Fixes

Paths parameters check fixed (genome and genomes-folders)
Fix snpEff download error with databases list

Assets 2

08 May 16:55

antonylebechec

v0.10.0

0726681

v0.10.0

This release is a refactor of HOWARD (Highly Open Workflow for Annotation & Ranking toward genomic variant Discovery) in Python, using Parquet and duckDB.

HOWARD annotates and prioritizes genetic variations, calculates and normalizes annotations, translates files in multiple formats (e.g. vcf, tsv, parquet) and generates variants statistics.

See README and gitHub for more explanations.

Assets 2

21 Sep 16:23

bioinfo-chru-strasbourg

0.9.15.6

f22da3c

HOWARD 0.9.15.6

HOWARD

HOWARD annotates and prioritizes genetic variations, calculates and normalizes annotations, translates vcf format and generates variants statistics.

HOWARD annotation is mainly based on ANNOVAR and snpEff tools to annotate, using available databases (see ANNOVAR and snpEff) and home made databases. It also uses BCFTOOLS to annotate variants with a VCF file. ANNOVAR and snpEff databases are automatically downloaded if needed.

HOWARD calculation harmonizes allele frequency (VAF), extracts Nomen (transcript, cNomen, pNomen...) from HGVS fields with an optional list of personalized transcripts, generates VaRank format barcode.

HOWARD prioritization algorithm uses profiles to flag variants (as passed or filtered), calculate a prioritization score, and automatically generate a comment for each variants (example: 'polymorphism identified in dbSNP. associated to Lung Cancer. Found in ClinVar database').Prioritization profiles are defined in a configuration file. A profile is defined as a list of annotation/value, using wildcards and comparison options (contains, lower than, greater than, equal...). Annotations fields may be quality values (usually from callers, such as 'GQ', 'DP') or other annotations fields provided by annotations tools, such as HOWARD itself (example: COSMIC, Clinvar, 1000genomes, PolyPhen, SIFT). Multiple profiles can be used simultaneously, which is useful to define multiple validation/prioritization levels (example: 'standard', 'stringent', 'rare variants', 'low allele frequency').

HOWARD translates VCF format into TSV format, by sorting variants using specific fields (example : 'prioritization score', 'allele frequency', 'gene symbol'), including/excluding annotations/fields, including/excluding variants, adding fixed columns.

HOWARD generates statistics files with a specific algorithm, snpEff and BCFTOOLS.

HOWARD is multithreaded through the number of variants and by database (data-scaling).

Getting Started

In order to build, setup and create a persitent CLI (running container), docker-compose command build images and launch services as containers.

$ docker-compose up

A setup container (HOWARD-setup) automatically downloads required databases according to an HOWARD VCF example annotation using ANNOVAR and snpEff. Configuration of host data and databases folders (default ${HOME}/HOWARD), assembly and databases to download in .env file. See HOWARD, ANNOVAR and snpEff documentation for custom databases download.

A Command Line Interface container (HOWARD-CLI) is started with host data and databases folders mounted. Execute a command, or connect to the CLI as a terminal, and let's start with HOWARD!

Using an HOWARD VCF example, this command:

1- annotates with HGVS (variation identification), outcome and location (fonctionnal annotation), and clinical databases (ClinVar and Cosmic),
2- calculates the Variant Allele Frquency (VAF), a genotype barcode (BARCODE), and process HGVS to extract NOMEN information,
3- prioritizes variations according to priorization rules specific to somatic focus (quality, functionnal and clinical annotations),
4- translates into TSV format, with specific fields order for the first 3 columns (ALL for the rest), and a sorting to focus on intersting variations (Flag as PASS, with best score)
5- generates final file into host data folder (e.g. ${HOME}/HOWARD/data/example.howard.tsv)

$ docker exec HOWARD --input=/tool/docs/example.vcf --output=/data/example.howard.tsv --annotation=snpeff,hgvs,symbol,outcome,location,CLINVAR,CLINVAR_CLNDN,COSMIC --calculation=VAF,BARCODE,NOMEN --prioritization=SOMATIC --translation=TSV --fields=NOMEN,PZFlag,PZScore,ALL --sort=PZFlag::DESC,PZScore:n:DESC

$ docker exec -ti HOWARD-CLI bash
[data]# HOWARD --help

Docker

HOWARD image presents a container that runs on CentOS, and includes yum modules and other tools dependencies:

Java [1.8]
bcftools/htslib [1.12]
ANNOVAR [2019Oct24]
snpEff [5.0e]

Docker Build - Image

The Dockerfile provided with this package provides everything that is needed to build the image. The build system must have Docker installed in
order to build the image.

$ cd ${HOME}/HOWARD
$ docker build -t howard:latest .

Running Run - Container

The container host must have Docker installed in order to run the image as a container. Then the image can be pulled and a container can be started directly. Any standard Docker switches may be provided on the command line when running a container.

$ docker run howard:latest

Mount Data and Databases volumes

In order to make data and databases persistent, host volumes can be mounted. Content may also be copied directly into the running container using a
docker cp ....

-v ${HOME}/HOWARD/data:/data
-v ${HOME}/HOWARD/databases:/databases

Run as a terminal

In order to execute command directly to an container, start HOWARD container with terminal interface:

$ docker run --name howard --entrypoint=bash -ti howard:latest

Example

Run HOWARD as a uniq command.

$ docker run --rm -v ${HOME}/HOWARD/data:/data -v ${HOME}/HOWARD/databases:/databases howard:latest --input=/tool/docs/example.vcf --output=/data/example.howard.tsv --annotation=snpeff,hgvs,symbol,outcome,location,CLINVAR,CLINVAR_CLNDN,COSMIC --calculation=VAF,BARCODE,NOMEN --prioritization=SOMATIC --translation=TSV --fields=NOMEN,PZFlag,PZScore,ALL --sort=PZFlag::DESC,PZScore:n:DESC

Database download

Databases are downloaded automatically by using annotation configuratin file, or options in command line (--annovar_databases, --snpeff_databases, assembly...).

Use a vcf file, such as HOWARD VCF example, to download ANNOVAR and snpEff databases (WITHOUT multithreading, "ALL" for all databases, "core" for core databases, "snpeff" for snpEff database, or a list of databases, or ANNOVAR code). Use this command multiple times for all needed databases and assembly (such as hg19, hg38, mm9).

$ docker run howard:latest --input=/tool/docs/example.vcf --output=/tool/docs/example.annotated.vcf --annotation=ALL,snpeff --thread=1 --verbose

Note: For home made databases, refer to config.annotation.ini file to construct and configure your own database.

Note: Beware of proxy configuration!

Assets 2

12 Apr 23:38

bioinfo-chru-strasbourg

0.9.15.4

dfba316

HOWARD 0.9.15.4

HOWARD

HOWARD annotates and prioritizes genetic variations, calculates and normalizes annotations, translates vcf format and generates variants statistics.

HOWARD generates statistics files with a specific algorithm, snpEff and BCFTOOLS.

HOWARD is multithreaded through the number of variants and by database (data-scaling).

Getting Started

In order to build, setup and create a persitent CLI (running container), docker-compose command build images and launch services as containers.

$ docker-compose up

A Command Line Interface container (HOWARD-CLI) is started with host data and databases folders mounted. Execute a command, or connect to the CLI as a terminal, and let's start with HOWARD!

Using an HOWARD VCF example, this command:

1- annotates with HGVS (variation identification), outcome and location (fonctionnal annotation), and clinical databases (ClinVar and Cosmic),
2- calculates the Variant Allele Frquency (VAF), a genotype barcode (BARCODE), and process HGVS to extract NOMEN information,
3- prioritizes variations according to priorization rules specific to somatic focus (quality, functionnal and clinical annotations),
4- translates into TSV format, with specific fields order for the first 3 columns (ALL for the rest), and a sorting to focus on intersting variations (Flag as PASS, with best score)
5- generates final file into host data folder (e.g. ${HOME}/HOWARD/data/example.howard.tsv)

$ docker exec HOWARD --input=/tool/docs/example.vcf --output=/data/example.howard.tsv --annotation=hgvs,symbol,outcome,location,CLINVAR,CLINVAR_CLNDN,COSMIC --calculation=VAF,BARCODE,NOMEN --prioritization=SOMATIC --translation=TSV --fields=NOMEN,PZFlag,PZScore,ALL --sort=PZFlag::DESC,PZScore:n:DESC

$ docker exec -ti HOWARD-CLI bash
[data]# HOWARD --help

Docker

HOWARD image presents a container that runs on CentOS, and includes yum modules and other tools dependencies:

Java [1.8]
bcftools/htslib [1.12]
ANNOVAR [2019Oct24]
snpEff [5.0e]

Docker Build - Image

The Dockerfile provided with this package provides everything that is needed to build the image. The build system must have Docker installed in
order to build the image.

$ cd ${HOME}/HOWARD
$ docker build -t howard:latest .

Running Run - Container

$ docker run howard:latest

Mount Data and Databases volumes

In order to make data and databases persistent, host volumes can be mounted. Content may also be copied directly into the running container using a
docker cp ....

-v ${HOME}/HOWARD/data:/data
-v ${HOME}/HOWARD/databases:/databases

Run as a terminal

In order to execute command directly to an container, start HOWARD container with terminal interface:

$ docker run --name howard --entrypoint=bash -ti howard:latest

Example

Run HOWARD as a uniq command.

$ docker run --rm -v ${HOME}/HOWARD/data:/data -v ${HOME}/HOWARD/databases:/databases howard:latest --input=/tool/docs/example.vcf --output=/data/example.howard.tsv --annotation=hgvs,symbol,outcome,location,CLINVAR,CLINVAR_CLNDN,COSMIC --calculation=VAF,BARCODE,NOMEN --prioritization=SOMATIC --translation=TSV --fields=NOMEN,PZFlag,PZScore,ALL --sort=PZFlag::DESC,PZScore:n:DESC

Database download

Databases are downloaded automatically by using annotation configuratin file, or options in command line (--annovar_databases, --snpeff_databases, assembly...).

$ docker run howard:latest --input=/tool/docs/example.vcf --output=/tool/docs/example.annotated.vcf --annotation=ALL,snpeff --thread=1 --verbose

Note: For home made databases, refer to config.annotation.ini file to construct and configure your own database.

Note: Beware of proxy configuration!

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates

Fixes

News

Updates

Fixes

News

Updates

Fixes

HOWARD

Getting Started

Docker

Docker Build - Image

Running Run - Container

Mount Data and Databases volumes

Run as a terminal

Example

Database download

HOWARD

Getting Started

Docker

Docker Build - Image

Running Run - Container

Mount Data and Databases volumes

Run as a terminal

Example

Database download

Releases: bioinfo-chru-strasbourg/howard

v0.12.1.1

Updates

Fixes

v0.12.0

News

Updates

Fixes

v0.11.0

News

Updates

Fixes

v0.10.0

HOWARD 0.9.15.6

HOWARD

Getting Started

Docker

Docker Build - Image

Running Run - Container

Mount Data and Databases volumes

Run as a terminal

Example

Database download

HOWARD 0.9.15.4

HOWARD

Getting Started

Docker

Docker Build - Image

Running Run - Container

Mount Data and Databases volumes

Run as a terminal

Example

Database download