Release notes

Update to how phylo supplementary data are handled.

An optional set of local sequences can be supplied to supplement the phylogenetic analysis. To supply them to piranha, point to the correct directory using -sd,--supplementary-datadir. The sequence files should be in FASTA format, but do not need to be aligned. To allow piranha to assign the sequences to the relevant phylogeny, the sequence files should have the reference group annotated in the header in the format display_name=Sabin1-related, for example.

This supplementary sequence files can be accompanied with csv metadata files (one row per supplementary sequence) and this metadata can be included in the final report and annotated onto the phylogenies (-smcol/--supplementary-metadata-columns). By default, the metadata is matched to the FASTA sequence name with a column titled sequence_name but this header name can be configured by specifying -smid/--supplementary-metadata-id-column.

Piranha will iterate accross the directory supplied and amalgamate the FASTA files, retaining any sequences with display_name=X in the header description, where X can be one of Sabin1-related, Sabin2-related, Sabin3-related or WPV1. It then will read in every csv file it detects in this directory and attempts to match any metadata to the gathered fasta records. These will be added to the relevant phylogenies.

Update to how local database is updated

If you supply a path to the -sd,--supplementary-datadir for the phylogenetics module, you have the option of updating this data directory with the new consesnsus sequences generated during the piranaha analysis. If you run with the -ud,--update-local-database flag, piranha will write out the new sequences and any accompanying metadata supplied into the directory provided.

The files written out will be in the format runname.today.fasta and runname.today.csv. For example, if your runname supplied is MIN001 and today's date is 2023-11-05, the files written will be:

MIN001.2023-11-05.fasta
MIN001.2023-11-05.csv with the newly generated consensus sequences and accompanying metadata from that run.
Note: if supplying the supplementary directory to piranha on a subsequent run, your updated local database will be included in the phylogenetics. However, piranha will ignore any files with identical runname.today patterns to the active run. So, if your current run would produce files called MIN001.2023-11-05.fasta and MIN001.2023-11-05.csv, if those files already exist in the supplementary data directory, they will be ignored. This is to avoid conflicts if piranha is run multiple times on the same data.

Piranha now runs on EPI2ME
Phylo pipeline added to github actions

Assets 2

02 Nov 09:50

aineniamh

1.1.2

ecdca9b

piranha v1.1.2

Release notes

Temp output file that describes the treatment of every read (mapped, unmapped, filtered, why it was filtered)
aln_block_len now configurable (default still 0.6 of the min read length|)
logos in the report are now clickable to go to the website and to go to repo
ID'ed a bug in config parsing, newer arguments were being ignored. now updated to use globals() python to find defined in config, rather than an explicit set of arguments that needs maintaining
Two new lang dict keys for info on piranha and how to cite it in the report (currently biorxiv link)
show only flagged seqs table in report if there are flagged seqs
Colour by whether cns could be generated (issue #131)
rm eng and french specific reports
Update local database flag added, but need to supply supp seqs
display name added to local db, but think this should change (legacy from rampart) **
fixing positive and negative parsing when using config.yaml
Updating eg report son website to contain phylo and configuration table
fixing metadata merging bug between barcodes.csv and supplementary_metadata.csv

Assets 2

24 Oct 16:13

aineniamh

1.1.1

826c743

piranha v1.1.1

Release notes

Fix for issue #162 which was introduced with update to fasta header format
Big alteration to fasta header formatting now for issue #159
Consensus fasta header format now includes a record id and record description set of fields. ID will always be consistent, and empty if certain fields not provided. Description dynamic, and can have extra info included with --all-metadata-to-header flag

>SAMPLE|REFERENCE_GROUP|CNS_ID|EPID|SAMPLE_DATE barcode=barcode01 variant_count=8 variants=17:CT;161:CT;427:GA;497:AC;507:CT;772:AG;822:CT;870:CA reference=Poliovirus3-Sabin_AY184221

Where the ID is: SAMPLE|REFERENCE_GROUP|CNS_ID|EPID|SAMPLE_DATE
And the description is: barcode=barcode01 variant_count=8 variants=17:CT;161:CT;427:GA;497:AC;507:CT;772:AG;822:CT;870:CA reference=Poliovirus3-Sabin_AY184221

For issue #156 can now colour by call as well in the phylogeny

Assets 2

20 Oct 15:48

aineniamh

1.1

253608e

piranha v1.1

Release notes

Piranha now has a phylogenetics module that can be triggered with the -rp/--run-phylogenetics flag. This module allows the user to generate an alignment (mafft) per reference group detected within a given sequencing run and estimate a maximum likelihood phylogenetic tree (iqtree2).
The alignment will contain any consensus VP1 sequences generated for a given reference group and the appropriate reference (i.e. Sabin2 if a VDPV2 set of sequences).
The user can supply an additional set of sequences, optionally with associated metadata, that could be a local FASTA database with previously generated sequences. These sequences can get added to the resepctive alignment, and then phylogeny, enabling the user to immediately and easily compare within the current sequencing run, and accross previous sequencing runs.
To ensure each phylogeny is rooted correctly, an appropriate outgroup is included in the alignment and tree building (the respective sabin sequencing of a given reference group), and is then pruned off of the phylogeny afterwards.
The phylogeny can be annotated with metadata supplied in the barcodes.csv file.
Configuration table supplied in the report now listing config settings of the piranha run
Some code refactoring also included in this release:
- Cleaning up the command.py file and porting the snakemake api calls to a function
- There is now only a single report and barcode report template- this allows easier maintenance, and the words in the reports are substituted in with the relevant language dictionary. this has led to some rephrasing, so that templating in can be consistent between the two languages.
Positive control added to ref db
2 new dependencies introduced for phylo pipeline- iqtree and jclusterfunk
Switching default cns header output to not include variant string. If '--add-all-metadata-to-header` flag used, will show up then instead.
Now blank field for number of mutations in cns fasta header if not a sabin as number to a general reference isn't really relevant.

Assets 2

18 Aug 15:53

aineniamh

1.0.13

f99ac45

piranha v1.0.13

Release notes

Resolved feature request #143 to switch default plate orientation
Resolved issue #141 to only link to bcode report if it exists
Resolved #139, allowing missing barcodes that don't have sample name
Adding qc for well entry to make sure correct format and non-duplicate well names
Updates to report plate viz colouring and sorting
For plate viz, now showing distinction between present samples and non-present samples (#139 )
Allowing multiple negative and positive controls (#88 and #138), still need to have distinct names- flag them on the command line or in the config.yaml file
Modifying the minimap2 parameters to catch divergent VDPV sequences (now up to 20% divergence with -x asm20, but this will include the noise of the read as well)
Note that macos actions have been removed in this release, as issues with mamba installation on GitHub actions were breaking for macos tests. Ubuntu tests still passing.