Skip to content

Commit

Permalink
Report alleles (#4)
Browse files Browse the repository at this point in the history
* Add v0.2 of pipeline

* Use new version of screen_abricate_report and write alleles to line list

* Add updated images and prepare for v0.2.0 release

* Write CHANGELOG.md
  • Loading branch information
dfornika authored Jan 3, 2020
1 parent de64703 commit 7fa8b54
Show file tree
Hide file tree
Showing 15 changed files with 125 additions and 98 deletions.
19 changes: 18 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,20 @@
# 0.2.0

- For each gene in the 'gene screening file', both the 'detection status' (True/False) and the detected allele
are written to the IRIDA line-list. Alleles are also reported in the 'gene_detection_status.tsv' output file.
- Limited pipeline parameters to simplify and standardize operation
- Resistance Gene Database is fixed on CARD database.
- Cannot disable post-assembly correction or read trimming.
- Cannot change contig name format
- Cannot provide 'extra spades options' to shovill assembler
- Added thresholds for resistance gene %Coverage and %Identity during secondary screening phase
- The gene screening file used for the analysis is included in the pipeline output

# 0.1.1

- Fixed [issue](https://github.com/Public-Health-Bioinformatics/irida-plugin-resistance-screen/issues/1) where sequence data was not being transferred to galaxy

# 0.1.0

* Initial release of example plugin.
- Initial release of example plugin.

17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ managers within your Galaxy instance. These can be found at:
| shovill | `1.0.4` | `iuc` | 3 (2018-11-13) | [shovill-3:865119fcb694](https://toolshed.g2.bx.psu.edu/view/iuc/shovill/865119fcb694) |
| quast | `5.0.2` | `iuc` | 5 (2018-12-04) | [quast-5:81df4950d65b](https://toolshed.g2.bx.psu.edu/view/iuc/quast/81df4950d65b) |
| abricate | `0.9.8` | `iuc` | 7 (2019-10-29) | [abricate-7:4efdca267d51](https://toolshed.g2.bx.psu.edu/view/iuc/abricate/4efdca267d51) |
| screen_abricate_report | `0.1.0` | `public-health-bioinformatics` | 0 (2019-10-31) | [screen_abricate_report-0:b2d56a44a872](https://toolshed.g2.bx.psu.edu/view/public-health-bioinformatics/screen_abricate_report/b2d56a44a872) |
| screen_abricate_report | `0.4.0` | `public-health-bioinformatics` | 4 (2020-01-02) | [screen_abricate_report-4:22247b1a59d5](https://toolshed.g2.bx.psu.edu/view/public-health-bioinformatics/screen_abricate_report/22247b1a59d5) |
| data_manager_manual | `0.0.2` | `iuc` | 5 (2019-10-21) | [data_manager_manual-5:744f607fac50](https://toolshed.g2.bx.psu.edu/view/iuc/data_manager_manual/744f607fac50) |

## Installing to IRIDA
Expand All @@ -49,13 +49,13 @@ Please download the provided `irida-plugin-resistance-screen-[version].jar` from
## Setting up your abricate report screening file(s)

Abricate report screening files have a simple tabular format, and can be created with Excel, another spreadsheet application,
or a plaintext editor. They consist of two columns, with headings `gene_name` and `regex`. All fields should be tab-delimited.
or a plaintext editor. They consist of two columns, with headings `gene_name` and `regex`. All fields must be tab-delimited.

```
gene_name regex
KPC KPC
OXA-48 OXA\-48
NDM NDM
KPC ^KPC-\d+$
OXA ^OXA-\d+$
NDM ^NDM-\d+$
```

## Preparing the 'abricate_report_screening_files' Tool Data Table in Galaxy
Expand Down Expand Up @@ -86,9 +86,10 @@ report, and a screened `abricate` report that includes only your genes of intere
And, you should be able to save and view these results in the IRIDA metadata table. The following fields are written to
the IRIDA 'Line List':

| Field Name | Description |
|--------------------------------------------|-----------------------------------------------------------|
| resistance-screen/<GENE_NAME>/detected | Whether or not `GENE_NAME` was detected (True/False) |
| Field Name | Description |
|--------------------------------------------|------------------------------------------------------------------------------------------------------------|
| resistance-screen/<GENE_NAME>/detected | Whether or not `GENE_NAME` was detected (True/False) |
| resistance-screen/<GENE_NAME>/alleles | Any allele(s) detected for `GENE_NAME`. If multiple alleles detected, commma-delimited (eg: `KPC-2,KPC-3`) |

**Note**: If your abricate report screening file contains many genes, this will result in many columns

Expand Down
Binary file modified doc/images/pipeline-parameters.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/images/plugin-metadata.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/images/plugin-results-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/images/plugin-results-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/images/plugin-results-3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@

<groupId>org.publichealthbioinformatics</groupId>
<artifactId>irida-plugin-resistance-screen</artifactId>
<version>0.2.0-SNAPSHOT</version>
<version>0.2.0</version>

<!-- Please fill out these properties with information about your particular plugin -->
<properties>
<!-- Information used to define properties about a plugin. Please see the PF4J docs for more details https://pf4j.org/doc/getting-started.html -->
<plugin.id>resistance-screen</plugin.id>
<plugin.class>org.publichealthbioinformatics.irida.plugin.resistancescreen.ResistanceScreenPlugin</plugin.class>
<plugin.version>0.2.0-SNAPSHOT</plugin.version>
<plugin.version>0.2.0</plugin.version>
<plugin.provider>Dan Fornika</plugin.provider>
<plugin.dependencies></plugin.dependencies>
<plugin.requires.runtime>1.0.0</plugin.requires.runtime>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,11 +96,15 @@ public void update(Collection<Sample> samples, AnalysisSubmission analysis) thro
for (Map<String, String> geneDetectionStatus : geneDetectionStatuses) {
String geneName = geneDetectionStatus.get("gene_name");
String geneDetected = geneDetectionStatus.get("detected");
PipelineProvidedMetadataEntry geneDetectedEntry = new PipelineProvidedMetadataEntry(geneDetected, "boolean", analysis);

// key will be string like 'abricate-screen/KPC/detected'
String key = workflowName + "/" + geneName + "/" + "detected";
String alleles = geneDetectionStatus.get("alleles");
PipelineProvidedMetadataEntry geneDetectedEntry = new PipelineProvidedMetadataEntry(geneDetected, "xs:boolean", analysis);
PipelineProvidedMetadataEntry allelesEntry = new PipelineProvidedMetadataEntry(alleles, "xs:string", analysis);
// key will be string like 'resistance-screen/KPC/detected'
String key;
key = workflowName + "/" + geneName + "/" + "detected";
metadataEntries.put(key, geneDetectedEntry);
key = workflowName + "/" + geneName + "/" + "alleles";
metadataEntries.put(key, allelesEntry);
}

Map<MetadataTemplateField, MetadataEntry> metadataMap = metadataTemplateService.getMetadataMap(metadataEntries);
Expand All @@ -125,8 +129,8 @@ public void update(Collection<Sample> samples, AnalysisSubmission analysis) thro
* the pipeline. This file should contain contents like:
*
* <pre>
* gene_name detected
* KPC True
* gene_name detected alleles
* KPC True KPC-2
* OXA False
* </pre>
*
Expand All @@ -147,7 +151,7 @@ List<Map<String, String>> parseGeneDetectionStatusFile(Path geneDetectionStatusF
HashMap<String, String> geneDetectionStatus = new HashMap<>();
String geneDetectionStatusLine;
while (( geneDetectionStatusLine = geneDetectionStatusReader.readLine()) != null) {
String[] geneDetectionStatusEntries = geneDetectionStatusLine.split("\t");
String[] geneDetectionStatusEntries = geneDetectionStatusLine.split("\t", -1);
for (int i = 0; i < fieldNames.length; i++) {
geneDetectionStatus.put(fieldNames[i], geneDetectionStatusEntries[i]);
}
Expand Down
36 changes: 11 additions & 25 deletions src/main/resources/workflows/0.2.0/irida_workflow.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,51 +2,36 @@
<iridaWorkflow>
<id>1f40bbf2-4080-4cf7-a846-988e602eaa30</id>
<name>resistance-screen</name>
<version>0.1.1</version>
<version>0.2.0</version>
<analysisType>RESISTANCE_SCREEN</analysisType>
<inputs>
<sequenceReadsPaired>sequence_reads_paired</sequenceReadsPaired>
<requiresSingleSample>true</requiresSingleSample>
</inputs>
<parameters>
<parameter name="shovill-1-adv.nocorr" defaultValue="true">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="adv.nocorr" label="Disable post-assembly correction" type="boolean"/>
</parameter>
<parameter name="shovill-1-adv.mincov" defaultValue="2">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="adv.mincov" label="Minimum contig coverage" type="integer"/>
</parameter>
<parameter name="shovill-1-trim" defaultValue="true">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="trim" label="Trim reads" type="boolean"/>
</parameter>
<parameter name="shovill-1-adv.namefmt" defaultValue="contig%05d">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="adv.namefmt" label="Contig name format" type="text"/>
</parameter>
<parameter name="shovill-1-adv.depth" defaultValue="100">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="adv.depth" label="Depth" type="integer"/>
</parameter>
<parameter name="shovill-1-adv.gsize" defaultValue="">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="adv.gsize" label="Estimated genome size" type="text"/>
</parameter>
<parameter name="shovill-1-adv.opts" defaultValue="">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="adv.opts" label="Extra SPAdes options" type="text"/>
</parameter>
<parameter name="shovill-1-adv.minlen" defaultValue="0">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="adv.minlen" label="Minimum contig length" type="integer"/>
</parameter>
<parameter name="shovill-1-assembler" defaultValue="spades">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/shovill/shovill/1.0.4" parameterName="assembler" label="Assembler to use" type="select"/>
</parameter>
<parameter name="abricate-2-adv.db" defaultValue="card">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/abricate/abricate/0.9.8" parameterName="adv.db"/>
</parameter>
<parameter name="abricate-2-adv.min_dna_id" defaultValue="75.0">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/iuc/abricate/abricate/0.9.8" parameterName="adv.min_dna_id"/>
</parameter>
<parameter name="screen_abricate_report-3-abricate_screening_file" required="true">
<parameter name="screen_abricate_report-4-abricate_screening_file" required="true">
<dynamicSource>
<galaxyToolDataTable name="abricate_report_screening_files" displayColumn="name" parameterColumn="value" />
</dynamicSource>
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/public-health-bioinformatics/screen_abricate_report/screen_abricate_report/0.1.0" parameterName="screening_file"/>
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/public-health-bioinformatics/screen_abricate_report/screen_abricate_report/0.4.0+galaxy0" parameterName="screening_file_source.screening_file"/>
</parameter>
<parameter name="screen_abricate_report-4-min_coverage" defaultValue="90.0">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/public_health_bioinformatics/screen_abricate_report/screen_abricate_report/0.4.0+galaxy0" parameterName="min_coverage"/>
</parameter>
<parameter name="screen_abricate_report-4-min_identity" defaultValue="90.0">
<toolParameter toolId="toolshed.g2.bx.psu.edu/repos/public_health_bioinformatics/screen_abricate_report/screen_abricate_report/0.4.0+galaxy0" parameterName="min_identity"/>
</parameter>
</parameters>
<outputs>
Expand All @@ -55,6 +40,7 @@
<output name="abricate_report_full" fileName="abricate_report_full.tsv"/>
<output name="quast" fileName="quast.tsv"/>
<output name="assembly" fileName="assembly.fasta"/>
<output name="abricate_report_screening_file" fileName="abricate_report_screening_file.tsv"/>
</outputs>
<toolRepositories>
<repository>
Expand All @@ -79,7 +65,7 @@
<name>screen_abricate_report</name>
<owner>public-health-bioinformatics</owner>
<url>https://toolshed.g2.bx.psu.edu</url>
<revision>b2d56a44a872</revision>
<revision>22247b1a59d5</revision>
</repository>
<repository>
<name>data_manager_manual</name>
Expand Down
Loading

0 comments on commit 7fa8b54

Please sign in to comment.