Skip to content
bmatern-nmdp edited this page Sep 25, 2015 · 1 revision

MIRING Rules

MIRING rules are based on 8 general elements:

MIRING Element 1 - Message Annotation

  • Message Generator Contact Information & Document Identification

MIRING Element 2 - Reference Context

  • Reference Sequences and Databases

MIRING Element 3 - Full Genotype

  • GLStrings and Typings

MIRING Element 4 - Consensus Sequence

  • Gene-length sequences

MIRING Element 5 - Novel Polymorphisms

  • Sequence variants relative to the reference sequence

MIRING Element 6 - Platform Documentation

  • Citations to reference each instrument and method

MIRING Element 7 - Read Processing Documentation

  • References for processing methodologies and software

MIRING Element 8 - Primary Data

  • References to raw NGS data in a public database

This validator attempts to verify that MIRING elements are represented in an HML document. To accomplish this goal, the elements are distilled into actionable rules that can be enforced on data in HML format.

Rule List

Miring Element Miring Description Rule ID Severity Rule Description
1 Message Annotation
1.1 Unique Miring Message Identifier 1.1.a miring Document has exactly one hmlid node directly under the root hml node
1.1.c info if hmlid is an OID the hmlid node should have a root and an extension attribute.
1.2 Message Generator Contact Information 1.2.a miring Document has exactly one reporting-center node directly under the root hml node
1.2.b warning reporting-center has one reporting-center-id and one reporting-center-context attributes
1.3 Platform Documentation (MIRING element 6) Reference 1.3.a miring Every sso ssp sbt-sanger sbt-ngs node should have a test-id and test-id-source attributes
1.3.b info test-id and test-id-source are in a recognized format. NCBI-GTR
1.4 Read Processing Documentation (MIRING element 7) Reference
1.5 Primary Data Availability 1.5.a miring Every sbt-ngs node has at least one raw-reads node underneath.
1.5.b miring "Every raw-reads node has an "availability" attribute. It's a string. Public; private; permission"
2 Reference Context
2.1 Reference Sequence Database Version for Allele Calling 2.1.a warning Every typing node must have at least one allele-assignment node.
2.1.b miring Every allele-assignment must have an allele-db attribute
2.1.c miring Every allele-assignment must have an allele-version attribute
2.2 Individual Reference Sequences Applied 2.2.a Every reference-database has at least one reference-sequence node.
2.2.b miring; warning "Reference sequence should have id; name; start; end; accession; URI attributes."
2.2.c miring end >= start
2.2.1 Reference Sequence Identifier 2.2.1.a every reference-sequence node must have an id attribute.
2.2.1.c warning Everyreference-sequence id corresponds to at least one ID under a consensus-sequence-block node.
2.3 Reference Sequence Source Type 2.3.a Every reference-database node has a availability attribute.
2.3.b miring "Every reference-database node has a curated attribute: true/false"
3 Full Genotype
3.1 Pertinent Locus/Loci 3.1.a warning "Locus is reported in either a glstring node or as a "locus" attribute under the typing method node. (sbt-ngs; etc)"
3.2 Formatted Genotype 3.2.a warning "glstring node should either have a text; or a URI attribute pointing to a valid GLString"
3.3 Genotype Uniform Resource Identifier (URI) 3.3.a Lookup URI to acquire GLString. It should be valid.
4 Consensus Sequence
4.a Sequence Quality Node 4.a warning end > start.
4.b warning sequence-quality:end parentCSB's end-start.sequence-quality:start parentCSB's end-start.
4.1 Consensus Sequence Block (CSB) 4.1.a Every consensus-sequence node should have at least one consensus-sequence-block node.
4.2 Consensus Sequence Descriptor 4.2.a warning Every consensus-sequence-block node should have a description attribute.
4.2.1 Consensus Sequence Block Identifier 4.2.1.a The consensus-sequence-block nodes are sequential under the consensus-sequence node.
4.2.2 Reference Sequence Identifier 4.2.2.a Each consensus-sequence-block node has a reference-sequence-id attribute.
4.2.3 Reference Sequence Coordinate 4.2.3.a miring Every consensus-sequence-block node must have start and end attributes.
4.2.3.b miring end >= start
4.2.3.d miring CSB:start >= refseq:start && CSB:end = refseq:end
4.2.3.e miring Length of sequence node text (trimmed) should be = end-start.
4.2.4 Phase Set 4.2.4.a warning Every consensus-sequence-block node must have phase set attribute.
4.2.4.b warning Warn if consensus-sequene-block node has a phasing-group attribute.
4.2.5 Copy Number 4.2.5.a miring Every consensus-sequence-block node must have expected-copy-number attribute
4.2.6 Reference Sequence Match
4.2.7 Sequence Continuity 4.2.7.a miring "Every consensus-sequence-block node must have a continuity attribute: true/false."
4.2.7.b miring "If Continuity is true; there must be no sequence gaps between this CSB and preceding CSB."
5 Novel Polymorphisms
5.1 Reference 5.1.a warning Variant nodes can have a variant-effect node to specify effects of variants.
5.2 Position 5.2.a "Every variant node should have "start" and "end "attributes."
5.2.b miring end >= start
5.2.d miring variant:start >= refseq:start && variant:end = refseq:end
5.3 Variant Identifier 5.3.a miring Each variant node must have an ID attribute.
5.3.b miring IDS are non-negative integers.
5.3.c miring IDS range from 0:n-1
5.4 Reference Sequence 5.4.a Each variant node should have a reference-bases attribute.
5.5 Variant Sequence 5.5.a Each variant node should have a alternate-bases attribute.
5.6 Quality Score 5.6.a miring Each variant node should have a quality-score attribute.
5.7 Quality Filter Status 5.7.a miring Each variant node should have a filter attribute.
5.8 INSDC Accession Number
6 Platform Documentation
7 Read Processing Documentation
8 Primary Data

Clone this wiki locally