Skip to content

Commit 5125b18

Browse files
authored
adaptation from issues #13 #14
1 parent a93387d commit 5125b18

File tree

1 file changed

+19
-15
lines changed

1 file changed

+19
-15
lines changed

format/definition.md

+19-15
Original file line numberDiff line numberDiff line change
@@ -21,25 +21,29 @@ Note: Keep in mind this is for the output of a pipeline, so we know there will b
2121
Please add description for each columnd/attribute
2222

2323
* header:
24-
* database: `##source-ontology LINK TO DATABASE` include version
25-
* commands used to generate the file. At least information about adapter removal and filtering
24+
* database: `##source-ontology LINK TO DATABASE` include version and link
25+
* commands used to generate the file. At least information about adapter removal, filtering, aligner, mirna tool. All of them starting like: `## CMD: `
2626
* genome version used (maybe try to get from BAM file if GFF3 generated from it)
27-
* sample names used in attribute:Expression
28-
* column1: seqID:
29-
* column2: source: databases used for the annotation (miRBase, mirDBgene,tRNA...etc): https://github.com/miRTop/incubator/issues/13
30-
* column3: type: ref_miRNA, isomiRs: https://github.com/miRTop/incubator/issues/13 (SO:0002166 ref_miRNA and SO:0002167)
31-
* column4/5: start/end: question about precursor position or genomic position?
27+
* sample names used in attribute:Expression: `## colData:` separated by spaces
28+
* small RNA GFF version `## version: 0.9`
29+
* column1: seqID: precursor name
30+
* column2: source: databases (lower case) used for the annotation (miRBase, mirDBgene,tRNA...etc): https://github.com/miRTop/incubator/issues/13. With the version number after `_` character: `mirbase_21`
31+
* column3: type: `ref_miRNA, isomiR`: https://github.com/miRTop/incubator/issues/13 (SO:0002166 ref_miRNA and SO:0002167 isomiR)
32+
* column4/5: start/end: precursor start/end as indicated by alignment tool
3233
* column6: score:
33-
* column7: strand:
34+
* column7: strand
3435
* column8: phase: (For features of type "CDS", the phase indicates where the feature begins with reference to the reading frame)
35-
* column9: attributes
36-
* ID: unique ID based on sequence like mintmap has for tRNA: prefix-22-BZBZOS4Y1 (https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates). good way to use it as cross-mapper ID between different naming or future changes.
37-
* Name:
36+
* column9: attributes:
37+
* ID: unique ID based on sequence like mintmap has for tRNA: prefix-22-BZBZOS4Y1 (https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates). good way to use it as cross-mapper ID between different naming or future changes. The tool will implement this, so an API can be used to fill this field.
38+
* Name: mature name
3839
* Parent: hairpin precursor name
39-
* Alias: get names from miRBase/miRgeneDB
40-
* Expression: raw counts separated by `,`
41-
* Filter: PASS or REJECT (this allow to keep all the data and select the one you really want to conside as valid features)
42-
40+
* Variant: categorical types: iso_5p, iso_3p, iso_snp(_seed/_central_supp), iso_add (adapted from isomiR-SEA)
41+
* Cigar: CIGAR string as indicated here: []
42+
* Alias: get names from miRBase/miRgeneDB or other database separated by `,`
43+
* Genomic: positions on the genome in the following format: `chr:start-end,chr:start-end`
44+
* Expression: raw counts separated by `,`. It should be in the same order than `colData` in the header.
45+
* Filter: PASS or REJECT (this allow to keep all the data and select the one you really want to conside as valid features). PASS can have subclases: `PASS:te`: meaning the sequence pass but the tools consider variants showed here are not trusted. REJECT can go with any short word explaining why it was rejected: `REJECT:lowcounts`. In this case the sequence will be skipped for data mining of the file when quering counts or summarize miRNA expression.
46+
* Seed_fam: in the format of 2-8 nts and reference miRNA sharing the seed. Usefull to go for pre-computed target predictions: `ATGCTGT:mir34a_5p`
4347

4448
**API**
4549

0 commit comments

Comments
 (0)