You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: format/definition.md
+19-15
Original file line number
Diff line number
Diff line change
@@ -21,25 +21,29 @@ Note: Keep in mind this is for the output of a pipeline, so we know there will b
21
21
Please add description for each columnd/attribute
22
22
23
23
* header:
24
-
* database: `##source-ontology LINK TO DATABASE` include version
25
-
* commands used to generate the file. At least information about adapter removal and filtering
24
+
* database: `##source-ontology LINK TO DATABASE` include version and link
25
+
* commands used to generate the file. At least information about adapter removal, filtering, aligner, mirna tool. All of them starting like: `## CMD: `
26
26
* genome version used (maybe try to get from BAM file if GFF3 generated from it)
27
-
* sample names used in attribute:Expression
28
-
* column1: seqID:
29
-
* column2: source: databases used for the annotation (miRBase, mirDBgene,tRNA...etc): https://github.com/miRTop/incubator/issues/13
30
-
* column3: type: ref_miRNA, isomiRs: https://github.com/miRTop/incubator/issues/13 (SO:0002166 ref_miRNA and SO:0002167)
31
-
* column4/5: start/end: question about precursor position or genomic position?
27
+
* sample names used in attribute:Expression: `## colData:` separated by spaces
28
+
* small RNA GFF version `## version: 0.9`
29
+
* column1: seqID: precursor name
30
+
* column2: source: databases (lower case) used for the annotation (miRBase, mirDBgene,tRNA...etc): https://github.com/miRTop/incubator/issues/13. With the version number after `_` character: `mirbase_21`
* column4/5: start/end: precursor start/end as indicated by alignment tool
32
33
* column6: score:
33
-
* column7: strand:
34
+
* column7: strand
34
35
* column8: phase: (For features of type "CDS", the phase indicates where the feature begins with reference to the reading frame)
35
-
* column9: attributes
36
-
* ID: unique ID based on sequence like mintmap has for tRNA: prefix-22-BZBZOS4Y1 (https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates). good way to use it as cross-mapper ID between different naming or future changes.
37
-
* Name:
36
+
* column9: attributes:
37
+
* ID: unique ID based on sequence like mintmap has for tRNA: prefix-22-BZBZOS4Y1 (https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates). good way to use it as cross-mapper ID between different naming or future changes. The tool will implement this, so an API can be used to fill this field.
38
+
* Name: mature name
38
39
* Parent: hairpin precursor name
39
-
* Alias: get names from miRBase/miRgeneDB
40
-
* Expression: raw counts separated by `,`
41
-
* Filter: PASS or REJECT (this allow to keep all the data and select the one you really want to conside as valid features)
* Alias: get names from miRBase/miRgeneDB or other database separated by `,`
43
+
* Genomic: positions on the genome in the following format: `chr:start-end,chr:start-end`
44
+
* Expression: raw counts separated by `,`. It should be in the same order than `colData` in the header.
45
+
* Filter: PASS or REJECT (this allow to keep all the data and select the one you really want to conside as valid features). PASS can have subclases: `PASS:te`: meaning the sequence pass but the tools consider variants showed here are not trusted. REJECT can go with any short word explaining why it was rejected: `REJECT:lowcounts`. In this case the sequence will be skipped for data mining of the file when quering counts or summarize miRNA expression.
46
+
* Seed_fam: in the format of 2-8 nts and reference miRNA sharing the seed. Usefull to go for pre-computed target predictions: `ATGCTGT:mir34a_5p`
0 commit comments