Skip to content

Commit 6165ad3

Browse files
committed
add Ensembl mm10
1 parent f9b26c5 commit 6165ad3

File tree

2 files changed

+30
-3
lines changed

2 files changed

+30
-3
lines changed

Makefile

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
none:
66

77
# make all sets of annotations
8-
all: gencode-hg19 ensembl-hg19 gencode-hg38 ensembl-hg38
8+
all: gencode-hg19 ensembl-hg19 gencode-hg38 ensembl-hg38 ensembl-mm10
99

1010
gencode-hg19: gencode.v19.annotation.genes.bed
1111

@@ -15,6 +15,8 @@ ensembl-hg19: Homo_sapiens.GRCh37.82.chr.bed
1515

1616
ensembl-hg38: Homo_sapiens.GRCh38.91.chr.bed
1717

18+
ensembl-mm10: Mus_musculus.GRCm38.91.chr.bed
19+
1820

1921

2022

@@ -72,6 +74,26 @@ Homo_sapiens.GRCh38.91.chr.bed: Homo_sapiens.GRCh38.91.chr.gtf
7274
gtf2bed < Homo_sapiens.GRCh38.91.chr.gtf > Homo_sapiens.GRCh38.91.chr.bed
7375

7476

77+
78+
79+
# ~~~~~ ENSEMBL mm10 ~~~~~ #
80+
# generate the Ensembl hg19 annotations .bed file
81+
Mus_musculus.GRCm38.91.chr.gtf.gz:
82+
wget ftp://ftp.ensembl.org/pub/release-91/gtf/mus_musculus/Mus_musculus.GRCm38.91.chr.gtf.gz
83+
84+
# remove comment lines
85+
# extract only 'gene' entries
86+
# add 'chr' to first entry, change 'chrMT' to 'chrM'
87+
Mus_musculus.GRCm38.91.chr.gtf: Mus_musculus.GRCm38.91.chr.gtf.gz
88+
zcat Mus_musculus.GRCm38.91.chr.gtf.gz | grep -Ev '^#' | grep -w 'gene' | sed -e 's/^/chr/' -e 's/^chrMT/chrM/' > Mus_musculus.GRCm38.91.chr.gtf
89+
90+
# convert to .bed
91+
Mus_musculus.GRCm38.91.chr.bed: Mus_musculus.GRCm38.91.chr.gtf
92+
gtf2bed < Mus_musculus.GRCm38.91.chr.gtf > Mus_musculus.GRCm38.91.chr.bed
93+
94+
95+
96+
7597
# ~~~~~ CLEAN UP ~~~~~ #
7698
.INTERMEDIATE: gencode.v19.annotation.gtf.gz \
7799
Homo_sapiens.GRCh37.82.gtf.gz \
@@ -82,6 +104,9 @@ Homo_sapiens.GRCh38.91.chr.bed: Homo_sapiens.GRCh38.91.chr.gtf
82104
Homo_sapiens.GRCh38.91.chr.gtf \
83105
Homo_sapiens.GRCh38.91.chr.gtf.gz \
84106
Homo_sapiens.GRCh37.82.chr.gtf \
85-
Homo_sapiens.GRCh37.82.chr.gtf.gz
107+
Homo_sapiens.GRCh37.82.chr.gtf.gz \
108+
Mus_musculus.GRCm38.91.chr.gtf.gz \
109+
Mus_musculus.GRCm38.91.chr.gtf
110+
86111

87112

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ cd reference-annotations
1212

1313
Generate the desired annotation files from the available entries:
1414

15-
- `all`, `gencode-hg19`, `gencode-hg38`, `ensembl-hg19`, `ensembl-hg38`
15+
- `all`, `gencode-hg19`, `gencode-hg38`, `ensembl-hg19`, `ensembl-hg38`, `ensembl-mm10`
1616

1717
```
1818
make all
@@ -40,6 +40,8 @@ The following files are created:
4040

4141
- `ensembl-hg38`: `Homo_sapiens.GRCh38.91.chr.bed`; Ensembl hg38 gene annotations & genomic regions
4242

43+
- `ensembl-mm10`: `Mus_musculus.GRCm38.91.chr.bed`; Ensembl mm10 gene annotations & genomic regions
44+
4345
# Notes
4446

4547
Intermediate files are removed by default. If you want to keep them, then comment out the `.INTERMEDIATE` section in the `Makefile`.

0 commit comments

Comments
 (0)