A script to create a gene-focussed BrigdeDb database based on Ensembl BioMART.
Java 11 is required.
Compile the code with:
mvn clean install
cp target/org.bridgedb.genedb-jar-with-dependencies.jar BioMart2BridgeDb.jar
In your terminal:
java -jar BioMart2BridgeDb.jar <configFile> <outputPath> <oldDB> <inclusive>
<configFile>: location of configuration file
<outputPath>: Path for the new database
<oldDB>: (optional) directory of the old database - run QC
<inclusive>: (optional) use inclusive BridgeDb list
Configuration files can be found in https://github.com/bridgedb/create-bridgedb-genedb-config/tree/master/resource .
Example: Bos taurus config file
Give the version of Ensembl BioMart to query:
e.g: http://www.ensembl.org/biomart/, http://oct2014.archive.ensembl.org/biomart/, http://metazoa.ensembl.org/biomart/
You can find an overview of releases in the Ensembl Archive.
MartRegistry can be found there:
e.g: protists_mart_27, metazoa_mart_27, default
Code name of the species: http://www.ensembl.org/biomart/martservice?type=datasets&mart=ENSEMBL_MART_ENSEMBL
The name of the bridge database
database_name=Arabidopsis thaliana genes and proteins
The name of the file .bridge created
The different data source code name can be found like there:
probe_set=affy_ath1_121501 gene_datasource=refseq_mrna,refseq_ncrna,refseq_peptide,uniprot_sptrembl,pdb,tair_locus,go_accession,unigene,entrezgene,wikigene_id,nasc_gene_id,uniprot_swissprot_accession
Optional filters (chromosome list)
e.g: chromosome_name=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,X,MT