forked from macieksmuga/bioapi-examples
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ga4gh client (based off david4096/client) and directory structure
- Loading branch information
Showing
19 changed files
with
6,563 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,8 @@ __pycache__/ | |
# C extensions | ||
*.so | ||
|
||
.idea | ||
|
||
# Distribution / packaging | ||
.Python | ||
env/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
Apache License | ||
Apache License | ||
Version 2.0, January 2004 | ||
http://www.apache.org/licenses/ | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Biomedicine API Examples | ||
|
||
## Introduction | ||
Data sharing efforts and readily available computing resources are making bioinformatics over the Web possible. In the past, siloed data stores and obscure file formats made it difficult to synthesize and reproduce results between institutions. Here we present two biomedicine APIs, currently under development, and provide example usage. Some familiarity with python is expected. | ||
|
||
*Get started!* | ||
|
||
``` | ||
pip install -r requirements.txt | ||
python hello_ga4gh.py | ||
``` | ||
|
||
## ExAC | ||
> Building on the existing ExAC application we opened up direct data access through straight forward web services. These services enable a user to integrate ExAC services into their own tools, querying the variant information and returning the data in an easy to programmatically use JSON format. | ||
https://github.com/hms-dbmi/exac_browser | ||
|
||
|
||
> The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a wide variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. | ||
> The data set provided on this website spans 60706 unrelated individuals sequenced as part of various disease-specific and population genetic studies. | ||
http://exac.broadinstitute.org | ||
|
||
The REST API for ExAC has been developed as part of Harvard’s Patient-centered Information Commons: Standardized Unification of Research Elements (PIC-SURE http://www.pic-sure.org/software). | ||
|
||
## GA4GH | ||
|
||
[GA4GH](https://genomicsandhealth.org) aims to standardize how bioinformatics data are shared over the web. A reference server with a subset of publicly available test data from 1000 genomes has been made available for these examples. | ||
|
||
The GA4GH reference server hosts bioinformatics data using an HTTP API. These data are backed by BAM and VCF files. For these examples we will only be accessing a GA4GH server, but it is open source and eager individuals can create their own server instance using [these instructions](http://ga4gh-reference-implementation.readthedocs.org/en/latest/demo.html). | ||
|
||
## What is HTTP API | ||
|
||
HTTP APIs allow web browsers and command line clients to use the same communication layer to transmit data to a server. A client can `GET` a resource from a server, `POST` a resource on a server, or `DELETE` amongst other things. | ||
|
||
The documents that servers and clients pass back and forth are often in JavaScript Object Notation (JSON), which can flexibly describe complex data structures. For example, a variant in GA4GH is returned as a document with the form: | ||
|
||
{ | ||
"alternateBases": ["T"], | ||
"calls": [], | ||
"created": 1455236057000, | ||
"end": 4530, | ||
"id": "YnJjYTE6MWtnUGhhc2UzOnJlZl9icmNhMTo0NTI5OjllNjRkMDIzOTc5NzQ3M2MyNjk2NzFiNzczMjg1MWNj", | ||
"info": {}, | ||
"referenceBases": "C", | ||
"referenceName": "ref_brca1", | ||
"start": 4529, | ||
"updated": 1455236057000, | ||
"variantSetId": "YnJjYTE6MWtnUGhhc2Uz" | ||
} | ||
|
||
JSON uses strings as keys for values that could be strings, numbers, or arrays and maps of more complex objects. | ||
|
||
## Examples | ||
|
||
Each example is provided with inline comments that explain what communication with a server is being performed and how those data are being manipulated by our script. | ||
|
||
### hello_ga4gh.py | ||
|
||
Access a GA4GH reference server hosting bioinformatics data and see the basics of building a query. | ||
|
||
### hello_exac.py | ||
|
||
Access an API hosting population genomics data and a query service for finding variants in a gene. | ||
|
||
### hello_ga4gh_client.py | ||
|
||
Access a GA4GH reference server using a (provided) client, making some operations easier. | ||
|
||
### visualize_ga4gh.py | ||
|
||
Get data from a remote web service and visualize it using matplotlib. | ||
|
||
### combine_apis.py | ||
|
||
Use data from two web services to produce synthetic results. | ||
|
||
### simple_service.py | ||
|
||
Make the results of combining two APIs available as its own web service. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
""" | ||
Simple shim for running the client program during development. | ||
""" | ||
import ga4gh.cli | ||
|
||
if __name__ == "__main__": | ||
ga4gh.cli.client_main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
""" | ||
combine_apis.py | ||
An example of combining the results of interacting with | ||
both the ExAC and GA4GH APIs. | ||
""" | ||
|
||
# We'll need both the requests module and the GA4GH client. | ||
|
||
import ga4gh.client as client | ||
import requests | ||
|
||
EXAC_BASE_URL = "http://exac.hms.harvard.edu/rest/" | ||
GA4GH_BASE_URL = "http://ga4gh-a1.westus.cloudapp.azure.com/ga4gh-example-data/" | ||
|
||
def main(): | ||
# Let's instantiate the GA4GH client first | ||
c = client.HttpClient(GA4GH_BASE_URL) | ||
|
||
# Since we've done it before, getting variants can be done | ||
# in a one-liner. We're picking up the first variant set | ||
# for the first dataset returned. | ||
|
||
ga4gh_variants = [v for v in c.searchVariants( | ||
c.searchVariantSets(c.searchDatasets().next().id).next().id, | ||
start=0, | ||
end=2**32, | ||
referenceName="1")] | ||
|
||
print(str(len(ga4gh_variants)) + " GA4GH variants.") | ||
|
||
# Now we'll access the ExAC API in search of variants on | ||
# the BRCA1 gene. See `hello_exac.py` | ||
|
||
GENE_NAME = "OR4F5" | ||
|
||
response = requests.get( | ||
EXAC_BASE_URL + "awesome?query=" + GENE_NAME + "&service=variants_in_gene") | ||
|
||
OR4F5_variants = response.json() | ||
|
||
print(str(len(OR4F5_variants)) + " ExAC variants.") | ||
|
||
# Let's find out if we have any matches on position. | ||
|
||
matches = [] | ||
|
||
for OR4F5_variant in OR4F5_variants: | ||
for ga4gh_variant in ga4gh_variants: | ||
# Note that GA4GH positions are 0-based so we add | ||
# 1 to line it up with ExAC. | ||
if (ga4gh_variant.start + 1) == OR4F5_variant['pos']: | ||
print(OR4F5_variant['pos']) | ||
print(ga4gh_variant.start) | ||
matches.append((ga4gh_variant, OR4F5_variant)) | ||
|
||
print("Found " + str(len(matches)) + " matches.") | ||
|
||
for match in matches: | ||
print(match[0].names) | ||
print(match[1]['rsid']) | ||
print(match[0].referenceBases, match[1]['ref']) | ||
print(match[0].alternateBases, match[1]['alt']) | ||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
|
||
.. image:: http://genomicsandhealth.org/files/logo_ga.png | ||
|
||
============================== | ||
GA4GH Reference Implementation | ||
============================== | ||
|
||
.. image:: https://badges.gitter.im/Join%20Chat.svg | ||
:alt: Join the chat at https://gitter.im/ga4gh/server | ||
:target: https://gitter.im/ga4gh/server?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge | ||
|
||
This is the development version of the GA4GH reference implementation. | ||
If you would like to install the stable version of the server, please | ||
see the instructions on `the PyPI page <https://pypi.python.org/pypi/ga4gh>`_. | ||
|
||
The server is currently under heavy development, and many aspects of | ||
the layout and APIs will change as requirements are better understood. | ||
If you would like to help, please check out our list of | ||
`issues <https://github.com/ga4gh/server/issues>`_! | ||
|
||
The latest bleeding-edge documentation is available at `read-the-docs.org | ||
<http://ga4gh-reference-implementation.readthedocs.org/en/latest>`_. | ||
|
||
- For a quick start with the GA4GH API, please see our | ||
`demo <http://ga4gh-reference-implementation.readthedocs.org/en/latest/demo.html>`_. | ||
- To configure and deploy the GA4GH server in production | ||
please see the | ||
`installation | ||
<http://ga4gh-reference-implementation.readthedocs.org/en/latest/installation.html>`_ | ||
page. | ||
- If you would like to contribute to the project, please see the | ||
`development | ||
<http://ga4gh-reference-implementation.readthedocs.org/en/latest/development.html>`_ | ||
page. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
""" | ||
Reference implementation of the GA4GH APIs. | ||
""" | ||
|
||
__version__ = "undefined" | ||
try: | ||
from . import _version | ||
__version__ = _version.version | ||
except ImportError: | ||
pass |
Oops, something went wrong.