Skip to content

Latest commit

 

History

History
12 lines (8 loc) · 1.17 KB

README.md

File metadata and controls

12 lines (8 loc) · 1.17 KB

Asian Genome Project

This project is inspired by and builds upon Maria Nattestad's project: https://youtu.be/-PCKK_nwFdA

The limited representation of Asian populations in genomic datasets hinders the study of genetic variation, population structures, and disease manifestations in these population groups. This project aims to visualize genotypic variation across different Asian population groups through PCA and t-SNE analysis of single nucleotide variation data (phased VCF) of chromosome 22 from the 1000 Genomes Project (http://www.internationalgenome.org).

Link to the colab notebook: https://colab.research.google.com/drive/1JUnItMIRTAJtMbbkLRTFJe-EbUOWOMgK?usp=sharing

References:

  1. GenomeAsia100K Consortium (2019). The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature, 576(7785), 106-111. doi: 10.1038/s41586-019-1793-z.
  2. The 1000 Genomes Project Consortium (2015). A global reference for human genetic variation. Nature, 526, 68–74. https://doi.org/10.1038/nature15393.
  3. McVean, G. (2009). A genealogical interpretation of principal components analysis. PLoS Genet, 5(10). doi: 10.1371/journal.pgen.1000686.