Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions datasets/smaht.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
Name: Somatic Mosaicism across Humnan Tissues (SMaHT)
Description: |
The Somatic Mosaicism across Human Tissues (SMaHT) project is an NIH Common Fund consortium (2023-)
aimed to comprehensively characterize somatic variation (“mosaicism”) in normal human tissues. While
most genetic studies have relied on blood-derived DNA, SMaHT captures the full spectrum of DNA variation
across cell types, tissues, and organs from phenotypically normal individuals to better understand the
role of somatic mosaicism in human development, aging, and disease progression. In addition to generating
production data across ~20 tissue types from 150 post-mortem donors, SMaHT also produces datasets from
cell line and tissue homogenate samples, to benchmark and develop new technologies and computational tools
for mosaic variant detection. The resulting data include high-coverage whole-genome and transcriptome data
using both short-read and long-read sequencing technologies from multiple platforms. SMaHT will also
generate comprehensive genome-wide catalogs of somatic variants.
Documentation: https://data.smaht.org/docs
Contact: [email protected]
ManagedBy: SMaHT Data Analysis Center (DAC)
UpdateFrequency: Bi-annually
Tags:
- biology
- bioinformatics
- genetic
- genomic
- imaging
- life sciences
- whole genome sequencing
- bam
License: NIH Genomic Data Sharing Policy: https://gdc.cancer.gov/access-data/data-access-policies
Citation: The SMaHT datasets were generated as part of the NIH Common Fund consortium initiative,
Somatic Mosaicism across Human Tissues (SMaHT). The SMaHT datasets are submitted under dbGaP
studies (http://www.ncbi.nlm.nih.gov/gap), with the study accession numbers, phs004193 for the
SMaHT Benchmarking data and phs004194 for the SMaHT Production data. The datasets were provided
by the SMaHT Data Analysis Center (DAC) [1UM1DA058230] on behalf of the SMaHT network. More
information about the SMaHT Network is available online at https://smaht.org/, about the SMaHT
Data Portal at https://data.smaht.org/ , and types of data generated by the Network at
https://data.smaht.org/about/consortium/data
Resources:
- Description: Released and controlled access SMaHT data
ARN: TBD
Region: us-east-1
Type: S3 Bucket
ControlledAccess: TBD
DataAtWork:
Tools & Applications:
- Title: Somatic Mosaicism across Human Tissues Data Portal
URL: https://data.smaht.org/
AuthorName: SMaHT Data Analysis Center (DAC)
Publications:
- Title: The Somatic Mosaicism across Human Tissues Network
URL: https://www.nature.com/articles/s41586-025-09096-7
AuthorName: Coorens T, Oh J, Choi Y, Lim N, Zhao B, Voshall A et al.
ADXCategories:
- Healthcare & Life Sciences Data