
Install DagsHub:
pip install dagshub
To stream this data directly on DagsHub
from dagshub.streaming import DagsHubFilesystem
fs = DagsHubFilesystem(".", repo_url="https://test.dagshub.com/DagsHub-Datasets/broad-gnomad-dataset")
fs.listdir("s3://gnomad-public-us-east-1")
Description
The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale human sequencing projects. The summary data provided here are released for the benefit of the wider scientific community without restriction on use. The v2 data set (GRCh37) spans 125,748 exome sequences and 15,708 whole-genome sequences from unrelated individuals. The v3 data set (GRCh38) spans 71,702 genomes, selected as in v2. Sign up for the gnomAD mailing list here.
Additional information
Documentation
Update frequency
Data from new releases are made public as soon as they are available. New releases, including both minor and major versions, have historically been issued on the order of once per year.
Managed by
gnomAD Production Team at the Broad Institute