High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios
- PMID: 36055201
- PMCID: PMC9439720
- DOI: 10.1016/j.cell.2022.08.004
High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios
Abstract
The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.
Keywords: 1000 Genomes Project; INDEL; SNV; population genetics; reference imputation panel; structural variation; trio sequencing; whole-genome sequencing.
Copyright © 2022 The Authors. Published by Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of interests E.E.E. is a scientific advisory board (SAB) member of Variant Bio, Inc. P.F. is an SAB member of Fabric Genomics, Inc., and Eagle Genomics, Ltd.
Figures
![None](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/fx1.gif)
![Figure 1](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/gr1.gif)
![Figure S1](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/figs1.gif)
![Figure S2](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/figs2.gif)
![Figure S3](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/figs3.gif)
![Figure S4](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/figs4.gif)
![Figure 2](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/gr2.gif)
![Figure 3](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/gr3.gif)
![Figure 4](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/gr4.gif)
![Figure S5](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/figs5.gif)
![Figure 5](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/gr5.gif)
![Figure 6](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/gr6.gif)
![Figure S6](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/figs6.gif)
![Figure S7](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/9439720/bin/figs7.gif)
Comment in
-
1000 Genomes Project phase 4: The gift that keeps on giving.Cell. 2022 Sep 1;185(18):3286-3289. doi: 10.1016/j.cell.2022.08.001. Cell. 2022. PMID: 36055197 Clinical Trial.
Similar articles
-
Deep whole-genome sequencing of 90 Han Chinese genomes.Gigascience. 2017 Sep 1;6(9):1-7. doi: 10.1093/gigascience/gix067. Gigascience. 2017. PMID: 28938720 Free PMC article.
-
GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing.Nucleic Acids Res. 2022 Mar 21;50(5):2464-2479. doi: 10.1093/nar/gkac076. Nucleic Acids Res. 2022. PMID: 35176773 Free PMC article.
-
KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses.Sci Rep. 2018 Apr 4;8(1):5677. doi: 10.1038/s41598-018-23837-x. Sci Rep. 2018. PMID: 29618732 Free PMC article.
-
Archived neonatal dried blood spot samples can be used for accurate whole genome and exome-targeted next-generation sequencing.Mol Genet Metab. 2013 Sep-Oct;110(1-2):65-72. doi: 10.1016/j.ymgme.2013.06.004. Epub 2013 Jun 13. Mol Genet Metab. 2013. PMID: 23830478
-
Global mapping of cancers: The Cancer Genome Atlas and beyond.Mol Oncol. 2021 Nov;15(11):2823-2840. doi: 10.1002/1878-0261.13056. Epub 2021 Jul 20. Mol Oncol. 2021. PMID: 34245122 Free PMC article. Review.
Cited by
-
Estimation of genetic variation in vitiligo associated genes: Population genomics perspective.BMC Genom Data. 2024 Jul 26;25(1):72. doi: 10.1186/s12863-024-01254-6. BMC Genom Data. 2024. PMID: 39060965 Free PMC article.
-
A sequence of SVA retrotransposon insertions in ASIP shaped human pigmentation.Nat Genet. 2024 Jul 24. doi: 10.1038/s41588-024-01841-4. Online ahead of print. Nat Genet. 2024. PMID: 39048794
-
Molecular and clinical characterization of a founder mutation causing G6PC3 deficiency.Res Sq [Preprint]. 2024 Jul 11:rs.3.rs-4595246. doi: 10.21203/rs.3.rs-4595246/v1. Res Sq. 2024. PMID: 39041036 Free PMC article. Preprint.
-
An assessment of the genomic structural variation landscape in Sub-Saharan African populations.Res Sq [Preprint]. 2024 Jul 8:rs.3.rs-4485126. doi: 10.21203/rs.3.rs-4485126/v1. Res Sq. 2024. PMID: 39041024 Free PMC article. Preprint.
-
DSB profiles in human spermatozoa highlight the role of TMEJ in the male germline.Front Genet. 2024 Jul 8;15:1423674. doi: 10.3389/fgene.2024.1423674. eCollection 2024. Front Genet. 2024. PMID: 39040993 Free PMC article.
References
-
- Almeida R., Ricaño-Ponce I., Kumar V., Deelen P., Szperl A., Trynka G., Gutierrez-Achury J., Kanterakis A., Westra H.-J., Franke L., et al. Fine mapping of the celiac disease-associated LPP locus reveals a potential functional variant. Hum. Mol. Genet. 2014;23:2481–2489. doi: 10.1093/hmg/ddt619. - DOI - PMC - PubMed
-
- Andrews S. FastQC. 2019. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
-
- Broad Institute Picard Toolkit, Github Repository. 2019. http://broadinstitute.github.io/picard/
Publication types
MeSH terms
Grants and funding
- R01 HG002898/HG/NHGRI NIH HHS/United States
- R03 HD099547/HD/NICHD NIH HHS/United States
- R35 GM138212/GM/NIGMS NIH HHS/United States
- UM1 HG008895/HG/NHGRI NIH HHS/United States
- UM1 HG008901/HG/NHGRI NIH HHS/United States
- R01 HD081256/HD/NICHD NIH HHS/United States
- R56 MH115957/MH/NIMH NIH HHS/United States
- WT_/Wellcome Trust/United Kingdom
- U24 HG007497/HG/NHGRI NIH HHS/United States
- R21 CA259309/CA/NCI NIH HHS/United States
- R01 MH115957/MH/NIMH NIH HHS/United States
- R01 CA261934/CA/NCI NIH HHS/United States
- UM1 HG008853/HG/NHGRI NIH HHS/United States
LinkOut - more resources
Full Text Sources