Identification of putative causal loci in whole-genome sequencing data via knockoff statistics
- PMID: 34035245
- PMCID: PMC8149672
- DOI: 10.1038/s41467-021-22889-4
Identification of putative causal loci in whole-genome sequencing data via knockoff statistics
Abstract
The analysis of whole-genome sequencing studies is challenging due to the large number of rare variants in noncoding regions and the lack of natural units for testing. We propose a statistical method to detect and localize rare and common risk variants in whole-genome sequencing studies based on a recently developed knockoff framework. It can (1) prioritize causal variants over associations due to linkage disequilibrium thereby improving interpretability; (2) help distinguish the signal due to rare variants from shadow effects of significant common variants nearby; (3) integrate multiple knockoffs for improved power, stability, and reproducibility; and (4) flexibly incorporate state-of-the-art and future association tests to achieve the benefits proposed here. In applications to whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) and COPDGene samples from NHLBI Trans-Omics for Precision Medicine (TOPMed) Program we show that our method compared with conventional association tests can lead to substantially more discoveries.
Conflict of interest statement
The authors declare no competing interests.
Figures
![Fig. 1](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig1_HTML.gif)
![Fig. 2](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig2_HTML.gif)
![Fig. 3](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig3_HTML.gif)
![Fig. 4](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig4_HTML.gif)
![Fig. 5](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig5_HTML.gif)
![Fig. 6](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig6_HTML.gif)
![Fig. 7](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig7_HTML.gif)
![Fig. 8](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig8_HTML.gif)
![Fig. 9](https://cdn.statically.io/img/www.ncbi.nlm.nih.gov/pmc/articles/instance/8149672/bin/41467_2021_22889_Fig9_HTML.gif)
Similar articles
-
Whole-Genome Sequencing Association Analyses of Stroke and Its Subtypes in Ancestrally Diverse Populations From Trans-Omics for Precision Medicine Project.Stroke. 2022 Mar;53(3):875-885. doi: 10.1161/STROKEAHA.120.031792. Epub 2021 Nov 3. Stroke. 2022. PMID: 34727735 Free PMC article.
-
A genome-wide scan statistic framework for whole-genome sequence data analysis.Nat Commun. 2019 Jul 9;10(1):3018. doi: 10.1038/s41467-019-11023-0. Nat Commun. 2019. PMID: 31289270 Free PMC article.
-
Genome-wide analysis of common and rare variants via multiple knockoffs at biobank scale, with an application to Alzheimer disease genetics.Am J Hum Genet. 2021 Dec 2;108(12):2336-2353. doi: 10.1016/j.ajhg.2021.10.009. Epub 2021 Nov 11. Am J Hum Genet. 2021. PMID: 34767756 Free PMC article.
-
Methods for the Analysis and Interpretation for Rare Variants Associated with Complex Traits.Curr Protoc Hum Genet. 2019 Apr;101(1):e83. doi: 10.1002/cphg.83. Epub 2019 Mar 8. Curr Protoc Hum Genet. 2019. PMID: 30849219 Free PMC article. Review.
-
Unique roles of rare variants in the genetics of complex diseases in humans.J Hum Genet. 2021 Jan;66(1):11-23. doi: 10.1038/s10038-020-00845-2. Epub 2020 Sep 18. J Hum Genet. 2021. PMID: 32948841 Free PMC article. Review.
Cited by
-
Enhancing credit scoring accuracy with a comprehensive evaluation of alternative data.PLoS One. 2024 May 21;19(5):e0303566. doi: 10.1371/journal.pone.0303566. eCollection 2024. PLoS One. 2024. PMID: 38771812 Free PMC article.
-
Key variants via the Alzheimer's Disease Sequencing Project whole genome sequence data.Alzheimers Dement. 2024 May;20(5):3290-3304. doi: 10.1002/alz.13705. Epub 2024 Mar 21. Alzheimers Dement. 2024. PMID: 38511601 Free PMC article.
-
Controlled Variable Selection from Summary Statistics Only? A Solution via GhostKnockoffs and Penalized Regression.ArXiv [Preprint]. 2024 Feb 20:arXiv:2402.12724v1. ArXiv. 2024. PMID: 38463500 Free PMC article. Preprint.
-
Knowledge domains and emerging trends of Genome-wide association studies in Alzheimer's disease: A bibliometric analysis and visualization study from 2002 to 2022.PLoS One. 2024 Jan 19;19(1):e0295008. doi: 10.1371/journal.pone.0295008. eCollection 2024. PLoS One. 2024. PMID: 38241287 Free PMC article.
-
Estimating gene-level false discovery probability improves eQTL statistical fine-mapping precision.NAR Genom Bioinform. 2023 Oct 30;5(4):lqad090. doi: 10.1093/nargab/lqad090. eCollection 2023 Dec. NAR Genom Bioinform. 2023. PMID: 37915762 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
- R01 AG023629/AG/NIA NIH HHS/United States
- U01 HL080295/HL/NHLBI NIH HHS/United States
- U01 AG052411/AG/NIA NIH HHS/United States
- HHSN268201500001I/HL/NHLBI NIH HHS/United States
- U01 HL096812/HL/NHLBI NIH HHS/United States
- HHSN268201100012C/HL/NHLBI NIH HHS/United States
- U01 HL130114/HL/NHLBI NIH HHS/United States
- R01 NS017950/NS/NINDS NIH HHS/United States
- U01 HL096917/HL/NHLBI NIH HHS/United States
- U01 AG052410/AG/NIA NIH HHS/United States
- HHSN268201100008C/HL/NHLBI NIH HHS/United States
- N01HC85080/HL/NHLBI NIH HHS/United States
- U01 AG052409/AG/NIA NIH HHS/United States
- R01 AG060747/AG/NIA NIH HHS/United States
- R01 MH106910/MH/NIMH NIH HHS/United States
- U01 AG049506/AG/NIA NIH HHS/United States
- HHSN268201100007C/HL/NHLBI NIH HHS/United States
- R01 HL120393/HL/NHLBI NIH HHS/United States
- HHSN268201200036C/HL/NHLBI NIH HHS/United States
- R01 HL089856/HL/NHLBI NIH HHS/United States
- RF1 AG072272/AG/NIA NIH HHS/United States
- U01 AG049508/AG/NIA NIH HHS/United States
- P30 AG066515/AG/NIA NIH HHS/United States
- R01 AG049607/AG/NIA NIH HHS/United States
- HHSN268201100011C/HL/NHLBI NIH HHS/United States
- U01 AG049505/AG/NIA NIH HHS/United States
- U01 AG016976/AG/NIA NIH HHS/United States
- HHSN268201500014C/HL/NHLBI NIH HHS/United States
- U01 AG057659/AG/NIA NIH HHS/United States
- N01HC85082/HL/NHLBI NIH HHS/United States
- U24 AG021886/AG/NIA NIH HHS/United States
- U01 HL089856/HL/NHLBI NIH HHS/United States
- N01HC55222/HL/NHLBI NIH HHS/United States
- R01 AG054076/AG/NIA NIH HHS/United States
- N01HC85079/HL/NHLBI NIH HHS/United States
- U01 HL096902/HL/NHLBI NIH HHS/United States
- N01HC85083/HL/NHLBI NIH HHS/United States
- U01 AG032984/AG/NIA NIH HHS/United States
- U24 AG041689/AG/NIA NIH HHS/United States
- N01HC85086/HL/NHLBI NIH HHS/United States
- R01 AG033040/AG/NIA NIH HHS/United States
- U01 HL096899/HL/NHLBI NIH HHS/United States
- UF1 AG047133/AG/NIA NIH HHS/United States
- N01HC25195/HL/NHLBI NIH HHS/United States
- U54 HG003067/HG/NHGRI NIH HHS/United States
- U54 AG052427/AG/NIA NIH HHS/United States
- U54 HG003273/HG/NHGRI NIH HHS/United States
- HHSN268201100005C/HL/NHLBI NIH HHS/United States
- R01 AG033193/AG/NIA NIH HHS/United States
- R01 HG008980/HG/NHGRI NIH HHS/United States
- HHSN268201100009C/HL/NHLBI NIH HHS/United States
- U01 HL120393/HL/NHLBI NIH HHS/United States
- HHSN268201100006C/HL/NHLBI NIH HHS/United States
- R01 AG066206/AG/NIA NIH HHS/United States
- U01 HL096814/HL/NHLBI NIH HHS/United States
- HHSN268200800007C/HL/NHLBI NIH HHS/United States
- N01HC85081/HL/NHLBI NIH HHS/United States
- R01 MH095797/MH/NIMH NIH HHS/United States
- U01 AG049507/AG/NIA NIH HHS/United States
- U54 HG003079/HG/NHGRI NIH HHS/United States
- U01 HL089897/HL/NHLBI NIH HHS/United States
- R01 HL117626/HL/NHLBI NIH HHS/United States
- HHSN268201100010C/HL/NHLBI NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources