Genome-wide analysis of common and rare variants via multiple knockoffs at biobank scale, with an application to Alzheimer disease genetics
- PMID: 34767756
- PMCID: PMC8715147
- DOI: 10.1016/j.ajhg.2021.10.009
Genome-wide analysis of common and rare variants via multiple knockoffs at biobank scale, with an application to Alzheimer disease genetics
Abstract
Knockoff-based methods have become increasingly popular due to their enhanced power for locus discovery and their ability to prioritize putative causal variants in a genome-wide analysis. However, because of the substantial computational cost for generating knockoffs, existing knockoff approaches cannot analyze millions of rare genetic variants in biobank-scale whole-genome sequencing and whole-genome imputed datasets. We propose a scalable knockoff-based method for the analysis of common and rare variants across the genome, KnockoffScreen-AL, that is applicable to biobank-scale studies with hundreds of thousands of samples and millions of genetic variants. The application of KnockoffScreen-AL to the analysis of Alzheimer disease (AD) in 388,051 WG-imputed samples from the UK Biobank resulted in 31 significant loci, including 14 loci that are missed by conventional association tests on these data. We perform replication studies in an independent meta-analysis of clinically diagnosed AD with 94,437 samples, and additionally leverage single-cell RNA-sequencing data with 143,793 single-nucleus transcriptomes from 17 control subjects and AD-affected individuals, and proteomics data from 735 control subjects and affected indviduals with AD and related disorders to validate the genes at these significant loci. These multi-omics analyses show that 79.1% of the proximal genes at these loci and 76.2% of the genes at loci identified only by KnockoffScreen-AL exhibit at least suggestive signal (p < 0.05) in the scRNA-seq or proteomics analyses. We highlight a potentially causal gene in AD progression, EGFR, that shows significant differences in expression and protein levels between AD-affected individuals and healthy control subjects.
Keywords: Alzheimer disease; GWAS; knockoff statistics; omics; sequencing.
Copyright © 2021 The Author(s). Published by Elsevier Inc. All rights reserved.
Conflict of interest statement
Declaration of interests The authors declare no competing interests.
Figures
Similar articles
-
BIGKnock: fine-mapping gene-based associations via knockoff analysis of biobank-scale data.Genome Biol. 2023 Feb 13;24(1):24. doi: 10.1186/s13059-023-02864-6. Genome Biol. 2023. PMID: 36782330 Free PMC article.
-
Identification of putative causal loci in whole-genome sequencing data via knockoff statistics.Nat Commun. 2021 May 25;12(1):3152. doi: 10.1038/s41467-021-22889-4. Nat Commun. 2021. PMID: 34035245 Free PMC article.
-
Whole-genome sequencing reveals new Alzheimer's disease-associated rare variants in loci related to synaptic function and neuronal development.Alzheimers Dement. 2021 Sep;17(9):1509-1527. doi: 10.1002/alz.12319. Epub 2021 Apr 2. Alzheimers Dement. 2021. PMID: 33797837 Free PMC article.
-
Interpretation of risk loci from genome-wide association studies of Alzheimer's disease.Lancet Neurol. 2020 Apr;19(4):326-335. doi: 10.1016/S1474-4422(19)30435-1. Epub 2020 Jan 24. Lancet Neurol. 2020. PMID: 31986256 Free PMC article. Review.
-
SORL1 genetic variants and Alzheimer disease risk: a literature review and meta-analysis of sequencing data.Acta Neuropathol. 2019 Aug;138(2):173-186. doi: 10.1007/s00401-019-01991-4. Epub 2019 Mar 25. Acta Neuropathol. 2019. PMID: 30911827 Review.
Cited by
-
Leveraging electronic health records and knowledge networks for Alzheimer's disease prediction and sex-specific biological insights.Nat Aging. 2024 Mar;4(3):379-395. doi: 10.1038/s43587-024-00573-8. Epub 2024 Feb 21. Nat Aging. 2024. PMID: 38383858 Free PMC article.
-
Identification of blood metabolites associated with risk of Alzheimer's disease by integrating genomics and metabolomics data.Mol Psychiatry. 2024 Apr;29(4):1153-1162. doi: 10.1038/s41380-023-02400-9. Epub 2024 Jan 12. Mol Psychiatry. 2024. PMID: 38216726
-
Deep neural networks with controlled variable selection for the identification of putative causal genetic variants.Nat Mach Intell. 2022 Sep;4(9):761-771. doi: 10.1038/s42256-022-00525-0. Epub 2022 Sep 15. Nat Mach Intell. 2022. PMID: 37859729 Free PMC article.
-
Integrated analysis of plasma proteome and cortex single-cell transcriptome reveals the novel biomarkers during cortical aging.Front Aging Neurosci. 2023 Jul 19;15:1063861. doi: 10.3389/fnagi.2023.1063861. eCollection 2023. Front Aging Neurosci. 2023. PMID: 37539343 Free PMC article.
-
BIGKnock: fine-mapping gene-based associations via knockoff analysis of biobank-scale data.Genome Biol. 2023 Feb 13;24(1):24. doi: 10.1186/s13059-023-02864-6. Genome Biol. 2023. PMID: 36782330 Free PMC article.
References
-
- Taliun D., Harris D.N., Kessler M.D., Carlson J., Szpiech Z.A., Torres R., Taliun S.A.G., Corvelo A., Gogarten S.M., Kang H.M., et al. NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021;590:290–299. - PMC - PubMed
-
- Battle A., Brown C.D., Engelhardt B.E., Montgomery S.B., GTEx Consortium. Laboratory, Data Analysis &Coordinating Center (LDACC)—Analysis Working Group. Statistical Methods groups—Analysis Working Group. Enhancing GTEx (eGTEx) groups. NIH Common Fund. NIH/NCI. NIH/NHGRI. NIH/NIMH. NIH/NIDA. Biospecimen Collection Source Site—NDRI. Biospecimen Collection Source Site—RPCI. Biospecimen Core Resource—VARI. Brain Bank Repository—University of Miami Brain Endowment Bank. Leidos Biomedical—Project Management. ELSI Study. Genome Browser Data Integration &Visualization—EBI. Genome Browser Data Integration &Visualization—UCSC Genomics Institute, University of California Santa Cruz. Lead analysts. Laboratory, Data Analysis &Coordinating Center (LDACC) NIH program management. Biospecimen collection. Pathology. eQTL manuscript working group Genetic effects on gene expression across human tissues. Nature. 2017;550:204–213. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
- U24 AG021886/AG/NIA NIH HHS/United States
- U01 AG016976/AG/NIA NIH HHS/United States
- R01 HL105756/HL/NHLBI NIH HHS/United States
- R01 AG033193/AG/NIA NIH HHS/United States
- R01 HG008980/HG/NHGRI NIH HHS/United States
- WT_/Wellcome Trust/United Kingdom
- R01 AG066206/AG/NIA NIH HHS/United States
- R01 MH095797/MH/NIMH NIH HHS/United States
- MC_QA137853/MRC_/Medical Research Council/United Kingdom
- MC_PC_17228/MRC_/Medical Research Council/United Kingdom
- R01 AG060747/AG/NIA NIH HHS/United States
- RF1 AG072272/AG/NIA NIH HHS/United States
- U01 AG032984/AG/NIA NIH HHS/United States
- P30 AG066515/AG/NIA NIH HHS/United States
- 890650 MARIE SKŁODOWSKA-CURIE (HORIZON 2020)/ERC_/European Research Council/International
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Research Materials
Miscellaneous