This is a preprint.
Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease
- PMID: 38585781
- PMCID: PMC10996727
- DOI: 10.1101/2024.03.22.24304565
Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease
Abstract
Rare structural variants (SVs) - insertions, deletions, and complex rearrangements - can cause Mendelian disease, yet they remain difficult to accurately detect and interpret. We sequenced and analyzed Oxford Nanopore long-read genomes of 68 individuals from the Undiagnosed Disease Network (UDN) with no previously identified diagnostic mutations from short-read sequencing. Using our optimized SV detection pipelines and 571 control long-read genomes, we detected 716 long-read rare (MAF < 0.01) SV alleles per genome on average, achieving a 2.4x increase from short-reads. To characterize the functional effects of rare SVs, we assessed their relationship with gene expression from blood or fibroblasts from the same individuals, and found that rare SVs overlapping enhancers were enriched (LOR = 0.46) near expression outliers. We also evaluated tandem repeat expansions (TREs) and found 14 rare TREs per genome; notably these TREs were also enriched near overexpression outliers. To prioritize candidate functional SVs, we developed Watershed-SV, a probabilistic model that integrates expression data with SV-specific genomic annotations, which significantly outperforms baseline models that don't incorporate expression data. Watershed-SV identified a median of eight high-confidence functional SVs per UDN genome. Notably, this included compound heterozygous deletions in FAM177A1 shared by two siblings, which were likely causal for a rare neurodevelopmental disorder. Our observations demonstrate the promise of integrating long-read sequencing with gene expression towards improving the prioritization of functional SVs and TREs in rare disease patients.
Conflict of interest statement
COMPETING INTEREST STATEMENT SBM is an advisor to BioMarin, Myome and Tenaya Therapeutics. AB is a co-founder of CellCipher, Inc, is a shareholder in Alphabet, Inc, and has consulted for Third Rock Ventures, LLC. EAA is the founder of Personalis, Deepcell, Svexa, RCD Co, Parameter Health, an advisor for SequenceBio, Foresite Labs, PacBio, a non-executive director at AstraZeneca, hold stocks in Oxford Nanopore, Pacific Biosciences, AstraZeneca, and offers collaborative support in kind to Illumina, Pacific Biosciences, Oxford Nanopore
Figures
Similar articles
-
Combined use of Oxford Nanopore and Illumina sequencing yields insights into soybean structural variation biology.BMC Biol. 2022 Feb 23;20(1):53. doi: 10.1186/s12915-022-01255-w. BMC Biol. 2022. PMID: 35197050 Free PMC article.
-
Comparison and benchmark of structural variants detected from long read and long-read assembly.Brief Bioinform. 2023 Jul 20;24(4):bbad188. doi: 10.1093/bib/bbad188. Brief Bioinform. 2023. PMID: 37200087
-
Oxford Nanopore and Bionano Genomics technologies evaluation for plant structural variation detection.BMC Genomics. 2022 Apr 21;23(1):317. doi: 10.1186/s12864-022-08499-4. BMC Genomics. 2022. PMID: 35448948 Free PMC article.
-
Application of long-read sequencing to the detection of structural variants in human cancer genomes.Comput Struct Biotechnol J. 2021 Jul 28;19:4207-4216. doi: 10.1016/j.csbj.2021.07.030. eCollection 2021. Comput Struct Biotechnol J. 2021. PMID: 34527193 Free PMC article. Review.
-
Unravelling the tumour genome: The evolutionary and clinical impacts of structural variants in tumourigenesis.J Pathol. 2022 Jul;257(4):479-493. doi: 10.1002/path.5901. Epub 2022 Apr 28. J Pathol. 2022. PMID: 35355264 Free PMC article. Review.
References
Publication types
Grants and funding
- U24 HG010263/HG/NHGRI NIH HHS/United States
- R01 AG048076/AG/NIA NIH HHS/United States
- R21 HG013397/HG/NHGRI NIH HHS/United States
- U01 AG072573/AG/NIA NIH HHS/United States
- R35 AG072290/AG/NIA NIH HHS/United States
- U01 CA253481/CA/NCI NIH HHS/United States
- U01 HG010218/HG/NHGRI NIH HHS/United States
- U01 HG011762/HG/NHGRI NIH HHS/United States
- R01 AG074339/AG/NIA NIH HHS/United States
- U01 HG012069/HG/NHGRI NIH HHS/United States
- T32 HG000044/HG/NHGRI NIH HHS/United States
- R01 AG066490/AG/NIA NIH HHS/United States
- R35 GM139580/GM/NIGMS NIH HHS/United States
- R01 MH125244/MH/NIMH NIH HHS/United States
- R01 NS072248/NS/NINDS NIH HHS/United States
- U01 NS134358/NS/NINDS NIH HHS/United States
- P30 AG066515/AG/NIA NIH HHS/United States
- R03 CA272952/CA/NCI NIH HHS/United States
- S10 OD025082/OD/NIH HHS/United States
- OT2 OD034190/OD/NIH HHS/United States
LinkOut - more resources
Full Text Sources