Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023;1(1):100823.
doi: 10.1016/j.gimo.2023.100823. Epub 2023 Jun 7.

Clinical variants in Caenorhabditis elegans expressing human STXBP1 reveal a novel class of pathogenic variants and classify variants of uncertain significance

Affiliations

Clinical variants in Caenorhabditis elegans expressing human STXBP1 reveal a novel class of pathogenic variants and classify variants of uncertain significance

Christopher E Hopkins et al. Genet Med Open. 2023.

Abstract

Purpose: Modeling disease variants in animals is useful for drug discovery, understanding disease pathology, and classifying variants of uncertain significance (VUS) as pathogenic or benign.

Methods: Using Clustered Regularly Interspaced Short Palindromic Repeats, we performed a Whole-gene Humanized Animal Model procedure to replace the coding sequence of the animal model's unc-18 ortholog with the coding sequence for the human STXBP1 gene. Next, we used Clustered Regularly Interspaced Short Palindromic Repeats to introduce precise point variants in the Whole-gene Humanized Animal Model-humanized STXBP1 locus from 3 clinical categories (benign, pathogenic, and VUS). Twenty-six phenotypic features extracted from video recordings were used to train machine learning classifiers on 25 pathogenic and 32 benign variants.

Results: Using multiple models, we were able to obtain a diagnostic sensitivity near 0.9. Twenty-three VUS were also interrogated and 8 of 23 (34.8%) were observed to be functionally abnormal. Interestingly, unsupervised clustering identified 2 distinct subsets of known pathogenic variants with distinct phenotypic features; both p.Tyr75Cys and p.Arg406Cys cluster away from other variants and show an increase in swim speed compared with hSTXBP1 worms. This leads to the hypothesis that the mechanism of disease for these 2 variants may differ from most STXBP1-mutated patients and may account for some of the clinical heterogeneity observed in the patient population.

Conclusion: We have demonstrated that automated analysis of a small animal system is an effective, scalable, and fast way to understand functional consequences of variants in STXBP1 and identify variant-specific intensities of aberrant activity suggesting a genotype-to-phenotype correlation is likely to occur in human clinical variations of STXBP1.

Keywords: CRISPR; Clinical variant; STXBP1; Variant of uncertain significance; unc-18.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Christopher E. Hopkins, Kathryn McCormick, Trisha Brock, Kolt Mcbride, Christine Kim, and Jennifer A. Lawson are employees to In Vivo Biosystems. Matthew N. Bainbridge and Matthew Wood are employees of Codified Genomics. Ingo Helbig and Sarah Ruggiero declare no conflicts of interest.

Figures

Figure 1
Figure 1. Growing need for classification in reported variants of STXBP1.
A. Number of missense variants in ClinVar that are VUS (gray), pathogenic (red), likely pathogenic (pink), likely benign (light green), or benign (green) as a function of time. B. The role of STXBP1, the protein of focus in this paper, in coordinating vesicular release. Associated proteins are listed. C. In silico predicted pathogenicity from REVEL, mean score per amino acid (gray), and local regression smoothing (red) across STXBP1, with super family domain (blue), and known pathogenic variants (black dots); dot size is relative to number of pathogenic variants at that locus. D. As in panel C, but for SCN1A. VUS, variant of uncertain significance.
Figure 2
Figure 2. Schematic representation of the experiments performed.
A. The native homolog of STXBP1, unc-18, was removed from the genome in a full deletion knockout. Subsequently, a codon optimized coding sequence encoding human STXBP1 is inserted into the same genomic location. B. Individual human variants were created in the STXBP1-expressing animals. The functional domains of STXBP1 as determined via crystallography are depicted, and the location of individual variants are marked with shapes. Red triangles represent pathogenic missense variants in our training data set, black X’s represent pathogenic truncating variants, green circles represent benign variants, yellow squares represent VUS. C. The generated strains were automatically assessed for 26 phenotypic features characterizing the animals’ movement and morphology. D. Two machine learning models, random forest and SVMs were trained on the resulting data set. E. Models were used to sort VUS into functionally normal and abnormal groups, representing functional predictors of pathogenicity. SVM, support vector machine; VUS, variant of uncertain significance.
Figure 3
Figure 3. Example data and unsupervised cluster analysis.
A. Straight Line Speed measured across 2034 worms from 60 genotypes: 25 benign (cyan) and 32 pathogenic variants (brown) plus controls (blue: unc-18 full deletion null, hSTXBP1 whole-gene humanized, and N2 wild-type) (average strain value marked by black bar). B. Pathogenic missense (brown triangles), pathogenic truncating (black triangles), benign (cyan circles), and control (blue stars) variants were clustered using k-means algorithm and plotted in 2 dimensions (principal components 1 and 2), which shows 3 distinct clusters (gray-bound regions). C. Inset bar graph shows linear speed for each pathogenic cluster, with standard error (whiskers) and speed of the control sample in each cluster (black circle). Asterisk indicates P value < .05 (t test). R:Arg; N:Asn; D:Asp; C:Cys; E:Glu; Q:Gln; G:Gly; H:His; I:Ile; L:Leu; K:Lys; M:Met; F:Phe; P:Pro; S:Ser; T:Thr; W:Trp; Y:Tyr; V:Val.
Figure 4
Figure 4. Model evaluation and functional predictions using supervised machine learning algorithms.
VUS predicted pathogenic (red text) and benign (green text) are shown; the length of black bar (left for benign, right pathogenic) indicates strength of classification. A. A support vector machine classification, of known benign and pathogenic variants, achieved an AUC of 0.94 and classified 5 of 23 VUS as pathogenic. B. A random forest classification achieved an AUC of 0.84 and classified 8 of 23 VUS as pathogenic. R:Arg; N:Asn; D:Asp; C:Cys; E:Glu; Q:Gln; G:Gly; H:His; I:Ile; L:Leu; K:Lys; M:Met; F:Phe; P:Pro; S:Ser; T:Thr; W:Trp; Y:Tyr; V:Val. AUC, area under curve.

Similar articles

References

    1. Batley J, Edwards D. Genome sequence data: management, storage, and visualization. BioTechniques. 2009;46(5):333–334, 336 10.2144/000113134. - DOI - PubMed
    1. Lightbody G, Haberland V, Browne F, et al. Review of applications of high-throughput sequencing in personalized medicine: barriers and facilitators of future progress in research and clinical application. Brief Bioinform. 2019;20(5):1795–1811. 10.1093/bib/bby051 - DOI - PMC - PubMed
    1. Baldridge D, Heeley J, Vineyard M, et al. The exome clinic and the role of medical genetics expertise in the interpretation of exome sequencing results. Genet Med. 2017;19(9):1040–1048. 10.1038/gim.2016.224 - DOI - PMC - PubMed
    1. Green ED, Gunter C, Biesecker LG, et al. Strategic vision for improving human health at the forefront of genomics. Nature. 2020;586(7831):683–692. 10.1038/s41586-020-2817-4 - DOI - PMC - PubMed
    1. Ponzoni L, Peñaherrera DA, Oltvai ZN, Bahar I. Rhapsody: predicting the pathogenicity of human missense variants. Bioinformatics. 2020;36(10):3084–3092. 10.1093/bioinformatics/btaa127 - DOI - PMC - PubMed

LinkOut - more resources