Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 6;180(3):568-584.e23.
doi: 10.1016/j.cell.2019.12.036. Epub 2020 Jan 23.

Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism

Collaborators, Affiliations

Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism

F Kyle Satterstrom et al. Cell. .

Abstract

We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n = 35,584 total samples, 11,986 with ASD). Using an enhanced analytical framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate of 0.1 or less. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained to have severe neurodevelopmental delay, whereas 53 show higher frequencies in individuals ascertained to have ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In cells from the human cortex, expression of risk genes is enriched in excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory-inhibitory imbalance underlying ASD.

Keywords: autism spectrum disorder; cell type; cytoskeleton; excitatory neurons; excitatory-inhibitory balance; exome sequencing; genetics; inhibitory neurons; liability; neurodevelopment.

PubMed Disclaimer

Conflict of interest statement

Declaration of Interests B.M.N. is a member of the scientific advisory board at Deep Genomics and consults for Biogen, Camp4 Therapeutics Corporation, Takeda Pharmaceutical, and Biogen. During the last 3 years, C.M. Freitag has been consultant to Desitin and Roche and receives royalties for books on ASD, ADHD, and MDD.

Figures

Figure 1.
Figure 1.. Distribution of Rare Autosomal Protein-Coding Variants in ASD Cases and Controls
(A) The proportion of rare autosomal genetic variants split by predicted functional consequences, represented by color, is displayed for family-based (split into de novo and inherited variants) and case-control data. PTVs and missense variants are split into three tiers of predicted functional severity, represented by shade, based on the pLI and MPC metrics, respectively. (B) The relative difference in variant frequency (i.e., burden) between ASD cases and controls (top and bottom) or transmitted and untransmitted parental variants (center) is shown for the top two tiers of functional severity for PTVs (left and center) and the top tier of functional severity for missense variants (right). Next to the bar plot, the same data are shown divided by sex. (C) The relative difference in variant frequency shown in (B) is converted to a trait liability Z score, split by the same subsets used in (A). For context, a Z score of 2.18 would shift an individual from the population mean to the top 1.69% of the population (equivalent to an ASD threshold based on 1 in 68 children; Christensen et al., 2016). No significant difference in liability was observed between males and females for any analysis. Statistical tests: (B) and (C), binomial exact test (BET) for most contrasts; exceptions were “both” and “case-control,” for which Fisher’s method for combining BET p values for each sex and, for case-control, each population was used; p values corrected for 168 tests are shown.
Figure 2.
Figure 2.. Gene Discovery in the ASC Cohort
(A) WES data from 35,584 samples are entered into a Bayesian analysis framework (TADA) that incorporates pLI score for PTVs and MPC score for missense variants. (B) The model identifies 102 autosomal genes associated with ASD at a false discovery rate (FDR) threshold of 0.1 or less, which is shown on the y axis of this Manhattan plot, with each point representing a gene. Of these, 78 pass the threshold FDR of 0.05 or less, and 26 pass the threshold family-wise error rate (FWER) of 0.05 or less. (C) Repeating our ASD trait liability analysis (Figure 1C) for variants observed within the 102 ASD-associated genes only. Statistical tests: (B), TADA; (C), BET for most contrasts; exceptions were “both” and “case-control,” for which Fisher’s method for combining BET p values for each sex and, for case-control, each population was used; p values corrected for 168 tests are shown.
Figure 3.
Figure 3.. Genetic Characterization of ASD Genes
(A) Count of PTVs versus missensevariants(MPC ≥ 1) in cases for each ASD-associated gene (red points, selected genes labeled). These counts reflect the data used by TADA for association analysis: de novo and case-control data for PTVs; de novo only for missense. (B) Location of ASD de novo missense variants in DEAF1. The five ASD variants (marked in red) are in the SAND (Sp100, AIRE-1, NucP41/75, DEAF-1) DNA-binding domain (amino acids 193–273, spirals show α helices, arrows show β sheets, KDWK isthe DNA-binding motif) alongside 10 variants observed in NDD, several of which have been shown to reduce DNA binding, including Q264P and Q264R (Chen et al., 2017; Heyne et al., 2018; Vultovan Silfhout et al., 2014). (C) Location of ASD missensevariants in KCNQ3. All four ASD variants are located in the voltage sensor (fourth of six transmembrane domains), with three in the same residue (R230), including the gain-of-function R230C mutation observed in NDD (Heyne et al., 2018; Miceli et al, 2015). Five inherited variants observed in benign infantile seizures are shown in the pore loop (Landrum et al., 2014; Maljevic et al., 2016). (D) Location of ASD missense variants in SCN1A along side 17 de novo variants in NDD and epilepsy (Heyne et al., 2018). (E) Location of ASD missense variants in SLC6A1 along side 31 de novo variants in NDD and epilepsy (Heyne et al., 2018; Johannesen et al., 2018). (F) Subtelomeric 2q37 deletions are associated with facial dysmorphisms, brachydactyly, high BMI, NDD, and ASD (Leroy et al., 2013). Although three genes within the locus have a pLI score of 0.995 or higher, only HDLBP is associated with ASD. (G) Deletions atthe 11q13.2–q13.4 locus have been observed in NDD, ASD, and otodental dysplasia (Coe et al., 2014; Cooperet al., 2011). Five genes within the locus have a pLI score of 0.995 or higher, including two ASD genes: KMT5B and SHANK2. (H) Assessment of gene-based enrichment, via MAGMA, of 102 ASD genes against genome-wide significant common variants from six GWASs. (I) Gene-based enrichment of 102 ASD genes in multiple GWASs as a function of effective cohort size. The GWAS used for each disorder in (I) has a black outline. Statistical tests: (F) and (G), TADA; (H) and (I), MAGMA.
Figure 4.
Figure 4.. Phenotypic and Functional Categories of ASD-Associated Genes
(A) Frequency of disruptive de novo variants (e.g., PTVs or missense variants with MPC ≥ 1) in ASD-ascertained and NDD-ascertained cohorts (Table S4) is shown for the 102 ASD-associated genes (selected genes labeled). Fifty genes with a higher frequency in ASD are designated ASD-predominant (ASDp), whereas the 49 genes more frequently mutated in NDD are designated as ASDNDD. Three genes marked with a star(UBR1, MAP1A, and NUP155) are included in the ASDP category on the basis of case-control data (Table S4), which are not shown here. Of the 26 FWER genes, 10 are ASDp and 16 are ASDNDD. Of the 102 genes, 13 demonstrate nominally significant heterogeneity between samples ascertained for ASD versus NDD (Table S4). (B) ASD cases with disruptive de novo variants in ASD genes show delayed walking compared with ASD cases without such de novo variants, and the effect is greater for those with disruptive de novo variants in ASDNDD genes. (C) Similarly, cases with disruptive de novo variants in ASDNDD genes and, to a lesser extent, ASDP genes have a lower full-scale IQ (FSIQ) than other ASD cases. (D) Despite the association between de novo variants in ASD genes and cognitive impairment shown in (C), an excess of disruptive de novo variants is observed in cases without intellectual disability (FSIQ ≥ 70) or with an IQ above the cohort mean (FSIQ ≥ 82). (E) Along with the phenotypic division (A), genes can also be classified functionally into four groups (gene expression regulation [GER], neuronal communication [NC], cytoskeleton, and other) based on Gene Ontology and research literature. The 102 ASD risk genes are shown in a mosaic plot divided by gene function and, from (A), the ASD versus NDD variant frequency, with the area of each box proportional to the number of genes. Statistical tests: (B) and (C), t test; (D), chi-square test with 1° of freedom.
Figure 5.
Figure 5.. Analysis of 102 ASD-Associated Genes in the Context of Gene Expression Data
(A) GTEx bulk RNA-seq data from 53 tissues were processed to identify genes enriched in specific tissues. Gene set enrichment was performed for the 102 ASD genes and four subsets (ASDP, ASDNDD, GER, and NC) for each tissue. Five representative tissues are shown here, including cortex, which has the greatest degree of enrichment (OR = 3.7; p = 2.6 × 10−6). (B) BrainSpan bulk RNA-seq data across 10 developmental stages was used to plot the normalized expression of the 101 cortically expressed ASD genes (excluding PAX5, which is not expressed in the cortex) across development, split by the four subsets. (C) A t-statistic was calculated, comparing prenatal with postnatal expression in the BrainSpan data. The t-statistic distribution of 101 ASD-associated genes shows a prenatal bias (p = 8 × 10−8) for GER genes (p = 9 × 10−15), whereas NC genes are postnatally biased (p = 0.03). (D) The cumulative number of ASD-associated genes expressed in RNA-seq data for 4,261 cells collected from human forebrain across prenatal development (Nowakowski et al., 2017). (E) t-SNE analysis identifies 19 clusters with unambiguous cell type in these single-cell expression data. (F) The enrichment of the 102 ASD-associated genes within cells of each type is represented by color. The most consistent enrichment is observed in maturing and mature excitatory (bottom center) and inhibitory (top right) neurons. (G) The developmental relationships of the 19 clusters are indicated by black arrows, with the inhibitory lineage shown on the left (cyan), excitatory lineage in the middle (magenta), and non-neuronal cell types on the right (gray). The proportion of the 102 ASD-associated genes observed in at least 25% of cells within the cluster is shown by the pie chart, whereas the log-transformed Bonferroni-corrected p value of gene set enrichment is shown by the size of the red circle. (H) The relationship between the number of cells in the cluster (x axis) and the p value for ASD gene enrichment (y axis) is shown for the 19 cell type clusters. Linear regression indicates that clusters with few expressed genes (e.g., C23 newborn inhibitory neurons) have higher p valuesthan clusters with many genes (e.g., C25 radial glia). (I) The relationship between the 19 cell type clusters using hierarchical clustering based on the 10% of genes with the greatest variability among cell types. Statistical tests: (A), t test; (C), Wilcoxon test; (E), (F), (H), and (I), FET.

Comment in

Similar articles

Cited by

References

    1. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, and Sunyaev SR (2010). A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249. - PMC - PubMed
    1. Baio J, Wiggins L, Christensen DL, Maenner MJ, Daniels J, Warren Z, Kurzius-Spencer M, Zahorodny W, Robinson Rosenberg C, White T, et al. (2018). Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years - Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014. MMWR Surveill Summ. 67, 1–23. - PMC - PubMed
    1. Battle A, Brown CD, Engelhardt BE, and Montgomery SB; GTEx Consortium; Laboratory, Data Analysis &Coordinating Center (LDACC)—Analysis Working Group; Statistical Methods groups—Analysis Working Group; Enhancing GTEx (eGTEx) groups; NIH Common Fund; NIH/NCI; NIH/NHGRI; NIH/NIMH; NIH/NIDA; Biospecimen Collection Source Site—NDRI; Biospecimen Collection Source Site—RPCI; Biospecimen Core Resource—VARI; Brain Bank Repository—University of Miami Brain Endowment Bank; Leidos Biomedical—Project Management; ELSI Study; Genome Browser Data Integration &Visualization — EBI; Genome Browser Data Integration &Visualization — UCSC Genomics Institute, University of California Santa Cruz; Lead analysts; Laboratory, Data Analysis &Coordinating Center (LDACC); NIH program management; Biospecimen collection; Pathology; eQTL manuscript working group (2017). Genetic effects on gene expression across human tissues. Nature 550, 204–213. - PubMed
    1. Ben-Shalom R, Keeshen CM, Berrios KN, An JY, Sanders SJ, and Bender KJ (2017). Opposing effects on NaV1.2 function underlie differences between SCN2A variants observed in individuals with autism spectrum disorder or infantile seizures. Biol. Psychiatry 82, 224–232. - PMC - PubMed
    1. Bernier R, Golzio C, Xiong B, Stessman HA, Coe BP, Penn O, Witherspoon K, Gerdts J, Baker C, Vulto-van Silfhout AT, et al. (2014). Disruptive CHD8 mutations define a subtype of autism early in development. Cell 158, 263–276. - PMC - PubMed

Publication types