Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov;18(11):e11176.
doi: 10.15252/msb.202211176.

spliceJAC: transition genes and state-specific gene regulation from single-cell transcriptome data

Affiliations

spliceJAC: transition genes and state-specific gene regulation from single-cell transcriptome data

Federico Bocci et al. Mol Syst Biol. 2022 Nov.

Abstract

Extracting dynamical information from single-cell transcriptomics is a novel task with the promise to advance our understanding of cell state transition and interactions between genes. Yet, theory-oriented, bottom-up approaches that consider differences among cell states are largely lacking. Here, we present spliceJAC, a method to quantify the multivariate mRNA splicing from single-cell RNA sequencing (scRNA-seq). spliceJAC utilizes the unspliced and spliced mRNA count matrices to constructs cell state-specific gene-gene regulatory interactions and applies stability analysis to predict putative driver genes critical to the transitions between cell states. By applying spliceJAC to biological systems including pancreas endothelium development and epithelial-mesenchymal transition (EMT) in A549 lung cancer cells, we predict genes that serve specific signaling roles in different cell states, recover important differentially expressed genes in agreement with pre-existing analysis, and predict new transition genes that are either exclusive or shared between different cell state transitions.

Keywords: attractor linear stability; cell state transition; gene regulatory network; mRNA splicing; single-cell RNA sequencing.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Overview of spliceJAC and main analysis output
  1. As input, spliceJAC requires unspliced and spliced mRNA count matrices as well as cell annotations.

  2. spliceJAC fits the mRNA count data within each cell state to a multivariate mRNA splicing model.

  3. The output of the model is a set of gene–gene interaction matrices that encode gene–gene interactions in each cell state.

  4. The switch between two cell states is interpreted as a transition on a high‐dimensional landscape shaped by the underlying state‐specific gene regulatory networks.

  5. By projecting the unstable eigenvalues of the inferred Jacobian matrix, spliceJAC predicts key transition genes (TG).

  6. Each cell state exhibits specific interactions between genes that are captured in a regulatory network. Downstream analysis of the network identifies important signaling hubs such as strong regulators that modulate many other genes and strong targets that receive inputs from multiple genes.

  7. A state transition coincides with a rearrangement of the state‐specific regulatory network.

Figure 2
Figure 2. Benchmarking spliceJAC against in silico circuits and comparison with other existing GRN inference methods
  1. A

    The phase space of a bistable toggle switch including nullclines (silver lines), stable fixed points (blue dots), and stochastic perturbation around stable fixed points (red lines). X‐ and y‐coordinates represent unspliced mRNA counts of genes X and Y.

  2. B

    Ground truth (top) and inferred (bottom) interaction matrices for the two stable fixed.

  3. C

    The phase space of a monostable circuit of three genes that activate each other in a loop.

  4. D

    Ground truth and inferred interaction matrix of the three genes circuit.

  5. E

    The EMT circuit proposed by Tian et al (2013). Green and red nodes highlight epithelial and mesenchymal genes, while pointing and t‐shaped arrows represent activation or inhibition, respectively.

  6. F

    Bifurcation diagram showing the available attractors as a function of TGF‐beta inducer. The red dotted line highlights a value leading to tristability used for spliceJAC testing thereafter.

  7. G, H

    Ground truth (G) and estimated (H) interaction matrices in the mesenchymal state.

  8. I, J

    Comparison with existing GRN inference methods based on the absolute difference between ground truth and prediction (I) and maximization of the fraction of correct matrix element signs (J). Green, orange, and red bars showcase results for the Epithelia, Hybrid E/M and mesenchymal states, respectively.

  9. K

    The AUPRC ratio for all methods in the Beeline pipeline and spliceJAC. For each circuit state, the AUPRC scores are normalized by the average score achieved for the state. NA = no output file was generated. NA* = an output file without any predicted edge was generated.

Figure EV1
Figure EV1. Spectral analysis of unstable fixed points
  1. A

    Left: The phase space of the toggle switch synthetic circuit. Right: The eigenvalues of the Jacobian matrices in the three fixed points, ranked in ascending order, inferred by spliceJAC.

  2. B–D

    The ground truth (red) and inferred (blue) eigenvalue spectrum in the three fixed points of the EMT circuit.

  3. E

    The inferred eigen‐spectrums cells sampled from multiple cell states are merged. Positive eigenvalues in the red circle indicate instability.

  4. F

    Comparison of largest eigenvector inferred by spliceJAC in the three stable states and in the cell mixtures. The boxplot central bands and boxes depict average, first to third quantile (Q1–Q3) range, respectively, while the whisker extension corresponds to 1.5× the interquartile range (IQR). The boxplot results are computed over 10 independent simulations.

Figure 3
Figure 3. Identification of cell state‐specific regulatory interactions in the pancreas epithelium
  1. UMAP embedding of the pancreas epithelium dataset from Bastidas‐Ponce et al (2019). Arrows indicate cell transitions identified with PAGA.

  2. A core gene regulatory network governing gene expression in Ductal state. Node size depicts gene expression level within the Ductal cluster while the color scale depicts betweenness centrality of the node.

  3. Scatterplot of the genes based on incoming interaction strength (x‐axis) and outgoing interaction strength (y‐axis).

  4. State‐to‐state variability of gene betweenness centrality across cell states quantified by the interquartile range. Leftmost genes have highest state‐to‐state variability, whereas rightmost genes have lowest state‐to‐state variability.

  5. A core GRN including the top Differentially Expressed Genes (DEG) of the Ngn3 high EP cell state and the top Transition Genes (TG) for the transition toward the Pre‐endocrine state.

  6. Change in incoming and outgoing signaling scores during the transitioning from Ngn3 high EP to pre‐endocrine.

  7. Flowchart highlighting the shared and unique Transition Genes (TG) for differentiation from Pre‐endocrine to Alpha, Beta, Delta, and Epsilon, respectively.

  8. Gene Ontology analysis for the top five identified transitioning genes leading to the Alpha, Beta, Delta, and Epsilon cluster, respectively.

Figure EV2
Figure EV2. The differential and conserved interactions in the Ngn3 high EP/Pre‐endocrine cell state transition
  1. A

    The differential GRN between the Ngn3 high EP and Pre‐endocrine cell states. Node colormap indicates the gene expression fold‐change between the cell states.

  2. B

    The top 10 differential interactions between the Ngn3 high EP and Pre‐endocrine cell states. Arrows depict the interaction strength change.

  3. C, D

    The conserved GRN (C) and top 10 conserved interactions (D) highlight the interactions that maintained similar strength from the Ngn3 high EP to the Pre‐endocrine cell state.

Figure EV3
Figure EV3. The GRN similarity along a developmental trajectory
  1. The area under the precision‐recall curve (AUPRC) when predicting the GRN of the observed GRN (y‐axis) using the GRN of the predictor state (x‐axis).

  2. The goodness of GRN prediction along a developmental trajectory. For each point, the red point shows the AUPRC between the GRNs of the starting and final state. For comparison, the blue dots and error bars show the average and standard deviation (SD) of AUPRC obtained when comparing the GRN of the starting state to the GRN of any other state in the dataset (n = 6 other states in the dataset excluding the final state).

Figure 4
Figure 4. Analysis of partial and complete EMT in A549 cells
  1. UMAP embedding of the A549 cells under TGFB induction(Cook & Vanderhyden, 2020). Arrows indicate cell transitions for partial and complete EMT.

  2. Average expression of epithelial and mesenchymal genes in the three identified cell states.

  3. Variability of total signaling across cell states quantified by the total signaling range across the three identified cell states. Leftmost genes have high state‐to‐state variability in their regulatory behavior, whereas rightmost genes have low state‐to‐state variability.

  4. Detail for the top five variable genes identified in panel (C). The bar plot showcases the total signaling of the genes in each of the three cell states.

  5. Same as (D) for the top five least variable genes.

  6. Top 10 transitioning genes in the “partial EMT” transition from epithelial to hybrid E/M states.

  7. Same as (F) for the “complete EMT” transition from hybrid E/M to mesenchymal states.

  8. A core GRN indicating the inferred interactions between top differentially expressed genes (DEG) of the epithelial state and the top transition genes of partial EM Transition.

  9. Same as (H) for the complete EMT transition.

  10. Gene Ontology analysis for the top 10 identified transitioning genes for partial and complete EMT. The analysis compares transitioning genes that are exclusive to partial EMT, exclusive to complete EMT, and shared between the two transitions.

Similar articles

Cited by

References

    1. Akers K, Murali TM (2021) Gene regulatory network inference in single‐cell biology. Curr Opin Syst Biol 26: 87–97
    1. Aldridge S, Teichmann SA (2020) Single cell transcriptomics comes of age. Nat Commun 11: 4307 - PMC - PubMed
    1. Aubin‐Frankowski P‐C, Vert J‐P (2020) Gene regulation inference from single‐cell RNA‐seq data with linear differential equations and velocity inference. Bioinformatics 36: 4774–4780 - PubMed
    1. Bargaje R, Trachana K, Shelton MN, McGinnis CS, Zhou JX, Chadick C, Cook S, Cavanaugh C, Huang S, Hood L (2017) Cell population structure prior to bifurcation predicts efficiency of directed differentiation in human induced pluripotent cells. Proc Natl Acad Sci U S A 114: 2271–2276 - PMC - PubMed
    1. Bastidas‐Ponce A, Tritschler S, Dony L, Scheibner K, Tarquis‐Medina M, Salinno C, Schirge S, Burtscher I, Böttcher A, Theis FJ et al (2019) Comprehensive single cell mRNA profiling reveals a detailed roadmap for pancreatic endocrinogenesis. Development 146: dev173849 - PubMed

Publication types