-
Accelerating Multiphase Flow Simulations with Denoising Diffusion Model Driven Initializations
Authors:
Jaehong Chung,
Agnese Marcato,
Eric J. Guiltinan,
Tapan Mukerji,
Hari Viswanathan,
Yen Ting Lin,
Javier E. Santos
Abstract:
This study introduces a hybrid fluid simulation approach that integrates generative diffusion models with physics-based simulations, aiming at reducing the computational costs of flow simulations while still honoring all the physical properties of interest. These simulations enhance our understanding of applications such as assessing hydrogen and CO$_2$ storage efficiency in underground reservoirs…
▽ More
This study introduces a hybrid fluid simulation approach that integrates generative diffusion models with physics-based simulations, aiming at reducing the computational costs of flow simulations while still honoring all the physical properties of interest. These simulations enhance our understanding of applications such as assessing hydrogen and CO$_2$ storage efficiency in underground reservoirs. Nevertheless, they are computationally expensive and the presence of nonunique solutions can require multiple simulations within a single geometry. To overcome the computational cost hurdle, we propose a hybrid method that couples generative diffusion models and physics-based modeling. We introduce a system to condition the diffusion model with a geometry of interest, allowing to produce variable fluid saturations in the same geometry. While training the model, we simultaneously generate initial conditions and perform physics-based simulations using these conditions. This integrated approach enables us to receive real-time feedback on a single compute node equipped with both CPUs and GPUs. By efficiently managing these processes within one compute node, we can continuously evaluate performance and stop training when the desired criteria are met. To test our model, we generate realizations in a real Berea sandstone fracture which shows that our technique is up to 4.4 times faster than commonly used flow simulation initializations.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Liouville Flow Importance Sampler
Authors:
Yifeng Tian,
Nishant Panda,
Yen Ting Lin
Abstract:
We present the Liouville Flow Importance Sampler (LFIS), an innovative flow-based model for generating samples from unnormalized density functions. LFIS learns a time-dependent velocity field that deterministically transports samples from a simple initial distribution to a complex target distribution, guided by a prescribed path of annealed distributions. The training of LFIS utilizes a unique met…
▽ More
We present the Liouville Flow Importance Sampler (LFIS), an innovative flow-based model for generating samples from unnormalized density functions. LFIS learns a time-dependent velocity field that deterministically transports samples from a simple initial distribution to a complex target distribution, guided by a prescribed path of annealed distributions. The training of LFIS utilizes a unique method that enforces the structure of a derived partial differential equation to neural networks modeling velocity fields. By considering the neural velocity field as an importance sampler, sample weights can be computed through accumulating errors along the sample trajectories driven by neural velocity fields, ensuring unbiased and consistent estimation of statistical quantities. We demonstrate the effectiveness of LFIS through its application to a range of benchmark problems, on many of which LFIS achieved state-of-the-art performance.
△ Less
Submitted 9 June, 2024; v1 submitted 3 May, 2024;
originally announced May 2024.
-
Offline tagging of radon-induced backgrounds in XENON1T and applicability to other liquid xenon detectors
Authors:
E. Aprile,
J. Aalbers,
K. Abe,
S. Ahmed Maouloud,
L. Althueser,
B. Andrieu,
E. Angelino,
J. R. Angevaare,
D. Antón Martin,
F. Arneodo,
L. Baudis,
A. L. Baxter,
M. Bazyk,
L. Bellagamba,
R. Biondi,
A. Bismark,
E. J. Brookes,
A. Brown,
G. Bruno,
R. Budnik,
T. K. Bui,
J. M. R. Cardoso,
A. P. Cimental Chavez,
A. P. Colijn,
J. Conrad
, et al. (142 additional authors not shown)
Abstract:
This paper details the first application of a software tagging algorithm to reduce radon-induced backgrounds in liquid noble element time projection chambers, such as XENON1T and XENONnT. The convection velocity field in XENON1T was mapped out using $^{222}\text{Rn}$ and $^{218}\text{Po}$ events, and the root-mean-square convection speed was measured to be $0.30 \pm 0.01$ cm/s. Given this velocity…
▽ More
This paper details the first application of a software tagging algorithm to reduce radon-induced backgrounds in liquid noble element time projection chambers, such as XENON1T and XENONnT. The convection velocity field in XENON1T was mapped out using $^{222}\text{Rn}$ and $^{218}\text{Po}$ events, and the root-mean-square convection speed was measured to be $0.30 \pm 0.01$ cm/s. Given this velocity field, $^{214}\text{Pb}$ background events can be tagged when they are followed by $^{214}\text{Bi}$ and $^{214}\text{Po}$ decays, or preceded by $^{218}\text{Po}$ decays. This was achieved by evolving a point cloud in the direction of a measured convection velocity field, and searching for $^{214}\text{Bi}$ and $^{214}\text{Po}$ decays or $^{218}\text{Po}$ decays within a volume defined by the point cloud. In XENON1T, this tagging system achieved a $^{214}\text{Pb}$ background reduction of $6.2^{+0.4}_{-0.9}\%$ with an exposure loss of $1.8\pm 0.2 \%$, despite the timescales of convection being smaller than the relevant decay times. We show that the performance can be improved in XENONnT, and that the performance of such a software-tagging approach can be expected to be further improved in a diffusion-limited scenario. Finally, a similar method might be useful to tag the cosmogenic $^{137}\text{Xe}$ background, which is relevant to the search for neutrinoless double-beta decay.
△ Less
Submitted 19 June, 2024; v1 submitted 21 March, 2024;
originally announced March 2024.
-
Data-Driven Modeling of Dislocation Mobility from Atomistics using Physics-Informed Machine Learning
Authors:
Yifeng Tian,
Soumendu Bagchi,
Liam Myhill,
Giacomo Po,
Enrique Martinez,
Yen Ting Lin,
Nithin Mathew,
Danny Perez
Abstract:
Dislocation mobility, which dictates the response of dislocations to an applied stress, is a fundamental property of crystalline materials that governs the evolution of plastic deformation. Traditional approaches for deriving mobility laws rely on phenomenological models of the underlying physics, whose free parameters are in turn fitted to a small number of intuition-driven atomic scale simulatio…
▽ More
Dislocation mobility, which dictates the response of dislocations to an applied stress, is a fundamental property of crystalline materials that governs the evolution of plastic deformation. Traditional approaches for deriving mobility laws rely on phenomenological models of the underlying physics, whose free parameters are in turn fitted to a small number of intuition-driven atomic scale simulations under varying conditions of temperature and stress. This tedious and time-consuming approach becomes particularly cumbersome for materials with complex dependencies on stress, temperature, and local environment, such as body-centered cubic crystals (BCC) metals and alloys. In this paper, we present a novel, uncertainty quantification-driven active learning paradigm for learning dislocation mobility laws from automated high-throughput large-scale molecular dynamics simulations, using Graph Neural Networks (GNN) with a physics-informed architecture. We demonstrate that this Physics-informed Graph Neural Network (PI-GNN) framework captures the underlying physics more accurately compared to existing phenomenological mobility laws in BCC metals.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Generating Multiphase Fluid Configurations in Fractures using Diffusion Models
Authors:
Jaehong Chung,
Agnese Marcato,
Eric J. Guiltinan,
Tapan Mukerji,
Yen Ting Lin,
Javier E. Santos
Abstract:
Pore-scale simulations accurately describe transport properties of fluids in the subsurface. These simulations enhance our understanding of applications such as assessing hydrogen storage efficiency and forecasting CO$_2$ sequestration processes in underground reservoirs. Nevertheless, they are computationally expensive due to their mesoscopic nature. In addition, their stationary solutions are no…
▽ More
Pore-scale simulations accurately describe transport properties of fluids in the subsurface. These simulations enhance our understanding of applications such as assessing hydrogen storage efficiency and forecasting CO$_2$ sequestration processes in underground reservoirs. Nevertheless, they are computationally expensive due to their mesoscopic nature. In addition, their stationary solutions are not guaranteed to be unique, so multiple runs with different initial conditions must be performed to ensure sufficient sample coverage. These factors complicate the task of obtaining representative and reliable forecasts. To overcome the high computational cost hurdle, we propose a hybrid method that couples generative diffusion models and physics-based modeling. Upon training a generative model, we synthesize samples that serve as the initial conditions for physics-based simulations. We measure the relaxation time (to stationary solutions) of the simulations, which serves as a validation metric and early-stopping criterion. Our numerical experiments revealed that the hybrid method exhibits a speed-up of up to 8.2 times compared to commonly used initialization methods. This finding offers compelling initial support that the proposed diffusion model-based hybrid scheme has potentials to significantly decrease the time required for convergence of numerical simulations without compromising the physical robustness.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Using Ornstein-Uhlenbeck Process to understand Denoising Diffusion Probabilistic Model and its Noise Schedules
Authors:
Javier E. Santos,
Yen Ting Lin
Abstract:
The aim of this short note is to show that Denoising Diffusion Probabilistic Model DDPM, a non-homogeneous discrete-time Markov process, can be represented by a time-homogeneous continuous-time Markov process observed at non-uniformly sampled discrete times. Surprisingly, this continuous-time Markov process is the well-known and well-studied Ornstein-Ohlenbeck (OU) process, which was developed in…
▽ More
The aim of this short note is to show that Denoising Diffusion Probabilistic Model DDPM, a non-homogeneous discrete-time Markov process, can be represented by a time-homogeneous continuous-time Markov process observed at non-uniformly sampled discrete times. Surprisingly, this continuous-time Markov process is the well-known and well-studied Ornstein-Ohlenbeck (OU) process, which was developed in 1930's for studying Brownian particles in Harmonic potentials. We establish the formal equivalence between DDPM and the OU process using its analytical solution. We further demonstrate that the design problem of the noise scheduler for non-homogeneous DDPM is equivalent to designing observation times for the OU process. We present several heuristic designs for observation times based on principled quantities such as auto-variance and Fisher Information and connect them to ad hoc noise schedules for DDPM. Interestingly, we show that the Fisher-Information-motivated schedule corresponds exactly the cosine schedule, which was developed without any theoretical foundation but is the current state-of-the-art noise schedule.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Mori-Zwanzig Modal Decomposition
Authors:
Michael Woodward,
Yifeng Tian,
Yen Ting Lin,
Christoph Hader,
Hermann Fasel,
Daniel Livescu
Abstract:
We introduce the Mori-Zwanzig (MZ) Modal Decomposition (MZMD), a novel technique for performing modal analysis of large scale spatio-temporal structures in complex dynamical systems, and show that it represents an efficient generalization of Dynamic Mode Decomposition (DMD). The MZ formalism provides a mathematical framework for constructing non-Markovian reduced-order models of resolved variables…
▽ More
We introduce the Mori-Zwanzig (MZ) Modal Decomposition (MZMD), a novel technique for performing modal analysis of large scale spatio-temporal structures in complex dynamical systems, and show that it represents an efficient generalization of Dynamic Mode Decomposition (DMD). The MZ formalism provides a mathematical framework for constructing non-Markovian reduced-order models of resolved variables from high-dimensional dynamical systems, incorporating the effects of unresolved dynamics through the memory kernel and orthogonal dynamics. We present a formulation and analysis of the modes and spectrum from MZMD and compare it to DMD when applied to a complex flow: a Direct Numerical Simulation (DNS) data-set of laminar-turbulent boundary-layer transition flow over a flared cone at Mach 6. We show that the addition of memory terms by MZMD improves the resolution of spatio-temporal structures within the transitional/turbulent regime, which contains features that arise due to nonlinear mechanisms, such as the generation of the so-called "hot" streaks on the surface of the flared cone. As a result, compared to DMD, MZMD improves future state prediction accuracy, while requiring nearly the same computational cost.
△ Less
Submitted 16 November, 2023; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages
Authors:
Shih-Cheng Huang,
Pin-Zu Li,
Yu-Chi Hsu,
Kuang-Ming Chen,
Yu Tung Lin,
Shih-Kai Hsiao,
Richard Tzong-Han Tsai,
Hung-yi Lee
Abstract:
Recently, the development of open-source large language models (LLMs) has advanced rapidly. Nevertheless, due to data constraints, the capabilities of most open-source LLMs are primarily focused on English. To address this issue, we introduce the concept of $\textit{chat vector}$ to equip pre-trained language models with instruction following and human value alignment via simple model arithmetic.…
▽ More
Recently, the development of open-source large language models (LLMs) has advanced rapidly. Nevertheless, due to data constraints, the capabilities of most open-source LLMs are primarily focused on English. To address this issue, we introduce the concept of $\textit{chat vector}$ to equip pre-trained language models with instruction following and human value alignment via simple model arithmetic. The chat vector is derived by subtracting the weights of a pre-trained base model (e.g. LLaMA2) from those of its corresponding chat model (e.g. LLaMA2-chat). By simply adding the chat vector to a continual pre-trained model's weights, we can endow the model with chat capabilities in new languages without the need for further training. Our empirical studies demonstrate the superior efficacy of the chat vector from three different aspects: instruction following, toxicity mitigation, and multi-turn dialogue. Moreover, to showcase the adaptability of our approach, we extend our experiments to encompass various languages, base models, and chat vectors. The results underscore the chat vector's simplicity, effectiveness, and wide applicability, making it a compelling solution for efficiently enabling conversational capabilities in pre-trained language models. Our code is available at https://github.com/aqweteddy/ChatVector.
△ Less
Submitted 7 June, 2024; v1 submitted 7 October, 2023;
originally announced October 2023.
-
Data-Driven Mori-Zwanzig: Reduced Order Modeling of Sparse Sensors Measurements for Boundary Layer Transition
Authors:
Michael Woodward,
Yifeng Tian,
Yen Ting Lin,
Arvind Mohan,
Christoph Hader,
Hermann Fasel,
Michael Chertkov,
Daniel Livescu
Abstract:
Understanding, predicting and controlling laminar-turbulent boundary-layer transition is crucial for the next generation aircraft design. However, in real flight experiments, or wind tunnel tests, often only sparse sensor measurements can be collected at fixed locations. Thus, in developing reduced models for predicting and controlling the flow at the sensor locations, the main challenge is in acc…
▽ More
Understanding, predicting and controlling laminar-turbulent boundary-layer transition is crucial for the next generation aircraft design. However, in real flight experiments, or wind tunnel tests, often only sparse sensor measurements can be collected at fixed locations. Thus, in developing reduced models for predicting and controlling the flow at the sensor locations, the main challenge is in accounting for how the surrounding field of unobserved variables interacts with the observed variables at the fixed sensor locations. This makes the Mori-Zwanzig (MZ) formalism a natural choice, as it results in the Generalized Langevin Equations which provides a framework for constructing non-Markovian reduced-order models that includes the effects the unresolved variables have on the resolved variables. These effects are captured in the so called memory kernel and orthogonal dynamics. In this work, we explore the data-driven MZ formulations to two boundary layer flows obtained from DNS data; a low speed incompressible flow; and a high speed compressible flow over a flared cone at Mach 6. An array of "sensors" are placed near the surface of the solid boundary, and the MZ operators are learned and the predictions are compared to the Extended Dynamic Mode Decomposition (EDMD), both using delay embedded coordinates. Further comparisons are made with Long Short-Term Memory (LSTM) and a regression based projection framework using neural networks for the MZ operators. First we compare the effects of including delay embedded coordinates with EDMD and Mori based MZ and provide evidence that using both memory and delay embedded coordinates minimizes generalization errors on the relevant time scales. Next, we provide numerical evidence that the data-driven regression based projection MZ model performs best with respect to the prediction accuracy (minimum generalization error) on the relevant time scales.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
On the combinatorics of Lotka-Volterra equations
Authors:
Francesco Caravelli,
Yen Ting Lin
Abstract:
We study an approach to obtaining the exact formal solution of the 2-species Lotka-Volterra equation based on combinatorics and generating functions. By employing a combination of Carleman linearization and Mori-Zwanzig reduction techniques, we transform the nonlinear equations into a linear system, allowing for the derivation of a formal solution. The Mori-Zwanzig reduction reduces to an expansio…
▽ More
We study an approach to obtaining the exact formal solution of the 2-species Lotka-Volterra equation based on combinatorics and generating functions. By employing a combination of Carleman linearization and Mori-Zwanzig reduction techniques, we transform the nonlinear equations into a linear system, allowing for the derivation of a formal solution. The Mori-Zwanzig reduction reduces to an expansion which we show can be interpreted as a directed and weighted lattice path walk, which we use to obtain a representation of the system dynamics as walks of fixed length. The exact solution is then shown to be dependent on the generator of weighted walks. We show that the generator can be obtained by the solution of PDE which in turn is equivalent to a particular Koopman evolution of nonlinear observables.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Improving Estimation of the Koopman Operator with Kolmogorov-Smirnov Indicator Functions
Authors:
Van A. Ngo,
Yen Ting Lin,
Danny Perez
Abstract:
It has become common to perform kinetic analysis using approximate Koopman operators that transforms high-dimensional time series of observables into ranked dynamical modes. Key to a practical success of the approach is the identification of a set of observables which form a good basis in which to expand the slow relaxation modes. Good observables are, however, difficult to identify {\em a priori}…
▽ More
It has become common to perform kinetic analysis using approximate Koopman operators that transforms high-dimensional time series of observables into ranked dynamical modes. Key to a practical success of the approach is the identification of a set of observables which form a good basis in which to expand the slow relaxation modes. Good observables are, however, difficult to identify {\em a priori} and sub-optimal choices can lead to significant underestimations of characteristic timescales. Leveraging the representation of slow dynamics in terms of Hidden Markov Model (HMM), we propose a simple and computationally efficient clustering procedure to infer surrogate observables that form a good basis for slow modes. We apply the approach to an analytically solvable model system, as well as on three protein systems of different complexities. We consistently demonstrate that the inferred indicator functions can significantly improve the estimation of the leading eigenvalues of the Koopman operators and correctly identify key states and transition timescales of stochastic systems, even when good observables are not known {\em a priori}.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Blackout Diffusion: Generative Diffusion Models in Discrete-State Spaces
Authors:
Javier E Santos,
Zachary R. Fox,
Nicholas Lubbers,
Yen Ting Lin
Abstract:
Typical generative diffusion models rely on a Gaussian diffusion process for training the backward transformations, which can then be used to generate samples from Gaussian noise. However, real world data often takes place in discrete-state spaces, including many scientific applications. Here, we develop a theoretical formulation for arbitrary discrete-state Markov processes in the forward diffusi…
▽ More
Typical generative diffusion models rely on a Gaussian diffusion process for training the backward transformations, which can then be used to generate samples from Gaussian noise. However, real world data often takes place in discrete-state spaces, including many scientific applications. Here, we develop a theoretical formulation for arbitrary discrete-state Markov processes in the forward diffusion process using exact (as opposed to variational) analysis. We relate the theory to the existing continuous-state Gaussian diffusion as well as other approaches to discrete diffusion, and identify the corresponding reverse-time stochastic process and score function in the continuous-time setting, and the reverse-time mapping in the discrete-time setting. As an example of this framework, we introduce ``Blackout Diffusion'', which learns to produce samples from an empty image instead of from noise. Numerical experiments on the CIFAR-10, Binarized MNIST, and CelebA datasets confirm the feasibility of our approach. Generalizing from specific (Gaussian) forward processes to discrete-state processes without a variational approximation sheds light on how to interpret diffusion models, which we discuss.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Data-Driven Mori-Zwanzig: Approaching a Reduced Order Model for Hypersonic Boundary Layer Transition
Authors:
Michael Woodward,
Yifeng Tian,
Arvind Mohan,
Yen Ting Lin,
Christoph Hader,
Hermann Fasel,
Misha Chertkov,
Daniel Livescu
Abstract:
In this work, we apply, for the first time to spatially inhomogeneous flows, a recently developed data-driven learning algorithm of Mori-Zwanzig (MZ) operators, which is based on a generalized Koopman's description of dynamical systems. The MZ formalism provides a mathematically exact procedure for constructing non-Markovian reduced-order models of resolved variables from high-dimensional dynamica…
▽ More
In this work, we apply, for the first time to spatially inhomogeneous flows, a recently developed data-driven learning algorithm of Mori-Zwanzig (MZ) operators, which is based on a generalized Koopman's description of dynamical systems. The MZ formalism provides a mathematically exact procedure for constructing non-Markovian reduced-order models of resolved variables from high-dimensional dynamical systems, where the effects due to the unresolved dynamics are captured in the memory kernel and orthogonal dynamics. The algorithm developed in this work applies Mori's linear projection operator and an SVD based compression to the selection of the resolved variables (equivalently, a low rank approximation of the two time covariance matrices). We show that this MZ decomposition not only identifies the same spatio-temporal structures found by DMD, but it can also be used to extract spatio-temporal structures of the hysteresis effects present in the memory kernels. We perform an analysis of these structures in the context of a laminar-turbulent boundary-layer transition flow over a flared cone at Mach 6, and show the dynamical relevance of the memory kernels. Additionally, by including these memory terms learned in our data-driven MZ approach, we show improvement in prediction accuracy over DMD at the same level of truncation and at a similar computational cost. Furthermore, an analysis of the spatio-temporal structures of the MZ operators shows identifiable structures associated with the nonlinear generation of the so-called "hot" streaks on the surface of the flared code, which have previously been observed in experiments and direct numerical simulations.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
Regression-based projection for learning Mori-Zwanzig operators
Authors:
Yen Ting Lin,
Yifeng Tian,
Danny Perez,
Daniel Livescu
Abstract:
We propose to adopt statistical regression as the projection operator to enable data-driven learning of the operators in the Mori--Zwanzig formalism. We present a principled method to extract the Markov and memory operators for any regression models. We show that the choice of linear regression results in a recently proposed data-driven learning algorithm based on Mori's projection operator, which…
▽ More
We propose to adopt statistical regression as the projection operator to enable data-driven learning of the operators in the Mori--Zwanzig formalism. We present a principled method to extract the Markov and memory operators for any regression models. We show that the choice of linear regression results in a recently proposed data-driven learning algorithm based on Mori's projection operator, which is a higher-order approximate Koopman learning method. We show that more expressive nonlinear regression models naturally fill in the gap between the highly idealized and computationally efficient Mori's projection operator and the most optimal yet computationally infeasible Zwanzig's projection operator. We performed numerical experiments and extracted the operators for an array of regression-based projections, including linear, polynomial, spline, and neural-network-based regressions, showing a progressive improvement as the complexity of the regression model increased. Our proposition provides a general framework to extract memory-dependent corrections and can be readily applied to an array of data-driven learning methods for stationary dynamical systems in the literature.
△ Less
Submitted 20 April, 2023; v1 submitted 10 May, 2022;
originally announced May 2022.
-
Gene expression noise accelerates the evolution of a biological oscillator
Authors:
Yen Ting Lin,
Nicolas E. Buchler
Abstract:
Gene expression is a biochemical process, where stochastic binding and un-binding events naturally generate fluctuations and cell-to-cell variability in gene dynamics. These fluctuations typically have destructive consequences for proper biological dynamics and function (e.g., loss of timing and synchrony in biological oscillators). Here, we show that gene expression noise counter-intuitively acce…
▽ More
Gene expression is a biochemical process, where stochastic binding and un-binding events naturally generate fluctuations and cell-to-cell variability in gene dynamics. These fluctuations typically have destructive consequences for proper biological dynamics and function (e.g., loss of timing and synchrony in biological oscillators). Here, we show that gene expression noise counter-intuitively accelerates the evolution of a biological oscillator and, thus, can impart a benefit to living organisms. We used computer simulations to evolve two mechanistic models of a biological oscillator at different levels of gene expression noise. We first show that gene expression noise induces oscillatory-like dynamics in regions of parameter space that cannot oscillate in the absence of noise. We then demonstrate that these noise-induced oscillations generate a fitness landscape whose gradient robustly and quickly guides evolution by mutation towards robust and self-sustaining oscillation. These results suggest that noise can help dynamical systems evolve or learn new behavior by revealing cryptic dynamic phenotypes outside the bifurcation point.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
Challenges for quantum computation of nonlinear dynamical systems using linear representations
Authors:
Yen Ting Lin,
Robert B. Lowrie,
Denis Aslangil,
Yiğit Subaşı,
Andrew T. Sornborger
Abstract:
A number of recent studies have proposed that linear representations are appropriate for solving nonlinear dynamical systems with quantum computers, which fundamentally act linearly on a wave function in a Hilbert space. Linear representations, such as the Koopman representation and Koopman von Neumann mechanics, have regained attention from the dynamical-systems research community. Here, we aim t…
▽ More
A number of recent studies have proposed that linear representations are appropriate for solving nonlinear dynamical systems with quantum computers, which fundamentally act linearly on a wave function in a Hilbert space. Linear representations, such as the Koopman representation and Koopman von Neumann mechanics, have regained attention from the dynamical-systems research community. Here, we aim to present a unified theoretical framework, currently missing in the literature, with which one can compare and relate existing methods, their conceptual basis, and their representations. We also aim to show that, despite the fact that quantum simulation of nonlinear classical systems may be possible with such linear representations, a necessary projection into a feasible finite-dimensional space will in practice eventually induce numerical artifacts which can be hard to eliminate or even control. As a result, a practical, reliable and accurate way to use quantum computation for solving general nonlinear dynamical systems is still an open problem.
△ Less
Submitted 8 July, 2024; v1 submitted 4 February, 2022;
originally announced February 2022.
-
A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks
Authors:
Xiaoyu Ma,
Sylvain Sardy,
Nick Hengartner,
Nikolai Bobenko,
Yen Ting Lin
Abstract:
To fit sparse linear associations, a LASSO sparsity inducing penalty with a single hyperparameter provably allows to recover the important features (needles) with high probability in certain regimes even if the sample size is smaller than the dimension of the input vector (haystack). More recently learners known as artificial neural networks (ANN) have shown great successes in many machine learnin…
▽ More
To fit sparse linear associations, a LASSO sparsity inducing penalty with a single hyperparameter provably allows to recover the important features (needles) with high probability in certain regimes even if the sample size is smaller than the dimension of the input vector (haystack). More recently learners known as artificial neural networks (ANN) have shown great successes in many machine learning tasks, in particular fitting nonlinear associations. Small learning rate, stochastic gradient descent algorithm and large training set help to cope with the explosion in the number of parameters present in deep neural networks. Yet few ANN learners have been developed and studied to find needles in nonlinear haystacks. Driven by a single hyperparameter, our ANN learner, like for sparse linear associations, exhibits a phase transition in the probability of retrieving the needles, which we do not observe with other ANN learners. To select our penalty parameter, we generalize the universal threshold of Donoho and Johnstone (1994) which is a better rule than the conservative (too many false detections) and expensive cross-validation. In the spirit of simulated annealing, we propose a warm-start sparsity inducing algorithm to solve the high-dimensional, non-convex and non-differentiable optimization problem. We perform precise Monte Carlo simulations to show the effectiveness of our approach.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
Implementation of a practical Markov chain Monte Carlo sampling algorithm in PyBioNetFit
Authors:
Jacob Neumann,
Yen Ting Lin,
Abhishek Mallela,
Ely F. Miller,
Joshua Colvin,
Abell T. Duprat1,
Ye Chen,
William S. Hlavacek,
Richard G. Posner
Abstract:
Bayesian inference in biological modeling commonly relies on Markov chain Monte Carlo (MCMC) sampling of a multidimensional and non-Gaussian posterior distribution that is not analytically tractable. Here, we present the implementation of a practical MCMC method in the open-source software package PyBioNetFit (PyBNF), which is designed to support parameterization of mathematical models for biologi…
▽ More
Bayesian inference in biological modeling commonly relies on Markov chain Monte Carlo (MCMC) sampling of a multidimensional and non-Gaussian posterior distribution that is not analytically tractable. Here, we present the implementation of a practical MCMC method in the open-source software package PyBioNetFit (PyBNF), which is designed to support parameterization of mathematical models for biological systems. The new MCMC method, am, incorporates an adaptive move proposal distribution. For warm starts, sampling can be initiated at a specified location in parameter space and with a multivariate Gaussian proposal distribution defined initially by a specified covariance matrix. Multiple chains can be generated in parallel using a computer cluster. We demonstrate that am can be used to successfully solve real-world Bayesian inference problems, including forecasting of new Coronavirus Disease 2019 case detection with Bayesian quantification of forecast uncertainty. PyBNF version 1.1.9, the first stable release with am, is available at PyPI and can be installed using the pip package-management system on platforms that have a working installation of Python 3. PyBNF relies on libRoadRunner and BioNetGen for simulations (e.g., numerical integration of ordinary differential equations defined in SBML or BNGL files) and Dask.Distributed for task scheduling on Linux computer clusters.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Data Driven Learning of Mori-Zwanzig Operators for Isotropic Turbulence
Authors:
Yifeng Tian,
Yen Ting Lin,
Marian Anghel,
Daniel Livescu
Abstract:
Developing reduced-order models for turbulent flows, which contain dynamics over a wide range of scales, is an extremely challenging problem. In statistical mechanics, the Mori-Zwanzig (MZ) formalism provides a mathematically formal procedure for constructing reduced-order representations of high-dimensional dynamical systems, where the effect due to the unresolved dynamics are captured in the mem…
▽ More
Developing reduced-order models for turbulent flows, which contain dynamics over a wide range of scales, is an extremely challenging problem. In statistical mechanics, the Mori-Zwanzig (MZ) formalism provides a mathematically formal procedure for constructing reduced-order representations of high-dimensional dynamical systems, where the effect due to the unresolved dynamics are captured in the memory kernel and orthogonal dynamics. Turbulence models based on MZ formalism have been scarce due to the limited knowledge of the MZ operators, which originates from the difficulty in deriving MZ kernels for complex nonlinear dynamical systems. In this work, we apply a recently developed data-driven learning algorithm, which is based on Koopman's description of dynamical systems and Mori's linear projection operator, on a set of fully-resolved isotropic turbulence datasets to extract the Mori-Zwanzig operators. With data augmentation using known turbulence symmetries, the extracted Markov term, memory kernel, and orthogonal dynamics are statistically converged and the Generalized Fluctuation-Dissipation Relation can be verified. The properties of the memory kernel and orthogonal dynamics, and their dependence on the choices of observables are investigated to address the modeling assumptions that are commonly used in MZ-based models. A series of numerical experiments are then constructed using the extracted kernels to evaluate the memory effects on predictions. Results show that the prediction errors are strongly affected by the choice of observables and can be further reduced by including the past history of the observables in the memory kernel.
△ Less
Submitted 30 August, 2021;
originally announced August 2021.
-
Data-driven learning for the Mori-Zwanzig formalism: a generalization of the Koopman learning framework
Authors:
Yen Ting Lin,
Yifeng Tian,
Marian Anghel,
Daniel Livescu
Abstract:
A theoretical framework which unifies the conventional Mori-Zwanzig formalism and the approximate Koopman learning is presented. In this framework, the Mori-Zwanzig formalism, developed in statistical mechanics to tackle the hard problem of construction of reduced-order dynamics for high-dimensional dynamical systems, can be considered as a natural generalization of the Koopman description of the…
▽ More
A theoretical framework which unifies the conventional Mori-Zwanzig formalism and the approximate Koopman learning is presented. In this framework, the Mori-Zwanzig formalism, developed in statistical mechanics to tackle the hard problem of construction of reduced-order dynamics for high-dimensional dynamical systems, can be considered as a natural generalization of the Koopman description of the dynamical system. We next show that similar to the approximate Koopman learning methods, data-driven methods can be developed for the Mori-Zwanzig formalism with Mori's linear projection operator. We developed two algorithms to extract the key operators, the Markov and the memory kernel, using time series of a reduced set of observables in a dynamical system. We adopted the Lorenz `96 system as a test problem and solved for the operators, which exhibit complex behaviors which are unlikely to be captured by traditional modeling approaches, in Mori-Zwanzig analysis. The nontrivial Generalized Fluctuation Dissipation relationship, which relates the memory kernel with the two-time correlation statistics of the orthogonal dynamics, was numerically verified as a validation of the solved operators. We present numerical evidence that the Generalized Langevin Equation, a key construct in the Mori-Zwanzig formalism, is more advantageous in predicting the evolution of the reduced set of observables than the conventional approximate Koopman operators.
△ Less
Submitted 26 July, 2021; v1 submitted 11 January, 2021;
originally announced January 2021.
-
Data-driven Optimized Control of the COVID-19 Epidemics
Authors:
Afroza Shirin,
Yen Ting Lin,
Francesco Sorrentino
Abstract:
Optimizing the impact on the economy of control strategies aiming at containing the spread of COVID-19 is a critical challenge. We use daily new case counts of COVID-19 patients reported by local health administrations from different Metropolitan Statistical Areas (MSAs) within the US to parametrize a model that well describes the propagation of the disease in each area. We then introduce a time-v…
▽ More
Optimizing the impact on the economy of control strategies aiming at containing the spread of COVID-19 is a critical challenge. We use daily new case counts of COVID-19 patients reported by local health administrations from different Metropolitan Statistical Areas (MSAs) within the US to parametrize a model that well describes the propagation of the disease in each area. We then introduce a time-varying control input that represents the level of social distancing imposed on the population of a given area and solve an optimal control problem with the goal of minimizing the impact of social distancing on the economy in the presence of relevant constraints, such as a desired level of suppression for the epidemics at a terminal time. We find that with the exception of the initial time and of the final time, the optimal control input is well approximated by a constant, specific to each area, which contrasts with the implemented system of reopening `in phases'. For all the areas considered, this optimal level corresponds to stricter social distancing than the level estimated from data. Proper selection of the time period for application of the control action optimally is important: depending on the particular MSA this period should be either short or long or intermediate. We also consider the case that the transmissibility increases in time (due e.g. to increasingly colder weather), for which we find that the optimal control solution yields progressively stricter measures of social distancing. {We finally compute the optimal control solution for a model modified to incorporate the effects of vaccinations on the population and we see that depending on a number of factors, social distancing measures could be optimally reduced during the period over which vaccines are administered to the population.
△ Less
Submitted 10 March, 2021; v1 submitted 4 September, 2020;
originally announced September 2020.
-
Daily Forecasting of New Cases for Regional Epidemics of Coronavirus Disease 2019 with Bayesian Uncertainty Quantification
Authors:
Yen Ting Lin,
Jacob Neumann,
Ely Miller,
Richard G. Posner,
Abhishek Mallela,
Cosmin Safta,
Jaideep Ray,
Gautam Thakur,
Supriya Chinthavali,
William S. Hlavacek
Abstract:
To increase situational awareness and support evidence-based policy-making, we formulated two types of mathematical models for COVID-19 transmission within a regional population. One is a fitting function that can be calibrated to reproduce an epidemic curve with two timescales (e.g., fast growth and slow decay). The other is a compartmental model that accounts for quarantine, self-isolation, soci…
▽ More
To increase situational awareness and support evidence-based policy-making, we formulated two types of mathematical models for COVID-19 transmission within a regional population. One is a fitting function that can be calibrated to reproduce an epidemic curve with two timescales (e.g., fast growth and slow decay). The other is a compartmental model that accounts for quarantine, self-isolation, social distancing, a non-exponentially distributed incubation period, asymptomatic individuals, and mild and severe forms of symptomatic disease. Using Bayesian inference, we have been calibrating our models daily for consistency with new reports of confirmed cases from the 15 most populous metropolitan statistical areas in the United States and quantifying uncertainty in parameter estimates and predictions of future case reports. This online learning approach allows for early identification of new trends despite considerable variability in case reporting. We infer new significant upward trends for five of the metropolitan areas starting between 19-April-2020 and 12-June-2020.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
What needles do sparse neural networks find in nonlinear haystacks
Authors:
Sylvain Sardy,
Nicolas W Hengartner,
Nikolai Bonenko,
Yen Ting Lin
Abstract:
Using a sparsity inducing penalty in artificial neural networks (ANNs) avoids over-fitting, especially in situations where noise is high and the training set is small in comparison to the number of features. For linear models, such an approach provably also recovers the important features with high probability in regimes for a well-chosen penalty parameter. The typical way of setting the penalty p…
▽ More
Using a sparsity inducing penalty in artificial neural networks (ANNs) avoids over-fitting, especially in situations where noise is high and the training set is small in comparison to the number of features. For linear models, such an approach provably also recovers the important features with high probability in regimes for a well-chosen penalty parameter. The typical way of setting the penalty parameter is by splitting the data set and performing the cross-validation, which is (1) computationally expensive and (2) not desirable when the data set is already small to be further split (for example, whole-genome sequence data). In this study, we establish the theoretical foundation to select the penalty parameter without cross-validation based on bounding with a high probability the infinite norm of the gradient of the loss function at zero under the zero-feature assumption. Our approach is a generalization of the universal threshold of Donoho and Johnstone (1994) to nonlinear ANN learning. We perform a set of comprehensive Monte Carlo simulations on a simple model, and the numerical results show the effectiveness of the proposed approach.
△ Less
Submitted 7 June, 2020;
originally announced June 2020.
-
The Novel Coronavirus, 2019-nCoV, is Highly Contagious and More Infectious Than Initially Estimated
Authors:
Steven Sanche,
Yen Ting Lin,
Chonggang Xu,
Ethan Romero-Severson,
Nicolas W. Hengartner,
Ruian Ke
Abstract:
The novel coronavirus (2019-nCoV) is a recently emerged human pathogen that has spread widely since January 2020. Initially, the basic reproductive number, R0, was estimated to be 2.2 to 2.7. Here we provide a new estimate of this quantity. We collected extensive individual case reports and estimated key epidemiology parameters, including the incubation period. Integrating these estimates and high…
▽ More
The novel coronavirus (2019-nCoV) is a recently emerged human pathogen that has spread widely since January 2020. Initially, the basic reproductive number, R0, was estimated to be 2.2 to 2.7. Here we provide a new estimate of this quantity. We collected extensive individual case reports and estimated key epidemiology parameters, including the incubation period. Integrating these estimates and high-resolution real-time human travel and infection data with mathematical models, we estimated that the number of infected individuals during early epidemic double every 2.4 days, and the R0 value is likely to be between 4.7 and 6.6. We further show that quarantine and contact tracing of symptomatic individuals alone may not be effective and early, strong control measures are needed to stop transmission of the virus.
△ Less
Submitted 8 February, 2020;
originally announced February 2020.
-
Scaling methods for accelerating kinetic Monte Carlo simulations of chemical reaction networks
Authors:
Yen Ting Lin,
Song Feng,
William S. Hlavacek
Abstract:
Various kinetic Monte Carlo algorithms become inefficient when some of the population sizes in a system are large, which gives rise to a large number of reaction events per unit time. Here, we present a new acceleration algorithm based on adaptive and heterogeneous scaling of reaction rates and stoichiometric coefficients. The algorithm is conceptually related to the commonly used idea of accelera…
▽ More
Various kinetic Monte Carlo algorithms become inefficient when some of the population sizes in a system are large, which gives rise to a large number of reaction events per unit time. Here, we present a new acceleration algorithm based on adaptive and heterogeneous scaling of reaction rates and stoichiometric coefficients. The algorithm is conceptually related to the commonly used idea of accelerating a stochastic simulation by considering a sub-volume $λΩ$ ($0<λ<1$) within a system of interest, which reduces the number of reaction events per unit time occurring in a simulation by a factor $1/λ$ at the cost of greater error in unbiased estimates of first moments and biased overestimates of second moments. Our new approach offers two unique benefits. First, scaling is adaptive and heterogeneous, which eliminates the pitfall of overaggressive scaling. Second, there is no need for an \emph{a priori} classification of populations as discrete or continuous (as in a hybrid method), which is problematic when discreteness of a chemical species changes during a simulation. The method requires specification of only a single algorithmic parameter, $N_c$, a global critical population size above which populations are effectively scaled down to increase simulation efficiency. The method, which we term partial scaling, is implemented in the open-source BioNetGen software package. We demonstrate that partial scaling can significantly accelerate simulations without significant loss of accuracy for several published models of biological systems. These models characterize activation of the mitogen-activated protein kinase ERK, prion protein aggregation, and T-cell receptor signaling.
△ Less
Submitted 10 May, 2019; v1 submitted 20 March, 2019;
originally announced March 2019.
-
Accelerated Bayesian inference of gene expression models from snapshots of single-cell transcripts
Authors:
Yen Ting Lin,
Nicolas E. Buchler
Abstract:
Understanding how stochastic gene expression is regulated in biological systems using snapshots of single-cell transcripts requires state-of-the-art methods of computational analysis and statistical inference. A Bayesian approach to statistical inference is the most complete method for model selection and uncertainty quantification of kinetic parameters from single-cell data. This approach is impr…
▽ More
Understanding how stochastic gene expression is regulated in biological systems using snapshots of single-cell transcripts requires state-of-the-art methods of computational analysis and statistical inference. A Bayesian approach to statistical inference is the most complete method for model selection and uncertainty quantification of kinetic parameters from single-cell data. This approach is impractical because current numerical algorithms are too slow to handle typical models of gene expression. To solve this problem, we first show that time-dependent mRNA distributions of discrete-state models of gene expression are dynamic Poisson mixtures, whose mixing kernels are characterized by a piece-wise deterministic Markov process. We combined this analytical result with a kinetic Monte Carlo algorithm to create a hybrid numerical method that accelerates the calculation of time-dependent mRNA distributions by 1000-fold compared to current methods. We then integrated the hybrid algorithm into an existing Monte Carlo sampler to estimate the Bayesian posterior distribution of many different, competing models in a reasonable amount of time. We validated our method of accelerated Bayesian inference on several synthetic data sets. Our results show that kinetic parameters can be reasonably constrained for modestly sampled data sets, if the model is known \textit{a priori}. If the model is unknown,the Bayesian evidence can be used to rigorously quantify the likelihood of a model relative to other models from the data. We demonstrate that Bayesian evidence selects the true model and outperforms approximate metrics, e.g., Bayesian Information Criterion (BIC) or Akaike Information Criterion (AIC), often used for model selection.
△ Less
Submitted 7 December, 2018;
originally announced December 2018.
-
Prediction of Optimal Drug Schedules for Controlling Autophagy
Authors:
Afroza Shirin,
Isaac Klickstein,
Song Feng,
Yen Ting Lin,
William S. Hlavacek,
Francesco Sorrentino
Abstract:
The effects of molecularly targeted drug perturbations on cellular activities and fates are difficult to predict using intuition alone because of the complex behaviors of cellular regulatory networks. An approach to overcoming this problem is to develop mathematical models for predicting drug effects. Such an approach beckons for co-development of computational methods for extracting insights usef…
▽ More
The effects of molecularly targeted drug perturbations on cellular activities and fates are difficult to predict using intuition alone because of the complex behaviors of cellular regulatory networks. An approach to overcoming this problem is to develop mathematical models for predicting drug effects. Such an approach beckons for co-development of computational methods for extracting insights useful for guiding therapy selection and optimizing drug scheduling. Here, we present and evaluate a generalizable strategy for identifying drug dosing schedules that minimize the amount of drug needed to achieve sustained suppression or elevation of an important cellular activity/process, the recycling of cytoplasmic contents through (macro)autophagy. Therapeutic targeting of autophagy is currently being evaluated in diverse clinical trials but without the benefit of a control engineering perspective. Using a nonlinear ordinary differential equation (ODE) model that accounts for activating and inhibiting influences among protein and lipid kinases that regulate autophagy (MTORC1, ULK1, AMPK and VPS34) and methods guaranteed to find locally optimal control strategies, we find optimal drug dosing schedules (open-loop controllers) for each of six classes of drugs and drug pairs. Our approach is generalizable to designing monotherapy and multi therapy drug schedules that affect different cell signaling networks of interest.
△ Less
Submitted 12 January, 2019; v1 submitted 31 July, 2018;
originally announced August 2018.
-
Model reduction methods for classical stochastic systems with fast-switching environments: reduced master equations, stochastic differential equations, and applications
Authors:
Peter G. Hufton,
Yen Ting Lin,
Tobias Galla
Abstract:
We study classical stochastic systems with discrete states, coupled to switching external environments. For fast environmental processes we derive reduced dynamics for the system itself, focusing on corrections to the adiabatic limit of infinite time scale separation. In some cases, this leads to master equations with negative transition `rates' or bursting events. We devise a simulation algorithm…
▽ More
We study classical stochastic systems with discrete states, coupled to switching external environments. For fast environmental processes we derive reduced dynamics for the system itself, focusing on corrections to the adiabatic limit of infinite time scale separation. In some cases, this leads to master equations with negative transition `rates' or bursting events. We devise a simulation algorithm in discrete time to unravel these master equations into sample paths, and provide an interpretation of bursting events. Focusing on stochastic population dynamics coupled to external environments, we discuss a series of approximation schemes combining expansions in the inverse switching rate of the environment, and a Kramers--Moyal expansion in the inverse size of the population. This places the different approximations in relation to existing work on piecewise-deterministic and piecewise-diffusive Markov processes. We apply the model reduction methods to different examples including systems in biology and a model of crack propagation.
△ Less
Submitted 7 March, 2018;
originally announced March 2018.
-
Generalizing Gillespie's direct method to enable network-free simulations
Authors:
Ryan Suderman,
Eshan D. Mitra,
Yen Ting Lin,
Keesha E. Erickson,
Song Feng,
William S. Hlavacek
Abstract:
Gillespie's direct method for stochastic simulation of chemical kinetics is a staple of computational systems biology research. However, the algorithm requires explicit enumeration of all reactions and all chemical species that may arise in the system. In many cases, this is not feasible due to the combinatorial explosion of reactions and species in biological networks. Rule-based modeling framewo…
▽ More
Gillespie's direct method for stochastic simulation of chemical kinetics is a staple of computational systems biology research. However, the algorithm requires explicit enumeration of all reactions and all chemical species that may arise in the system. In many cases, this is not feasible due to the combinatorial explosion of reactions and species in biological networks. Rule-based modeling frameworks provide a way to exactly represent networks containing such combinatorial complexity, and generalizations of Gillespie's direct method have been developed as simulation engines for rule-based modeling languages. Here, we provide both a high-level description of the algorithms underlying the simulation engines, termed network-free simulation algorithms, and how they have been applied in systems biology research. We also define a generic rule-based modeling framework and describe a number of technical details required for adapting Gillespie's direct method for network-free simulation. Finally, we briefly discuss potential avenues for advancing network-free simulation and the role they continue to play in modeling dynamical systems in biology.
△ Less
Submitted 30 January, 2018;
originally announced January 2018.
-
Efficient analysis of stochastic gene dynamics in the non-adiabatic regime using piecewise deterministic Markov processes
Authors:
Yen Ting Lin,
Nicolas E. Buchler
Abstract:
Single-cell experiments show that gene expression is stochastic and bursty, a feature that can emerge from slow switching between promoter states with different activities. One source of long-lived promoter states is the slow binding and unbinding kinetics of transcription factors to promoters, i.e. the non-adiabatic binding regime. Here, we introduce a simple analytical framework, known as a piec…
▽ More
Single-cell experiments show that gene expression is stochastic and bursty, a feature that can emerge from slow switching between promoter states with different activities. One source of long-lived promoter states is the slow binding and unbinding kinetics of transcription factors to promoters, i.e. the non-adiabatic binding regime. Here, we introduce a simple analytical framework, known as a piecewise deterministic Markov process (PDMP), that accurately describes the stochastic dynamics of gene expression in the non-adiabatic regime. We illustrate the utility of the PDMP on a non-trivial dynamical system by analyzing the properties of a titration-based oscillator in the non-adiabatic limit. We first show how to transform the underlying Chemical Master Equation into a PDMP where the slow transitions between promoter states are stochastic, but whose rates depend upon the faster deterministic dynamics of the transcription factors regulated by these promoters. We show that the PDMP accurately describes the observed periods of stochastic cycles in activator and repressor-based titration oscillators. We then generalize our PDMP analysis to more complicated versions of titration-based oscillators to explain how multiple binding sites lengthen the period and improve coherence. Last, we show how noise-induced oscillation previously observed in a titration-based oscillator arises from non-adiabatic and discrete binding events at the promoter site.
△ Less
Submitted 25 October, 2017;
originally announced October 2017.
-
A stochastic and dynamical view of pluripotency in mouse embryonic stem cells
Authors:
Yen Ting Lin,
Peter G. Hufton,
Esther J. Lee,
Davit A. Potoyan
Abstract:
Pluripotent embryonic stem cells are of paramount importance for biomedical research thanks to their innate ability for self-renewal and differentiation into all major cell lines. The fateful decision to exit or remain in the pluripotent state is regulated by complex genetic regulatory network. Latest advances in transcriptomics have made it possible to infer basic topologies of pluripotency gover…
▽ More
Pluripotent embryonic stem cells are of paramount importance for biomedical research thanks to their innate ability for self-renewal and differentiation into all major cell lines. The fateful decision to exit or remain in the pluripotent state is regulated by complex genetic regulatory network. Latest advances in transcriptomics have made it possible to infer basic topologies of pluripotency governing networks. The inferred network topologies, however, only encode boolean information while remaining silent about the roles of dynamics and molecular noise in gene expression. These features are widely considered essential for functional decision making. Herein we developed a framework for extending the boolean level networks into models accounting for individual genetic switches and promoter architecture which allows mechanistic interrogation of the roles of molecular noise, external signaling, and network topology. We demonstrate the pluripotent state of the network to be a broad attractor which is robust to variations of gene expression. Dynamics of exiting the pluripotent state, on the other hand, is significantly influenced by the molecular noise originating from genetic switching events which makes cells more responsive to extracellular signals. Lastly we show that steady state probability landscape can be significantly remodeled by global gene switching rates alone which can be taken as a proxy for how global epigenetic modifications exert control over stability of pluripotent states.
△ Less
Submitted 23 October, 2017;
originally announced October 2017.
-
Phenotypic switching of populations of cells in a stochastic environment
Authors:
Peter G. Hufton,
Yen Ting Lin,
Tobias Galla
Abstract:
In biology phenotypic switching is a common bet-hedging strategy in the face of uncertain environmental conditions. Existing mathematical models often focus on periodically changing environments to determine the optimal phenotypic response. We focus on the case in which the environment switches randomly between discrete states. Starting from an individual-based model we derive stochastic different…
▽ More
In biology phenotypic switching is a common bet-hedging strategy in the face of uncertain environmental conditions. Existing mathematical models often focus on periodically changing environments to determine the optimal phenotypic response. We focus on the case in which the environment switches randomly between discrete states. Starting from an individual-based model we derive stochastic differential equations to describe the dynamics, and obtain analytical expressions for the mean instantaneous growth rates based on the theory of piecewise deterministic Markov processes. We show that optimal phenotypic responses are non-trivial for slow and intermediate environmental processes, and systematically compare the cases of periodic and random environments. The best response to random switching is more likely to be heterogeneity than in the case of deterministic periodic environments, net growth rates tend to be higher under stochastic environmental dynamics. The combined system of environment and population of cells can be interpreted as host-pathogen interaction, in which the host tries to choose environmental switching so as to minimise growth of the pathogen, and in which the pathogen employs a phenotypic switching optimised to increase its growth rate. We discuss the existence of Nash-like mutual best-response scenarios for such host-pathogen games.
△ Less
Submitted 5 January, 2018; v1 submitted 23 June, 2017;
originally announced June 2017.
-
Mechanisms of stochastic onset and termination of atrial fibrillation episodes: Insights using a cellular automaton model
Authors:
Yen Ting Lin,
Eugene TY Chang,
Julie Eatock,
Tobias Galla,
Richard H Clayton
Abstract:
Mathematical models of cardiac electrical excitation are increasingly complex, with multiscale models seeking to represent and bridge physiological behaviours across temporal and spatial scales. The increasing complexity of these models makes it computationally expensive to both evaluate long term (>60 seconds) behaviour and determine sensitivity of model outputs to inputs. This is particularly re…
▽ More
Mathematical models of cardiac electrical excitation are increasingly complex, with multiscale models seeking to represent and bridge physiological behaviours across temporal and spatial scales. The increasing complexity of these models makes it computationally expensive to both evaluate long term (>60 seconds) behaviour and determine sensitivity of model outputs to inputs. This is particularly relevant in models of atrial fibrillation (AF), where individual episodes last from seconds to days, and inter-episode waiting times can be minutes to months. Potential mechanisms of transition between sinus rhythm and AF have been identified but are not well understood, and it is difficult to simulate AF for long periods of time using state-of-the-art models. In this study, we implemented a Moe-type cellular automaton on a novel, topologically correct surface geometry of the left atrium. We used the model to simulate stochastic initiation and spontaneous termination of AF, arising from bursts of spontaneous activation near pulmonary veins. The simplified representation of atrial electrical activity reduced computational cost, and so permitted us to investigate AF mechanisms in a probabilistic setting. We computed large numbers (~10^5) of sample paths of the model, to infer stochastic initiation and termination rates of AF episodes using different model parameters. By generating statistical distributions of model outputs, we demonstrated how to propagate uncertainties of inputs within our microscopic level model up to a macroscopic level. Lastly, we investigated spontaneous termination in the model and found a complex dependence on its past AF trajectory, the mechanism of which merits future investigation.
△ Less
Submitted 11 December, 2016;
originally announced December 2016.
-
Intrinsic noise in systems with switching environments
Authors:
Peter G. Hufton,
Yen Ting Lin,
Tobias Galla,
Alan J. McKane
Abstract:
We study individual-based dynamics in finite populations, subject to randomly switching environmental conditions. These are inspired by models in which genes transition between on and off states, regulating underlying protein dynamics. Similarly switches between environmental states are relevant in bacterial populations and in models of epidemic spread. Existing piecewise-deterministic Markov proc…
▽ More
We study individual-based dynamics in finite populations, subject to randomly switching environmental conditions. These are inspired by models in which genes transition between on and off states, regulating underlying protein dynamics. Similarly switches between environmental states are relevant in bacterial populations and in models of epidemic spread. Existing piecewise-deterministic Markov process (PDMP) approaches focus on the deterministic limit of the population dynamics while retaining the randomness of the switching. Here we go beyond this approximation and explicitly include effects of intrinsic stochasticity at the level of the linear-noise approximation. Specifically we derive the stationary distributions of a number of model systems, in good agreement with simulations. This improves existing approaches which are limited to the regimes of fast and slow switching.
△ Less
Submitted 17 March, 2016; v1 submitted 2 December, 2015;
originally announced December 2015.
-
Assessing Measures of Atrial Fibrillation Clustering via Stochastic Models of Episode Recurrence and Disease Progression
Authors:
Julie Eatock,
Yen Ting Lin,
Eugene T. Y. Chang,
Tobias Galla,
Richard H. Clayton
Abstract:
Atrial fibrillation (AF) is a leading cause of morbidity and mortality. AF prevalence increases with age, which is attributed to pathophysiological changes that aid AF initiation and perpetuation. Current state-of-the-art models are only capable of simulating short periods of atrial activity at high spatial resolution, whilst the majority of clinical recordings are based on infrequent temporal dat…
▽ More
Atrial fibrillation (AF) is a leading cause of morbidity and mortality. AF prevalence increases with age, which is attributed to pathophysiological changes that aid AF initiation and perpetuation. Current state-of-the-art models are only capable of simulating short periods of atrial activity at high spatial resolution, whilst the majority of clinical recordings are based on infrequent temporal datasets of limited spatial resolution. Being able to estimate disease progression informed by both modelling and clinical data would be of significant interest. In addition an analysis of the temporal distribution of recorded fibrillation episodes AF density can provide insights into recurrence patterns. We present an initial analysis of the AF density measure using a simplified idealised stochastic model of a binary time series representing AF episodes. The future aim of this work is to develop robust clinical measures of progression which will be tested on models that generate long-term synthetic data. These measures would then be of clinical interest in deciding treatment strategies.
△ Less
Submitted 2 October, 2015;
originally announced October 2015.
-
Gene expression dynamics with stochastic bursts: exact results for a coarse-grained model
Authors:
Yen Ting Lin,
Charles R. Doering
Abstract:
We present a theoretical framework to analyze the dynamics of gene expression with stochastic bursts. Beginning with an individual-based model which fully accounts for the messenger RNA (mRNA) and protein populations, we propose a novel expansion of the master equation for the joint process. The resulting coarse-grained model reduces the dimensionality of the system, describing only the protein po…
▽ More
We present a theoretical framework to analyze the dynamics of gene expression with stochastic bursts. Beginning with an individual-based model which fully accounts for the messenger RNA (mRNA) and protein populations, we propose a novel expansion of the master equation for the joint process. The resulting coarse-grained model reduces the dimensionality of the system, describing only the protein population while fully accounting for the effects of discrete and fluctuating mRNA population. Closed form expressions for the stationary distribution of the protein population and mean first-passage times of the coarse-grained model are derived and large-scale Monte Carlo simulations show that the analysis accurately describes the individual-based process accounting for mRNA population, in contrast to the failure of commonly proposed diffusion-type models.
△ Less
Submitted 12 August, 2015;
originally announced August 2015.
-
Bursting noise in gene expression dynamics: Linking microscopic and mesoscopic models
Authors:
Yen Ting Lin,
Tobias Galla
Abstract:
The dynamics of short-lived mRNA results in bursts of protein production in gene regulatory networks. We investigate the propagation of bursting noise between different levels of mathematical modelling, and demonstrate that conventional approaches based on diffusion approximations can fail to capture bursting noise. An alternative coarse-grained model, the so-called piecewise deterministic Markov…
▽ More
The dynamics of short-lived mRNA results in bursts of protein production in gene regulatory networks. We investigate the propagation of bursting noise between different levels of mathematical modelling, and demonstrate that conventional approaches based on diffusion approximations can fail to capture bursting noise. An alternative coarse-grained model, the so-called piecewise deterministic Markov process (PDMP), is seen to outperform the diffusion approximation in biologically relevant parameter regimes. We provide a systematic embedding of the PDMP model into the landscape of existing approaches, and we present analytical methods to calculate its stationary distribution and switching frequencies.
△ Less
Submitted 3 August, 2015;
originally announced August 2015.
-
Modelling the progression of atrial fibrillation: A stochastic individual-based approach
Authors:
Eugene TY Chang,
Yen Ting Lin,
Tobias Galla,
Richard H Clayton,
Julie Eatock
Abstract:
We propose a stochastic individual-based model of the progression of atrial fibrillation (AF). The model operates at patient level over a lifetime and is based on elements of the physiology and biophysics of AF, making contact with existing mechanistic models. The outputs of the model are times when the patient is in normal rhythm and AF, and we carry out a population-level analysis of the statist…
▽ More
We propose a stochastic individual-based model of the progression of atrial fibrillation (AF). The model operates at patient level over a lifetime and is based on elements of the physiology and biophysics of AF, making contact with existing mechanistic models. The outputs of the model are times when the patient is in normal rhythm and AF, and we carry out a population-level analysis of the statistics of disease progression. While the model is stylised at present and not directly predictive, future improvements are proposed to tighten the gap between existing mechanistic models of AF, and epidemiological data, with a view towards model-based personalised medicine.
△ Less
Submitted 28 July, 2015; v1 submitted 27 July, 2015;
originally announced July 2015.
-
Formation and Dissolution of Bacterial Colonies
Authors:
Christoph Weber,
Yen Ting Lin,
Nicolas Biais,
Vasily Zaburdaev
Abstract:
Many organisms form colonies for a transient period of time to withstand environmental pressure. Bacterial biofilms are a prototypical example of such behavior. Despite significant interest across disciplines, physical mechanisms governing the formation and dissolution of bacterial colonies are still poorly understood. Starting from a kinetic description of motile and interacting cells we derive a…
▽ More
Many organisms form colonies for a transient period of time to withstand environmental pressure. Bacterial biofilms are a prototypical example of such behavior. Despite significant interest across disciplines, physical mechanisms governing the formation and dissolution of bacterial colonies are still poorly understood. Starting from a kinetic description of motile and interacting cells we derive a hydrodynamic equation for their density on a surface. We use it to describe formation of multiple colonies with sizes consistent with experimental data and to discuss their dissolution.
△ Less
Submitted 3 June, 2015;
originally announced June 2015.
-
The Blanco Cosmology Survey: Data Acquisition, Processing, Calibration, Quality Diagnostics and Data Release
Authors:
S. Desai,
R. Armstrong,
J. J. Mohr,
D. R. Semler,
J. Liu,
E. Bertin,
S. S. Alam,
W. A. Barkhouse,
G. Bazin,
E. J. Buckley-Geer,
M. C. Cooper,
S. M. Hansen,
F. W. High,
H. Lin,
Y. T. Lin,
C. -C. Ngeow,
A. Rest,
J. Song,
D. Tucker,
A. Zenteno
Abstract:
The Blanco Cosmology Survey (BCS) is a 60 night imaging survey of $\sim$80 deg$^2$ of the southern sky located in two fields: ($α$,$δ$)= (5 hr, $-55^{\circ}$) and (23 hr, $-55^{\circ}$). The survey was carried out between 2005 and 2008 in $griz$ bands with the Mosaic2 imager on the Blanco 4m telescope. The primary aim of the BCS survey is to provide the data required to optically confirm and measu…
▽ More
The Blanco Cosmology Survey (BCS) is a 60 night imaging survey of $\sim$80 deg$^2$ of the southern sky located in two fields: ($α$,$δ$)= (5 hr, $-55^{\circ}$) and (23 hr, $-55^{\circ}$). The survey was carried out between 2005 and 2008 in $griz$ bands with the Mosaic2 imager on the Blanco 4m telescope. The primary aim of the BCS survey is to provide the data required to optically confirm and measure photometric redshifts for Sunyaev-Zel'dovich effect selected galaxy clusters from the South Pole Telescope and the Atacama Cosmology Telescope. We process and calibrate the BCS data, carrying out PSF corrected model fitting photometry for all detected objects. The median 10$σ$ galaxy (point source) depths over the survey in $griz$ are approximately 23.3 (23.9), 23.4 (24.0), 23.0 (23.6) and 21.3 (22.1), respectively. The astrometric accuracy relative to the USNO-B survey is $\sim45$ milli-arcsec. We calibrate our absolute photometry using the stellar locus in $grizJ$ bands, and thus our absolute photometric scale derives from 2MASS which has $\sim2$% accuracy. The scatter of stars about the stellar locus indicates a systematics floor in the relative stellar photometric scatter in $griz$ that is $\sim$1.9%, $\sim$2.2%, $\sim$2.7% and$\sim$2.7%, respectively. A simple cut in the AstrOmatic star-galaxy classifier {\tt spread\_model} produces a star sample with good spatial uniformity. We use the resulting photometric catalogs to calibrate photometric redshifts for the survey and demonstrate scatter $δz/(1+z)=0.054$ with an outlier fraction $η<5$% to $z\sim1$. We highlight some selected science results to date and provide a full description of the released data products.
△ Less
Submitted 6 August, 2012; v1 submitted 5 April, 2012;
originally announced April 2012.
-
Reversible metal-insulator transition in LaAlO3 thin films mediated by intragap defects: An alternative mechanism for resistive switching
Authors:
Z. Q. Liu,
D. P. Leusink,
W. M. Lü,
X. Wang,
X. P. Yang,
K. Gopinadhan,
Y. T. Lin,
A. Annadi,
Y. L. Zhao,
A. Roy Barman,
S. Dhar,
Y. P. Feng,
H. B. Su,
G. Xiong,
T. Venkatesan,
Ariando
Abstract:
We report on the electric-field-induced reversible metal-insulator transition (MIT) of the insulating LaAlO3 thin films observed in metal/LaAlO3/Nb-SrTiO3 heterostructures. The switching voltage depends strongly on the thickness of the LaAlO3 thin film which indicates that a minimum thickness is required for the MIT. A constant opposing voltage is required to deplete the charges from the defect st…
▽ More
We report on the electric-field-induced reversible metal-insulator transition (MIT) of the insulating LaAlO3 thin films observed in metal/LaAlO3/Nb-SrTiO3 heterostructures. The switching voltage depends strongly on the thickness of the LaAlO3 thin film which indicates that a minimum thickness is required for the MIT. A constant opposing voltage is required to deplete the charges from the defect states. Our experimental results exclude the possibility of diffusion of the metal electrodes or oxygen vacancies into the LaAlO3 layer. Instead, the phenomenon is attributed to the formation of a quasi-conduction band (QCB) in the defect states of LaAlO3 that forms a continuum state with the conduction band of the Nb-SrTiO3. Once this continuum (metallic) state is formed, the state remains stable even when the voltage bias is turned off. The thickness dependent reverse switch-on voltage and the constant forward switch-off voltage are consistent with our model. The viewpoint proposed here can provide an alternative mechanism for resistive switching in complex oxides.
△ Less
Submitted 7 October, 2011;
originally announced October 2011.
-
Fluctuations and stability in front propagation
Authors:
E. Khain,
Y. T. Lin,
L. M. Sander
Abstract:
Propagating fronts arising from bistable reaction-diffusion equations are a purely deterministic effect. Stochastic reaction-diffusion processes also show front propagation which coincides with the deterministic effect in the limit of small fluctuations (usually, large populations). However, for larger fluctuations propagation can be affected. We give an example, based on the classic spruce-budwor…
▽ More
Propagating fronts arising from bistable reaction-diffusion equations are a purely deterministic effect. Stochastic reaction-diffusion processes also show front propagation which coincides with the deterministic effect in the limit of small fluctuations (usually, large populations). However, for larger fluctuations propagation can be affected. We give an example, based on the classic spruce-budworm model, where the direction of wave propagation, i.e., the relative stability of two phases, can be reversed by fluctuations.
△ Less
Submitted 29 September, 2010;
originally announced September 2010.
-
The Harmonic Measure for critical Potts clusters
Authors:
David A. Adams,
Yen Ting Lin,
Leonard M. Sander,
Robert M. Ziff
Abstract:
We present a technique, which we call "etching," which we use to study the harmonic measure of Fortuin-Kasteleyn clusters in the Q-state Potts model for Q=1-4. The harmonic measure is the probability distribution of random walkers diffusing onto the perimeter of a cluster. We use etching to study regions of clusters which are extremely unlikely to be hit by random walkers, having hitting probabi…
▽ More
We present a technique, which we call "etching," which we use to study the harmonic measure of Fortuin-Kasteleyn clusters in the Q-state Potts model for Q=1-4. The harmonic measure is the probability distribution of random walkers diffusing onto the perimeter of a cluster. We use etching to study regions of clusters which are extremely unlikely to be hit by random walkers, having hitting probabilities down to 10^(-4600). We find good agreement between the theoretical predictions of Duplantier and our numerical results for the generalized dimension D(q), including regions of small and negative q.
△ Less
Submitted 28 August, 2009; v1 submitted 14 July, 2009;
originally announced July 2009.