We apply deep learning techniques to the late-time turbulent regime in a post-inflationary model where a real scalar inflaton field and the standard model Higgs doublet interact with renormalizable couplings between them. After inflation, the inflaton decays into the Higgs through a trilinear coupling and the Higgs field subsequently thermalizes with gauge bosons via its $SU(2)\times U(1)$ gauge interaction. Depending on the strength of the trilinear interaction and the Higgs self-coupling, the effective mass squared of Higgs can become negative, leading to the tachyonic production of Higgs particles. These produced Higgs particles would then share their energy with gauge bosons, potentially indicating thermalization. Since the model entails different non-perturbative effects, it is necessary to resort to numerical and semi-classical techniques. However, simulations require significant costs in terms of time and computational resources depending on the model used. Particularly, when $SU(2)$ gauge interactions are introduced, this becomes evident as the gauge field redistributes particle energies through rescattering processes, leading to an abundance of UV modes that disrupt simulation stability. This necessitates very small lattice spacings, resulting in exceedingly long simulation runtimes. Furthermore, the late-time behavior of preheating dynamics exhibits a universal form by wave kinetic theory. Therefore, we analyze patterns in the flow of particle numbers and predict future behavior using CNN-LSTM (Convolutional Neural Network combined with Long Short-Term Memory) time series analysis. In this way, we can reduce our dependence on simulations by orders of magnitude in terms of time and computational resources.

1 Introduction

Cosmic inflation has been demonstrated to play a pivotal role in establishing the initial conditions of the early universe within modern cosmology [1, 2, 3, 4]. These characteristics can be realized by a real scalar field, known as the inflaton. It allows for the exploration of various intriguing phenomenologies due to the interplay between the inflaton and other fields [5, 6, 7]. However, in any post-inflationary model, we must eventually explain a thermal universe where the constituents of thermal plasma possess distinctive properties such as thermal mass, necessitating a deeper comprehension of thermal field theory. During the initial stages of thermalization, defining thermal quantities becomes challenging, particularly with a low number of plasma constituents [8]. In the initially under-occupied systems, low-momentum particles form a bath, thermalize, and then dominate the system by absorbing the energy from the hard excitations. This process is dominated by number-changing interactions, which can interfere and therefore it is important to take into account modification of the interaction rate so-called the Landau–Pomeranchuk–Migdal (LPM) effect [9, 10, 11]. When the rate of plasma constituent production is sufficiently low, interaction dynamics can be treated perturbatively [12, 13], but when the rate is large, the system falls into the non-perturbative regime, where fields redistribute their energies by turbulent rescattering. This late-time phenomenon is known as free turbulence, where the dynamics are described by kinetic theories [14, 15]. Given the inherent challenges of demonstrating free turbulence and tracking thermalization using lattice simulations, it is tempting to infer patterns in the occupation number of fields, represented as self-similarity implied by the kinetic theories and analyze them using deep learning methods for its proficiency in pattern recognition.

Machine learning and deep learning have revolutionized various fields, including physics, by providing powerful tools to analyze complex systems and phenomena [16, 17, 18]. Machine learning involves algorithms that enable computers to learn from data and make predictions or decisions without being explicitly programmed, while deep learning utilizes neural networks with multiple layers to learn complex patterns and representations from data. Convolutional Neural Networks (CNNs) are particularly effective for processing spatial data, such as images, while Long Short-Term Memory (LSTM) networks specialize in capturing temporal dependencies in sequential data [19].

Therefore, combining Convolutional 1D (Conv1D) layers with LSTM networks offers a powerful solution for analyzing time series data in physics. Conv1D layers effectively capture local patterns in sequential data, while LSTM networks model long-term dependencies and temporal dynamics, enabling accurate analysis of time-dependent datasets across various domains.

In this study, we apply deep learning techniques to the following example model. We consider the post-inflationary dynamics of standard model Higgs particles abundantly produced by the inflaton, where the Higgs subsequently decays into gauge bosons through $SU(2)\times U(1)$ interactions, leading to the formation of a thermal sector. During the inflaton oscillation period, numerical simulations are often necessary since the production of scalar or gauge bosons after inflation involves non-perturbative dynamics (e.g. preheating)[20, 21, 22, 23, 24, 25, 26, 27, 28]. Since lattice simulations designed to analyze the preheating system have limitations in observing late-time dynamics up to reheating, we attempt to overcome these limitations by employing deep learning techniques.

Our paper is structured as follows: In Section 2, we examine the tachyonic production mechanism of Higgs via its trilinear interaction with the inflaton. In Appendix A, we investigate how post-inflationary $SU(2)\times U(1)$ gauge interactions are implemented on the lattice, and the simulation results will be presented in Section 3. We discuss the deep learning technique that we applied to our results in Section 4 and draw conclusions in Section 5. Appendix B provides an introduction to the hyperparameters used in deep learning.

2 Tachyonic production of Higgs

The minimal interactions between the inflaton $\phi$ and the Higgs $H$ are of the form [29]

\Delta V=\frac{1}{2}{m^{2}_{\phi}}\,\phi^{2}+\frac{1}{4}{\lambda_{\phi}}\,\phi% ^{4}+\frac{1}{2}{\lambda_{\phi h}}\,\phi^{2}\,H^{\dagger}H+{\sigma_{\phi h}}\,% \phi\,H^{\dagger}H-{m^{2}_{h}}\,H^{\dagger}H+{\lambda_{h}}\,(H^{\dagger}H)^{2}\;,

(1)

where we assume $\lambda_{\phi}=0,\,\lambda_{\phi h}=0,\,$ and $m_{h}=0$ for simplicity.¹¹1In the case of the quartic inflaton potential, to elevate the global minimum of the potential from negative values to zero, one should introduce an undesirable offset term [31]. In unitary gauge

H=\frac{1}{\sqrt{2}}\begin{pmatrix}0\\ h\end{pmatrix}\,,

(2)

the potential simplifies to

V=\frac{1}{2}{m^{2}_{\phi}}\,\phi^{2}+\frac{\sigma_{\phi h}}{2}\,\phi\,h^{2}+% \frac{\lambda_{h}}{4}\,h^{4}\,.

(3)

By rewriting Eq. (3) in the form

V=\frac{1}{2}\left({m_{\phi}}\,\phi+\frac{\sigma_{\phi h}}{2m_{\phi}}h^{2}% \right)^{2}+\frac{1}{4}\,\left(\lambda_{h}-\frac{\sigma^{2}_{\phi h}}{2m^{2}_{% \phi}}\right)h^{4}\,,

(4)

one finds that $\lambda_{h}$ must be larger than $\sigma^{2}_{\phi h}/2m^{2}_{\phi}$ to make the potential bounded from below [30]. The equation of motion of $h$ is

\Box h+V_{,h}=0

(5)

Quantizing $h$ with the eigenmodes $h_{k}(t)e^{i\mathbb{k}\vec{x}}$ and $\mathbb{k}$ the comoving momentum, one may derive linear mode equations in momentum space²²2The modes are rescaled to factor out the scale factor, $h_{k}\rightarrow a^{3/2}h_{k}$ .

\ddot{h}_{k}+\omega^{2}_{k}h_{k}=0,

(6)

where

\omega^{2}_{k}=\frac{k^{2}}{a^{2}}+\sigma_{\phi h}\phi(t)+3\lambda_{h}\langle h% ^{2}\rangle-3\dot{a}^{2}/4a^{2}-3\ddot{a}/2a\,.

(7)

Given that negative values of the inflaton background can make the entire frequency term negative, we expect a tachyonic growth of the Higgs mode during the inflaton oscillation period. While one can theoretically examine the mathematical structure of the mode equations and quantitatively determine the number of produced particles, practical implementation proves to be challenging for various reasons [31]. Each mode in the equation experiences movement across the instability band with varying Floquet quotient due to the expansion of space. Eq. (6) becomes linearized in terms of $k$ only when we disregard the rescattering of modes with different $k$ . This is not the case in the presence of the strong self-interaction and gauge interaction of Higgs, even at the onset of the inflaton oscillation period. To deal with the turbulent particle interactions, we opt to solve the equations of motion directly in spacetime through a semi-classical approach (see [32] for studies on gravitational wave production in a similar framework).

3 Post-inflationary weak interactions

Given the gauge interactions in the standard model, Higgs particles will interact with gauge bosons upon production. Previous studies have explored scenarios where considerable Higgs condensates generate gauge bosons via resonant channels [27, 28]. We examine a system in which field constituents, particularly focusing on the Higgs directly produced by the inflaton, interact with other fields in a turbulent manner.

Refer to caption — Figure 1: Energy densities of interacting fields during preheating. Left: energy density of fields normalized by $m_{\phi}^{4}$ . The $x$ -axis represents time rescaled by the inflaton mass, $m_{\phi}$ . Right: ratio of the energy density of each field to the total energy over the scale factor.

Using a publicly available code, CosmoLattice [33, 34], we conduct lattice simulations to model post-inflationary weak interactions of the Higgs (see A). We set $g_{2}=g_{1}=0.6$ and $\lambda_{h}=10^{-2}$ for the coupling constants at the inflationary physics scale to be not too different from those of the standard model. As in [32], the inflaton field starts to roll down to the minimum of potential with the bare mass $m_{\phi}=1.6\times 10^{13}$ GeV, $\phi_{init}=1\,m_{p}$ and $\dot{\phi}_{init}=-0.71\,m_{\phi}m_{p}$ , where the Planck mass $m_{p}\approx 2.435\times 10^{18}\text{ GeV}$ . Here, $init$ refers to the initial values input into the lattice simulations. While we do not specify an inflationary model, it is assumed to converge to a quadratic inflaton potential some time after the end of inflation.³³3Around the end of inflation where $\phi>0$ , the potential of the form Eq. (4) reaches its minimum for $h=0$ . Parameters may vary within certain limits to accommodate running coupling constants or experimental constraints.

The energy densities of fields are defined

$\displaystyle\rho_{\phi}$	$\displaystyle\equiv$	$\displaystyle K_{\phi}+G_{\phi}+V(\phi)\,$	(8)
$\displaystyle\rho_{H}$	$\displaystyle\equiv$	$\displaystyle K_{H}+G_{H}+V(H)\,$	(9)
$\displaystyle\rho_{B}$	$\displaystyle\equiv$	$\displaystyle K_{U(1)}+G_{U(1)}\,$	(10)
$\displaystyle\rho_{W}$	$\displaystyle\equiv$	$\displaystyle K_{SU(2)}+G_{SU(2)}\,,$	(11)

where $V(\phi)$ and $V(H)$ represent the potential of the inflaton and Higgs doublet, respectively. Fig. 1 shows the energy densities of fields over time for the case where $\sigma_{\phi h}=10^{9}\,\text{GeV}$ . The total energy density of the system is still dominated by the energy of the inflaton field, while the Higgs contributes only a small fraction of the total energy. Due to the strong self-interaction of the Higgs ( $\lambda_{h}=10^{-2}$ ), tachyonic production is highly suppressed but its interaction with the inflaton is sufficiently strong to withstand the dilution by the quadratic inflaton potential ( $\rho_{R}\propto a^{-4}$ for radiation while $\rho_{\phi}\propto a^{-3}$ for the massive inflaton field).

For a qualitative analysis, we illustrate the energy ratio of the Higgs and the $SU(2)$ gauge field to the $U(1)$ gauge field on the left of Fig. 2. We anticipate observing the equipartition of energy in the gauge sector as an indication of thermalization, in a ratio of 4:6:2 (or equivalently 2:3:1) corresponding to the degrees of freedom of the Higgs, $SU(2)$ gauge fields, and $U(1)$ gauge boson, respectively. We have captured the moment of energy equipartition shortly after the end of inflation ( $a\sim 60$ or $m_{\phi}t\sim 700$ ).

We excited initial particle modes for the comoving momentum in the range $5\leq k/m_{\phi}\leq 10$ and evolved the system in the extended range $5\leq k/m_{\phi}\lesssim 690$ . It should be noted that our focus is on the tail end of spectra in the UV, which could potentially underestimate Higgs production in the IR. Expanding the momentum window in the IR direction leads to increased energy exchange due to the increased scattering of particles, necessitating further expansion in the UV direction as well. This results in a significant increase in simulation runtime and it is one of the key motivations for us to leverage deep learning techniques. For the subsequent discussion, we define the occupation number for the Higgs doublet as⁴⁴4Zero initial occupation number for scalar fields in theory corresponds to $n_{k,init}=1/2$ in our definition. For the Higgs doublet, we take into account 4 Higgs d.o.f and 1/2 for the normalization factor, resulting in $n^{H}_{k,init}=1/2\times 4\times 1/2=1$ .

\displaystyle n^{H}_{k}=\sum_{i}n^{i}_{k}\equiv\sum_{i}\frac{\omega^{i}_{k}}{2% }\left(|{(h^{i}_{k})^{\prime}}|^{2}/({\omega^{i}_{k}})^{2}+|h^{i}_{k}|^{2}% \right)\,,

(12)

where $i$ represents the Higgs $d.o.f$ and the frequency is

\displaystyle\omega^{i}_{k}\equiv\sqrt{k^{2}+a^{2}\,|d^{2}V(H)/d{h^{i}}^{2}|}\,.

(13)

The prime denotes the conformal time derivative $a(d/dt)$ . For gauge bosons, since the occupation number is gauge-dependent, gauge fixing must be performed at each measurement [28, 35, 36, 37]. Due to its technical difficulty, we leave this aspect for future work.

The late-time classical reheating process of self-interacting fields or non-abelian fields exhibits a turbulent and self-similar evolution of distribution functions towards equilibrium [15, 38]. The shape of the spectra, along with the self-similar dynamics, can be comprehended within the framework of wave kinetic theory.⁵⁵5See [39, 40] for possible IR- and UV-cascades in the context of non-thermal fixed points in scalar field theories. This suggests the presence of a consistent trend in the evolution of the occupation number, offering a foundation for exploring the occupation number using deep learning methods.

4 CNN-LSTM time series analysis

The CNN-LSTM model integrates two fundamental components: the Convolutional Neural Network (CNN) and the Long Short-Term Memory (LSTM) network. Each component plays a distinct role in processing the input data to extract features and capture temporal dependencies.

The CNN component is responsible for extracting spatial features from the input data. It operates by applying convolutional filters over the input data to capture spatial patterns, such as edges, textures, and shapes. This process is facilitated by multiple convolutional layers followed by pooling layers, which help reduce spatial dimensions while retaining essential information.

On the other hand, the LSTM component specializes in modeling temporal dependencies in sequential data. It consists of stacked LSTM layers, each equipped with memory cells capable of retaining information over extended time periods. These memory cells are governed by gates that control the flow of information, enabling the model to selectively update, forget, or output information based on the input sequence.

In the parallel CNN-LSTM model, the input data are processed separately by the CNN and LSTM components. Once the CNN and LSTM components have processed their respective data streams, the output features are concatenated and fed into a linear layer (or layers) for further processing. This linear layer serves as the final step in the model, where the extracted features are transformed and mapped to the desired output space.

In our CNN-LSTM time series analysis, the input data have the shape (batch_size, seq_length, input_size), which corresponds to (the number of time points, the length of sequence, the number of $k$ modes) for the preheating model that we discuss. Both the input and output contain $n_{k}(t)$ of the Higgs, but the model output predicts values occurring after one sequence.⁶⁶6In a time series context, the term ‘time’ typically denotes the sequential order of data points rather than implying that the input variables must directly represent time intervals. We implement the model using PyTorch [41], a popular deep learning library in Python. The model architecture consists of a Conv1D layer followed by an LSTM layer. The Conv1D layer captures local patterns in the sequential data, while the LSTM layer captures long-term dependencies.

The model is trained using the training data, and the training process involves optimizing a loss function (mean absolute error) using the Adam optimizer. The training loop iterates over epochs, updating the model parameters to minimize the loss. Predictions are made for future time steps beyond the training data and compared with the actual test data. The test data are completely separated from the model training and forecasting process.

The free turbulence starts approximately around $m_{\phi}t\approx 400$ , so we divided the data into a training set for $400<m_{\phi}t<1000$ and a test set for $1000<m_{\phi}t<1500$ . We make predictions up to $m_{\phi}t\approx 2000$ .

The model’s predictions are visualized alongside the actual data as in Fig. 3. It represents the actual data used for training, the trained predictions, the test data not used for training, and the predictions for future time steps for two momentum modes among 15 modes ( $5\leq k/m_{\phi}\lesssim 400$ ). It is remarkable that our deep learning model managed to learn patterns and trends in the limited amount of data and make predictions similar to actual simulation results.

The predictions for each mode are collected and reshaped in Fig. 4. At the end of our predictions, around $m_{\phi}t\approx 2000$ , it appears that we have already reached a point where the system departs from the classical regime and quantum effects become significant, indicating proximity to thermalization. By further developing the deep learning model, we can implement a more realistic cosmological model that correctly incorporates the production of the Higgs, which has been underestimated in the IR regime.

Still, we may use the predicted values to verify the self-similar evolution. Rewriting Eq. (4) from [15] yields

\displaystyle\tilde{k}^{4}n(k=\tilde{k}\tau^{p},t)\tau^{q}=\tilde{k}^{4}n(k=% \tilde{k},t=t_{0})~{},

(14)

where $\tilde{k}\equiv k\tau^{-p}$ , $\tau\equiv t/t_{0}$ , and $t_{0}$ denotes a fixed point in time. $p$ and $q$ represent the fitting exponents, and $p$ determines how quickly the system approaches equilibrium. Since the right-hand side of Eq. (14) is stationary, it was used as data for fitting, resulting in a good fit when $p\approx 0.16$ and $q\approx 0.63$ , as illustrated in Fig. 4. For physical insights on the exponents, see [15, 38]. However, the aim of this work is not to precisely determine the exponents, but to apply deep learning techniques based on the consistent patterns in the distributions, referred to as self-similar evolution or non-thermal fixed points. The exponents may vary depending on the details of the self- or gauge interaction of the Higgs field as well as the fitting conditions. In fact, the Higgs field interacts not only through self- and gauge interactions but also with fermions, and intense fermion reactions are expected during the preheating period [42, 43]. Once the code is updated to include fermions, it will be interesting to directly observe how this affects the self-evolution or non-thermal fixed points in the Higgs distribution.

5 Conclusion

We consider a simple post-inflationary scenario where after inflation, the Standard Model Higgs decays into gauge bosons via weak interactions while forming a quasi-thermal state in the gauge sector. The non-perturbative nature of the system is mainly due to the $SU(2)$ gauge interaction ( $g=0.6$ ) and Higgs self-coupling ( $\lambda_{h}=10^{-2}$ ) along with the trilinear interaction ( $\sigma_{\phi h}\phi H^{\dagger}H$ where $\sigma_{\phi h}=10^{9}\,$ GeV) between Higgs and the inflaton. Analyzing the production and decay of Higgs in this model appears challenging without relying on numerical and semi-classical approaches. Demonstrating thermalization through simulation is also very challenging, primarily because simulations are costly in practical terms and quantum effects become significant in later stages. We introduced deep learning techniques as an effort to overcome the limitations of classical lattice simulations and ultimately reproduce the thermal universe. The later stage of preheating is referred to as the regime of free turbulence, where the flow of particle numbers can be analyzed through kinetic theory, making it a suitable material for applying deep learning techniques. LSTM networks are commonly used in time series analysis, but they have limitations in predicting beyond the trained range. And since the momentum modes of the fields in the free turbulence regime have localized influence rather than behaving independently, local patterns in the momentum space can play an important role. CNNs are useful for capturing localized patterns observed in images and similar data. Therefore, we utilized a deep learning model that combines LSTM with CNN, allowing a fully connected linear layer to learn the outputs of both networks simultaneously. We applied this model to the realistic post-inflationary preheating scenario and predicted the spectra of the Higgs field beyond simulation results. While still in its early stages, this study has demonstrated its potential for advancement.

Acknowledgements. J.Y. would like to thank Oleg Lebedev for useful discussions. This work was performed using HPC resources from the ‘‘Mésocentre’’ computing center of CentraleSupélec, École Normale Supérieure Paris-Saclay and Université Paris-Saclay supported by CNRS and Région Île-de-France (https://mesocentre.universite-paris-scalay.fr/). This project has received support from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 860881-HIDDeN, and the CNRS-IRP UCMN.

Appendix A $SU(2)\times U(1)$ on the lattice

Let us start with the $SU(2)\times U(1)$ gauge Lagrangian

\mathcal{L}_{\textrm{g}auge}=-\frac{1}{4}{(W^{a}_{\mu\nu})^{2}}\,-\frac{1}{4}{% B^{2}_{\mu\nu}}\,+(D_{\mu}H)^{\dagger}(D_{\mu}H)+{m^{2}_{h}}\,H^{\dagger}H-{% \lambda_{h}}\,(H^{\dagger}H)^{2}\;,

(15)

where $B_{\mu}$ is the $U(1)$ gauge boson, with $B_{\mu\nu}=\partial_{\mu}B_{\nu}-\partial_{\nu}B_{\mu}$ , and $W^{a}_{\mu}$ are the $SU(2)$ gauge bosons, with $W^{a}_{\mu\nu}$ their field strengths. The covariant derivative of Higgs doublet $H$ is

D_{\mu}H=\partial_{\mu}H-ig_{2}W^{a}_{\mu}\tau^{a}H-\frac{1}{2}ig_{1}B_{\mu}H,

(16)

where $g_{2}$ and $g_{1}$ are the $SU(2)$ and $U(1)$ couplings, respectively.

To present the equations of motion in an expanding universe, we extract useful expressions from [33]. First, we assume a flat Friedmann-Lemaître-Robertson-Walker (FLRW) metric, characterized by the line element:

\text{d}s^{2}=g_{\mu\nu}\text{d}x^{\mu}\text{d}x^{\nu}=-a(\eta)^{2\alpha}\text% {d}\eta^{2}+a(\eta)^{2}\delta_{ij}\text{d}x^{i}\text{d}x^{j}\ ,

(17)

Here, $a(\eta)$ represents the scale factor, $\delta_{ij}$ denotes the Euclidean metric, and $\alpha$ is a constant parameter chosen conveniently depending on the scenario.

The EOMs in an expanding universe are given by

$\displaystyle\phi^{\prime\prime}-a^{-2(1-\alpha)}{\vv\nabla}^{\,2}\phi+(3-% \alpha)\frac{{a^{\prime}}}{a}{\phi^{\prime}}$	$\displaystyle=$	$\displaystyle-a^{2\alpha}V_{,\phi}\ ,$	(18)
$\displaystyle H^{\prime\prime}-a^{-2(1-\alpha)}{\vv D}^{\,2}H+(3-\alpha)\frac{% {a^{\prime}}}{a}{H^{\prime}}$	$\displaystyle=$	$\displaystyle-\frac{a^{2\alpha}V_{,\|H\|}}{2}\frac{H}{\|H\|}\ ,$	(19)
$\displaystyle\partial_{0}F_{0i}-a^{-2(1-\alpha)}\partial_{j}F_{ji}+(1-\alpha)% \frac{{a^{\prime}}}{a}F_{0i}$	$\displaystyle=$	$\displaystyle a^{2\alpha}J^{B}_{i}\ ,$	(20)
$\displaystyle(\mathcal{D}_{0})_{ab}(G_{0i})^{b}-a^{-2(1-\alpha)}(\mathcal{D}_{% j})_{ab}(G_{ji})^{b}+(1-\alpha)\frac{{a^{\prime}}}{a}(G_{0i})^{b}$	$\displaystyle=$	$\displaystyle a^{2\alpha}(J_{i})_{a}\ ,$	(21)
$\displaystyle\partial_{i}F_{0i}$	$\displaystyle=$	$\displaystyle a^{2}J^{B}_{0}\ ,$	(22)
$\displaystyle(\mathcal{D}_{i})_{ab}(G_{0i})^{b}$	$\displaystyle=$	$\displaystyle a^{2}(J_{0})_{a}\,,$	(23)

where the prime ^′ denotes $\alpha$ -time derivative $d/d\eta$ and $(\mathcal{D}_{\nu}O)_{a}=(\mathcal{D}_{\nu})_{ab}O_{b}\equiv(\delta_{ab}% \partial_{\nu}-f_{abc}W_{\nu}^{c})O_{b}$ . The field strength tensors are defined as

	$\displaystyle F_{\mu\nu}$	$\displaystyle\equiv$	$\displaystyle\partial_{\mu}B_{\nu}-\partial_{\nu}B_{\mu}\ ,$		(24)
	$\displaystyle G_{\mu\nu}$	$\displaystyle\equiv$	$\displaystyle\partial_{\mu}W_{\nu}-\partial_{\nu}W_{\mu}-ig_{2}[W_{\mu},W_{\nu% }]\,,$		(25)

and the currents are

	$\displaystyle\hskip 51.21504pt(J^{B})^{\mu}$	$\displaystyle\equiv$	$\displaystyle g_{1}\mathcal{I}m[H^{\dagger}(D^{\mu}H)]\,,$		(26)
	$\displaystyle\hskip 51.21504ptJ_{a}^{\mu}$	$\displaystyle\equiv$	$\displaystyle 2g_{2}\mathcal{I}m[H^{{\dagger}}T_{a}(D^{\mu}H)]\,.$		(27)

Eqs. (22) and (23) are the $U(1)$ and $SU(2)$ Gauss constraints in an expanding background that can be used to check the consistency of numerical simulations. The evolution of the scale factor obeys the Friedmann equations

$\displaystyle\left({a^{\prime}\over a}\right)^{2}$	$\displaystyle=$	$\displaystyle\frac{a^{2\alpha}}{3m_{p}^{2}}\left\langle{K}_{\phi}+{K}_{H}+{G}_% {\phi}+{G}_{H}+{K}_{U(1)}+{G}_{U(1)}+{K}_{SU(2)}+{G}_{SU(2)}+{V}\right\rangle\,,$	(28)
$\displaystyle{a^{\prime\prime}\over a}$	$\displaystyle=$	$\displaystyle\frac{a^{2\alpha}}{3m_{p}^{2}}\left\langle(\alpha-2)({K}_{\phi}+{% K}_{H})+\alpha({G}_{\phi}+{G}_{H})+(\alpha+1)V\right.$
		$\displaystyle\hskip 56.9055pt\left.+~{}(\alpha-1)({K}_{U(1)}+{G}_{U(1)}+{K}_{% SU(2)}+{G}_{SU(2)})\right\rangle\,,$

where $\langle...\rangle$ represents values averaged over space. One of the two Friedmann equations is redundant and used to check consistency as Gauss constraints. The kinetic ( $K$ ) and gradient ( $G$ ) energy densities are

\displaystyle\begin{array}[]{lcl}{K}_{\phi}&=&\frac{1}{2a^{2\alpha}}{\phi^{% \prime}}^{2}\vspace{0.1cm}\vspace{0.1cm}\\ {K}_{H}&=&\frac{1}{a^{2\alpha}}(D_{0}H)^{\dagger}(D_{0}H)\vspace{0.1cm}\\ {K}_{U(1)}&=&\frac{1}{2a^{2+2\alpha}}\sum_{i}F_{0i}^{2}\vspace{0.1cm}\\ {K}_{SU(2)}&=&\frac{1}{2a^{2+2\alpha}}\sum_{a,i}(G_{0i}^{a})^{2}\vspace{0.1cm}% \\ \end{array}\hskip 2.84544pt;\hskip 11.38092pt\begin{array}[]{lcl}{G}_{\phi}&=&% \frac{1}{2a^{2}}\sum_{i}(\partial_{i}\phi)^{2}\vspace{0.1cm}\\ {G}_{H}&=&\frac{1}{a^{2}}\sum_{i}(D_{i}H)^{\dagger}(D_{i}H)\vspace{0.1cm}\\ {G}_{U(1)}&=&\frac{1}{2a^{4}}\sum_{i,j<i}F_{ij}^{2}\vspace{0.1cm}\\ {G}_{SU(2)}&=&\frac{1}{2a^{4}}\sum_{a,i,j<i}(G_{ij}^{a})^{2}\,.\vspace*{0.2cm}% \\ \end{array}

(38)

One can solve the above equations in the temporal gauge with different equation-solving algorithms. We implement our model on the lattice using a publicly available code called CosmoLattice [33, 34].

Program variables used for our physics model are rescaled as follows:

	$\displaystyle\alpha$	$\displaystyle=$	$\displaystyle 0~{}~{},~{}~{}f_{}=1.6\cdot 10^{13}\,\text{GeV}~{}~{},~{}~{}% \omega_{}=1.6\cdot 10^{13}\,\text{GeV}~{}~{},~{}~{}d\tau=a^{-\alpha}\omega_{*% }dt~{}~{},$		(39)
	$\displaystyle\tilde{f}$	$\displaystyle\equiv$	$\displaystyle f/f_{}~{}~{},~{}~{}\tilde{V}\equiv V/(f^{2}_{}\omega^{2}_{*})~% {}~{},$		(40)

where variables with tildes represent dimensionless program variables. Important input parameters used for our model are set as follows

evolver=VV2 , N=160 , dt=0.005 , kIR = 5 , kCutOff = 10.0 .

(41)

We evolve the field equations using the second kind of Velocity-Verlet algorithm on 160 lattice points per dimension. dt represents the time step in program unit.

Appendix B Hyperparameters

Hyperparameters are parameters whose values are set before the learning process begins. They control aspects of the learning process itself and are not learned from the data. These parameters are external to the model and must be set based on heuristics, prior experience, or trial and error. Common hyperparameters include the learning rate, batch size, number of hidden layers, number of neurons in each layer, dropout rate, regularization strength, and optimizer choice. Proper selection and tuning of hyperparameters are crucial for achieving good performance and generalization of machine learning models. If hyperparameters are not tuned, the results obtained through the model can either underfit or overfit the data. Therefore, it is important to compute and monitor the train loss and validation loss. Typically, the train loss decreases over epochs, so by solely examining the validation loss, we can discern whether the model is underfitting or overfitting. If the validation loss continues to decrease, it indicates underfitting, suggesting room for improvement through further training. However, if the validation loss does not decrease, it implies overfitting, where the model merely memorizes the data without understanding beyond it. Thus, it is advisable to perform early stopping at this intermediate stage, as illustrated in Fig 5.

References

[1] A. A. Starobinsky, Phys. Lett. B 91 (1980) 99-102.
[2] A. H. Guth, Phys. Rev. D 23 (1981) 347-356.
[3] A. D. Linde, Phys. Lett. B 108 (1982) 389-393; Phys. Lett. B 129 (1983), 177-181.
[4] V. F. Mukhanov and G. V. Chibisov, JETP Lett. 33, 532-535 (1981).
[5] O. Lebedev and J. H. Yoon, Phys. Lett. B 821, 136614 (2021).
[6] O. Lebedev, T. Nerdi, T. Solomko and J. H. Yoon, Phys. Rev. D 106, no.4, 043537 (2022).
[7] O. Lebedev, T. Solomko and J. H. Yoon, JCAP 02, 035 (2023).
[8] A. Kurkela and G. D. Moore, JHEP 12, 044 (2011) doi:10.1007/JHEP12(2011)044 [arXiv:1107.5050 [hep-ph]].
[9] L. D. Landau and I. Pomeranchuk, Dokl. Akad. Nauk Ser. Fiz. 92, 535-536 (1953)
[10] A. B. Migdal, Phys. Rev. 103, 1811-1820 (1956) doi:10.1103/PhysRev.103.1811
[11] M. Gyulassy and X. n. Wang, Nucl. Phys. B 420, 583-614 (1994) doi:10.1016/0550-3213(94)90079-5 [arXiv:nucl-th/9306003 [nucl-th]].
[12] K. Harigaya and K. Mukaida, JHEP 05, 006 (2014) doi:10.1007/JHEP05(2014)006 [arXiv:1312.3097 [hep-ph]].
[13] K. Mukaida and M. Yamada, [arXiv:2402.14054 [hep-ph]].
[14] R. Micha and I. I. Tkachev, Phys. Rev. Lett. 90, 121301 (2003) doi:10.1103/PhysRevLett.90.121301 [arXiv:hep-ph/0210202 [hep-ph]].
[15] R. Micha and I. I. Tkachev, Phys. Rev. D 70, 043538 (2004) doi:10.1103/PhysRevD.70.043538 [arXiv:hep-ph/0403101 [hep-ph]].
[16] Y. LeCun, Y. Bengio and G. Hinton, Nature 521, 436-444 (2015) doi:10.1038/nature14539
[17] J. Schmidhuber, Neural Networks 61, 85-117 (2015) doi:10.1016/j.neunet.2014.09.003
[18] T. Plehn, A. Butter, B. Dillon, T. Heimel, C. Krause and R. Winterhalder, [arXiv:2211.01421 [hep-ph]].
[19] S. Hochreiter and J. Schmidhuber, Neural Comput. 9, no.8, 1735-1780 (1997) doi:10.1162/neco.1997.9.8.1735
[20] L. Kofman, A. D. Linde and A. A. Starobinsky, Phys. Rev. Lett. 73, 3195-3198 (1994) doi:10.1103/PhysRevLett.73.3195 [arXiv:hep-th/9405187 [hep-th]].
[21] L. Kofman, A. D. Linde and A. A. Starobinsky, Phys. Rev. D 56, 3258-3295 (1997) doi:10.1103/PhysRevD.56.3258 [arXiv:hep-ph/9704452 [hep-ph]].
[22] P. B. Greene, L. Kofman, A. D. Linde and A. A. Starobinsky, Phys. Rev. D 56, 6175-6192 (1997).
[23] A. Rajantie, P. M. Saffin and E. J. Copeland, Phys. Rev. D 63, 123512 (2001) doi:10.1103/PhysRevD.63.123512 [arXiv:hep-ph/0012097 [hep-ph]].
[24] J. I. Skullerud, J. Smit and A. Tranberg, JHEP 08, 045 (2003) doi:10.1088/1126-6708/2003/08/045 [arXiv:hep-ph/0307094 [hep-ph]].
[25] R. Micha and I. I. Tkachev, doi:10.1142/9789812704498_0020 [arXiv:hep-ph/0301249 [hep-ph]].
[26] J. Berges, S. Schlichting and D. Sexty, Phys. Rev. D 86, 074006 (2012) doi:10.1103/PhysRevD.86.074006 [arXiv:1203.4646 [hep-ph]].
[27] D. G. Figueroa, J. Garcia-Bellido and F. Torrenti, Phys. Rev. D 92, no.8, 083511 (2015) doi:10.1103/PhysRevD.92.083511 [arXiv:1504.04600 [astro-ph.CO]].
[28] K. Enqvist, S. Nurmi, S. Rusak and D. Weir, JCAP 02, 057 (2016) doi:10.1088/1475-7516/2016/02/057 [arXiv:1506.06895 [astro-ph.CO]].
[29] O. Lebedev, Prog. Part. Nucl. Phys. 120, 103881 (2021).
[30] J. F. Dufaux, G. N. Felder, L. Kofman, M. Peloso and D. Podolsky, JCAP 07, 006 (2006) doi:10.1088/1475-7516/2006/07/006 [arXiv:hep-ph/0602144 [hep-ph]].
[31] A. A. Abolhasani, H. Firouzjahi and M. M. Sheikh-Jabbari, Phys. Rev. D 81, 043524 (2010) doi:10.1103/PhysRevD.81.043524 [arXiv:0912.1021 [hep-th]].
[32] C. Cosme, D. G. Figueroa and N. Loayza, JCAP 05, 023 (2023) doi:10.1088/1475-7516/2023/05/023 [arXiv:2206.14721 [astro-ph.CO]].
[33] D. G. Figueroa, A. Florio, F. Torrenti and W. Valkenburg, JCAP 04, 035 (2021).
[34] D. G. Figueroa, A. Florio, F. Torrenti and W. Valkenburg, [arXiv:2102.01031 [astro-ph.CO]].
[35] L. Giusti, M. L. Paciello, C. Parrinello, S. Petrarca and B. Taglienti, Int. J. Mod. Phys. A 16, 3487-3534 (2001) doi:10.1142/S0217751X01004281 [arXiv:hep-lat/0104012 [hep-lat]].
[36] G. S. Bali, V. Bornyakov, M. Muller-Preussker and F. Pahl, Nucl. Phys. B Proc. Suppl. 42, 852-854 (1995) doi:10.1016/0920-5632(95)00401-T [arXiv:hep-lat/9412027 [hep-lat]].
[37] M. Schröck and H. Vogt, Comput. Phys. Commun. 184, 1907-1919 (2013) doi:10.1016/j.cpc.2013.03.021 [arXiv:1212.5221 [hep-lat]].
[38] J. Berges, K. Boguslavski, S. Schlichting and R. Venugopalan, Phys. Rev. D 89, no.11, 114007 (2014) doi:10.1103/PhysRevD.89.114007 [arXiv:1311.3005 [hep-ph]].
[39] J. Berges, A. Rothkopf and J. Schmidt, Phys. Rev. Lett. 101, 041603 (2008) doi:10.1103/PhysRevLett.101.041603 [arXiv:0803.0131 [hep-ph]].
[40] J. Berges and D. Sexty, Phys. Rev. Lett. 108, 161601 (2012) doi:10.1103/PhysRevLett.108.161601 [arXiv:1201.0687 [hep-ph]].
[41] Adam Paszke et al., CoRR, vol. abs/1912.01703, 2019. [Online]. Available: http://arxiv.org/abs/1912.01703
[42] J. Garcia-Bellido, D. G. Figueroa and J. Rubio, Phys. Rev. D 79, 063531 (2009) doi:10.1103/PhysRevD.79.063531 [arXiv:0812.4624 [hep-ph]].
[43] J. Berges, D. Gelfand and D. Sexty, Phys. Rev. D 89, no.2, 025001 (2014) doi:10.1103/PhysRevD.89.025001 [arXiv:1308.2180 [hep-ph]].